Publications
-
Why Did You Post That GIF? Understanding the Relationship between User Identity and Self-Expression through GIFs on Social MediaIn Proceedings of the ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW), 2024
-
Understanding the Linguistic Signature of Acts and Recalls of Online Racial Microaggressions through Interpretable NLPIn Proceedings of the ACM Conference on Computer Supported Cooperative Work & Social Computing (CSCW), 2024
In this work, we examine the linguistic signature of online racial microaggressions (acts) and how it differs from that of personal narratives recalling experiences of such aggressions (recalls) by Black social media users. We manually curate and annotate a corpus of acts and recalls from in-the-wild social media discussions, and verify labels with Black workshop participants. We leverage Natural Language Processing (NLP) and qualitative analysis on this data to classify (RQ1), interpret (RQ2), and characterize (RQ3) the language underlying acts and recalls of racial microaggressions in the context of racism in the U.S. Our findings show that neural language models (LMs) can classify acts and recalls with high accuracy (RQ1) with contextual words revealing themes that associate Blacks with objects that reify negative stereotypes (RQ2). Furthermore, overlapping linguistic signatures between acts and recalls serve functionally different purposes (RQ3), providing broader implications to the current challenges in content moderation systems on social media.
-
A Design Space for Intelligent and Interactive Writing AssistantsIn Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI), 2024
In our era of rapid technological advancement, the research landscape for writing assistants has become increasingly fragmented across various research communities. We seek to address this challenge by proposing a design space as a structured way to examine nd explore the multidimensional space of intelligent and interactive writing assistants. Through a large community collaboration, we explore five aspects of writing assistants: task, user, technology, interaction, and ecosystem. Within each aspect, we define dimensions (i.e., fundamental components of an aspect) and codes (i.e., potential options for each dimension) by systematically reviewing 115 papers. Our design space aims to offer researchers and designers a practical tool to navigate, comprehend, and compare the various possibilities of writing assistants, and aid in the envisioning and design of new writing assistants.
-
Leveraging Prompt-Based Large Language Models: Predicting Pandemic Health Decisions and Outcomes Through Social Media LanguageIn Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI), 2024
We introduce a multi-step reasoning framework using prompt-based LLMs to examine the relationship between social media language patterns and trends in national health outcomes. Grounded in fuzzy-trace theory, which emphasizes the importance of “gists” of causal coherence in effective health communication, we introduce Role-Based Incremental Coaching (RBIC), a prompt-based LLM framework, to identify gists at-scale. Using RBIC, we systematically extract gists from subreddit discussions opposing COVID-19 health measures (Study 1). We then track how these gists evolve across key events (Study 2) and assess their influence on online engagement (Study 3). Finally, we investigate how the volume of gists is associated with national health trends like vaccine uptake and hospitalizations (Study 4). Our work is the first to empirically link social media linguistic patterns to real-world public health trends, highlighting the potential of prompt-based LLMs in identifying critical online discussion patterns that can form the basis of public health communication strategies.
-
Escalated Police Stops of Black Men Are Linguistically and Psychologically Distinct in Their Earliest MomentsIn Proceedings of the National Academy of Sciences, 2023
Across the United States, police chiefs, city officials, and community leaders alike have highlighted the need to de-escalate police encounters with the public. This concern about escalation extends from encounters involving use of force to routine car stops, where Black drivers are disproportionately pulled over. Yet, despite the calls for action, we know little about the trajectory of police stops or how escalation unfolds. In study 1, we use methods from computational linguistics to analyze police body-worn camera footage from 577 stops of Black drivers. We find that stops with escalated outcomes (those ending in arrest, handcuffing, or a search) diverge from stops without these outcomes in their earliest moments—even in the first 45 words spoken by the officer. In stops that result in escalation, officers are more likely to issue commands as their opening words to the driver and less likely to tell drivers the reason why they are being stopped. In study 2, we expose Black males to audio clips of the same stops and find differences in how escalated stops are perceived: Participants report more negative emotion, appraise officers more negatively, worry about force being used, and predict worse outcomes after hearing only the officer’s initial words in escalated versus non-escalated stops. Our findings show that car stops that end in escalated outcomes sometimes begin in an escalated fashion, with adverse effects for Black male drivers and, in turn, police–community relations.
-
Same Words, Different Meanings: Semantic Polarization in Broadcast Media Language Forecasts Polarization on Social Media DiscourseIn Proceedings of the International AAAI Conference on Web and Social Media (ICWSM 2023), 2023
With the growth of online news over the past decade, empirical studies on political discourse and news consumption have focused on the phenomenon of filter bubbles and echo chambers. Yet recently, scholars have revealed limited evidence around the impact of such phenomenon, leading some to argue that partisan segregation across news audiences cannot be fully explained by online news consumption alone and that the role of traditional legacy media may be as salient in polarizing public discourse around current events. In this work, we expand the scope of analysis to include both online and more traditional media by investigating the relationship between broadcast news media language and social media discourse. By analyzing a decade’s worth of closed captions (2 million speaker turns) from CNN and Fox News along with topically corresponding discourse from Twitter, we provide a novel framework for measuring semantic polarization between America’s two major broadcast networks to demonstrate how semantic polarization between these outlets has evolved (Study 1), peaked (Study 2) and influenced partisan discussions on Twitter (Study 3) across the last decade. Our results demonstrate a sharp increase in polarization in how topically important keywords are discussed between the two channels, especially after 2016, with overall highest peaks occurring in 2020. The two stations discuss identical topics in drastically distinct contexts in 2020, to the extent that there is barely any linguistic overlap in how identical keywords are contextually discussed. Further, we demonstrate at scale, how such partisan division in broadcast media language significantly shapes semantic polarity trends on Twitter (and vice-versa), empirically linking for the first time, how online discussions are influenced by televised media.
-
Cultural Differences in the Effects of Contextual Factors and Privacy Concerns on Users’ Privacy Decision on Social Networking SitesBehaviour & Information Technology, 2022
Many social network sites (SNSs) have become available around the world and users’ online social networks increasingly include contacts from different cultures. However, there is lack of investigation into the concrete cultural differences in the effects of contextual factors and privacy concerns on users’ privacy decisions on social network sites (SNSs). The goal of this paper is to understand how contextual factors and privacy concerns cast different impact on privacy decisions, such as friend request decisions, information disclosure and perceived risk, in different countries. We performed a quantitative study through a large-scale online survey across the US, Korea and China to model the relationships between contextual factors, privacy concerns and privacy decisions. We find that the contextual influence and focus of privacy concerns vary between the individualistic and collectivistic countries in our sample. We suggest that multinational SNS service providers should consider different contextual factors and focus of privacy concerns in different countries and customise privacy designs and friend recommendation algorithms in SNSs in different countries.
-
-
The Design of Online Environments (Political Hashtags) and the Quality of Democratic Discourse At-Scale2020
Facilitating democratic discourse, or people’s ability to access factual information in service of thoughtful discussion of social issues, is critical for democracies to function properly. However, with the rise of online fake news, misinformation, and political extremism, it is becoming increasingly difficult to have civil conversations on the internet. As a first step to addressing this issue, scholars need to understand how the current design of online environments shapes people’s ability to respectfully engage across social and political differences. In this dissertation, I investigate how common social media design features, such as hashtags directly impact the quality of democratic discourse at-scale. Using natural language processing, statistics, and experimental design, I empirically demonstrate how linguistic behavior and the presence of political hashtags in online social media news articles impact the quality of discussions surrounding race, gender, and equality. Through my findings, I provide a theoretical examination of functionality and intertextuality as critical aspects of online design. Online design considerations that consider functionality alone tend to promote a digital public sphere that predominantly favors hashtag (or content) producers over non-users and passive content consumers. The sole emphasis on the functionality of design features drives frequency-driven research practices that prioritize discourse conditions for hashtag producers through volume-based definitions of discussion quality. Collectively, the research studies in this thesis are motivated by a desire to understand how online spaces can be better designed to foster interaction and discourse that can bridge rather than sharpen social differences. Results from this dissertation research strongly indicate that scholars, designers, and engineers need to rethink and evaluate how current methodological approaches that prioritize the functionality of online design choices are limiting the way we understand the quality of democratic discourse on the internet. As a step towards this direction, I evoke Kristeva’s notion of intertextuality to demonstrate how online design choices facilitate the power of language in which important social topics are discussed across networks.
-
Hashtag Burnout? A Control Experiment Investigating How Political Hashtags Shape Reactions to News ContentIn Proceedings of the ACM on Human-Computer Interaction, 2019
Both hashtag activists and news organizations assume that trending political hashtags effectively capture the nowness of social issues that people care about [20]. In fact, news organizations with growing social media presence increasingly capitalize the use of political hashtags in article headlines and social media news post - a practice aimed to generate new readership through lightweight news consumption of content by linking a particular story to a broader topic [28]. However, response to political hashtags can be complicated as demonstrated with the events surrounding #MeToo and #BlackLivesMatter. In fact, the semantic simplicity of political hashtags often belies the complexities around the question of who gets to participate [71], what intersectional identities are included or excluded from the hashtag [45], as well as how the meaning of the hashtag expands and drifts [10] depending on the context through which it is expressed. Overtime, reports show increasing backlash [70, 73, 74] and polarization [21, 52, 66, 67, 70] against key issues embodied by political hashtags. In this vein, we assume that political hashtags affect how people make sense of and engage with media content. However, we do not know how the presence of political hashtags -signaling that a news story is related to a current social issue - influences the assumptions potential readers make about the social content of an article. In this work we conducted a randomized control experiment to examine how the presence of political hashtags (particularly the most prevalently used #MeToo and #BlackLivesMatter) in social media news posts shape reactions across a general audience (n=1979). Our findings show that compared to the control group, people shown news posts with political hashtags perceive the news topic as less socially important and are less motivated to know more about social issues related to the post. People also find the news more partisan and controversial when hashtags are included. In fact, negative perception associated with political hashtags (partisan bias & topic controversy) mediates people’s motivation to further engage with the news content). High-intensity Facebook users and politically moderate participants perceive news with political hashtags as more partisan compared to posts excluding hashtags. There are also significant differences in discourse patterns between the hashtag and control groups around how politically moderate respondents engage with the news content in their comments.
-
Moral and Affective Differences in U.S. Immigration Policy Debate on TwitterComputer Supported Cooperative Work (CSCW), 2019
Understanding ideological conflict has been a topic of interest in CSCW, for example in Value Sensitive Design research. More specifically, understanding ideological conflict is important for studying social media platforms like Twitter, which provide the ability for people to freely express their thoughts and opinions on contentious political events. In this work, we examine Twitter data to understand the moral, affective, and cognitive differences in language use between two opposing sides of the political debate over immigration related issues in the United States in the year since the 2016 presidential election. In total, we analyzed and compared the language of 45,045 pro-immigration tweets and 11,213 anti-immigration tweets spread across this period. Based on Moral Foundations Theory used to understand ideological conflict, we found pro-immigration tweets to contain more language associated with moral foundations of harm, fairness, and loyalty. Anti-immigration tweets contained more language associated with moral foundations of authority, more words associated with cognitive rigidity and more 3rd person pronouns and negative emotion. We discuss the implications of our research for political communication over social media, and for incorporating Moral Foundations Theory into other CSCW research. We discuss the potential application of this theory for Value Sensitive Design research.
-
Differences in Online Privacy & Security Attitudes Based on Economic Living Standards: A Global Study of 24 CountriesIn Proceedings of the European Conference on Information Systems (ECIS), 2018
This work explores online privacy and security attitudes from 24,143 individuals across 24 countries with diverse economic living standards. By using k-mode analysis, we identified three distinct profiles based on similarity in Internet security and privacy attitudes measured by 83 items. By comparing the aggregated dissimilarity measures between each respondent and the centroid values of the three profiles at the country level, we assigned each country to their best-fitting privacy profile. We found significant differences in GDP per capita between profiles 1 (highest GDP) to 3 (lowest). People in profiles with higher GDP per capita have significantly greater privacy concerns in relation to information being monitored or bought and sold. These individuals are also more reluctant towards government surveillance of online communication as well as less likely to agree that governments should work with other public and private entities to develop online security laws. As economic living standards improve, the proportion of individuals increases in profile 1, decreases in profile 2, and most rapidly drops in profile 3. To the best of our knowledge, it is the first research that systematically examines country-level privacy in relation to a national economic variable using GDP per capita.
-
Fostering Civil Discourse Online: Linguistic Behavior in Comments of #MeToo Articles across Political PerspectivesIn Proceedings of the ACM on Human-Computer Interaction, 2018
Linguistic style and affect shape how users perceive and assess political content on social media. Using linguistic methods to compare political discourse on far-left, mainstream and alt-right news articles covering the #MeToo movement, we reveal rhetorical similarities and differences in commenting behavior across the political spectrum. We employed natural language processing techniques and qualitative methods on a corpus of approximately 30,000 Facebook comments from three politically distinct news journals. Our findings show that commenting behavior reflects how social movements are framed and understood within a particular political orientation. Surprisingly, these data reveal that the structural patterns of discourse among commenters from the two alternative news sites are similar in terms of their relationship to those from the mainstream - exhibiting polarization, generalization, and othering of perspectives in political conversation. These data have implications for understanding the possibility for civil discourse in online venues and the role of commenting behavior in polarizing media sources in undermining such discourse.
-
Class Confessions: Restorative Properties in Online Experiences of Socioeconomic StigmaIn Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, 2017
In this paper, we examine stigma related to class identity online through an empirical examination of Elite University Class Confessions (EUCC). EUCC is an online space that includes a Facebook page and a surrounding sociotechnical ecosystem. It is a community of, for, and about low-income and first generation students at an elite university. By bringing in a community that learns and engages with users’ socioeconomic struggles, EUCC engenders unique restorative properties for students experiencing class stigma. EUCC’s restorative properties foster new ways of understanding one’s stigmatized identity through meaning- making interactions in a networked sociotechnical system. We discuss how EUCC’s design shapes the nature of user interactions around class stigma, and explore in depth how people experience stigma differently through the restorative properties of EUCC.
Workshop
-
Real Memes In-The-Wild: Explainable Classification of Hateful vs. Non-Hateful MemesIn Proceedings of the Combating Toxicity, Harassment, and Abuse in Online Social Spaces: A Workshop at CHI 2023, 2023
-
ToxVis: Enabling Interpretability of Implicit vs. Explicit Toxicity Detection Models with Interactive VisualizationarXiv, 2023
The rise of hate speech on online platforms has led to an urgent need for effective content moderation. However, the subjective and multi-faceted nature of hateful online content, including implicit hate speech, poses significant challenges to human moderators and content moderation systems. To address this issue, we developed ToxVis, a visually interactive and explainable tool for classifying hate speech into three categories: implicit, explicit, and non-hateful. We fine-tuned two transformer-based models using RoBERTa, XLNET, and GPT-3 and used deep learning interpretation techniques to provide explanations for the classification results. ToxVis enables users to input potentially hateful text and receive a classification result along with a visual explanation of which words contributed most to the decision. By making the classification process explainable, ToxVis provides a valuable tool for understanding the nuances of hateful content and supporting more effective content moderation. Our research contributes to the growing body of work aimed at mitigating the harms caused by online hate speech and demonstrates the potential for combining state-of-the-art natural language processing models with interpretable deep learning techniques to address this critical issue. Finally, ToxVis can serve as a resource for content moderators, social media platforms, and researchers working to combat the spread of hate speech online.
-
Quality of Democratic Discourse in the Age of Political Hashtags and Social Media News ConsumptionIn Conference Companion Publication of the 2019 on Computer Supported Cooperative Work and Social Computing, 2019
Whether through television, newspapers, or more increasingly through Social Networking Sites (SNS), journalistic coverage of current events have long played a significant role in mediating knowledge and information to the public. The platforms and channels through which news is produced and consumed shape how the public talk about current issues, exemplifying the critical link between democratic discourse and the press. However, with the advent of social media, the display of online news content has increasingly changed over the years. This implies that the conditions and avenues through which audiences make sense of mediated politics through news have possibly changed as well. This is the premise that motivates my work. In my dissertation, I examine how social media news consumption impacts the viability of online political deliberation around news content. Specifically, I investigate how civil discourse is shaped in relation to political hashtags in the headlines and texts of social media news posts. I use both qualitative and computational (natural language processing) methods on publicly available social media news comments and survey data collected through large-scale experiments.
-
Intelligent Agents in Everyday Settings: Leveraging a Multi-Methods ApproachIn 2018 ACM CHI Conference on Human Factors in Computing Systems (Workshop), 2018
Conversational Agents (CAs) or Intelligent Personal Assistants (IPAs) (e.g., Apple’s Siri, Microsoft’s Cortana; Amazon’s Alexa and Google’s Google Assistant) are voice-based interfaces designed for tasks in everyday life including: retrieval of information (e.g., weather, traffic, news), streaming of music, online shopping, controlling of home appliances, and voicecalls within the home and automobiles. Continuous enhancements of their natural language processing abilities, seamless set up of miniaturized hardware, and large-scale cloud-based infrastructures render CAs as unobtrusive, artificially intelligent voice sensors. With CAs rapidly making their way into the home market, the social implications remain unclear. Some product companies have released open-source software platforms that allow third-party developers and the general public to contribute software towards the growth of CAs. However, research around userinteraction with CAs in social settings is still at a nascent stage. In this workshop paper, we unpack the methods used in our ongoing work on people’s social interactions with CAs in order to generate discussion around how the research community can leverage various methodologies using both qualitative and quantitative techniques.
-
Contextual Impact on SNS Users’ Privacy Decisions : A Cross-Cultural StudyIn 2018 ACM CHI Conference on Human Factors in Computing Systems (Workshop), 2018
Social network users with different cultural backgrounds have different privacy attitudes and behaviors. This study is to explore the mechanisms behind the cultural differences in privacy decisions. The findings have implications on customizing privacy technologies in different cultures.
-
Privacy Norms in the Context of Connected & Self-Driving CarsIn Computer Supported Cooperative Work (CSCW) Networked Privacy Workshop, 2017
The upcoming transition to self-driving cars could lead to a seismic shift in society, one that affects industry practices, regulation landscapes, as well as personal decision-making and social norms around privacy. Major tech companies and traditional auto manufacturers have started working together to conceive an optimal regulatory environment for autonomous vehicles. However, the issue of privacy in the process of collecting, managing, and using data generated from self-driving and connected cars remain one of the biggest challenges yet to be solved. In this position paper, I highlight key privacy challenges and issues in the context of connected and self-driving cars.