Text to Context: Contextualizing Language with Humans, Groups, and Communities for Socially Aware NLP

NAACL 2024 Tutorial, Mexico City
June 16, 2024
Venue: Doña Adelita, Time: 1400 - 1730 (CST)

Welcome to our NAACL 2024 tutorial on "Text to Context: Contextualizing Language with Humans, Groups, and Communities for Socially Aware NLP".

The tutorial will introduce research that has successfully integrated personal and social factors into traditional NLP as a foundation for cutting-edge research in the field. Unique aspects of this tutorial will include:

  1. interdisciplinary methods woven together into a coherent framework for human-centered NLP,
  2. theory and domain expertise from an interdisciplinary team of presenters, and
  3. hands-on demonstrations that facilitate immediate uptake and application by attendees.



Session Overview

1. Introduction to Socially Aware NLP

The 3 hour tutorial will begin with a brief overview of the entire session organized from the individual to the societal levels of context. We will also introduce the key concepts in behavioral and social science that motivate the techniques that will be discussed in the subsequent sections.

Presenters: Adithya V Ganesan

Materials: Slides

2. Personal Context in NLP

In this session, we will review the methods for producing user representation from language, ranging from simple N gram features to advanced techniques such as Latent Dirichlet Allocation, Word2Vec, and Transformer models. These language-based user representations become much effective when integrated with user factors for analyses. We will showcase different user factor adaptation methods for merging human and social factors with language representations. While these methods produce user representations by taking a person's full picture into account, it is also pivotal to preserve the privacy of the individuals. Thus we will also review works demonstrating the successful implementation of human-level NLP systems incorporating differential privacy to ensure secure and privacy-preserving NLP practices

Presenters: Adithya V Ganesan (PhD Student), Swanie Juhng (PhD Student), H. Andrew Schwartz (Faculty)

Materials: Slides

3. Individuals With Agents

This session will begin by considering the "generator" of language and its mathematical formulation. Next, we will look at how individuals or personas make their way into dialogue and conversational AI systems. Finally, we introduce metrics aimed at assessing conversational AI on an individual level and how they contrast with the more traditional automatic dialog metrics.

Presenters: Nikita Soni (PhD Student), João Sedoc (Faculty)

Materials: Slides

4. Group Context in NLP

We will go over the methods that place emphasis on treating individuals and groups as interactive entities, with the individual's interactions within a group adding context to documents. Drawing inspiration from adjacent fields, particularly computational social science, we will show how to analyze the language of user-associated groups, unveil valuable insights into the context of an individual, the evolving dynamics of group language usage over time, and its influence on individual language patterns. By incorporating code demonstrations and references, we will discuss how these methods can enrich multiple traditional NLP tasks.

Presenters: Vasudha Varadarajan (PhD Student), Ryan L. Boyd (Faculty)

Materials: Slides

5. Community Context in NLP

This tutorial session will cover the basics of creating language estimates of spatial communities (e.g., U.S. states or provinces in China). We will cover topics such as aggregation, as in how to move from documents to communities \emph{through} people, selection biases, ecological fallacies (i.e., language patterns at the individual level do not always hold at the community level), and cultural considerations. Participants in this session will be provided with a code notebook to experiment with on their own to examine the gains from proper methods for handling community-level text.

Presenters: Siddharth Mangalik (PhD Student), Salvatore Giorgi (Faculty)

Materials: Slides

6. Closing Note

We will end the tutorial by briefly summarizing the topics covered across all the sessions, distinguishing the situations for which methods are appropriate, concluding with a perspective on the future of human-centered NLP.

Presenters: H. Andrew Schwartz


People

Ryan L Boyd

Ryan L Boyd

Stony Brook University, USA

Adithya V Ganesan

Adithya V Ganesan

Stony Brook University, USA

Salvatore Giorgi

Salvatore Giorgi

University Of Pennsylvania & National Institute on Drug Abuse

Swanie Juhng

Swanie Juhng

Stony Brook University, USA

Siddharth Mangalik

Siddharth Mangalik

Stony Brook University, USA

H. Andrew Schwartz

H. Andrew Schwartz

Stony Brook University, USA

João Sedoc

João Sedoc

New York University, USA

Nikita Soni

Nikita Soni

Stony Brook University, USA

Vasudha Varadarajan

Vasudha Varadarajan

Stony Brook University, USA