Deb Roy

Orcid: 0000-0002-2780-4768

  • MIT, USA

According to our database1, Deb Roy authored at least 154 papers between 1996 and 2025.

Collaborative distances:



In proceedings 
PhD thesis 


Online presence:



Bridging Context Gaps: Enhancing Comprehension in Long-Form Social Conversations Through Contextualized Excerpts.
Proceedings of the 31st International Conference on Computational Linguistics, 2025

Anonymization of Voices in Spaces for Civic Dialogue: Measuring Impact on Empathy, Trust, and Feeling Heard.
Proc. ACM Hum. Comput. Interact., 2024

LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users.
CoRR, 2024

Prompting Large Language Models with Audio for General-Purpose Speech Summarization.
CoRR, 2024

PersonaLLM: Investigating the Ability of Large Language Models to Express Personality Traits.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

SenseMate: An Accessible and Beginner-Friendly Human-AI Platform for Qualitative Data Analysis.
Proceedings of the 29th International Conference on Intelligent User Interfaces, 2024

On the Relationship between Truth and Political Bias in Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Bridging Dictionary: AI-Generated Dictionary of Partisan Language Use.
Proceedings of the Companion Publication of the 2024 Conference on Computer-Supported Cooperative Work and Social Computing, 2024

AudienceView: AI-Assisted Interpretation of Audience Feedback in Journalism.
Proceedings of the Companion Publication of the 2024 Conference on Computer-Supported Cooperative Work and Social Computing, 2024

Topic Detection and Tracking with Time-Aware Document Embeddings.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Fora: A corpus and framework for the study of facilitated dialogue.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Leveraging Large Language Models for Learning Complex Legal Concepts through Storytelling.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

In Pursuit of Constructive Communication: Designing Tools to Support Development of Constructive Communication Metrics.
Proceedings of the Companion Publication of the 2024 ACM Designing Interactive Systems Conference, 2024

The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI.
CoRR, 2023

Polarized Speech on Online Platforms.
CoRR, 2023

ConGraT: Self-Supervised Contrastive Pretraining for Joint Graph and Text Embeddings.
CoRR, 2023

Language Models Trained on Media Diets Can Predict Public Opinion.
CoRR, 2023

Redrawing attendance boundaries to promote racial and ethnic diversity in elementary schools.
CoRR, 2023

All A-board: Sharing Educational Data Science Research with School Districts.
Proceedings of the Tenth ACM Conference on Learning @ Scale, 2023

End-to-End Zero-Shot Voice Conversion with Location-Variable Convolutions.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Divergences in Following Patterns between Influential Twitter Users and Their Audiences across Dimensions of Identity.
Proceedings of the Seventeenth International AAAI Conference on Web and Social Media, 2023

M-sense: Modeling Narrative Structure in Short Personal Narratives Using Protagonist's Mental Representations.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

End-to-End Zero-Shot Voice Style Transfer with Location-Variable Convolutions.
CoRR, 2022

Child-driven, machine-guided: Automatic scaffolding of constructionist-inspired early literacy play.
Comput. Educ., 2022

Technology-assisted coaching can increase engagement with learning technology at home and caregivers' awareness of it.
Comput. Educ., 2022

Annotating the Tweebank Corpus on Named Entity Recognition and Building NLP Models for Social Media Analysis.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Perspective-Taking to Reduce Affective Polarization on Social Media.
Proceedings of the Sixteenth International AAAI Conference on Web and Social Media, 2022

Engaging Politically Diverse Audiences on Social Media.
Proceedings of the Sixteenth International AAAI Conference on Web and Social Media, 2022

Real Talk, Real Listening, Real Change.
Proceedings of the International Conference on Multimodal Interaction, 2022

CommunityLM: Probing Partisan Worldviews from Language Models.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

Designing building blocks for open-ended early literacy software.
Int. J. Child Comput. Interact., 2021

Interpretable Multi-Modal Hate Speech Detection.
CoRR, 2021

Modeling Human Motives and Emotions from Personal Narratives Using External Knowledge And Entity Tracking.
Proceedings of the WWW '21: The Web Conference 2021, 2021

The Structure of Toxic Conversations on Twitter.
Proceedings of the WWW '21: The Web Conference 2021, 2021

Balanced Influence Maximization in the Presence of Homophily.
Proceedings of the WSDM '21, 2021

Lifelong Knowledge-Enriched Social Event Representation Learning.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

Constructing Embodied Algebra by Sketching.
Proceedings of the CHI '21: CHI Conference on Human Factors in Computing Systems, 2021

Keeper: A Synchronous Online Conversation Environment Informed by In-Person Facilitation Practices.
Proceedings of the CHI '21: CHI Conference on Human Factors in Computing Systems, 2021

Video SemNet: Memory-Augmented Video Semantic Network.
CoRR, 2020

Are Visual Explanations Useful? A Case Study in Model-in-the-Loop Prediction.
CoRR, 2020

DAPPER: Learning Domain-Adapted Persona Representation Using Pretrained BERT and External Memory.
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, 2020

Keeper: An Online Synchronous Conversation Environment Informed by In-Person Facilitation Practices.
Proceedings of the Companion Publication of the 2020 ACM Conference on Computer Supported Cooperative Work and Social Computing, 2020

Spelling their pictures: the role of visual scaffolds in an authoring app for young children's literacy and creativity.
Proceedings of the 19th ACM International Conference on Interaction Design and Children, 2020

Exploring aspects of similarity between spoken personal narratives by disentangling them into narrative clause types.
Proceedings of the First Joint Workshop on Narrative Understanding, Storylines, and Events, 2020

I'm Lonely. Who should I talk to?
Proceedings of the Companion of The 2019 World Wide Web Conference, 2019

Generating Black-Box Adversarial Examples for Text Classifiers Using a Deep Reinforced Model.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2019

RadioTalk: A Large-Scale Corpus of Talk Radio Transcripts.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Context is Key: New Approaches to Neural Coherence Modeling.
CoRR, 2018

Me, My Echo Chamber, and I: Introspection on Social Media Polarization.
Proceedings of the 2018 World Wide Web Conference on World Wide Web, 2018

Child-Coach-Parent Network for Early Literacy Learning.
Proceedings of the Rethinking learning in the digital age: Making the Learning Sciences count, 2018

Learning Personas from Dialogue with Attentive Memory Networks.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Children-Centered Sensing in Early Childhood Classrooms.
Proceedings of the Extended Abstracts of the 2018 CHI Conference on Human Factors in Computing Systems, 2018

Light it up: using paper circuitry to enhance low-fidelity paper prototypes for children.
Proceedings of the 17th ACM Conference on Interaction Design and Children, 2018

Rumor Gauge: Predicting the Veracity of Rumors on Twitter.
ACM Trans. Knowl. Discov. Data, 2017

Generalized Grounding Graphs: A Probabilistic Framework for Understanding Grounded Commands.
CoRR, 2017

Auris: creating affective virtual spaces from music.
Proceedings of the 23rd ACM Symposium on Virtual Reality Software and Technology, 2017

TweetVista: An AI-Powered Interactive Tool for Exploring Conversations on Twitter.
Proceedings of the Companion Publication of the 22nd International Conference on Intelligent User Interfaces, 2017

Mapping Twitter Conversation Landscapes.
Proceedings of the Eleventh International Conference on Web and Social Media, 2017

Nasty, Brutish, and Short: What Makes Election News Popular on Twitter?
Proceedings of the Eleventh International Conference on Web and Social Media, 2017

Audio-Visual Sentiment Analysis for Learning Emotional Arcs in Movies.
Proceedings of the 2017 IEEE International Conference on Data Mining, 2017

DeepSpace: Mood-Based Image Texture Generation for Virtual Reality from Music.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017

Bilingual SpeechBlocks: Investigating How Bilingual Children Tinker with Words in English and Spanish.
Proceedings of the Annual Symposium on Computer-Human Interaction in Play, 2017

SpeechBlocks: A Constructionist Early Literacy App.
Proceedings of the 2017 Conference on Interaction Design and Children, 2017

Twitter Demographic Classification Using Deep Multi-modal Multi-task Learning.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

Human Atlas: A Tool for Mapping Social Networks.
Proceedings of the 25th International Conference on World Wide Web, 2016

Tweet2Vec: Learning Tweet Embeddings Using Character-level CNN-LSTM Encoder-Decoder.
Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016

DeepStance at SemEval-2016 Task 6: Detecting Stance in Tweets Using Character and Word-Level CNNs.
Proceedings of the 10th International Workshop on Semantic Evaluation, 2016

Tweet Acts: A Speech Act Classifier for Twitter.
Proceedings of the Tenth International Conference on Web and Social Media, 2016

A Semi-Automatic Method for Efficient Detection of Stories on Social Media.
Proceedings of the Tenth International Conference on Web and Social Media, 2016

Automatic Detection and Categorization of Election-Related Tweets.
Proceedings of the Tenth International Conference on Web and Social Media, 2016

Tracking the Yak: An Empirical Study of Yik Yak.
Proceedings of the Tenth International Conference on Web and Social Media, 2016

Measuring Responsiveness in the Online Public Sphere for the 2016 U.S. Election: Concepts.
CoRR, 2015

Enhanced Twitter Sentiment Classification Using Contextual Information.
Proceedings of the 6th Workshop on Computational Approaches to Subjectivity, 2015

Digital Stylometry: Linking Profiles Across Social Networks.
Proceedings of the Social Informatics - 7th International Conference, 2015

A Human-Machine Collaborative System for Identifying Rumors on Twitter.
Proceedings of the IEEE International Conference on Data Mining Workshop, 2015

Grounding language models in spatiotemporal context.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

An Automatic Child-Directed Speech Detector for the Study of Child Language Development.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

A portable audio/video recorder for longitudinal study of child development.
Proceedings of the International Conference on Multimodal Interaction, 2012

Relating Activity Contexts to Early Word Learning in Dense Longitudinal Data.
Proceedings of the 34th Annual Meeting of the Cognitive Science Society, 2012

Understanding Speech in Interactive Narratives with Crowdsourced Data.
Proceedings of the Eighth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 2012

Extracting aspects of determiner meaning from dialogue in a virtual world environment.
Proceedings of the Ninth International Conference on Computational Semantics, 2011

An Interface for Visualization and Exploration of Spatial Distributions.
Proceedings of the Scalable Integration of Analytics and Visualization, 2011

An immersive system for browsing and visualizing surveillance video.
Proceedings of the 18th International Conference on Multimedia 2010, 2010

Learning Meanings of Words and Constructions, Grounded in a Virtual Game.
Proceedings of the Semantic Approaches in Natural Language Processing: Proceedings of the 10th Conference on Natural Language Processing, 2010

Grounding Verbs of Motion in Natural Language Commands to Robots.
Proceedings of the Experimental Robotics, 2010

Natural language command of an autonomous micro-air vehicle.
Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010

Automatic estimation of transcription accuracy and difficulty.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Semi-automatic task recognition for interactive narratives with EAT & RUN.
Proceedings of the Intelligent Narrative Technologies III Workshop, 2010

Grounding spatial language for video search.
Proceedings of the 12th International Conference on Multimodal Interfaces / 7. International Workshop on Machine Learning for Multimodal Interaction, 2010

Exploiting feature dynamics for active object recognition.
Proceedings of the 11th International Conference on Control, 2010

Toward understanding natural language directions.
Proceedings of the 5th ACM/IEEE International Conference on Human Robot Interaction, 2010

Capturing and generating social behavior with the restaurant game.
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2010), 2010

Toward an interleaved model of actions and words in social simulation.
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2010), 2010

Behavior Compilation for AI in Games.
Proceedings of the Sixth AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 2010

Semi-Automated Dialogue Act Classification for Situated Social Agents in Games.
Proceedings of the Agents for Games and Simulations II, 2010

Fast transcription of unstructured audio recordings.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

New horizons in the study of child language acquisition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Grounding spatial prepositions for video search.
Proceedings of the 11th International Conference on Multimodal Interfaces, 2009

Towards surveillance video search by natural language query.
Proceedings of the 8th ACM International Conference on Image and Video Retrieval, 2009

A human-machine collaborative approach to tracking human movement in multi-camera video.
Proceedings of the 8th ACM International Conference on Image and Video Retrieval, 2009

Automatic learning and generation of social behavior from collective human gameplay.
Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2009), 2009

The Restaurant Game: Learning Social Behavior and Language from Thousands of Players Online.
J. Game Dev., 2008

Object schemas for grounding language in a responsive robot.
Connect. Sci., 2008

Object schemas for responsive robotic language use.
Proceedings of the 3rd ACM/IEEE international conference on Human robot interaction, 2008

Grounded Language Modeling for Automatic Speech Recognition of Sports Video.
Proceedings of the ACL 2008, 2008

Interpretation of Spatial Language in a Map Navigation Task.
IEEE Trans. Syst. Man Cybern. Part B, 2007

Situated Language Understanding as Filtering Perceived Affordances.
Cogn. Sci., 2007

Situated Models of Meaning for Sports Video Retrieval.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2007

Temporal feature induction for baseball highlight classification.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

Unsupervised content-based indexing for sports video retrieval.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

Unsupervised content-based indexing of sports video.
Proceedings of the 9th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2007

Grounding Language in Spatial Routines.
Proceedings of the Control Mechanisms for Spatial Knowledge Processing in Cognitive / Intelligent Systems, 2007

Representing Intentions in a Cognitive Model of Language Acquisition: Effects of Phrase Structure on Situated Verb Learning.
Proceedings of the Intentions in Intelligent Systems, 2007

Mining temporal patterns of movement for video content classification.
Proceedings of the 8th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2006

Grounded Situation Models for Robots: Where words and percepts meet.
Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2006

Grounded Situation Models: Where Words and Percepts Meet.
Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2006

Modeling Interactions from Email Communication.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Spatial routines for a simulated speech-controlled vehicle.
Proceedings of the 1st ACM SIGCHI/SIGART Conference on Human-Robot Interaction, 2006

Towards situated speech understanding: visual context priming of language models.
Comput. Speech Lang., 2005

Connecting language to the world.
Artif. Intell., 2005

Semiotic schemas: A framework for grounding language in action and perception.
Artif. Intell., 2005

Learning Influence among Interacting Markov Chains.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

Probabilistic grounding of situated speech using plan recognition and reference resolution.
Proceedings of the 7th International Conference on Multimodal Interfaces, 2005

Intentional Context in Situated Natural Language Learning.
Proceedings of the Ninth Conference on Computational Natural Language Learning, 2005

Speaking with your Sidekick: Understanding Situated Speech in Computer Role Playing Games.
Proceedings of the First Artificial Intelligence and Interactive Digital Entertainment Conference, 2005

Mental imagery for a conversational robot.
IEEE Trans. Syst. Man Cybern. Part B, 2004

Grounded Semantic Composition for Visual Scenes.
J. Artif. Intell. Res., 2004

Visual Memory Augmentation: Using Eye Gaze as an Attention Filter.
Proceedings of the 8th International Symposium on Wearable Computers (ISWC 2004), 31 October, 2004

Elvis: situated speech and gesture understanding for a robotic chandelier.
Proceedings of the 6th International Conference on Multimodal Interfaces, 2004

Grounding Language in the World: Signs, Schemas, and Meaning.
Proceedings of the Intersection of Cognitive Science and Robotics: From Interfaces to Intelligence, 2004

Grounded spoken language acquisition: experiments in word learning.
IEEE Trans. Multim., 2003

Coupling perception and simulation: steps towards conversational robotics.
Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems, Las Vegas, Nevada, USA, October 27, 2003

A visual context-aware multimodal system for spoken language processing.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

A visually grounded natural language interface for reference to spatial scenes.
Proceedings of the 5th International Conference on Multimodal Interfaces, 2003

Augmenting user interfaces with adaptive speech commands.
Proceedings of the 5th International Conference on Multimodal Interfaces, 2003

Learning visually grounded words and syntax for a scene description task.
Comput. Speech Lang., 2002

Learning words from sights and sounds: a computational model.
Cogn. Sci., 2002

A trainable spoken language understanding system for visual object selection.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

A system that learns to describe objects in visual scenes.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Towards Visually-Grounded Spoken Language Acquisition.
Proceedings of the 4th IEEE International Conference on Multimodal Interfaces (ICMI 2002), 2002

Grounded speech communication.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Learning from Multimodal Observations.
Proceedings of the 2000 IEEE International Conference on Multimedia and Expo, 2000

Integration of speech and vision using mutual information.
Proceedings of the IEEE International Conference on Acoustics, 2000

Perceptual Intelligence: learning gestures and words for individualized, adaptive interfaces.
Proceedings of the Human-Computer Interaction: Ergonomics and User Interfaces, 1999

Learning words from natural audio-visual input.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Word learning in a multimodal environment.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

A Phoneme Probability Display for Individuals with Hearing Disabilities.
Proceedings of the Third International ACM Conference on Assistive Technologies, 1998

Speaker indexing using neural network clustering of vowel spectra.
Int. J. Speech Technol., 1997

Toco the toucan: a synthetic character guided by perception, emotion, and story.
Proceedings of the ACM SIGGRAPH 97 Visual Proceedings: The art and interdisciplinary programs of SIGGRAPH '97, 1997

Speaker identification based text to audio alignment for an audio retrieval system.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

Using Acoustic Structure in a Hand-Held Audio Playback Device.
IBM Syst. J., 1996

Automatic Spoken Affect Classification and Analysis.
Proceedings of the 2nd International Conference on Automatic Face and Gesture Recognition (FG '96), 1996

NewsComm: A Hand-Held Interface for Interactive Access to Structured Audio.
Proceedings of the Conference on Human Factors in Computing Systems: Common Ground, 1996
