Paul Pu Liang

Orcid: 0000-0001-7768-3610

According to our database1, Paul Pu Liang authored at least 96 papers between 2017 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Foundations & Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions.
ACM Comput. Surv., October, 2024

OS-ATLAS: A Foundation Action Model for Generalist GUI Agents.
CoRR, 2024

VideoWebArena: Evaluating Long Context Multimodal Agents with Video Understanding Web Tasks.
CoRR, 2024

Progressive Compositionality In Text-to-Image Generative Models.
CoRR, 2024

TeaserGen: Generating Teasers for Long Documentaries.
CoRR, 2024

Quantitative Insights into Language Model Usage and Trust in Academia: An Empirical Study.
CoRR, 2024

MultiMed: Massively Multimodal and Multitask Medical Understanding.
CoRR, 2024

IoT-LM: Large Multisensory Language Models for the Internet of Things.
CoRR, 2024

HEMM: Holistic Evaluation of Multimodal Foundation Models.
CoRR, 2024

Foundations of Multisensory Artificial Intelligence.
CoRR, 2024

Semantically Corrected Amharic Automatic Speech Recognition.
CoRR, 2024

Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

MMoE: Enhancing Multimodal Models with Mixtures of Multimodal Interaction Experts.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

FLHetBench: Benchmarking Device and State Heterogeneity in Federated Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Modeling Dense Multimodal Interactions Between Biological Pathways and Histology for Survival Prediction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Trans. Mach. Learn. Res., 2023

High-Modality Multimodal Transformer: Quantifying Modality & Interaction Heterogeneity for High-Modality Representation Learning.
Trans. Mach. Learn. Res., 2023

MultiZoo and MultiBench: A Standardized Toolkit for Multimodal Deep Learning.
J. Mach. Learn. Res., 2023

MMOE: Mixture of Multimodal Interaction Experts.
CoRR, 2023

MultiIoT: Towards Large-scale Multisensory Learning for the Internet of Things.
CoRR, 2023

Comparative Knowledge Distillation.
CoRR, 2023

MultiZoo & MultiBench: A Standardized Toolkit for Multimodal Deep Learning.
CoRR, 2023

Factorized Contrastive Learning: Going Beyond Multi-view Redundancy.
CoRR, 2023

Quantifying & Modeling Feature Interactions: An Information Decomposition Framework.
CoRR, 2023

Read and Reap the Rewards: Learning to Play Atari with the Help of Instruction Manuals.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Localized Symbolic Knowledge Distillation for Visual Commonsense Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Factorized Contrastive Learning: Going Beyond Multi-view Redundancy.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Quantifying & Modeling Multimodal Interactions: An Information Decomposition Framework.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Tutorial on Multimodal Machine Learning: Principles, Challenges, and Open Questions.
Proceedings of the International Conference on Multimodal Interaction, 2023

Multimodal Fusion Interactions: A Study of Human and Automatic Quantification.
Proceedings of the 25th International Conference on Multimodal Interaction, 2023

HIINT: Historical, Intra- and Inter- personal Dynamics Modeling with Cross-person Memory Transformer.
Proceedings of the 25th International Conference on Multimodal Interaction, 2023

MultiViz: Towards Visualizing and Understanding Multimodal Models.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Lecture Presentations Multimodal Dataset: Towards Understanding Multimodality in Educational Videos.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Face-to-Face Contrastive Learning for Social Intelligence Question-Answering.
Proceedings of the 17th IEEE International Conference on Automatic Face and Gesture Recognition, 2023

Difference-Masking: Choosing What to Mask in Continued Pretraining.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

MultiViz: Towards User-Centric Visualizations and Interpretations of Multimodal Models.
Proceedings of the Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, 2023

Language Models Get a Gender Makeover: Mitigating Gender Bias with Few-Shot Data Interventions.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

Cross-modal Attention Congruence Regularization for Vision-Language Relation Alignment.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Nano: Nested Human-in-the-Loop Reward Learning for Few-shot Language Model Control.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Demystify the Gravity Well in the Optimization Landscape (Student Abstract).
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Foundations and Recent Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions.
CoRR, 2022

Multimodal Lecture Presentations Dataset: Understanding Multimodality in Educational Slides.
CoRR, 2022

Face-to-Face Contrastive Learning for Social Intelligence Question-Answering.
CoRR, 2022

MultiViz: An Analysis Benchmark for Visualizing and Understanding Multimodal Models.
CoRR, 2022

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code.
CoRR, 2022

Brainish: Formalizing A Multimodal Language for Intelligence and Consciousness.
CoRR, 2022

HighMMT: Towards Modality and Task Generalization for High-Modality Representation Learning.
CoRR, 2022

Uncertainty Quantification with Pre-trained Language Models: A Large-Scale Empirical Analysis.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

PACS: A Dataset for Physical Audiovisual CommonSense Reasoning.
Proceedings of the Computer Vision - ECCV 2022, 2022

Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

DIME: Fine-grained Interpretations of Multimodal Models via Disentangled Local Explanations.
Proceedings of the AIES '22: AAAI/ACM Conference on AI, Ethics, and Society, Oxford, United Kingdom, May 19, 2022

2021
Ask & Explore: Grounded Question Answering for Curiosity-Driven Exploration.
CoRR, 2021

Understanding the Tradeoffs in Client-Side Privacy for Speech Recognition.
CoRR, 2021

MultiBench: Multiscale Benchmarks for Multimodal Representation Learning.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

StylePTB: A Compositional Benchmark for Fine-grained Controllable Text Style Transfer.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Cross-Modal Generalization: Learning in Low Resource Modalities via Meta-Alignment.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Towards Understanding and Mitigating Social Biases in Language Models.
Proceedings of the 38th International Conference on Machine Learning, 2021

Anchor & Transform: Learning Sparse Embeddings for Large Vocabularies.
Proceedings of the 9th International Conference on Learning Representations, 2021

Understanding the Tradeoffs in Client-side Privacy for Downstream Speech Tasks.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

Learning Language and Multimodal Privacy-Preserving Markers of Mood from Mobile Data.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Deep Neural Network for Robust Modulation Classification Under Uncertain Noise Conditions.
IEEE Trans. Veh. Technol., 2020

Foundations of Multimodal Co-learning.
Inf. Fusion, 2020

Multimodal Privacy-preserving Mood Prediction from Mobile Data: A Preliminary Study.
CoRR, 2020

An Investigation of how Label Smoothing Affects Generalization.
CoRR, 2020

Anchor & Transform: Learning Sparse Representations of Discrete Objects.
CoRR, 2020

Learning Not to Learn in the Presence of Noisy Labels.
CoRR, 2020

Think Locally, Act Globally: Federated Learning with Local and Global Representations.
CoRR, 2020

CMU-MOSEAS: A Multimodal Language Dataset for Spanish, Portuguese, German and French.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Diverse and Admissible Trajectory Forecasting Through Multimodal Context Understanding.
Proceedings of the Computer Vision - ECCV 2020, 2020

On Emergent Communication in Competitive Multi-Agent Teams.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

Towards Debiasing Sentence Representations.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Factorized Multimodal Transformer for Multimodal Sequential Learning.
CoRR, 2019

Variational Auto-Decoder.
CoRR, 2019

Deep Gamblers: Learning to Abstain with Portfolio Theory.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Strong and Simple Baselines for Multimodal Utterance Embeddings.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Learning Factorized Multimodal Representations.
Proceedings of the 7th International Conference on Learning Representations, 2019

Social-IQ: A Question Answering Benchmark for Artificial Social Intelligence.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Multimodal Transformer for Unaligned Multimodal Language Sequences.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Learning Representations from Imperfect Time Series Data via Tensor Rank Regularization.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Words Can Shift: Dynamically Adjusting Word Representations Using Nonverbal Behaviors.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Found in Translation: Learning Robust Joint Representations by Cyclic Translations between Modalities.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Label-Assisted Transmission for Short Packet Communications: A Machine Learning Approach.
IEEE Trans. Veh. Technol., 2018

Seq2Seq2Sentiment: Multimodal Sequence to Sequence Models for Sentiment Analysis.
CoRR, 2018

Multimodal Local-Global Ranking Fusion for Emotion Recognition.
Proceedings of the 2018 on International Conference on Multimodal Interaction, 2018

A Machine Learning Approach to MIMO Communications.
Proceedings of the 2018 IEEE International Conference on Communications, 2018

Robust Modulation Classification under Uncertain Noise Condition Using Recurrent Neural Network.
Proceedings of the IEEE Global Communications Conference, 2018

Multimodal Language Analysis with Recurrent Multistage Fusion.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

An Empirical Evaluation of Sketched SVD and its Application to Leverage Score Ordering.
Proceedings of The 10th Asian Conference on Machine Learning, 2018

Efficient Low-rank Multimodal Fusion With Modality-Specific Factors.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

Multimodal Language Analysis in the Wild: CMU-MOSEI Dataset and Interpretable Dynamic Fusion Graph.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

Multi-attention Recurrent Network for Human Communication Comprehension.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Memory Fusion Network for Multi-view Sequential Learning.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Multimodal sentiment analysis with word-level fusion and reinforcement learning.
Proceedings of the 19th ACM International Conference on Multimodal Interaction, 2017


  Loading...