On the Alignment, Robustness, and Generalizability of Multimodal Learning
PhD thesis, 2024
Evaluating Durability: Benchmark Insights into Multimodal Watermarking.
CoRR, 2024
Entity6K: A Large Open-Domain Evaluation Dataset for Real-World Entity Recognition.
CoRR, 2024
Embodied Executable Policy Learning with Language-based Scene Summarization.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024
SnapNTell: Enhancing Entity-Centric Visual Question Answering with Retrieval Augmented Multimodal LLM.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Offline Reinforcement Learning with Imbalanced Datasets.
CoRR, 2023
MultiSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos.
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
Multimodal Representation Learning of Cardiovascular Magnetic Resonance Imaging.
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
Converting ECG Signals to Images for Efficient Image-text Retrieval via Encoding.
CoRR, 2023
Interpolation for Robust Learning: Data Augmentation on Geodesics.
CoRR, 2023
LiveSeg: Unsupervised Multimodal Temporal Segmentation of Long Livestream Videos.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023
Automated Cardiovascular Record Retrieval by Multimodal Learning between Electrocardiogram and Clinical Report.
Proceedings of the Machine Learning for Health, 2023
Interpolation for Robust Learning: Data Augmentation on Wasserstein Geodesics.
Proceedings of the International Conference on Machine Learning, 2023
Cardiac Disease Diagnosis on Imbalanced Electrocardiography Data Through Optimal Transport Augmentation.
Proceedings of the IEEE International Conference on Acoustics, 2023
Can Brain Signals Reveal Inner Alignment with Human Languages?
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
Transfer Knowledge from Natural Language to Electrocardiography: Can We Detect Cardiovascular Disease Through Language Models?
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, 2023
Align and Attend: Multimodal Summarization with Dual Contrastive Losses.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Group Distributionally Robust Reinforcement Learning with Hierarchical Latent Variables.
,
,
,
,
,
,
,
,
,
,
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023
SCCS: Semantics-Consistent Cross-domain Summarization via Optimal Transport Alignment.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
Are Multimodal Models Robust to Image and Text Perturbations?
CoRR, 2022
Semantics-Consistent Cross-domain Summarization via Optimal Transport Alignment.
CoRR, 2022
An Empirical Exploration of Cross-domain Alignment between Language and Electroencephalogram.
CoRR, 2022
MHMS: Multimodal Hierarchical Multimedia Summarization.
CoRR, 2022
Optimal Transport based Data Augmentation for Heart Disease Diagnosis and Prediction.
CoRR, 2022
GeoECG: Data Augmentation via Wasserstein Geodesic Perturbation for Robust Electrocardiogram Prediction.
Proceedings of the Machine Learning for Healthcare Conference, 2022
Adversarial and Cooperative Correlated Domain Adaptation based Multimodal Emotion Recognition.
Proceedings of the 2nd Workshop on Affective Content Analysis (AffCon 2019) co-located with Thirty-Third AAAI Conference on Artificial Intelligence (AAAI 2019), 2019