Bidirectional Human-AI Alignment: Emerging Challenges and Opportunities.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, 2025
SAGEval: The frontiers of Satisfactory Agent based NLG Evaluation for reference-free open-ended text.
CoRR, 2024
Quantifying reliance on external information over parametric knowledge during Retrieval Augmented Generation (RAG) using mechanistic analysis.
CoRR, 2024
VLMGuard: Defending VLMs against Malicious Prompts via Unlabeled Data.
CoRR, 2024
ValueCompass: A Framework of Fundamental Values for Human-AI Alignment.
CoRR, 2024
From RAGs to rich parameters: Probing how language models utilize external knowledge over parametric information for factual queries.
CoRR, 2024
Towards Bidirectional Human-AI Alignment: A Systematic Review for Clarifications, Framework, and Future Directions.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Frontiers of Large Language Model-Based Agentic Systems - Construction, Efficacy and Safety.
Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 2024
Leveraging Language Models to Detect Greenwashing.
CoRR, 2023
Topic Segmentation of Semi-Structured and Unstructured Conversational Datasets using Language Models.
CoRR, 2023
On Surgical Fine-tuning for Language Encoders.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
Data-driven stochastic reliability assessment of the US electricity grid under large penetration of variable renewable energy resources.
PhD thesis, 2022
Topic Segmentation in the Wild: Towards Segmentation of Semi-structured & Unstructured Chats.
CoRR, 2022
Reconstruction of Long-Term Historical Demand Data.
CoRR, 2022