Zehan Qi

According to our database1, Zehan Qi authored at least 15 papers between 2023 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
AutoGLM: Autonomous Foundation Agents for GUIs.
CoRR, 2024

Long<sup>2</sup>RAG: Evaluating Long-Context & Long-Form Retrieval-Augmented Generation with Key Point Recall.
CoRR, 2024

VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents.
CoRR, 2024

DebateQA: Evaluating Question Answering on Debatable Knowledge.
CoRR, 2024

MR-BEN: A Comprehensive Meta-Reasoning Benchmark for Large Language Models.
CoRR, 2024

ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools.
CoRR, 2024

NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User Prompts.
CoRR, 2024

Knowledge Conflicts for LLMs: A Survey.
CoRR, 2024

Prejudice and Caprice: A Statistical Framework for Measuring Social Discrimination in Large Language Models.
CoRR, 2024

Walking in Others' Shoes: How Perspective-Taking Guides Large Language Models in Reducing Toxicity and Bias.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Knowledge Conflicts for LLMs: A Survey.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

LONG²RAG: Evaluating Long-Context & Long-Form Retrieval-Augmented Generation with Key Point Recall.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User Queries.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Preemptive Answer "Attacks" on Chain-of-Thought Reasoning.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Survey on Factuality in Large Language Models: Knowledge, Retrieval and Domain-Specificity.
CoRR, 2023


  Loading...