Wenxuan Wang

Orcid: 0000-0002-9803-8204

Affiliations:
  • Chinese University of Hong Kong, Department of Computer Science and Engineering, Hong Kong (PhD 2023)


According to our database1, Wenxuan Wang authored at least 48 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs.
CoRR, 2024

Chain-of-Jailbreak Attack for Image Generation Models via Editing Step by Step.
CoRR, 2024

A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How.
CoRR, 2024

Learning to Ask: When LLMs Meet Unclear Instruction.
CoRR, 2024

On the Resilience of Multi-Agent Systems with Malicious Agents.
CoRR, 2024

Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training.
CoRR, 2024

Automatically Generating UI Code from Screenshot: A Divide-and-Conquer-Based Approach.
CoRR, 2024

Exploring Multi-Lingual Bias of Large Code Models in Code Generation.
CoRR, 2024

How Well Can LLMs Echo Us? Evaluating AI Chatbots' Role-Play Ability with ECHO.
CoRR, 2024

How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments.
CoRR, 2024

Asclepius: A Spectrum Evaluation Benchmark for Medical Multi-Modal Large Language Models.
CoRR, 2024

The Earth is Flat? Unveiling Factual Errors in Large Language Models.
CoRR, 2024

A & B == B & A: Triggering Logical Reasoning Failures in Large Language Models.
CoRR, 2024

New Job, New Gender? Measuring the Social Bias in Image Generation Models.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

On the Humanity of Conversational AI: Evaluating the Psychological Portrayal of LLMs.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

LogicAsker: Evaluating and Improving the Logical Reasoning Ability of Large Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

On the Reliability of Psychological Scales on Large Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Does ChatGPT Know That It Does Not Know? Evaluating the Black-Box Calibration of ChatGPT.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

All Languages Matter: On the Multilingual Safety of LLMs.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
APIBench: A Benchmark Dataset for Evaluating API Recommendation Approaches in Python and Java.
Dataset, November, 2023

Revisiting, Benchmarking and Exploring API Recommendation: How Far Are We?
IEEE Trans. Software Eng., April, 2023

Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models.
CoRR, 2023

Who is ChatGPT? Benchmarking LLMs' Psychological Portrayal Using PsychoBench.
CoRR, 2023

All Languages Matter: On the Multilingual Safety of Large Language Models.
CoRR, 2023

Emotionally Numb or Empathetic? Evaluating How LLMs Feel Using EmotionBench.
CoRR, 2023

ChatGPT an ENFJ, Bard an ISTJ: Empirical Study on Personalities of Large Language Models.
CoRR, 2023

Constructing Effective In-Context Demonstration for Code Intelligence Tasks: An Empirical Study.
CoRR, 2023

ParroT: Translating During Chat Using Large Language Models.
CoRR, 2023

ChatGPT or Grammarly? Evaluating ChatGPT on Grammatical Error Correction Benchmark.
CoRR, 2023

Is ChatGPT A Good Translator? A Preliminary Study.
CoRR, 2023

BiasAsker: Measuring the Bias in Conversational AI System.
Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2023

An Image is Worth a Thousand Toxic Words: A Metamorphic Testing Framework for Content Moderation Software.
Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering, 2023

Generative Type Inference for Python.
Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering, 2023

What Makes Good In-Context Demonstrations for Code Intelligence Tasks with LLMs?
Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering, 2023

Validating Multimedia Content Moderation Software via Semantic Fusion.
Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, 2023

MTTM: Metamorphic Testing for Textual Content Moderation Software.
Proceedings of the 45th IEEE/ACM International Conference on Software Engineering, 2023

ParroT: Translating during Chat using Large Language Models tuned with Human Translation and Feedback.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Improving the Transferability of Adversarial Samples by Path-Augmented Method.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Understanding and Mitigating the Uncertainty in Zero-Shot Translation.
CoRR, 2022

Tencent's Multilingual Machine Translation System for WMT22 Large-Scale African Languages.
Proceedings of the Seventh Conference on Machine Translation, 2022

AEON: a method for automatic evaluation of NLP test cases.
Proceedings of the ISSTA '22: 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, Virtual Event, South Korea, July 18, 2022

Improving Adversarial Transferability via Neuron Attribution-based Attacks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
APIBench: A Benchmark Dataset for Evaluating API Recommendation Approaches in Python and Java.
Dataset, December, 2021

Language Models are Good Translators.
CoRR, 2021

2020
Rethinking the Value of Transformer Components.
Proceedings of the 28th International Conference on Computational Linguistics, 2020


  Loading...