Wenxuan Wang

Orcid: 0000-0002-9803-8204

Affiliations:

Chinese University of Hong Kong, Department of Computer Science and Engineering, Hong Kong (PhD 2023)

According to our database¹, Wenxuan Wang authored at least 51 papers between 2020 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

2020

2021

2022

2023

2024

2025

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

On the shortcut learning in multilingual neural machine translation.

[BibT_eX]

[DOI]

Neurocomputing, 2025

2024

Understanding and Mitigating the Uncertainty in Zero-Shot Translation.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

MRWeb: An Exploration of Generating Multi-Page Resource-Aware Web Code from UI Designs.

[BibT_eX]

[DOI]

CoRR, 2024

Medchain: Bridging the Gap Between LLM Agents and Clinical Practice through Interactive Sequential Benchmarking.

[BibT_eX]

[DOI]

CoRR, 2024

Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

Chain-of-Jailbreak Attack for Image Generation Models via Editing Step by Step.

[BibT_eX]

[DOI]

CoRR, 2024

Learning to Ask: When LLMs Meet Unclear Instruction.

[BibT_eX]

[DOI]

CoRR, 2024

On the Resilience of Multi-Agent Systems with Malicious Agents.

[BibT_eX]

[DOI]

CoRR, 2024

Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training.

[BibT_eX]

[DOI]

CoRR, 2024

Automatically Generating UI Code from Screenshot: A Divide-and-Conquer-Based Approach.

[BibT_eX]

[DOI]

CoRR, 2024

Exploring Multi-Lingual Bias of Large Code Models in Code Generation.

[BibT_eX]

[DOI]

CoRR, 2024

How Well Can LLMs Echo Us? Evaluating AI Chatbots' Role-Play Ability with ECHO.

[BibT_eX]

[DOI]

CoRR, 2024

How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments.

[BibT_eX]

[DOI]

CoRR, 2024

Asclepius: A Spectrum Evaluation Benchmark for Medical Multi-Modal Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

The Earth is Flat? Unveiling Factual Errors in Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

A & B == B & A: Triggering Logical Reasoning Failures in Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

New Job, New Gender? Measuring the Social Bias in Image Generation Models.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How.

[BibT_eX]

[DOI]

Proceedings of the 39th IEEE/ACM International Conference on Automated Software Engineering, 2024

GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

On the Humanity of Conversational AI: Evaluating the Psychological Portrayal of LLMs.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

LogicAsker: Evaluating and Improving the Logical Reasoning Ability of Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

On the Reliability of Psychological Scales on Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Does ChatGPT Know That It Does Not Know? Evaluating the Black-Box Calibration of ChatGPT.

[BibT_eX]

[DOI]

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

All Languages Matter: On the Multilingual Safety of LLMs.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

APIBench: A Benchmark Dataset for Evaluating API Recommendation Approaches in Python and Java.

[BibT_eX]

[DOI]

Dataset, November, 2023

Revisiting, Benchmarking and Exploring API Recommendation: How Far Are We?

[BibT_eX]

[DOI]

IEEE Trans. Software Eng., April, 2023

Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Who is ChatGPT? Benchmarking LLMs' Psychological Portrayal Using PsychoBench.

[BibT_eX]

[DOI]

CoRR, 2023

All Languages Matter: On the Multilingual Safety of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Emotionally Numb or Empathetic? Evaluating How LLMs Feel Using EmotionBench.

[BibT_eX]

[DOI]

CoRR, 2023

ChatGPT an ENFJ, Bard an ISTJ: Empirical Study on Personalities of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Constructing Effective In-Context Demonstration for Code Intelligence Tasks: An Empirical Study.

[BibT_eX]

[DOI]

CoRR, 2023

ParroT: Translating During Chat Using Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

ChatGPT or Grammarly? Evaluating ChatGPT on Grammatical Error Correction Benchmark.

[BibT_eX]

[DOI]

CoRR, 2023

Is ChatGPT A Good Translator? A Preliminary Study.

[BibT_eX]

[DOI]

CoRR, 2023

BiasAsker: Measuring the Bias in Conversational AI System.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2023

An Image is Worth a Thousand Toxic Words: A Metamorphic Testing Framework for Content Moderation Software.

[BibT_eX]

[DOI]

Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering, 2023

Generative Type Inference for Python.

[BibT_eX]

[DOI]

Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering, 2023

What Makes Good In-Context Demonstrations for Code Intelligence Tasks with LLMs?

[BibT_eX]

[DOI]

Proceedings of the 38th IEEE/ACM International Conference on Automated Software Engineering, 2023

Validating Multimedia Content Moderation Software via Semantic Fusion.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM SIGSOFT International Symposium on Software Testing and Analysis, 2023

MTTM: Metamorphic Testing for Textual Content Moderation Software.

[BibT_eX]

[DOI]

Proceedings of the 45th IEEE/ACM International Conference on Software Engineering, 2023

ParroT: Translating during Chat using Large Language Models tuned with Human Translation and Feedback.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Improving the Transferability of Adversarial Samples by Path-Augmented Method.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Tencent's Multilingual Machine Translation System for WMT22 Large-Scale African Languages.

[BibT_eX]

[DOI]

Proceedings of the Seventh Conference on Machine Translation, 2022

AEON: a method for automatic evaluation of NLP test cases.

[BibT_eX]

[DOI]

Proceedings of the ISSTA '22: 31st ACM SIGSOFT International Symposium on Software Testing and Analysis, Virtual Event, South Korea, July 18, 2022

Improving Adversarial Transferability via Neuron Attribution-based Attacks.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021

APIBench: A Benchmark Dataset for Evaluating API Recommendation Approaches in Python and Java.

[BibT_eX]

[DOI]

Dataset, December, 2021

Language Models are Good Translators.

[BibT_eX]

[DOI]

CoRR, 2021

2020

Rethinking the Value of Transformer Components.

[BibT_eX]

[DOI]

Wenxuan Wang

Zhaopeng Tu

Proceedings of the 28th International Conference on Computational Linguistics, 2020

Wenxuan Wang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...