Xiaoyu Shen

Orcid: 0000-0002-0217-2469

Affiliations:
  • Max Planck Institute for Informatics (MPII), Saarbrücken, Germany
  • Saarland University, Spoken Language Systems, Saarbrücken, Germany


According to our database1, Xiaoyu Shen authored at least 55 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Assessing "Implicit" Retrieval Robustness of Large Language Models.
CoRR, 2024

InternLM-Law: An Open Source Chinese Legal Large Language Model.
CoRR, 2024

Fine-Tuning Large Language Models to Translate: Will a Touch of Noisy Data in Misaligned Languages Suffice?
CoRR, 2024

A Preference-driven Paradigm for Enhanced Translation with Large Language Models.
CoRR, 2024

Unraveling the Mystery of Scaling Laws: Part I.
CoRR, 2024

SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

The Impact of Demonstrations on Multilingual In-Context Learning: A Multidimensional Analysis.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
A Comprehensive Evaluation of Parameter-Efficient Fine-Tuning on Software Engineering Tasks.
CoRR, 2023

LawBench: Benchmarking Legal Knowledge of Large Language Models.
CoRR, 2023

Weaker Than You Think: A Critical Look atWeakly Supervised Learning.
CoRR, 2023

xPQA: Cross-Lingual Product Question Answering across 12 Languages.
CoRR, 2023

Meta Self-Refinement for Robust Learning with Weak Supervision.
Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

Neural Ranking with Weak Supervision for Open-Domain Question Answering : A Survey.
Proceedings of the Findings of the Association for Computational Linguistics: EACL 2023, 2023

Weaker Than You Think: A Critical Look at Weakly Supervised Learning.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

xPQA: Cross-Lingual Product Question Answering in 12 Languages.
Proceedings of the The 61st Annual Meeting of the Association for Computational Linguistics: Industry Track, 2023

2022
MDIA: A Benchmark for Multilingual Dialogue Generation in 46 Languages.
CoRR, 2022

Low-Resource Dense Retrieval for Open-Domain Question Answering: A Comprehensive Survey.
CoRR, 2022


AST-Trans: Code Summarization with Efficient Tree-Structured Attention.
Proceedings of the 44th IEEE/ACM 44th International Conference on Software Engineering, 2022

FocusQA: Open-Domain Question Answering with a Context in Focus.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

From Rewriting to Remembering: Common Ground for Conversational QA Models.
Proceedings of the 4th Workshop on NLP for Conversational AI, 2022

2021
Deep latent-variable models for neural text generation.
PhD thesis, 2021

Learning Fine-Grained Fact-Article Correspondence in Legal Cases.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Knowledge-enhanced Session-based Recommendation with Temporal Transformer.
CoRR, 2021

Dependency Learning for Legal Judgment Prediction with a Unified Text-to-Text Transformer.
CoRR, 2021

Learning Fine-grained Fact-Article Correspondence in Legal Cases.
CoRR, 2021

AST-Transformer: Encoding Abstract Syntax Trees Efficiently for Code Summarization.
Proceedings of the 36th IEEE/ACM International Conference on Automated Software Engineering, 2021

The SelectGen Challenge: Finding the Best Training Samples for Few-Shot Neural Text Generation.
Proceedings of the 14th International Conference on Natural Language Generation, 2021

Preventing Author Profiling through Zero-Shot Multilingual Back-Translation.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Neural Data-to-Text Generation with LM-based Text Augmentation.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

Question Rewriting for Open-Domain Conversational QA: Best Practices and Limitations.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

On Training Instance Selection for Few-Shot Neural Text Generation.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Integrating Image Captioning with Rule-based Entity Masking.
CoRR, 2020

Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence.
CoRR, 2020

Unsupervised Pidgin Text Generation By Pivoting English Data and Self-Training.
Proceedings of the 1st AfricaNLP Workshop Proceedings, 2020

MovieChats: Chat like Humans in a Closed Domain.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

DART: A Lightweight Quality-Suggestive Data-to-Text Annotation Tool.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Diversifying Dialogue Generation with Non-Conversational Text.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Improving Latent Alignment in Text Summarization by Generalizing the Pointer Generator.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Select and Attend: Towards Controllable Content Selection in Text Generation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Unsupervised Rewriter for Multi-Sentence Compression.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Improving Multi-turn Dialogue Modelling with Utterance ReWriter.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Simulating the Large-Scale Erosion of Genomic Privacy Over Time.
IEEE ACM Trans. Comput. Biol. Bioinform., 2018

A comprehensive study: Sentence compression with linguistic knowledge-enhanced gated neural network.
Data Knowl. Eng., 2018

Nexus Network: Connecting the Preceding and the Following in Dialogue Generation.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Dialogue Generation With GAN.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Improving Variational Encoder-Decoders in Dialogue Generation.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Towards Better Variational Encoder-Decoders in Seq2Seq Tasks.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Gated Neural Network for Sentence Compression Using Linguistic Knowledge.
Proceedings of the Natural Language Processing and Information Systems, 2017

Estimation of Gap Between Current Language Models and Human Performance.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset.
Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017

Wake-Sleep Variational Autoencoders for Language Modeling.
Proceedings of the Neural Information Processing - 24th International Conference, 2017

A Conditional Variational Framework for Dialog Generation.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017


  Loading...