Xin Jiang

Orcid: 0000-0002-9117-8247

Affiliations:
  • Huawei Noah's Ark Lab, China
  • Peking University, Beijing, China (PhD 2009)


According to our database1, Xin Jiang authored at least 155 papers between 2009 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Subtle Errors Matter: Preference Learning via Error-injected Self-editing.
CoRR, 2024

RevisEval: Improving LLM-as-a-Judge via Response-Adapted References.
CoRR, 2024

CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration.
CoRR, 2024

ToolACE: Winning the Points of LLM Function Calling.
CoRR, 2024

Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization.
CoRR, 2024

Chain-of-Probe: Examing the Necessity and Accuracy of CoT Step-by-Step.
CoRR, 2024

Mixture of insighTful Experts (MoTE): The Synergy of Thought Chains and Expert Mixtures in Self-Alignment.
CoRR, 2024

Visually Guided Generative Text-Layout Pre-training for Document Intelligence.
CoRR, 2024

YODA: Teacher-Student Progressive Learning for Language Models.
CoRR, 2024

MART: Learning Hierarchical Music Audio Representations with Part-Whole Transformer.
Proceedings of the Companion Proceedings of the ACM on Web Conference 2024, 2024

Poster Abstract: Tasking Heterogeneous Sensor Systems with LLMs.
Proceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems, 2024

Visually Guided Generative Text-Layout Pre-training for Document Intelligence.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Retrieval-based Disentangled Representation Learning with Natural Language Supervision.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

MT-Eval: A Multi-Turn Capabilities Evaluation Benchmark for Large Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Prompt-Based Length Controlled Generation with Multiple Control Types.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Learning to Edit: Aligning LLMs with Knowledge Editing.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Unsupervised Extractive Summarization with Learnable Length Control Strategies.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Preparing Lessons for Progressive Training on Language Models.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Red Alarm for Pre-trained Models: Universal Vulnerability to Neuron-level Backdoor Attacks.
Mach. Intell. Res., April, 2023

Music-PAW: Learning Music Representations via Hierarchical Part-whole Interaction and Contrast.
CoRR, 2023

Data Management For Large Language Models: A Survey.
CoRR, 2023

Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis.
CoRR, 2023

Prompt-Based Length Controlled Generation with Reinforcement Learning.
CoRR, 2023

Aligning Large Language Models with Human: A Survey.
CoRR, 2023

Enhancing Coherence of Extractive Summarization with Multitask Learning.
CoRR, 2023

PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing.
CoRR, 2023

EdgeFM: Leveraging Foundation Model for Open-set Learning on the Edge.
Proceedings of the 21st ACM Conference on Embedded Networked Sensor Systems, 2023

Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Reusing Pretrained Models by Multi-linear Operators for Efficient Training.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Learning Summary-Worthy Visual Representation for Abstractive Summarization in Video.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

A Study on Transformer Configuration and Training Objective.
Proceedings of the International Conference on Machine Learning, 2023

History, Present and Future: Enhancing Dialogue Generation with Few-Shot History-Future Prompt.
Proceedings of the IEEE International Conference on Acoustics, 2023

Lexicon-injected Semantic Parsing for Task-Oriented Dialog.
Proceedings of the IEEE International Conference on Acoustics, 2023

Improving Factual Consistency for Knowledge-Grounded Dialogue Systems via Knowledge Enhancement and Alignment.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Gradually Excavating External Knowledge for Implicit Complex Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

HyperPELT: Unified Parameter-Efficient Language Model Tuning for Both Language and Vision-and-Language Tasks.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Structured Pruning for Efficient Generative Pre-trained Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

CAME: Confidence-guided Adaptive Memory Efficient Optimization.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

One Cannot Stand for Everyone! Leveraging Multiple User Simulators to train Task-oriented Dialogue Systems.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

AutoConv: Automatically Generating Information-seeking Conversations with Large Language Models.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

NewsDialogues: Towards Proactive News Grounded Conversation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

mCLIP: Multilingual CLIP via Cross-lingual Transfer.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

KPT: Keyword-Guided Pre-training for Grounded Dialog Generation.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Wukong-Reader: Multi-modal Pre-training for Fine-grained Visual Document Understanding.
CoRR, 2022

Retrieval-based Disentanglement with Distant Supervision.
CoRR, 2022

PanGu-Coder: Program Synthesis with Function-Level Language Modeling.
CoRR, 2022

PERT: A New Solution to Pinyin to Character Conversion Task.
CoRR, 2022

Revisiting Pre-trained Language Models and their Evaluation for Arabic Natural Language Understanding.
CoRR, 2022

Deeper vs Wider: A Revisit of Transformer Configuration.
CoRR, 2022

PANGUBOT: Efficient Generative Dialogue Pre-training from Pre-trained Language Model.
CoRR, 2022

Towards Identifying Social Bias in Dialog Systems: Frame, Datasets, and Benchmarks.
CoRR, 2022

Wukong: 100 Million Large-scale Chinese Cross-modal Pre-training Dataset and A Foundation Framework.
CoRR, 2022

Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Towards Efficient Post-training Quantization of Pre-trained Language Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

LMTurk: Few-Shot Learners as Crowdsourcing Workers in a Language-Model-as-a-Service Framework.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

FreeTransfer-X: Safe and Label-Free Cross-Lingual Transfer from Off-the-Shelf Models.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

CorrectSpeech: A Fully Automated System for Speech Correction and Accent Reduction.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

CoCA-MDD: A Coupled Cross-Attention based Framework for Streaming Mispronunciation Detection and Diagnosis.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Boosting Graph Structure Learning with Dummy Nodes.
Proceedings of the International Conference on Machine Learning, 2022

FILIP: Fine-grained Interactive Language-Image Pre-Training.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Exploring extreme parameter compression for pre-trained language models.
Proceedings of the Tenth International Conference on Learning Representations, 2022

SPIRAL: Self-supervised Perturbation-Invariant Representation Learning for Speech Pre-Training.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Towards Identifying Social Bias in Dialog Systems: Framework, Dataset, and Benchmark.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Hypoformer: Hybrid Decomposition Transformer for Edge-friendly Neural Machine Translation.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Pre-training Language Models with Deterministic Factual Knowledge.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Revisiting Pre-trained Language Models and their Evaluation for Arabic Natural Language Processing.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

LiteVL: Efficient Video-Language Learning with Enhanced Spatial-Temporal Modeling.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

UTC: A Unified Transformer with Inter-Task Contrastive Learning for Visual Dialog.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Pan More Gold from the Sand: Refining Open-domain Dialogue Training with Noisy Self-Retrieval Generation.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

Hyperlink-induced Pre-training for Passage Retrieval in Open-domain Question Answering.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Compilable Neural Code Generation with Compiler Feedback.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

ClusterFormer: Neural Clustering Attention for Efficient and Effective Transformer.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Compression of Generative Pre-trained Language Models via Quantization.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

MINER: Multi-Interest Matching Network for News Recommendation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

How Pre-trained Language Models Capture Factual Knowledge? A Causal-Inspired Analysis.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

Controlled Text Generation Using Dictionary Prior in Variational Autoencoders.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

Enabling Multimodal Generation on CLIP via Vision-Language Knowledge Distillation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

bert2BERT: Towards Reusable Pretrained Language Models.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

MTRec: Multi-Task Learning over BERT for News Recommendation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

Read before Generate! Faithful Long Form Question Answering with Machine Reading.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

UniDS: A Unified Dialogue System for Chit-Chat and Task-oriented Dialogues.
Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering, 2022

UniMS: A Unified Framework for Multimodal Summarization with Knowledge Distillation.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

AutoBERT-Zero: Evolving BERT Backbone from Scratch.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
EyelashNet: a dataset and a baseline method for eyelash matting.
ACM Trans. Graph., 2021

Improving task-agnostic BERT distillation with layer mapping search.
Neurocomputing, 2021

LMTurk: Few-Shot Learners as Crowdsourcing Workers.
CoRR, 2021

JABER: Junior Arabic BERt.
CoRR, 2021

CCA-MDD: A Coupled Cross-Attention based Framework for Streaming Mispronunciation detection and diagnosis.
CoRR, 2021

CINS: Comprehensive Instruction for Few-shot Learning in Task-oriented Dialog Systems.
CoRR, 2021

NumGPT: Improving Numeracy Ability of Generative Pre-trained Models.
CoRR, 2021

Integrating Regular Expressions with Neural Networks via DFA.
CoRR, 2021

CLSEBERT: Contrastive Learning for Syntax Enhanced Code Pre-Trained Model.
CoRR, 2021

Learning Multilingual Representation for Natural Language Understanding with Enhanced Cross-Lingual Supervision.
CoRR, 2021

Improved OOD Generalization via Adversarial Training and Pre-training.
CoRR, 2021

PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation.
CoRR, 2021

An Approach to Improve Robustness of NLP Systems against ASR Errors.
CoRR, 2021

LightMBERT: A Simple Yet Effective Method for Multilingual BERT Distillation.
CoRR, 2021

Training Multilingual Pre-trained Language Model with Byte-level Subwords.
CoRR, 2021

Red Alarm for Pre-trained Models: Universal Vulnerabilities by Neuron-Level Backdoor Attacks.
CoRR, 2021

Improved OOD Generalization via Adversarial Training and Pretraing.
Proceedings of the 38th International Conference on Machine Learning, 2021

Reweighting Augmented Samples by Minimizing the Maximal Expected Loss.
Proceedings of the 9th International Conference on Learning Representations, 2021

On Position Embeddings in BERT.
Proceedings of the 9th International Conference on Learning Representations, 2021

Extract then Distill: Efficient and Effective Task-Agnostic BERT Distillation.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2021, 2021

DyLex: Incorporating Dynamic Lexicons into BERT for Sequence Labeling.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Generate & Rank: A Multi-task Framework for Math Word Problems.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Improving Unsupervised Question Answering via Summarization-Informed Question Generation.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

EditSpeech: A Text Based Speech Editing System Using Partial Inference and Bidirectional Fusion.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained Language Models.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Exploring Discourse Structures for Argument Impact Classification.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

GhostBERT: Generate More Features with Cheap Operations for BERT.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

BinaryBERT: Pushing the Limit of BERT Quantization.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Continuous Self-Attention Models with Neural ODE Networks.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

HopRetriever: Retrieve Hops over Wikipedia to Answer Complex Questions.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
BinaryBERT: Pushing the Limit of BERT Quantization.
CoRR, 2020

PPKE: Knowledge Representation Learning by Path-based Pre-training.
CoRR, 2020

KgPLM: Knowledge-guided Language Model Pre-training via Generative and Discriminative Learning.
CoRR, 2020

Learning to Detect Unacceptable Machine Translations for Downstream Tasks.
CoRR, 2020

DynaBERT: Dynamic BERT with Adaptive Width and Depth.
CoRR, 2020

Unsupervised Text Generation by Learning from Search.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

DynaBERT: Dynamic BERT with Adaptive Width and Depth.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Neural Subgraph Isomorphism Counting.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

An Investigation of Few-Shot Learning in Spoken Term Classification.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

A General Framework for Adaptation of Neural Machine Translation to Simultaneous Translation.
Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, 2020

On the Importance of Word and Sentence Representation Learning in Implicit Discourse Relation Classification.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Progressive Memory Banks for Incremental Domain Adaptation.
Proceedings of the 8th International Conference on Learning Representations, 2020

HyperText: Endowing FastText with Hyperbolic Geometry.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

TernaryBERT: Distillation-aware Ultra-low Bit BERT.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

TinyBERT: Distilling BERT for Natural Language Understanding.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Integrating Graph Contextualized Knowledge into Pre-trained Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Accurate Word Alignment Induction from Neural Machine Translation.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Dialog State Tracking with Reinforced Data Augmentation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Neural Subgraph Isomorphism Counting.
CoRR, 2019

Zero-Shot Paraphrase Generation with Multilingual Language Models.
CoRR, 2019

How to Do Simultaneous Translation Better with Consecutive Neural Machine Translation?
CoRR, 2019

Pretrained Language Models for Document-Level Neural Machine Translation.
CoRR, 2019

NEZHA: Neural Contextualized Representation for Chinese Language Understanding.
CoRR, 2019

GPT-based Generation for Classical Chinese Poetry.
CoRR, 2019

Triple-to-Text: Converting RDF Triples into High-Quality Natural Languages via Optimizing an Inverse KL Divergence.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

Exploring Diverse Expressions for Paraphrase Generation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

ERNIE: Enhanced Language Representation with Informative Entities.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Decomposable Neural Paraphrase Generation.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Meta Learning for Few-shot Keyword Spotting.
CoRR, 2018

Paraphrase Generation with Deep Reinforcement Learning.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Affective Neural Response Generation.
Proceedings of the Advances in Information Retrieval, 2018

2017
Deep Active Learning for Dialogue Generation.
Proceedings of the 6th Joint Conference on Lexical and Computational Semantics, 2017

2016
Online Sequence-to-Sequence Active Learning for Open-Domain Dialogue Generation.
CoRR, 2016

Neural Generative Question Answering.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Incorporating Semantic Knowledge into Latent Matching Model in Search.
Proceedings of the Information Retrieval Technology, 2016

2014
Ranking Optimization with Constraints.
Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, 2014

2009
A ranking approach to keyphrase extraction.
Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2009


  Loading...