Shuming Shi

Orcid: 0009-0003-1712-5619

Affiliations:
  • Tencent AI Lab, China
  • Microsoft Research Asia (former)
  • Tsinghua University, China (PhD 2004)


According to our database1, Shuming Shi authored at least 211 papers between 2004 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Pretraining without wordpieces: learning over a vocabulary of millions of words.
Int. J. Mach. Learn. Cybern., September, 2024

An Energy-based Model for Word-level AutoCompletion in Computer-aided Translation.
Trans. Assoc. Comput. Linguistics, 2024

Exploring Human-Like Translation Strategy with Large Language Models.
Trans. Assoc. Comput. Linguistics, 2024

On the Transformations across Reward Model, Parameter Update, and In-Context Prompt.
CoRR, 2024

Sequence can Secretly Tell You What to Discard.
CoRR, 2024

Benchmarking LLMs via Uncertainty Quantification.
CoRR, 2024

Mitigating Hallucinations of Large Language Models via Knowledge Consistent Alignment.
CoRR, 2024

Salute the Classic: Revisiting Challenges of Machine Translation in the Age of Large Language Models.
CoRR, 2024

Inferflow: an Efficient and Highly Configurable Inference Engine for Large Language Models.
CoRR, 2024

DoG-Instruct: Towards Premium Instruction-Tuning Data via Text-Grounded Instruction Wrapping.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

GPT4Video: A Unified Multimodal Large Language Model for lnstruction-Followed Understanding and Safety-Aware Generation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Knowledge Fusion of Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

The Reasonableness Behind Unreasonable Translation Capability of Large Language Model.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Retrieval is Accurate Generation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

SkillNet-X: A Multilingual Multitask Model with Sparsely Activated Skills.
Proceedings of the IEEE International Conference on Acoustics, 2024

Knowledge Verification to Nip Hallucination in the Bud.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Selection-p: Self-Supervised Task-Agnostic Prompt Compression for Faithfulness and Transferability.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Not All Preference Pairs Are Created Equal: A Recipe for Annotation-Efficient Iterative Preference Learning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

On the Cultural Gap in Text-to-Image Generation.
Proceedings of the ECAI 2024 - 27th European Conference on Artificial Intelligence, 19-24 October 2024, Santiago de Compostela, Spain, 2024

A Frustratingly Simple Decoding Method for Neural Text Generation.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Reasons to Reject? Aligning Language Models with Judgments.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Benchmarking and Improving Long-Text Translation with Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Addressing Entity Translation Problem via Translation Difficulty and Context Diversity.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Spotting AI's Touch: Identifying LLM-Paraphrased Spans in Text.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

MAGE: Machine-generated Text Detection in the Wild.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

TextBind: Multi-turn Interleaved Multimodal Instruction-following in the Wild.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Disperse-Then-Merge: Pushing the Limits of Instruction Tuning via Alignment Tax Reduction.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

Advancement in Graph Understanding: A Multimodal Benchmark and Fine-Tuning of Vision-Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
A benchmark dataset and evaluation methodology for Chinese zero pronoun translation.
Lang. Resour. Evaluation, September, 2023

Predicting Events in MOBA Games: Prediction, Attribution, and Evaluation.
IEEE Trans. Games, June, 2023

Alleviating Hallucinations of Large Language Models through Induced Hallucinations.
CoRR, 2023

Emage: Non-Autoregressive Text-to-Image Generation.
CoRR, 2023

When Graph Data Meets Multimodal: A New Paradigm for Graph Understanding and Reasoning.
CoRR, 2023

StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving.
CoRR, 2023

Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models.
CoRR, 2023

A Benchmark for Text Expansion: Datasets, Metrics, and Baselines.
CoRR, 2023

TeGit: Generating High-Quality Instruction-Tuning Data with Text-Grounded Task Design.
CoRR, 2023

Siren's Song in the AI Ocean: A Survey on Hallucination in Large Language Models.
CoRR, 2023

Disco-Bench: A Discourse-Aware Evaluation Benchmark for Language Modelling.
CoRR, 2023

Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration.
CoRR, 2023

Deepfake Text Detection in the Wild.
CoRR, 2023

A Simple and Plug-and-play Method for Unsupervised Sentence Representation Enhancement.
CoRR, 2023

ParroT: Translating During Chat Using Large Language Models.
CoRR, 2023

Exploring the Use of Large Language Models for Reference-Free Text Quality Evaluation: A Preliminary Empirical Study.
CoRR, 2023

Is ChatGPT A Good Keyphrase Generator? A Preliminary Study.
CoRR, 2023

Findings of the WMT 2023 Shared Task on Discourse-Level Literary Translation: A Fresh Orb in the Cosmos of LLMs.
Proceedings of the Eighth Conference on Machine Translation, 2023

Findings of the Word-Level AutoCompletion Shared Task in WMT 2023.
Proceedings of the Eighth Conference on Machine Translation, 2023

Sen2Pro: A Probabilistic Perspective to Sentence Embedding from Pre-trained Language Model.
Proceedings of the 8th Workshop on Representation Learning for NLP, 2023

MarkBERT: Marking Word Boundaries Improves Chinese BERT.
Proceedings of the Natural Language Processing and Chinese Computing, 2023

Exploring the Use of Large Language Models for Reference-Free Text Quality Evaluation: An Empirical Study.
Proceedings of the Findings of the Association for Computational Linguistics: IJCNLP-AACL 2023, 2023

A Simple Yet Effective Approach to Structured Knowledge Distillation.
Proceedings of the IEEE International Conference on Acoustics, 2023

Skillnet-NLG: General-Purpose Natural Language Generation with a Sparsely Activated Approach.
Proceedings of the IEEE International Conference on Acoustics, 2023

Retrieval-Augmented Few-shot Text Classification.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Document-Level Machine Translation with Large Language Models.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Explore-Instruct: Enhancing Domain-Specific Instruction Coverage through Active Exploration.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

ParroT: Translating during Chat using Large Language Models tuned with Human Translation and Feedback.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

IMTLab: An Open-Source Platform for Building, Evaluating, and Diagnosing Interactive Machine Translation Systems.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Rethinking Word-Level Auto-Completion in Computer-Aided Translation.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

RobustGEC: Robust Grammatical Error Correction Against Subtle Context Perturbation.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

SORTIE: Dependency-Aware Symbolic Reasoning for Logical Data-to-text Generation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Making Better Use of Training Corpus: Retrieval-based Aspect Sentiment Triplet Extraction via Label Interpolation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

A Survey on Zero Pronoun Translation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Unsupervised Keyphrase Extraction by Learning Neural Keyphrase Set Function.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Explicit Syntactic Guidance for Neural Text Generation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Rethinking Translation Memory Augmented Neural Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Improved Visual Story Generation with Adaptive Context Modeling.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Enhancing Grammatical Error Correction Systems with Explanations.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Effidit: An Assistant for Improving Writing Efficiency.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2023

Zero-Shot Rumor Detection with Propagation Structure via Prompt Learning.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Interpretable Real-Time Win Prediction for Honor of Kings - A Popular Mobile MOBA Esport.
IEEE Trans. Games, 2022

Effidit: Your AI Writing Assistant.
CoRR, 2022

One Model, Multiple Modalities: A Sparsely Activated Approach for Text, Sound, Image, Video and Code.
CoRR, 2022

Pretraining Chinese BERT for Detecting Word Insertion and Deletion Errors.
CoRR, 2022

MarkBERT: Marking Word Boundaries Improves Chinese BERT.
CoRR, 2022

Efficient Sub-structured Knowledge Distillation.
CoRR, 2022

One Model, Multiple Tasks: Pathways for Natural Language Understanding.
CoRR, 2022

Revisiting the Evaluation Metrics of Paraphrase Generation.
CoRR, 2022

Rethink Stealthy Backdoor Attacks in Natural Language Processing.
CoRR, 2022

Tencent's Multilingual Machine Translation System for WMT22 Large-Scale African Languages.
Proceedings of the Seventh Conference on Machine Translation, 2022

Tencent AI Lab - Shanghai Jiao Tong University Low-Resource Translation System for the WMT22 Translation Task.
Proceedings of the Seventh Conference on Machine Translation, 2022

Findings of the Word-Level AutoCompletion Shared Task in WMT 2022.
Proceedings of the Seventh Conference on Machine Translation, 2022

Recent Advances in Retrieval-Augmented Text Generation.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022

On Synthetic Data for Back Translation.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Towards Efficient Dialogue Pre-training with Transferable and Interpretable Latent Structure.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

GuoFeng: A Benchmark for Zero Pronoun Recovery and Translation.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

On the Evaluation Metrics for Paraphrase Generation.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

MCPG: A Flexible Multi-Level Controllable Framework for Unsupervised Paraphrase Generation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

BiTIIMT: A Bilingual Text-infilling Method for Interactive Machine Translation.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Investigating Data Variance in Evaluations of Automatic Machine Translation Metrics.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

Understanding and Improving Sequence-to-Sequence Pretraining for Neural Machine Translation.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Exploring and Adapting Chinese GPT to Pinyin Input Method.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Rethinking Negative Sampling for Handling Missing Entity Annotations.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Bridging the Data Gap between Training and Inference for Unsupervised Neural Machine Translation.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

A Model-agnostic Data Manipulation Method for Persona-based Dialogue Generation.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Learning from Sibling Mentions with Scalable Graph Inference in Fine-Grained Entity Typing.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Redistributing Low-Frequency Words: Making the Most of Monolingual Data in Non-Autoregressive Translation.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
Detecting Source Contextual Barriers for Understanding Neural Machine Translation.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Attending From Foresight: A Novel Attention Mechanism for Neural Machine Translation.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Context-aware Self-Attention Networks for Natural Language Processing.
Neurocomputing, 2021

Rethinking Negative Sampling for Unlabeled Entity Problem in Named Entity Recognition.
CoRR, 2021

REAM#: An Enhancement Approach to Reference-based Evaluation Metrics for Open-domain Dialog Generation.
CoRR, 2021

TranSmart: A Practical Interactive Machine Translation System.
CoRR, 2021

Tencent AI Lab Machine Translation Systems for the WMT21 Biomedical Translation Task.
Proceedings of the Sixth Conference on Machine Translation, 2021

Tencent Translation System for the WMT21 News Translation Task.
Proceedings of the Sixth Conference on Machine Translation, 2021

Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition.
Proceedings of the 9th International Conference on Learning Representations, 2021

Fine-grained Entity Typing without Knowledge Base.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Segmenting Natural Language Sentences via Lexical Unit Analysis.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Frontmatter.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2021

An Empirical Study on Multiple Information Sources for Zero-Shot Fine-Grained Entity Typing.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

On the Language Coverage Bias for Neural Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

Dialogue Response Selection with Hierarchical Curriculum Learning.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

TexSmart: A System for Enhanced Natural Language Understanding.
Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

On the Copying Behaviors of Pre-Training for Neural Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

GWLAN: General Word-Level AutocompletioN for Computer-Aided Translation.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Tail-to-Tail Non-Autoregressive Sequence Prediction for Chinese Grammatical Error Correction.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Self-Training Sampling with Monolingual Data Uncertainty for Neural Machine Translation.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

REAM$\sharp$: An Enhancement Approach to Reference-based Evaluation Metrics for Open-domain Dialog Generation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

Enhancing the Open-Domain Dialogue Evaluation in Latent Space.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020
Exploiting deep representations for natural language processing.
Neurocomputing, 2020

TexSmart: A Text Understanding System for Fine-Grained NER and Enhanced Semantic Analysis.
CoRR, 2020

Predicting Events in MOBA Games: Dataset, Attribution, and Evaluation.
CoRR, 2020

Interpretable Real-Time Win Prediction for Honor of Kings, a Popular Mobile MOBA Esport.
CoRR, 2020

Assessing the Bilingual Knowledge Learned by Neural Machine Translation Models.
CoRR, 2020

Grayscale Data Construction and Multi-Level Ranking Objective for Dialogue Response Selection.
CoRR, 2020

Understanding Learning Dynamics for Neural Machine Translation.
CoRR, 2020

Detecting and Understanding Generalization Barriers for Neural Machine Translation.
CoRR, 2020

Tencent Neural Machine Translation Systems for the WMT20 News Translation Task.
Proceedings of the Fifth Conference on Machine Translation, 2020

Tencent AI Lab Machine Translation Systems for the WMT20 Biomedical Translation Task.
Proceedings of the Fifth Conference on Machine Translation, 2020

Tencent AI Lab Machine Translation Systems for WMT20 Chat Translation Task.
Proceedings of the Fifth Conference on Machine Translation, 2020

When Hearst Is not Enough: Improving Hypernymy Detection from Corpus with Distributional Models.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

On the Sub-Layer Functionalities of Transformer Decoder.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

On the Branching Bias of Syntax Extracted from Pre-trained Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

On the Inference Calibration of Neural Machine Translation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Rigid Formats Controlled Text Generation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Evaluating Explanation Methods for Neural Machine Translation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Balancing Quality and Human Involvement: An Effective Approach to Interactive Neural Machine Translation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Go From the General to the Particular: Multi-Domain Translation with Domain Transformation Networks.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Neuron Interaction Based Representation Composition for Neural Machine Translation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

CASE: Context-Aware Semantic Expansion.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Microblog Hashtag Generation via Encoding Conversation Contexts.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Understanding and Improving Hidden Representations for Neural Machine Translation.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Skeleton-to-Response: Dialogue Generation Guided by Retrieval Memory.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Self-Attention with Structural Position Representations.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

One Model to Learn Both: Zero Pronoun Prediction and Translation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Semi-supervised Text Style Transfer: Cross Projection in Latent Space.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Towards Understanding Neural Machine Translation with Word Importance.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Towards Better Modeling Hierarchical Structure for Self-Attention with Ordered Neurons.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Multi-Granularity Self-Attention for Neural Machine Translation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

A Discrete CVAE for Response Generation on Short-Text Conversation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Retrieval-guided Dialogue Response Generation via a Matching-to-Generation Framework.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Exploiting Sentential Context for Neural Machine Translation.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Topic-Aware Neural Keyphrase Generation for Social Media Language.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

On the Word Alignment from Neural Machine Translation.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Fine-Grained Sentence Functions for Short-Text Conversation.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Graph Based Translation Memory for Neural Machine Translation.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Neural Machine Translation with Adequacy-Oriented Learning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Generating Multiple Diverse Responses for Short-Text Conversation.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Dynamic Layer Aggregation for Neural Machine Translation with Routing-by-Agreement.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Learning to Remember Translation History with a Continuous Cache.
Trans. Assoc. Comput. Linguistics, 2018

Skeleton-to-Response: Dialogue Generation Guided by Retrieval Memory.
CoRR, 2018

Language Style Transfer from Sentences with Arbitrary Unknown Styles.
CoRR, 2018

A Manually Annotated Chinese Corpus for Non-task-oriented Dialogue Systems.
CoRR, 2018

Generative Stock Question Answering.
CoRR, 2018

Incorporating Pseudo-Parallel Data for Quantifiable Sequence Editing.
CoRR, 2018

Directional Skip-Gram: Explicitly Distinguishing Left and Right Context for Word Embeddings.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Target Foresight Based Attention for Neural Machine Translation.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Joint Learning Embeddings for Chinese Words and their Components via Ladder Structured Networks.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Complementary Learning of Word Embeddings.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Towards Less Generic Responses in Neural Conversation Models: A Statistical Re-weighting Method.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

QuaSE: Sequence Editing under Quantifiable Guidance.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Generating Classical Chinese Poems via Conditional Variational Autoencoder and Adversarial Training.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Exploiting Deep Representations for Neural Machine Translation.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

hyperdoc2vec: Distributed Representations of Hypertext Documents.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

Automatic Article Commenting: the Task and Dataset.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

Translating Pro-Drop Languages With Reconstruction Models.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Improving Sequence-to-Sequence Constituency Parsing.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Deep Neural Solver for Math Word Problems.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Learning Fine-Grained Expressions to Solve Math Word Problems.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

2016
How well do Computers Solve Math Word Problems? Large-Scale Dataset Construction and Evaluation.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

2015
Automatically Solving Number Word Problems by Semantic Parsing and Reasoning.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

2014
Unsupervised Template Mining for Semantic Category Understanding.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

Improving Context and Category Matching for Entity Search.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013
Overview of the Recognizing Inference in Text (RITE-2) at NTCIR-10.
Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, 2013

2012
Towards Large-Scale Unsupervised Relation Extraction from the Web.
Int. J. Semantic Web Inf. Syst., 2012

Ensemble Semantics for Large-scale Unsupervised Relation Extraction.
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012

2011
Overview of NTCIR-9 RITE: Recognizing Inference in TExt.
Proceedings of the 9th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2011

Nonlinear Evidence Fusion and Propagation for Hyponymy Relation Mining.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011

2010
Revisiting globally sorted indexes for efficient document retrieval.
Proceedings of the Third International Conference on Web Search and Web Data Mining, 2010

Corpus-based Semantic Class Mining: Distributional vs. Pattern-Based Approaches.
Proceedings of the COLING 2010, 2010

Efficient term proximity search with term-pair indexes.
Proceedings of the 19th ACM Conference on Information and Knowledge Management, 2010

2009
Effective top-k computation with term-proximity support.
Inf. Process. Manag., 2009

Microsoft Research Asia at the Web Track of TREC 2009.
Proceedings of The Eighteenth Text REtrieval Conference, 2009

Nonlinear static-rank computation.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

Employing Topic Models for Pattern-based Semantic Class Discovery.
Proceedings of the ACL 2009, 2009

2008
Improving relevance judgment of web search results with image excerpts.
Proceedings of the 17th International Conference on World Wide Web, 2008

Can phrase indexing help to process non-phrase queries?
Proceedings of the 17th ACM Conference on Information and Knowledge Management, 2008

Pattern-based semantic class discovery with multi-membership support.
Proceedings of the 17th ACM Conference on Information and Knowledge Management, 2008

2007
Web page title extraction and its application.
Inf. Process. Manag., 2007

Web object retrieval.
Proceedings of the 16th International Conference on World Wide Web, 2007

Improve Ranking by Using Image Information.
Proceedings of the Advances in Information Retrieval, 2007

Effective top-k computation in retrieving structured documents with term-proximity support.
Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, 2007

2006
Exploring URL Hit Priors for Web Search.
Proceedings of the Advances in Information Retrieval, 2006

Pseudo-anchor text extraction for searching vertical objects.
Proceedings of the 2006 ACM CIKM International Conference on Information and Knowledge Management, 2006

2005
Gravitation-based model for information retrieval.
Proceedings of the SIGIR 2005: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2005

Title extraction from bodies of HTML documents and its application to web page retrieval.
Proceedings of the SIGIR 2005: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2005

2004
Microsoft Research Asia at Web Track and Terabyte Track of TREC 2004.
Proceedings of the Thirteenth Text REtrieval Conference, 2004


  Loading...