Jianfeng Gao

Affiliations:
  • Microsoft Research, Redmond, WA, USA


According to our database1, Jianfeng Gao authored at least 473 papers between 2000 and 2024.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2023, "For contributions to machine learning for web search, natural language processing, and conversational systems".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Overview of the Ninth Dialog System Technology Challenge: DSTC9.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Retrieve What You Need: A Mutual Learning Framework for Open-domain Question Answering.
Trans. Assoc. Comput. Linguistics, 2024

Multimodal Foundation Models: From Specialists to General-Purpose Assistants.
Found. Trends Comput. Graph. Vis., 2024

Vector-ICL: In-context Learning with Continuous Vector Representations.
CoRR, 2024

ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning.
CoRR, 2024

A generative framework to bridge data-driven models and scientific theories in language neuroscience.
CoRR, 2024

Data Analysis in the Era of Generative AI.
CoRR, 2024

Contextualized Data-Wrangling Code Generation in Computational Notebooks.
CoRR, 2024

GRIN: GRadient-INformed MoE.
CoRR, 2024

Model Tells Itself Where to Attend: Faithfulness Meets Automatic Attention Steering.
CoRR, 2024

Data Formulator 2: Iteratively Creating Rich Visualizations with AI.
CoRR, 2024

Diversifying the Expert Knowledge for Task-Agnostic Pruning in Sparse Mixture-of-Experts.
CoRR, 2024

UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models.
CoRR, 2024

GUI-WORLD: A Dataset for GUI-oriented Multimodal LLM-based Agents.
CoRR, 2024

DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs.
CoRR, 2024

Matryoshka Multimodal Models.
CoRR, 2024

Crafting Interpretable Embeddings by Asking LLMs Questions.
CoRR, 2024

BiomedParse: a biomedical foundation model for image parsing of everything everywhere all at once.
CoRR, 2024

List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs.
CoRR, 2024

Training Small Multimodal Models to Bridge Biomedical Competency Gap: A Case Study in Radiology Imaging.
CoRR, 2024

Pix2Gif: Motion-Guided Diffusion for GIF Generation.
CoRR, 2024

Attribute Structuring Improves LLM-Based Evaluation of Clinical Text Summaries.
CoRR, 2024

Position Paper: Agent AI Towards a Holistic Intelligence.
CoRR, 2024

Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models.
CoRR, 2024

The Essential Role of Causality in Foundation World Models for Embodied AI.
CoRR, 2024

Large Language Models: A Survey.
CoRR, 2024

An Interactive Agent Foundation Model.
CoRR, 2024

Learning a Decision Tree Algorithm with Transformers.
CoRR, 2024

Rethinking Interpretability in the Era of Large Language Models.
CoRR, 2024

Towards Consistent Natural-Language Explanations via Explanation-Consistency Finetuning.
CoRR, 2024

TrustLLM: Trustworthiness in Large Language Models.
CoRR, 2024

Agent AI: Surveying the Horizons of Multimodal Interaction.
CoRR, 2024

Teaching Language Models to Self-Improve through Interactive Demonstrations.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

ReEval: Automatic Hallucination Evaluation for Retrieval-Augmented Large Language Models via Transferable Adversarial Attacks.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

MindAgent: Emergent Gaming Interaction.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024


Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Is Self-Repair a Silver Bullet for Code Generation?
Proceedings of the Twelfth International Conference on Learning Representations, 2024

MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Fast-ELECTRA for Efficient Pre-training.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Toward Compositional Behavior in Neural Models: A Survey of Current Views.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

Language Models as Inductive Reasoners.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

Visual in-Context Prompting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Fine-tuning large neural language models for biomedical natural language processing.
Patterns, April, 2023

Neural Approaches to Conversational Information Retrieval
The Information Retrieval Series 44, Springer, ISBN: 978-3-031-23079-0, 2023

Training Vision-Language Transformers from Captions.
Trans. Mach. Learn. Res., 2023

How Much Do Language Models Copy From Their Training Data? Evaluating Linguistic Novelty in Text Generation Using RAVEN.
Trans. Assoc. Comput. Linguistics, 2023

LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models.
CoRR, 2023

IMProv: Inpainting-based Multimodal Prompting for Computer Vision Tasks.
CoRR, 2023

GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation.
CoRR, 2023

LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents.
CoRR, 2023

LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing.
CoRR, 2023

Automatic Hallucination Assessment for Aligned Large Language Models via Transferable Adversarial Attacks.
CoRR, 2023

Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V.
CoRR, 2023

BiomedJourney: Counterfactual Biomedical Image Generation by Instruction-Learning from Multimodal Patient Journeys.
CoRR, 2023

MathVista: Evaluating Math Reasoning in Visual Contexts with GPT-4V, Bard, and Other Large Multimodal Models.
CoRR, 2023

Sparse Backpropagation for MoE Training.
CoRR, 2023

MindAgent: Emergent Gaming Interaction.
CoRR, 2023

An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models.
CoRR, 2023

Semantic-SAM: Segment and Recognize Anything at Any Granularity.
CoRR, 2023

Demystifying GPT Self-Repair for Code Generation.
CoRR, 2023

Self-Verification Improves Few-Shot Clinical Information Extraction.
CoRR, 2023

Explaining black box text modules in natural language with language models.
CoRR, 2023

Chain-of-Skills: A Configurable Model for Open-domain Question Answering.
CoRR, 2023

ArK: Augmented Reality with Knowledge Interactive Emergent Ability.
CoRR, 2023

Segment Everything Everywhere All at Once.
CoRR, 2023

Instruction Tuning with GPT-4.
CoRR, 2023

Pre-training Transformers for Knowledge Graph Completion.
CoRR, 2023

A Simple Framework for Open-Vocabulary Segmentation and Detection.
CoRR, 2023

Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback.
CoRR, 2023

Enhancing Task Bot Engagement with Synthesized Open-Domain Dialog.
Proceedings of the 24th Meeting of the Special Interest Group on Discourse and Dialogue, 2023

Segment Everything Everywhere All at Once.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Augmenting Language Models with Long-Term Memory.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Localized Symbolic Knowledge Distillation for Visual Commonsense Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Bridging Discrete and Backpropagation: Straight-Through and Beyond.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Guiding Large Language Models via Directional Stimulus Prompting.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Differentiable Tree Operations Promote Compositional Generalization.
Proceedings of the International Conference on Machine Learning, 2023

Understand and Modularize Generator Optimization in ELECTRA-style Pretraining.
Proceedings of the International Conference on Machine Learning, 2023

Visually-Augmented Language Modeling.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Learning Math Reasoning from Self-Sampled Correct and Partially-Correct Solutions.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

DeBERTaV3: Improving DeBERTa using ELECTRA-Style Pre-Training with Gradient-Disentangled Embedding Sharing.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Optimizing Bi-Encoder for Named Entity Recognition via Contrastive Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Tree Prompting: Efficient Task Adaptation without Fine-Tuning.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Interactive Text Generation.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Pre-training Multi-task Contrastive Learning Models for Scientific Literature Understanding.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

ConvLab-3: A Flexible Dialogue System Toolkit Based on a Unified Data Format.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Generalized Decoding for Pixel, Image, and Language.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Learning Customized Visual Models with Retrieval-Augmented Knowledge.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

GLIGEN: Open-Set Grounded Text-to-Image Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Explaining Data Patterns in Natural Language with Language Models.
Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, 2023

Logical Transformers: Infusing Logical Structures into Pre-Trained Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Chain-of-Skills: A Configurable Model for Open-Domain Question Answering.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

DIONYSUS: A Pre-trained Model for Low-Resource Dialogue Summarization.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Task-Aware Specialization for Efficient and Robust Dense Retrieval for Open-Domain Question Answering.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

2022
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing.
ACM Trans. Comput. Heal., 2022

Vision-Language Pre-Training: Basics, Recent Advances, and Future Trends.
Found. Trends Comput. Graph. Vis., 2022

Deep Learning-based Text Classification: A Comprehensive Review.
ACM Comput. Surv., 2022

Enhancing Task Bot Engagement with Synthesized Open-Domain Dialog.
CoRR, 2022

Efficient Long Sequence Modeling via State Space Augmented Transformer.
CoRR, 2022

ConvLab-3: A Flexible Dialogue System Toolkit Based on a Unified Data Format.
CoRR, 2022

Execution-based Evaluation for Data Science Code Generation Models.
CoRR, 2022

Lafite2: Few-shot Text-to-Image Generation.
CoRR, 2022

AutoMoE: Neural Architecture Search for Efficient Sparsely Activated Transformers.
CoRR, 2022

Explaining Patterns in Data with Language Models via Interpretable Autoprompting.
CoRR, 2022

Emb-GAM: an Interpretable and Efficient Predictor using Pre-trained Language Models.
CoRR, 2022

Z-Code++: A Pre-trained Language Model Optimized for Abstractive Summarization.
CoRR, 2022

Structural Biases for Improving Transformers on Translation into Morphologically Rich Languages.
CoRR, 2022

Interactive Code Generation via Test-Driven User-Intent Formalization.
CoRR, 2022

OPERA: Harmonizing Task-Oriented Dialogs and Information Seeking Experience.
CoRR, 2022

GODEL: Large-Scale Pre-Training for Goal-Directed Dialog.
CoRR, 2022

Learning from Self-Sampled Correct and Partially-Correct Programs.
CoRR, 2022

AdaMix: Mixture-of-Adapter for Parameter-efficient Tuning of Large Language Models.
CoRR, 2022

Training Vision-Language Transformers from Captions Alone.
CoRR, 2022

Sparsely Activated Mixture-of-Experts are Robust Multi-Task Learners.
CoRR, 2022

METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals.
CoRR, 2022

Focal Modulation Networks.
CoRR, 2022

Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer.
CoRR, 2022

A Survey of Knowledge-Intensive NLP with Pre-Trained Language Models.
CoRR, 2022

AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models.
CoRR, 2022

Toward Self-Learning End-to-End Dialog Systems.
CoRR, 2022

Neural Approaches to Conversational Information Retrieval.
CoRR, 2022

Neurocompositional Computing: From the Central Paradox of Cognition to a New Generation of AI Systems.
AI Mag., 2022

Toward Self-Learning End-to-End Task-oriented Dialog Systems.
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2022

GLIPv2: Unifying Localization and Vision-Language Understanding.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Focal Modulation Networks.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

K-LITE: Learning Transferable Visual Models with External Knowledge.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Fault-Aware Neural Code Rankers.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

LiST: Lite Prompted Self-training Makes Parameter-efficient Few-shot Learners.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

Knowledge-Grounded Dialogue Generation with a Unified Knowledge Representation.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

KAT: A Knowledge Augmented Transformer for Vision-and-Language.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Taming Sparsely Activated Transformer with Stochastic Experts.
Proceedings of the Tenth International Conference on Learning Representations, 2022

No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Efficient Self-supervised Vision Transformers for Representation Learning.
Proceedings of the Tenth International Conference on Learning Representations, 2022

AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Open-domain Question Answering via Chain of Reasoning over Heterogeneous Knowledge.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

CodeExp: Explanatory Code Document Generation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Grounded Keys-to-Text Generation: Towards Factual Open-Ended Generation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Knowledge-Rich Self-Supervision for Biomedical Entity Linking.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

RegionCLIP: Region-based Language-Image Pretraining.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Unified Contrastive Learning in Image-Text-Label Space.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Grounded Language-Image Pre-training.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

WebQA: Multihop and Multimodal QA.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Open Domain Question Answering with A Unified Knowledge Interface.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

RetGen: A Joint Framework for Retrieval and Grounded Text Generation Modeling.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

ValueNet: A New Dataset for Human Value Driven Dialogue System.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Overview of the Eighth Dialog System Technology Challenge: DSTC8.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

SOLOIST: Building Task Bots at Scale with Transfer Learning and Machine Teaching.
Trans. Assoc. Comput. Linguistics, 2021

Vision-Language Navigation Policy Learning and Adaptation.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Knowledge-Rich Self-Supervised Entity Linking.
CoRR, 2021

Florence: A New Foundation Model for Computer Vision.
CoRR, 2021

CLUES: Few-Shot Learning Evaluation in Natural Language Understanding.
CoRR, 2021

SYNERGY: Building Task Bots at Scale Using Symbolic Knowledge and Machine Teaching.
CoRR, 2021

Open Domain Question Answering over Virtual Documents: A Unified Approach for Data and Text.
CoRR, 2021

LiST: Lite Self-training Makes Efficient Few-shot Learners.
CoRR, 2021

Image Scene Graph Generation (SGG) Benchmark.
CoRR, 2021

Focal Self-attention for Local-Global Interactions in Vision Transformers.
CoRR, 2021

XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation.
CoRR, 2021

Joint Retrieval and Generation Training for Grounded Text Generation.
CoRR, 2021

Adversarial Training as Stackelberg Game: An Unrolled Optimization Approach.
CoRR, 2021

VinVL: Making Visual Representations Matter in Vision-Language Models.
CoRR, 2021

Focal Attention for Long-Range Interactions in Vision Transformers.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Adversarial GLUE: A Multi-Task Benchmark for Robustness Evaluation of Language Models.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Few-Shot Learning Evaluation in Natural Language Understanding.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Targeted Adversarial Training for Natural Language Understanding.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Enriching Transformers with Structured Tensor-Product Representations for Abstractive Summarization.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Text Editing by Command.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Posterior Differential Regularization with f-divergence for Improving Model Robustness.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Domain-Specific Pretraining for Vertical Search: Case Study on Biomedical Literature.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

Data Augmentation for Spoken Language Understanding via Pretrained Language Models.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Deberta: decoding-Enhanced Bert with Disentangled Attention.
Proceedings of the 9th International Conference on Learning Representations, 2021

Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

TACo: Token-aware Cascade Contrastive Learning for Video-Text Alignment.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

ARCH: Efficient Adversarial Regularized Training with Caching.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Token-wise Curriculum Learning for Neural Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Few-Shot Named Entity Recognition: An Empirical Baseline Study.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

HittER: Hierarchical Transformers for Knowledge Graph Embeddings.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

NICE: Neural Image Commenting with Empathy.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Contrastive Multi-document Question Generation.
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 2021

VinVL: Revisiting Visual Representations in Vision-Language Models.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Compositional processing emerges in neural networks solving math problems.
Proceedings of the 43rd Annual Meeting of the Cognitive Science Society, 2021

EmailSum: Abstractive Email Thread Summarization.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

RADDLE: An Evaluation Benchmark and Analysis Platform for Robust Task-oriented Dialog Systems.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Reader-Guided Passage Reranking for Open-Domain Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

Generation-Augmented Retrieval for Open-Domain Question Answering.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

GO FIGURE: A Meta Evaluation of Factuality in Summarization.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

UnitedQA: A Hybrid Approach for Open Domain Question Answering.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

A Controllable Model of Grounded Response Generation.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Data Augmentation for Abstractive Query-Focused Multi-Document Summarization.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Challenges in Building Intelligent Open-domain Dialog Systems.
ACM Trans. Inf. Syst., 2020

Few-Shot Named Entity Recognition: A Comprehensive Study.
CoRR, 2020

Self-supervised Pre-training with Hard Examples Improves Visual Representations.
CoRR, 2020

MiniVLM: A Smaller and Faster Vision-Language Model.
CoRR, 2020

Neuro-Symbolic Representations for Video Captioning: A Case for Leveraging Inductive Biases for Vision and Language.
CoRR, 2020

CMT in TREC-COVID Round 2: Mitigating the Generalization Gaps from Web to Special Domain Search.
CoRR, 2020

VIVO: Surpassing Human Performance in Novel Object Captioning with Visual Vocabulary Pre-Training.
CoRR, 2020

Robust Conversational AI with Grounded Text Generation.
CoRR, 2020

Very Deep Transformers for Neural Machine Translation.
CoRR, 2020

Evaluation of Text Generation: A Survey.
CoRR, 2020

M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-training.
CoRR, 2020

Novel Human-Object Interaction Detection via Adversarial Domain Generalization.
CoRR, 2020

SOLOIST: Few-shot Task-Oriented Dialog with A Single Pre-trained Auto-regressive Model.
CoRR, 2020

Data Augmentation for Spoken Language Understanding via Pretrained Models.
CoRR, 2020

Adversarial Training for Large Neural Language Models.
CoRR, 2020

Guided Dialog Policy Learning without Adversarial Learning in the Loop.
CoRR, 2020

Multi-View Learning for Vision-and-Language Navigation.
CoRR, 2020

The Design and Implementation of XiaoIce, an Empathetic Social Chatbot.
Comput. Linguistics, 2020

Few-Shot Generative Conversational Query Rewriting.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

Recent Advances in Conversational Information Retrieval.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

Is Your Goal-Oriented Dialog Model Performing Really Well? Empirical Analysis of System-wise Evaluation.
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2020


Sequential Attention GAN for Interactive Image Editing.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Feature Quantization Improves GAN Training.
Proceedings of the 37th International Conference on Machine Learning, 2020

Mapping natural-language problems to formal-language solutions using structured neural representations.
Proceedings of the 37th International Conference on Machine Learning, 2020

UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training.
Proceedings of the 37th International Conference on Machine Learning, 2020

RaCT: Toward Amortized Ranking-Critical Training For Collaborative Filtering.
Proceedings of the 8th International Conference on Learning Representations, 2020

On the Variance of the Adaptive Learning Rate and Beyond.
Proceedings of the 8th International Conference on Learning Representations, 2020

RMM: A Recursive Mental Model for Dialog Navigation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

PlotMachines: Outline-Conditioned Generation with Dynamic Plot State Tracking.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Few-shot Natural Language Generation for Task-Oriented Dialog.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Understanding the Difficulty of Training Transformers.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Guided Dialogue Policy Learning without Adversarial Learning in the Loop.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks.
Proceedings of the Computer Vision - ECCV 2020, 2020

Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-Training.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020

DIALOGPT : Large-Scale Generative Pre-training for Conversational Response Generation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020

MIND: A Large-scale Dataset for News Recommendation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Conversation Learner - A Machine Teaching Tool for Building Dialog Managers for Task-Oriented Dialog Systems.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020

The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020

SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

MagGAN: High-Resolution Face Attribute Editing with Mask-Guided Generative Adversarial Network.
Proceedings of the Computer Vision - ACCV 2020 - 15th Asian Conference on Computer Vision, Kyoto, Japan, November 30, 2020

Unified Vision-Language Pre-Training for Image Captioning and VQA.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Complementary Auxiliary Classifiers for Label-Conditional Text Generation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

What Makes A Good Story? Designing Composite Rewards for Visual Storytelling.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

PIQA: Reasoning about Physical Commonsense in Natural Language.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Neural Approaches to Conversational AI.
Found. Trends Inf. Retr., 2019

The Eighth Dialog System Technology Challenge.
CoRR, 2019

Unsupervised Common Question Generation from Multiple Documents using Reinforced Contrastive Coordinator.
CoRR, 2019

HUBERT Untangles BERT to Improve Transfer across NLP Tasks.
CoRR, 2019

Enhancing the Transformer with Explicit Relational Encoding for Math Problem Solving.
CoRR, 2019

Natural- to formal-language generation using Tensor Product Representations.
CoRR, 2019

Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators.
CoRR, 2019

A Hybrid Neural Network Model for Commonsense Reasoning.
CoRR, 2019

Towards Amortized Ranking-Critical Training for Collaborative Filtering.
CoRR, 2019

Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding.
CoRR, 2019

Consistent Dialogue Generation with Self-supervised Feature Learning.
CoRR, 2019

Dialog System Technology Challenge 7.
CoRR, 2019

Unified Language Model Pre-training for Natural Language Understanding and Generation.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Unsupervised Deep Structured Semantic Models for Commonsense Reasoning.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Jointly Optimizing Diversity and Relevance in Neural Response Generation.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Cyclical Annealing Schedule: A Simple Approach to Mitigating KL Vanishing.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Towards an Open-Doman Dialog System.
Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval, 2019

Adversarial Domain Adaptation for Machine Reading Comprehension.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Robust Navigation with Language Pretraining and Stochastic Sampling.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

TIGEr: Text-to-Image Grounding for Image Caption Evaluation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

REO-Relevance, Extraness, Omission: A Fine-grained Evaluation for Image Captioning.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Structuring Latent Spaces for Stylized Response Generation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Implicit Deep Latent Variable Models for Text Generation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Object-Driven Text-To-Image Synthesis via Adversarial Training.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

StoryGAN: A Sequential Conditional GAN for Story Visualization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Tactical Rewind: Self-Correction via Backtracking in Vision-And-Language Navigation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

A Hybrid Retrieval-Generation Neural Conversation Model.
Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019

DoubleTransfer at MEDIQA 2019: Multi-Source Transfer Learning for Natural Language Understanding in the Medical Domain.
Proceedings of the 18th BioNLP Workshop and Shared Task, 2019

Budgeted Policy Learning for Task-Oriented Dialogue Systems.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Microsoft Icecaps: An Open-Source Toolkit for Conversation Modeling.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Towards Generating Long and Coherent Text with Multi-Level Latent Variable Models.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Conversing by Reading: Contentful Neural Conversation with On-demand Machine Reading.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Multi-Task Deep Neural Networks for Natural Language Understanding.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

ConvLab: Multi-Domain End-to-End Dialog System Platform.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Multi-step Reasoning via Recurrent Dual Attention for Visual Dialog.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Interactive Semantic Parsing for If-Then Recipes via Hierarchical Reinforcement Learning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Switch-Based Active Deep Dyna-Q: Efficient Adaptive Planning for Task-Completion Dialogue Policy Learning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Sequential Attention GAN for Interactive Image Editing via Dialogue.
CoRR, 2018

A bird's-eye view on coherence, and a worm's-eye view on cohesion.
CoRR, 2018

ReCoRD: Bridging the Gap between Human and Machine Commonsense Reading Comprehension.
CoRR, 2018

Stochastic Answer Networks for SQuAD 2.0.
CoRR, 2018

Multi-Task Learning for Machine Reading Comprehension.
CoRR, 2018

Microsoft Dialogue Challenge: Building End-to-End Task-Completion Dialogue Systems.
CoRR, 2018

The Neural Painter: Multi-Turn Image Generation.
CoRR, 2018

Stochastic Answer Networks for Natural Language Inference.
CoRR, 2018

Integrating planning for task-completion dialogue policy learning.
CoRR, 2018

Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Generating Informative and Diverse Conversational Responses via Adversarial Information Maximization.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

M-Walk: Learning to Walk over Graphs using Monte Carlo Tree Search.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Discourse-Aware Neural Rewards for Coherent Text Generation.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

ReinforceWalk: Learning to Walk in Graph with Monte Carlo Tree Search.
Proceedings of the 6th International Conference on Learning Representations, 2018

Adversarial Advantage Actor-Critic Model for Task-Completion Dialogue Policy Learning.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Subgoal Discovery for Hierarchical Dialogue Policy Learning.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Discriminative Deep Dyna-Q: Robust Planning for Dialogue Policy Learning.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Language-Based Image Editing With Recurrent Attentive Models.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy Learning.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

Stochastic Answer Networks for Machine Reading Comprehension.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

A Knowledge-Grounded Neural Conversation Model.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Towards Human-level Machine Reading Comprehension: Reasoning and Inference with Multiple Strategies.
CoRR, 2017

Composite Task-Completion Dialogue System via Hierarchical Deep Reinforcement Learning.
CoRR, 2017

Investigation of Language Understanding Impact for Reinforcement Learning Based Dialogue Systems.
CoRR, 2017

End-to-End Task-Completion Neural Dialogue Systems.
CoRR, 2017

Modeling Large-Scale Structured Relationships with Shared Memory for Knowledge Base Completion.
Proceedings of the 2nd Workshop on Representation Learning for NLP, 2017

An Empirical Analysis of Multiple-Turn Reasoning Strategies in Reading Comprehension Tasks.
Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017

Image-Grounded Conversations: Multimodal Context for Natural Question and Response Generation.
Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017

Multi-Task Learning for Speaker-Role Adaptation in Neural Conversation Models.
Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017

End-to-End Task-Completion Neural Dialogue Systems.
Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017

Open-Domain Neural Dialogue Systems.
Proceedings of the IJCNLP 2017, Taipei, Taiwan, November 27, 2017

TopicRNN: A Recurrent Neural Network with Long-Range Semantic Dependency.
Proceedings of the 5th International Conference on Learning Representations, 2017

End-to-end joint learning of natural language understanding and dialogue manager.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Character-level deep conflation for business data analytics.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Composite Task-Completion Dialogue Policy Learning via Hierarchical Deep Reinforcement Learning.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Semantic Compositional Networks for Visual Captioning.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

StyleNet: Generating Attractive Visual Captions with Styles.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Deep Context Modeling for Web Query Entity Disambiguation.
Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, 2017

A Nested Attention Neural Hybrid Model for Grammatical Error Correction.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016
Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Basic Reasoning with Tensor Product Representations.
CoRR, 2016

Implicit ReasoNet: Modeling Large-Scale Structured Relationships with Shared Memory.
CoRR, 2016

Efficient Exploration for Dialog Policy Learning with Deep BBQ Networks \& Replay Buffer Spiking.
CoRR, 2016

Combating Reinforcement Learning's Sisyphean Curse with Intrinsic Fear.
CoRR, 2016

A User Simulator for Task-Completion Dialogues.
CoRR, 2016

A Persona-Based Neural Conversation Model.
CoRR, 2016

Reasoning in Vector Space: An Exploratory Study of Question Answering.
Proceedings of the 4th International Conference on Learning Representations, 2016

Deep Reinforcement Learning with a Combinatorial Action Space for Predicting and Tracking Popular Discussion Threads.
CoRR, 2016

End-to-End Reinforcement Learning of Dialogue Agents for Information Access.
CoRR, 2016

Knowledge as a Teacher: Knowledge-Guided Structural Attention Networks.
CoRR, 2016

Unsupervised Learning of Predictors from Unpaired Input-Output Samples.
CoRR, 2016

Query Understanding for Search on All Devices at WSDM 2016.
Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, 2016

Syntax or semantics? knowledge-guided joint semantic frame parsing.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Learning for Efficient Supervised Query Expansion via Two-stage Feature Selection.
Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016

ReasoNet: Learning to Stop Reading in Machine Comprehension.
Proceedings of the Workshop on Cognitive Computation: Integrating neural and symbolic approaches 2016 co-located with the 30th Annual Conference on Neural Information Processing Systems (NIPS 2016), 2016

MS MARCO: A Human Generated MAchine Reading COmprehension Dataset.
Proceedings of the Workshop on Cognitive Computation: Integrating neural and symbolic approaches 2016 co-located with the 30th Annual Conference on Neural Information Processing Systems (NIPS 2016), 2016

A Diversity-Promoting Objective Function for Neural Conversation Models.
Proceedings of the NAACL HLT 2016, 2016

Multi-Domain Joint Semantic Frame Parsing Using Bi-Directional RNN-LSTM.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

End-to-End Memory Networks with Knowledge Carryover for Multi-Turn Spoken Language Understanding.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Interpreting the prediction process of a deep network constructed from supervised topic models.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Deep Reinforcement Learning for Dialogue Generation.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Bi-directional Attention with Agreement for Dependency Parsing.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

MS-Celeb-1M: Challenge of Recognizing One Million Celebrities in the Real World.
Proceedings of the Imaging and Multimedia Analytics in a Web and Mobile World 2016, 2016

MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition.
Proceedings of the Computer Vision - ECCV 2016, 2016

Stacked Attention Networks for Image Question Answering.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

A Persona-Based Neural Conversation Model.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

Deep Reinforcement Learning with a Natural Language Action Space.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

2015
Embedding Entities and Relations for Learning and Inference in Knowledge Bases.
Proceedings of the 3rd International Conference on Learning Representations, 2015

Deep Sentence Embedding Using the Long Short Term Memory Network: Analysis and Application to Information Retrieval.
CoRR, 2015

Recurrent Reinforcement Learning: A Hybrid Approach.
CoRR, 2015

Deep Reinforcement Learning with an Unbounded Action Space.
CoRR, 2015

End-to-end Learning of Latent Dirichlet Allocation by Mirror-Descent Back Propagation.
CoRR, 2015

End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Deep Learning and Continuous Representations for Natural Language Processing.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

A Neural Network Approach to Context-Sensitive Generation of Conversational Responses.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

A Deep Embedding Model for Co-occurrence Learning.
Proceedings of the IEEE International Conference on Data Mining Workshop, 2015

From captions to visual concepts and back.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Semantic Parsing via Staged Query Graph Generation: Question Answering with Knowledge Base.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

deltaBLEU: A Discriminative Metric for Generation Tasks with Intrinsically Diverse Targets.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

2014
Learning Multi-Relational Semantics Using Neural-Embedding Models.
CoRR, 2014

Semantic Modelling with Long-Short-Term Memory for Information Retrieval.
CoRR, 2014

Learning semantic representations using convolutional neural networks for web search.
Proceedings of the 23rd International World Wide Web Conference, 2014

Modeling Interestingness with Deep Neural Networks.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

Large-scale Expected BLEU Training of Phrase-based Reordering Models.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

Minimum Translation Modeling with Recurrent Neural Networks.
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, 2014

A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval.
Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, 2014

Learning Continuous Phrase Representations for Translation Modeling.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Decoder Integration and Expected BLEU Training for Recurrent Neural Network Language Models.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Online Classification Using a Voted RDA Method.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013
Learning Semantic Representations for the Phrase Translation Model.
CoRR, 2013

Query expansion using path-constrained random walks.
Proceedings of the 36th International ACM SIGIR conference on research and development in Information Retrieval, 2013

Beyond Left-to-Right: Multiple Decomposition Structures for SMT.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

Training MRF-Based Phrase Translation Models using Gradient Ascent.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

End-to-end learning of parsing models for information retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2013

Deep stacking networks for information retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2013

Learning deep structured semantic models for web search using clickthrough data.
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

2012
Modeling click-through based word-pairs for web search.
Proceedings of the 21st World Wide Web Conference, 2012

MSR SPLAT, a language analysis toolkit.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2012

A Unified Approach to Transliteration-based Text Input with Online Spelling Correction.
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012

Learning Lexicon Models from Search Logs for Query Expansion.
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012

Combining Signals for Cross-Lingual Relevance Feedback.
Proceedings of the Information Retrieval Technology, 2012

Translingual Mining from Text Data.
Proceedings of the Mining Text Data, 2012

2011
Clickthrough-based latent semantic models for web search.
Proceedings of the Proceeding of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2011

Domain Adaptation via Pseudo In-Domain Data Selection.
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 2011

2010
Adapting boosting for information retrieval measures.
Inf. Retr., 2010

Optimizing two stage bigram language models for IR.
Proceedings of the 19th International Conference on World Wide Web, 2010

Exploring web scale language models for search query processing.
Proceedings of the 19th International Conference on World Wide Web, 2010

Multi-style language model for web scale information retrieval.
Proceedings of the Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2010

The MSRA machine translation system for IWSLT 2010.
Proceedings of the 2010 International Workshop on Spoken Language Translation, 2010

A Large Scale Ranker-Based System for Search Query Spelling Correction.
Proceedings of the COLING 2010, 2010

A comparison of unsupervised methods for Part-of-Speech Tagging in Chinese.
Proceedings of the COLING 2010, 2010

Clickthrough-based translation models for web search: from word models to phrase models.
Proceedings of the 19th ACM Conference on Information and Knowledge Management, 2010

Learning Phrase-Based Spelling Error Models from Clickthrough Data.
Proceedings of the ACL 2010, 2010

2009
Improved Monolingual Hypothesis Alignment for Machine Translation System Combination.
ACM Trans. Asian Lang. Inf. Process., 2009

Smoothing clickthrough data for web search ranking.
Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2009

Discovery of Term Variation in Japanese Web Search Queries.
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009

Model Adaptation via Model Interpolation and Boosting for Web Search Ranking.
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009

2008
Selecting good expansion terms for pseudo-relevance feedback.
Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2008

A Web-based English Proofing System for English as a Second Language Users.
Proceedings of the Third International Joint Conference on Natural Language Processing, 2008

Using Contextual Speller Techniques and Language Modeling for ESL Error Correction.
Proceedings of the Third International Joint Conference on Natural Language Processing, 2008

Indirect-HMM-based Hypothesis Alignment for Combining Outputs from Machine Translation Systems.
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, 2008

A comparison of Bayesian estimators for unsupervised Hidden Markov Model POS taggers.
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, 2008

Bayesian Semi-Supervised Chinese Word Segmentation for Statistical Machine Translation.
Proceedings of the COLING 2008, 2008

2007
A system to mine large-scale bilingual dictionaries from monolingual web pages.
Proceedings of Machine Translation Summit XI: Papers, 2007

Scalable training of L<sup>1</sup>-regularized log-linear models.
Proceedings of the Machine Learning, 2007

Compressing Trigram Language Models With Golomb Coding.
Proceedings of the EMNLP-CoNLL 2007, 2007

Extending query translation to cross-language query expansion with markov chain models.
Proceedings of the Sixteenth ACM Conference on Information and Knowledge Management, 2007

A Comparative Study of Parameter Estimation Methods for Statistical Natural Language Processing.
Proceedings of the ACL 2007, 2007

2006
An empirical study on language model adaptation.
ACM Trans. Asian Lang. Inf. Process., 2006

Statistical query translation models for cross-language information retrieval.
ACM Trans. Asian Lang. Inf. Process., 2006

A study of statistical models for query translation: finding a good unit of translation.
Proceedings of the SIGIR 2006: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2006

An Information-Theoretic Approach to Automatic Evaluation of Summaries.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2006

A Comparative Study of Discriminative Methods for Reranking LVCSR N-Best Hypotheses in Domain Adaptation and Generalization.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Automatic Acquisition of Chinese-English Parallel Corpus from the Web.
Proceedings of the Advances in Information Retrieval, 2006

A Supervised Learning Approach to Entity Search.
Proceedings of the Information Retrieval Technology, 2006

A DOM Tree Alignment Model for Mining Parallel Data from the Web.
Proceedings of the ACL 2006, 2006

Approximation Lasso Methods for Language Modeling.
Proceedings of the ACL 2006, 2006

2005
Chinese Word Segmentation and Named Entity Recognition: A Pragmatic Approach.
Comput. Linguistics, 2005

Linear discriminant model for information retrieval.
Proceedings of the SIGIR 2005: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2005

A Comparative Study on Language Model Adaptation Techniques Using New Evaluation Metrics.
Proceedings of the HLT/EMNLP 2005, 2005

Minimum Sample Risk Methods for Language Modeling.
Proceedings of the HLT/EMNLP 2005, 2005

Transformation Based Chinese Entity Detection and Tracking.
Proceedings of the Natural Language Processing - IJCNLP 2005, Second International Joint Conference, Jeju Island, Republic of Korea, October 11-13, 2005, 2005

An Empirical Study on Language Model Adaptation Using a Metric of Domain Similarity.
Proceedings of the Natural Language Processing, 2005

Person resolution in person search results: WebHawk.
Proceedings of the 2005 ACM CIKM International Conference on Information and Knowledge Management, Bremen, Germany, October 31, 2005

2004
Introduction to the special issue on statistical language modeling.
ACM Trans. Asian Lang. Inf. Process., 2004

Dependence language model for information retrieval.
Proceedings of the SIGIR 2004: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2004

The Use of SVM for Chinese New Word Identification.
Proceedings of the Natural Language Processing, 2004

Long Distance Dependency in Language Modeling: An Empirical Study.
Proceedings of the Natural Language Processing, 2004

Adaptive Chinese Word Segmentation.
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, 2004

Chinese Chunking with Another Type of Spec.
Proceedings of the Third Workshop on Chinese Language Processing, 2004

A Semi-Supervised Approach to Build Annotated Corpus for Chinese Named Entity Recognition.
Proceedings of the Third Workshop on Chinese Language Processing, 2004

2003
A Class-based Language Model Approach to Chinese Named Entity Identification.
Int. J. Comput. Linguistics Chin. Lang. Process., 2003

Training data optimization for language model adaptation.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Unsupervised Learning of Dependency Structure for Language Modeling.
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, 2003

Improved Source-Channel Models for Chinese Word Segmentation.
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, 2003

Single Character Chinese Named Entity Recognition.
Proceedings of the Second Workshop on Chinese Language Processing, 2003

Unsupervised Training for Overlapping Ambiguity Resolution in Chinese Word Segmentation.
Proceedings of the Second Workshop on Chinese Language Processing, 2003

2002
Toward a unified approach to statistical language modeling for Chinese.
ACM Trans. Asian Lang. Inf. Process., 2002

Improving Encarta Search Engine Performance by Mining User Logs.
Int. J. Pattern Recognit. Artif. Intell., 2002

Resolving query translation ambiguity using a decaying co-occurrence model and syntactic dependence relations.
Proceedings of the SIGIR 2002: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2002

NTCIR-3 CLIR Experiments at MSRA.
Proceedings of the Third NTCIR Workshop on Research in Information Retrieval, 2002

Improving language modeling by combining heteogeneous corpora.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Exploiting Headword Dependency and Predictive Clustering for Language Modeling.
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, 2002

Chinese Named Entity Identification Using Class-based Language Model.
Proceedings of the 19th International Conference on Computational Linguistics, 2002

Improving Language Model Size Reduction using Better Pruning Criteria.
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002

Exploring Asymmetric Clustering for Statistical Language Modeling.
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, 2002

Finding the Better Indexing units for Chinese Information Retrieval.
Proceedings of the First Workshop on Chinese Language Processing, 2002

2001
Improving the Effectiveness of Information Retrieval with Clustering and Fusion.
Int. J. Comput. Linguistics Chin. Lang. Process., 2001

The Use of Clustering Techniques for Language Modeling-Application to Asian Language.
Int. J. Comput. Linguistics Chin. Lang. Process., 2001

TREC-10 Web Track Experiments at MSRA.
Proceedings of The Tenth Text REtrieval Conference, 2001

Improving Query Translation for Cross-Language Information Retrieval Using Statistical Models.
Proceedings of the SIGIR 2001: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2001

Mining Generalized Query Patterns from Web Logs.
Proceedings of the 34th Annual Hawaii International Conference on System Sciences (HICSS-34), 2001

2000
TREC-9 CLIR Experiments at MSRCN.
Proceedings of The Ninth Text REtrieval Conference, 2000

Lexicon Optimization for Chinese Language Modeling.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

On the use of words and n-grams for Chinese information retrieval.
Proceedings of the Fifth International Workshop on Information Retrieval with Asian Languages, 2000, Hong Kong, China, September 30, 2000

Language model size reduction by pruning and clustering.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

N-gram distribution based language model adaptation.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

A unified approach to statistical language modeling for Chinese.
Proceedings of the IEEE International Conference on Acoustics, 2000

PENS: A Machine-aided English Writing System for Chinese Users.
Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, 2000

Distribution-Based Pruning of Backoff Language Models.
Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, 2000


  Loading...