Xiaodong He

Orcid: 0000-0002-9463-9168

Affiliations:
  • JD AI Research, Beijing, China
  • Microsoft Corporation, Redmond, WA, USA (2003 - 2018)
  • University of Missouri-Columbia, MO, USA (PhD 2003)
  • Chinese Academy of Sciences, Beijing, China (1996 - 1999)
  • Tsinghua University, Beijing, China (former)


According to our database1, Xiaodong He authored at least 276 papers between 1999 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
An efficient confusing choices decoupling framework for multi-choice tasks over texts.
Neural Comput. Appl., January, 2024

MuJo-SF: Multimodal Joint Slot Filling for Attribute Value Prediction of E-Commerce Commodities.
IEEE Trans. Multim., 2024

Operation-Augmented Numerical Reasoning for Question Answering.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Empowering Embodied Manipulation: A Bimanual-Mobile Robot Manipulation Dataset for Household Tasks.
CoRR, 2024

Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning.
CoRR, 2024

MuEP: A Multimodal Benchmark for Embodied Planning with Foundation Models.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Embodied Multi-Modal Agent trained by an LLM from a Parallel TextWorld.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

POCE: Primal Policy Optimization with Conservative Estimation for Multi-constraint Offline Reinforcement Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Prosody Modelling With Pre-Trained Cross-Utterance Representations for Improved Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

A Group Fairness Lens for Large Language Models.
CoRR, 2023

MaskedSpeech: Context-aware Speech Synthesis with Masking Strategy.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Leveraging Label Information for Multimodal Emotion Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

OTF: Optimal Transport based Fusion of Supervised and Self-Supervised Learning Models for Automatic Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

SegCLIP: Patch Aggregation with Learnable Centers for Open-Vocabulary Semantic Segmentation.
Proceedings of the International Conference on Machine Learning, 2023

Improving Disfluency Detection with Multi-Scale Self Attention and Contrastive Learning.
Proceedings of the IEEE International Conference on Acoustics, 2023

UFO2: A Unified Pre-Training Framework for Online and Offline Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

Composable Text Controls in Latent Space with ODEs.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

POSPAN: Position-Constrained Span Masking for Language Model Pre-training.
Proceedings of the 32nd ACM International Conference on Information and Knowledge Management, 2023

Dialog-Post: Multi-Level Self-Supervised Objectives and Hierarchical Model for Dialogue Post-Training.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

MoNET: Tackle State Momentum via Noise-Enhanced Training for Dialogue State Tracking.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Mars: Modeling Context & State Representations with Contrastive Learning for End-to-End Task-Oriented Dialog.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

AUGUST: an Automatic Generation Understudy for Synthesizing Conversational Recommendation Datasets.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

DiffusEmp: A Diffusion Model-Based Framework with Multi-Grained Control for Empathetic Response Generation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

MNER-QG: An End-to-End MRC Framework for Multimodal Named Entity Recognition with Query Grounding.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Visual Question Answering - From Theory to Application
Advances in Computer Vision and Pattern Recognition, Springer, ISBN: 978-981-19-0963-4, 2022

Multi-Speaker Multi-Style Speech Synthesis with Timbre and Style Disentanglement.
CoRR, 2022

P<sup>3</sup>LM: Probabilistically Permuted Prophet Language Modeling for Generative Pre-Training.
CoRR, 2022

MuGER<sup>2</sup>: Multi-Granularity Evidence Retrieval and Reasoning for Hybrid Question Answering.
CoRR, 2022

Mars: Semantic-aware Contrastive Learning for End-to-End Task-Oriented Dialog.
CoRR, 2022

Composable Text Control Operations in Latent Space with Ordinary Differential Equations.
CoRR, 2022

A Two-stage User Intent Detection Model on Complicated Utterances with Multi-task Learning.
Proceedings of the Companion of The Web Conference 2022, Virtual Event / Lyon, France, April 25, 2022

DialCSP: A Two-Stage Attention-Based Model for Customer Satisfaction Prediction in E-commerce Customer Service.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2022

MFDG: A Multi-Factor Dialogue Graph Model for Dialogue Intent Classification.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2022

Overview of the NLPCC 2022 Shared Task on Multimodal Product Summarization.
Proceedings of the Natural Language Processing and Chinese Computing, 2022

OPERA: Operation-Pivoted Discrete Reasoning over Text.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Label Anchored Contrastive Learning for Language Understanding.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

LUNA: Learning Slot-Turn Alignment for Dialogue State Tracking.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

BORT: Back and Denoising Reconstruction for End-to-End Task-Oriented Dialog.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

Don't Take It Literally: An Edit-Invariant Sequence Loss for Text Generation.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Query Prior Matters: A MRC Framework for Multimodal Named Entity Recognition.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

E-ConvRec: A Large-Scale Conversational Recommendation Dataset for E-Commerce Customer Service.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Cross-modal Transfer Learning via Multi-grained Alignment for End-to-End Spoken Language Understanding.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

SCaLa: Supervised Contrastive Learning for End-to-End Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Learning to Generate Poetic Chinese Landscape Painting with Calligraphy.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Cross-modal Contrastive Distillation for Instructional Activity Anticipation.
Proceedings of the 26th International Conference on Pattern Recognition, 2022

SE-GAN: Skeleton Enhanced Gan-Based Model for Brush Handwriting Font Generation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Gated Multimodal Fusion with Contrastive Learning for Turn-Taking Prediction in Human-Robot Dialogue.
Proceedings of the IEEE International Conference on Acoustics, 2022

Building Robust Spoken Language Understanding by Cross Attention Between Phoneme Sequence and ASR Hypothesis.
Proceedings of the IEEE International Conference on Acoustics, 2022

UniRPG: Unified Discrete Reasoning over Table and Text as Program Generation.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

JDDC 2.1: A Multimodal Chinese Dialogue Dataset with Joint Tasks of Query Rewriting, Response Generation, Discourse Parsing, and Summarization.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Correctable-DST: Mitigating Historical Context Mismatch between Training and Inference for Improved Dialogue State Tracking.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

MuGER2: Multi-Granularity Evidence Retrieval and Reasoning for Hybrid Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

PRINCE: Prefix-Masked Decoding for Knowledge Enhanced Sequence-to-Sequence Pre-Training.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

P3LM: Probabilistically Permuted Prophet Language Modeling for Generative Pre-Training.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

Beyond QA: 'Heuristic QA' Strategies in JIMI.
Proceedings of the Database Systems for Advanced Applications, 2022

Tracking Satisfaction States for Customer Satisfaction Prediction in E-commerce Service Chatbots.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

Few-Shot Table Understanding: A Benchmark Dataset and Pre-Training Baseline.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

AutoQGS: Auto-Prompt for Low-Resource Knowledge-based Question Generation from SPARQL.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

Legal Charge Prediction via Bilinear Attention Network.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

Fine- and Coarse-Granularity Hybrid Self-Attention for Efficient BERT.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

A Multi-Factor Classification Framework for Completing Users' Fuzzy Queries (Student Abstract).
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

SimCTC: A Simple Contrast Learning Method of Text Clustering (Student Abstract).
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
CUGE: A Chinese Language Understanding and Generation Evaluation Benchmark.
CoRR, 2021

SCaLa: Supervised Contrastive Learning for End-to-End Automatic Speech Recognition.
CoRR, 2021

The JDDC 2.0 Corpus: A Large-Scale Multimodal Multi-Turn Chinese Dialogue Dataset for E-commerce Customer Service.
CoRR, 2021

Joint System-Wise Optimization for Pipeline Goal-Oriented Dialog System.
CoRR, 2021

Conversational AI Systems for Social Good: Opportunities and Challenges.
CoRR, 2021

The practice of speech and language processing in China.
Commun. ACM, 2021

EviDR: Evidence-Emphasized Discrete Reasoning for Reasoning Machine Reading Comprehension.
Proceedings of the Natural Language Processing and Chinese Computing, 2021

CUSTOM: Aspect-Oriented Product Summarization for E-Commerce.
Proceedings of the Natural Language Processing and Chinese Computing, 2021

SGG: Learning to Select, Guide, and Generate for Keyphrase Generation.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Graph Ensemble Learning over Multiple Dependency Trees for Aspect-level Sentiment Classification.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Learning to Compose Stylistic Calligraphy Artwork with Emotions.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

ViDA-MAN: Visual Dialog with Digital Humans.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Neural Kalman Filtering for Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2021

Improving Prosody Modelling with Cross-Utterance Bert Embeddings for End-to-End Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2021

Dian: Duration Informed Auto-Regressive Network for Voice Cloning.
Proceedings of the IEEE International Conference on Acoustics, 2021

Conversational Query Rewriting with Self-Supervised Learning.
Proceedings of the IEEE International Conference on Acoustics, 2021

RoR: Read-over-Read for Long Document Machine Reading Comprehension.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Learn to Copy from the Copying History: Correlational Copy Network for Abstractive Summarization.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Incremental Learning for End-to-End Automatic Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

RevCore: Review-Augmented Conversational Recommendation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020
Multimodal Intelligence: Representation Learning, Information Fusion, and Applications.
IEEE J. Sel. Top. Signal Process., 2020

Introduction to the Special Issue on Deep Learning for Multi-Modal Intelligence Across Speech, Language, Vision, and Heterogeneous Signals.
IEEE J. Sel. Top. Signal Process., 2020

Graph Sequential Network for Reasoning over Sequences.
CoRR, 2020

AIIS: The SIGIR 2020 Workshop on Applied Interactive Information Systems.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020

Enhancing Multi-turn Dialogue Modeling with Intent Information for E-Commerce Customer Service.
Proceedings of the Natural Language Processing and Chinese Computing, 2020

Group Contextual Encoding for 3D Point Clouds.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

MaLiang: An Emotion-driven Chinese Calligraphy Artwork Composition System.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

The JDDC Corpus: A Large-Scale Multi-Turn Chinese Dialogue Dataset for E-commerce Customer Service.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Sound Event Localization and Detection Based on Multiple DOA Beamforming and Multi-Task Learning.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

The JD AI Speaker Verification System for the FFSVC 2020 Challenge.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Efficient WaveGlow: An Improved WaveGlow Vocoder with Enhanced Speed.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Multimodal Joint Attribute Prediction and Value Extraction for E-commerce Product.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Enhancing Automated Essay Scoring Performance via Cohesion Measurement and Combination of Regression and Ranking.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Learning to Predict Charges for Legal Judgment via Self-Attentive Capsule Network.
Proceedings of the ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020, 2020

On the Faithfulness for E-commerce Product Summarization.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Learning to Decouple Relations: Few-Shot Relation Classification with Entity-Guided Attention and Confusion-Aware Training.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Multimodal Sentence Summarization via Multimodal Selective Encoding.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Self-Attention Guided Copy Mechanism for Abstractive Summarization.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Select, Answer and Explain: Interpretable Multi-Hop Reading Comprehension over Multiple Documents.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Keywords-Guided Abstractive Sentence Summarization.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Aspect-Aware Multimodal Summarization for Chinese E-Commerce Products.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Zero-Shot Text-to-SQL Learning with Auxiliary Task.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Another AI? Artificial Imagination for Artistic Mind Map Generation.
Int. J. Multim. Data Eng. Manag., 2019

Selective Attention Based Graph Convolutional Networks for Aspect-Level Sentiment Classification.
CoRR, 2019

Relation Module for Non-answerable Prediction on Question Answering.
CoRR, 2019

Multiple instance learning with graph neural networks.
CoRR, 2019

Towards adversarial learning of speaker-invariant representation for speech emotion recognition.
CoRR, 2019

Multi-Level Coupling Network for Non-IID Sequential Recommendation.
IEEE Access, 2019

Automated Thematic and Emotional Modern Chinese Poetry Composition.
Proceedings of the Natural Language Processing and Chinese Computing, 2019

Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

From Knowledge Map to Mind Map: Artificial Imagination.
Proceedings of the 2nd IEEE Conference on Multimedia Information Processing and Retrieval, 2019

Direct-Path Signal Cross-Correlation Estimation for Sound Source Localization in Reverberation.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Speaker Diarization with Lexical Information.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Multi-Stride Self-Attention for Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Knowledgeable Storyteller: A Commonsense-Driven Generative Model for Visual Storytelling.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Mappa Mundi: An Interactive Artistic Mind Map Generator with Artificial Imagination.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Discrete Trust-aware Matrix Factorization for Fast Recommendation.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Dynamic Item Block and Prediction Enhancing Block for Sequential Recommendation.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Deep Speaker Embedding Learning with Multi-level Pooling for Text-independent Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2019

Object-Driven Text-To-Image Synthesis via Adversarial Training.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Relation Module for Non-Answerable Predictions on Reading Comprehension.
Proceedings of the 23rd Conference on Computational Natural Language Learning, 2019

Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

End-to-End Structure-Aware Convolutional Networks for Knowledge Base Completion.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Hierarchically Structured Reinforcement Learning for Topically Coherent Visual Story Generation.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Attentive Tensor Product Learning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
From Eliza to XiaoIce: challenges and opportunities with social chatbots.
Frontiers Inf. Technol. Electron. Eng., 2018

The Neural Painter: Multi-Turn Image Generation.
CoRR, 2018

Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning.
CoRR, 2018

Attentive Tensor Product Learning for Language Generation and Grammar Parsing.
CoRR, 2018

Natural Language to Structured Query Generation via Meta-Learning.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Tensor Product Generation Networks for Deep NLP Modeling.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Deep Communicating Agents for Abstractive Summarization.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Discourse-Aware Neural Rewards for Coherent Text Generation.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

On the Discrimination-Generalization Tradeoff in GANs.
Proceedings of the 6th International Conference on Learning Representations, 2018

Constrained Convolutional-Recurrent Networks to Improve Speech Quality with Low Impact on Recognition Accuracy.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Policy Shaping and Generalized Update Equations for Semantic Parsing from Denotations.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Stacked Cross Attention for Image-Text Matching.
Proceedings of the Computer Vision - ECCV 2018, 2018

AttnGAN: Fine-Grained Text to Image Generation With Attentional Generative Adversarial Networks.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Tips and Tricks for Visual Question Answering: Learnings From the 2017 Challenge.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

CleanNet: Transfer Learning for Scalable Image Classifier Training With Label Noise.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Deep Reinforcement Learning for NLP.
Proceedings of ACL 2018, Melbourne, Australia, July 15-20, 2018, Tutorial Abstracts, 2018

Question-Answering with Grammatically-Interpretable Representations.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Deep Learning for Image-to-Text Generation: A Technical Overview.
IEEE Signal Process. Mag., 2017

Editorial.
Mach. Transl., 2017

Reinforcement Learning To Adapt Speech Enhancement to Instantaneous Input Signal Quality.
CoRR, 2017

A Neural-Symbolic Approach to Natural Language Tasks.
CoRR, 2017

Tensor Product Generation Networks.
CoRR, 2017

Multiple-Kernel Based Vehicle Tracking Using 3D Deformable Model and Camera Self-Calibration.
CoRR, 2017

Deep Learning of Grammatically-Interpretable Representations Through Question-Answering.
CoRR, 2017

Reinforcement Learning with External Knowledge and Two-Stage Q-functions for Predicting Popular Reddit Threads.
CoRR, 2017

Bottom-Up and Top-Down Attention for Image Captioning and VQA.
CoRR, 2017

Adversarial Ranking for Language Generation.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Character-level deep conflation for business data analytics.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Two-Stage Synthesis Networks for Transfer Learning in Machine Comprehension.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Learning Generic Sentence Representations Using Convolutional Neural Networks.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Semantic Compositional Networks for Visual Captioning.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

StyleNet: Generating Attractive Visual Captions with Styles.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Deep Learning with Low Precision by Half-Wave Gaussian Quantization.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
Deep Sentence Embedding Using Long Short-Term Memory Networks: Analysis and Application to Information Retrieval.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Rich Image Captioning in the Wild.
CoRR, 2016

Basic Reasoning with Tensor Product Representations.
CoRR, 2016

Generating Natural Questions About an Image.
CoRR, 2016

A Corpus and Evaluation Framework for Deeper Understanding of Commonsense Stories.
CoRR, 2016

Reasoning in Vector Space: An Exploratory Study of Question Answering.
Proceedings of the 4th International Conference on Learning Representations, 2016

Deep Reinforcement Learning with a Combinatorial Action Space for Predicting and Tracking Popular Discussion Threads.
CoRR, 2016

Unsupervised Learning of Sentence Representations using Convolutional Neural Networks.
CoRR, 2016

Unsupervised Learning of Predictors from Unpaired Input-Output Samples.
CoRR, 2016

Table Cell Search for Question Answering.
Proceedings of the 25th International Conference on World Wide Web, 2016

Multi-Rate Deep Learning for Temporal Recommendation.
Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, 2016

Hierarchical Attention Networks for Document Classification.
Proceedings of the NAACL HLT 2016, 2016

A Corpus and Cloze Evaluation for Deeper Understanding of Commonsense Stories.
Proceedings of the NAACL HLT 2016, 2016


Interpreting the prediction process of a deep network constructed from supervised topic models.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Zero-shot learning of intent embeddings for expansion by convolutional deep structured semantic models.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Enhancing Retrieval and Ranking Performance for Media Search Engine by Deep Learning.
Proceedings of the 49th Hawaii International Conference on System Sciences, 2016

Deep Reinforcement Learning with a Combinatorial Action Space for Predicting Popular Reddit Threads.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Character-Level Question Answering with Attention.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Bi-directional Attention with Agreement for Dependency Parsing.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

MS-Celeb-1M: Challenge of Recognizing One Million Celebrities in the Real World.
Proceedings of the Imaging and Multimedia Analytics in a Web and Mobile World 2016, 2016

MS-Celeb-1M: A Dataset and Benchmark for Large-Scale Face Recognition.
Proceedings of the Computer Vision - ECCV 2016, 2016

Stacked Attention Networks for Image Question Answering.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Rich Image Captioning in the Wild.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2016

Generating Natural Questions About an Image.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

Deep Reinforcement Learning with a Natural Language Action Space.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

2015
Using Recurrent Neural Networks for Slot Filling in Spoken Language Understanding.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Introduction to the Special Section on Continuous Space and Related Methods in Natural Language Processing.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Embedding Entities and Relations for Learning and Inference in Knowledge Bases.
Proceedings of the 3rd International Conference on Learning Representations, 2015

Deep Sentence Embedding Using the Long Short Term Memory Network: Analysis and Application to Information Retrieval.
CoRR, 2015

Recurrent Reinforcement Learning: A Hybrid Approach.
CoRR, 2015

Deep Reinforcement Learning with an Unbounded Action Space.
CoRR, 2015

End-to-end Learning of Latent Dirichlet Allocation by Mirror-Descent Back Propagation.
CoRR, 2015

A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems.
Proceedings of the 24th International Conference on World Wide Web, 2015

Data Selection With Fewer Words.
Proceedings of the Tenth Workshop on Statistical Machine Translation, 2015

End-to-end Learning of LDA by Mirror-Descent Back Propagation over a Deep Architecture.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Deep Learning and Continuous Representations for Natural Language Processing.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Representation Learning Using Multi-Task Deep Neural Networks for Semantic Classification and Information Retrieval.
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Generalized Learning of Neural Network Based Semantic Similarity Models and Its Application in Movie Search.
Proceedings of the IEEE International Conference on Data Mining Workshop, 2015

A Deep Embedding Model for Co-occurrence Learning.
Proceedings of the IEEE International Conference on Data Mining Workshop, 2015

From captions to visual concepts and back.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Detecting actionable items in meetings by convolutional deep structured semantic models.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Semantic Parsing via Staged Query Graph Generation: Question Answering with Knowledge Base.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

Language Models for Image Captioning: The Quirks and What Works.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

2014
Learning Multi-Relational Semantics Using Neural-Embedding Models.
CoRR, 2014

Semantic Modelling with Long-Short-Term Memory for Information Retrieval.
CoRR, 2014

Learning semantic representations using convolutional neural networks for web search.
Proceedings of the 23rd International World Wide Web Conference, 2014

Adapting deep RankNet for personalized search.
Proceedings of the Seventh ACM International Conference on Web Search and Data Mining, 2014

Modeling action-level satisfaction for search task satisfaction prediction.
Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2014

Modeling Interestingness with Deep Neural Networks.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval.
Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, 2014

Semantic Parsing for Single-Relation Question Answering.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Learning Continuous Phrase Representations for Translation Modeling.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

2013
Optimization Algorithms and Applications for Speech and Language Processing.
IEEE Trans. Speech Audio Process., 2013

Speech-Centric Information Processing: An Optimization-Oriented Approach.
Proc. IEEE, 2013

Learning Semantic Representations for the Phrase Translation Model.
CoRR, 2013

Enhancing personalized search by mining and modeling task behavior.
Proceedings of the 22nd International World Wide Web Conference, 2013

Learning to extract cross-session search tasks.
Proceedings of the 22nd International World Wide Web Conference, 2013

Personalized ranking model adaptation for web search.
Proceedings of the 36th International ACM SIGIR conference on research and development in Information Retrieval, 2013

Training MRF-Based Phrase Translation Models using Gradient Ascent.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

MSR-FBK IWSLT 2013 SLT system description.
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign@IWSLT 2013, 2013

Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Random features for Kernel Deep Convex Network.
Proceedings of the IEEE International Conference on Acoustics, 2013

Multi-style adaptive training for robust cross-lingual spoken language understanding.
Proceedings of the IEEE International Conference on Acoustics, 2013

End-to-end learning of parsing models for information retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2013

Recent advances in deep learning for speech research at Microsoft.
Proceedings of the IEEE International Conference on Acoustics, 2013

Deep stacking networks for information retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2013

Learning deep structured semantic models for web search using clickthrough data.
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

2012
Review of Hypothesis Alignment Algorithms for MT System Combination via Confusion Network Decoding.
Proceedings of the Seventh Workshop on Statistical Machine Translation, 2012

Use of kernel deep convex networks and end-to-end learning for spoken language understanding.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Towards deeper understanding: Deep convex networks for semantic utterance classification.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Optimization in speech-centric information processing: Criteria and techniques.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

New methods and evaluation experiments on translating TED talks in the IWSLT benchmark.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Learning Lexicon Models from Search Logs for Query Expansion.
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012

Maximum Expected BLEU Training of Phrase and Lexicon Translation Models.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012

2011
Speech Recognition, Machine Translation, and Speech Translation - A Unified Discriminative Learning Paradigm [Lecture Notes].
IEEE Signal Process. Mag., 2011

The MSR SYSTEM for IWSLT 2011 evaluation.
Proceedings of the 2011 International Workshop on Spoken Language Translation, 2011

Robust Speech Translation by Domain Adaptation.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

A novel decision function and the associated decision-feedback learning for speech translation.
Proceedings of the IEEE International Conference on Acoustics, 2011

Why word error rate is not a good metric for speech recognizer training for the speech translation task?
Proceedings of the IEEE International Conference on Acoustics, 2011

Domain Adaptation via Pseudo In-Domain Data Selection.
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 2011

2010
Introduction to the Issue on Statistical Learning Methods for Speech and Language Processing.
IEEE J. Sel. Top. Signal Process., 2010

Clickthrough-based translation models for web search: from word models to phrase models.
Proceedings of the 19th ACM Conference on Information and Knowledge Management, 2010

2009
Improved Monolingual Hypothesis Alignment for Machine Translation System Combination.
ACM Trans. Asian Lang. Inf. Process., 2009

Using N-gram based Features for Machine Translation System Combination.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, May 31, 2009

Joint Optimization for Machine Translation System Combination.
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009

Incremental HMM Alignment for MT System Combination.
Proceedings of the ACL 2009, 2009

2008
Discriminative Learning for Speech Recognition: Theory and Practice
Synthesis Lectures on Speech and Audio Processing, Morgan & Claypool Publishers, ISBN: 978-3-031-02557-0, 2008

Discriminative learning in sequential pattern recognition.
IEEE Signal Process. Mag., 2008

Large-margin minimum classification error training: A theoretical risk minimization perspective.
Comput. Speech Lang., 2008

Indirect-HMM-based Hypothesis Alignment for Combining Outputs from Machine Translation Systems.
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, 2008

2007
A new look at discriminative training for hidden Markov models.
Pattern Recognit. Lett., 2007

Prior knowledge guided maximum expected likelihood based model selection and adaptation for nonnative speech recognition.
Comput. Speech Lang., 2007

Training Non-Parametric Features for Statistical Machine Translation.
Proceedings of the Second Workshop on Statistical Machine Translation, 2007

Using Word-Dependent Transition Models in HMM-Based Word Alignment for Statistical Machine Translation.
Proceedings of the Second Workshop on Statistical Machine Translation, 2007

Automatic validation of terminology translation consistenscy with statistical method.
Proceedings of Machine Translation Summit XI: Papers, 2007

Phone-discriminating minimum classification error (p-MCE) training for phonetic recognition.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Large-Margin Minimum Classification Error Training for Large-Scale Speech Recognition Tasks.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
A Novel Learning Method for Hidden Markov Models in Speech and Audio Processing.
Proceedings of the IEEE 8th Workshop on Multimedia Signal Processing, 2006

Use of incrementally regulated discriminative margins in MCE training for speech recognition.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Robust feature space adaptation for telephony speech recognition.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

2004
Prior knowledge guided MEL based model selection and adaptation for nonnative speech recognition.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Fast model selection based speaker adaptation for nonnative speech.
IEEE Trans. Speech Audio Process., 2003

Minimum classification error (MCE) model adaptation of continuous density HMMS.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Maximum a posteriori linear regression (MAPLR) variance adaptation for continuous density HMMS.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Minimum classification error linear regression for acoustic model adaptation of continuous density HMMs.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Maximum expected likelihood based model selection and adaptation for nonnative English speakers.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Fast model adaptation and complexity selection for nonnative English speakers.
Proceedings of the IEEE International Conference on Acoustics, 2002

2001
Model complexity optimization for nonnative English speakers.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

2000
A combined adaptive and decision tree based speech separation technique for telemedicine applications.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1999
A new hybrid structure of speech recognizer based on HMM and neural network.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Study on tone classification of Chinese continuous speech in speech recognition system.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Research on speech units modeling in continuous speech recognition.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999


  Loading...