Haifeng Wang

Orcid: 0000-0002-0672-7468

Affiliations:
  • Baidu Inc., China
  • Harbin Institute of Technology, China
  • Toshiba, Research and Development Center, Beijing, China (former)


According to our database1, Haifeng Wang authored at least 221 papers between 2000 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Learning to Select External Knowledge With Multi-Scale Negative Sampling.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Exploring the Causality of End-to-End Autonomous Driving.
CoRR, 2024

BEVWorld: A Multimodal World Model for Autonomous Driving via Unified BEV Latent Space.
CoRR, 2024

Towards Boosting Many-to-Many Multilingual Machine Translation with Large Language Models.
CoRR, 2024

An Empirical Study of Consistency Regularization for End-to-End Speech-to-Text Translation.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

BASES: Large-scale Web Search User Simulation with Large Language Model based Agents.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

2023
Graph-Grounded Goal Planning for Conversational Recommendation.
IEEE Trans. Knowl. Data Eng., May, 2023

GLS-CSC: A Simple but Effective Strategy to Mitigate Chinese STM Models' Over-Reliance on Superficial Clue.
CoRR, 2023

Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation.
CoRR, 2023

Less Learn Shortcut: Analyzing and Mitigating Learning of Spurious Feature-Label Correlation.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

A Thorough Examination on Zero-shot Dense Retrieval.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Learning Multilingual Sentence Representations with Cross-lingual Consistency Regularization.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: EMNLP 2023, 2023

ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Towards Zero-Shot Persona Dialogue Generation with In-Context Learning.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

TOME: A Two-stage Approach for Model-based Retrieval.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

XDailyDialog: A Multilingual Parallel Dialogue Corpus.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Improving Zero-shot Multilingual Neural Machine Translation by Leveraging Cross-lingual Consistency Regularization.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

Towards Boosting the Open-Domain Chatbot with Human Feedback.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Evolving Decomposed Plasticity Rules for Information-Bottlenecked Meta-Learning.
Trans. Mach. Learn. Res., 2022

Geometry-enhanced molecular representation learning for property prediction.
Nat. Mach. Intell., 2022

ERNIE-UniX2: A Unified Cross-lingual Cross-modal Framework for Understanding and Generation.
CoRR, 2022

PLATO-K: Internal and External Knowledge Enhanced Dialogue Generation.
CoRR, 2022

ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model with Knowledge-Enhanced Mixture-of-Denoising-Experts.
CoRR, 2022

ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding.
CoRR, 2022

ERNIE-ViL 2.0: Multi-view Contrastive Learning for Image-Text Pre-training.
CoRR, 2022

SINC: Service Information Augmented Open-Domain Conversation.
CoRR, 2022

Less Learn Shortcut: Analyzing and Mitigating Learning of Spurious Feature-Label Correlation.
CoRR, 2022

ERNIE-Search: Bridging Cross-Encoder with Dual-Encoder via Self On-the-fly Distillation for Dense Passage Retrieval.
CoRR, 2022

Towards Multi-Turn Empathetic Dialogs with Positive Emotion Elicitation.
CoRR, 2022

ERNIE-SPARSE: Learning Hierarchical Efficient Transformer Through Regularized Self-Attention.
CoRR, 2022

DuReader_retrieval: A Large-scale Chinese Benchmark for Passage Retrieval from Web Search Engine.
CoRR, 2022

POLARIS: A Geographic Pre-trained Model and its Applications in Baidu Maps.
CoRR, 2022

HelixADMET: a robust and endpoint extensible ADMET system incorporating self-supervised knowledge transfer.
Bioinform., 2022

BatchDTA: implicit batch alignment enhances deep learning-based drug-target affinity estimation.
Briefings Bioinform., 2022

Bi-SimCut: A Simple Strategy for Boosting Neural Machine Translation.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

CLOP: Video-and-Language Pre-Training with Knowledge Regularizations.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

DuIVA: An Intelligent Voice Assistant for Hands-free and Eyes-free Voice Interaction with the Baidu Maps App.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

ERNIE-GeoL: A Geography-and-Language Pre-trained Model and its Applications in Baidu Maps.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

PLATO-XL: Exploring the Large-scale Pre-training of Dialogue Generation.
Proceedings of the Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, 2022

DuQM: A Chinese Dataset of Linguistically Perturbed Natural Questions for Evaluating the Robustness of Question Matching Models.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

DuReader-Retrieval: A Large-scale Chinese Benchmark for Passage Retrieval from Web Search Engine.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

PLATO-Ad: A Unified Advertisement Text Generation Framework with Multi-Task Prompt Learning.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: EMNLP 2022 - Industry Track, Abu Dhabi, UAE, December 7, 2022

Clip-Tuning: Towards Derivative-free Prompt Learning with a Mixture of Rewards.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

A Fine-grained Interpretability Evaluation Benchmark for Neural NLP.
Proceedings of the 26th Conference on Computational Natural Language Learning, 2022

DuTraffic: Live Traffic Condition Prediction with Trajectory Data and Street Views at Baidu Maps.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

DuARUS: Automatic Geo-object Change Detection with Street-view Imagery for Updating Road Database at Baidu Maps.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

DuIVRS: A Telephonic Interactive Voice Response System for Large-Scale POI Attribute Acquisition at Baidu Maps.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

DuETA: Traffic Congestion Propagation Pattern Modeling via Efficient Graph Learning for ETA Prediction at Baidu Maps.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

DuMapper: Towards Automatic Verification of Large-Scale POIs with Street Views at Baidu Maps.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

Learning Adaptive Segmentation Policy for End-to-End Simultaneous Translation.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Long Time No See! Open-Domain Conversation with Long-Term Persona Memory.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

DuReader<sub>vis</sub>: A Chinese Dataset for Open-domain Document Visual Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

Where to Go for the Holidays: Towards Mixed-Type Dialogs for Clarification of User Goals.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

UNIMO-2: End-to-End Unified Vision-Language Grounded Learning.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

Is Discourse Role Important for Emotion Recognition in Conversation?
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Coherent Dialog Generation with Query Graph.
ACM Trans. Asian Low Resour. Lang. Inf. Process., 2021

ERNIE-ViLG: Unified Generative Pre-training for Bidirectional Vision-Language Generation.
CoRR, 2021

ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation.
CoRR, 2021

TOD-DA: Towards Boosting the Robustness of Task-oriented Dialogue Modeling on Spoken Conversations.
CoRR, 2021

CELLS: Cost-Effective Evolution in Latent Space for Goal-Directed Molecular Generation.
CoRR, 2021

Building Chinese Biomedical Language Models via Multi-Level Text Discrimination.
CoRR, 2021

Do What Nature Did To Us: Evolving Plastic Recurrent Neural Networks For Task Generalization.
CoRR, 2021

ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation.
CoRR, 2021

ChemRL-GEM: Geometry Enhanced Molecular Representation Learning for Property Prediction.
CoRR, 2021

ERNIE-Tiny : A Progressive Distillation Framework for Pretrained Transformer Compression.
CoRR, 2021

A Unified Pre-training Framework for Conversational AI.
CoRR, 2021

BSTC: A Large-Scale Chinese-English Speech Translation Dataset.
CoRR, 2021

ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

HGAMN: Heterogeneous Graph Attention Matching Network for Multilingual POI Retrieval at Baidu Maps.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

SSML: Self-Supervised Meta-Learner for En Route Travel Time Estimation at Baidu Maps.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

Meta-Learned Spatial-Temporal POI Auto-Completion for the Search Engine at Baidu Maps.
Proceedings of the KDD '21: The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2021

An Investigation of Containment Measure Implementation and Public Responses to the COVID-19 Pandemic in Mainland China.
Proceedings of the IEEE International Conference on Digital Health, 2021

Data Augmentation with Hierarchical SQL-to-Question Generation for Cross-domain Text-to-SQL Parsing.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

RocketQAv2: A Joint Training Method for Dense Passage Retrieval and Passage Re-ranking.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

ERNIE-M: Enhanced Multilingual Representation by Aligning Cross-lingual Semantics with Monolingual Corpora.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

DuRecDial 2.0: A Bilingual Parallel Corpus for Conversational Recommendation.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Mixup Decoding for Diverse Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

SgSum: Transforming Multi-document Summarization into Sub-graph Selection.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

GEDIT: Geographic-Enhanced and Dependency-Guided Tagging for Joint POI and Accessibility Extraction at Baidu Maps.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

Docking-based Virtual Screening with Multi-Task Learning.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2021

Correcting Chinese Spelling Errors with Phonetic Pre-training.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

BASS: Boosting Abstractive Summarization with Unified Semantic Graph.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Link Prediction on N-ary Relational Facts: A Graph-based Approach.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

DuReader_robust: A Chinese Dataset Towards Evaluating Robustness and Generalization of Machine Reading Comprehension in Real-World Applications.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

PAIR: Leveraging Passage-Centric Similarity Relation for Improving Dense Passage Retrieval.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

UNIMO: Towards Unified-Modal Understanding and Generation via Cross-Modal Contrastive Learning.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

ERNIE-Doc: A Retrospective Long-Document Modeling Transformer.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

PLATO-2: Towards Building an Open-Domain Chatbot via Curriculum Learning.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

Discovering Dialog Structure Graph for Coherent Dialog Generation.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

ERNIE-ViL: Knowledge Enhanced Vision-Language Representations through Scene Graphs.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Multi-Task Learning for Entity Recommendation and Document Ranking in Web Search.
ACM Trans. Intell. Syst. Technol., 2020

Personalized Query Auto-Completion for Large-Scale POI Search at Baidu Maps.
ACM Trans. Asian Low Resour. Lang. Inf. Process., 2020

ERNIE-M: Enhanced Multilingual Representation by Aligning Cross-lingual Semantics with Monolingual Corpora.
CoRR, 2020

Discovering Dialog Structure Graph for Open-Domain Dialog Generation.
CoRR, 2020

ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for Natural Language Understanding.
CoRR, 2020

ERNIE-ViL: Knowledge Enhanced Vision-Language Representations Through Scene Graph.
CoRR, 2020

DuReaderrobust: A Chinese Dataset Towards Evaluating the Robustness of Machine Reading Comprehension Models.
CoRR, 2020

Understanding the Impact of the COVID-19 Pandemic on Transportation-related Behaviors with Human Mobility Data.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Personalized Prefix Embedding for POI Auto-Completion in the Search Engine of Baidu Maps.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

ConSTGAT: Contextual Spatial-Temporal Graph Attention Network for Travel Time Estimation at Baidu Maps.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

Enhancing Dialog Coherence with Event Graph Grounded Content Planning.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Learning Adaptive Segmentation Policy for Simultaneous Translation.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

DuSQL: A Large-Scale and Pragmatic Chinese Text-to-SQL Dataset.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

An Investigation of Containment Measures Against the COVID-19 Pandemic in Mainland China.
Proceedings of the 2020 IEEE International Conference on Big Data (IEEE BigData 2020), 2020

Conversational Graph Grounded Policy Learning for Open-Domain Conversation Generation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Towards Conversational Recommendation over Multi-Type Dialogs.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Leveraging Graph to Improve Abstractive Multi-Document Summarization.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

Knowledge Graph Grounded Goal Planning for Open-Domain Conversation Generation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

ERNIE 2.0: A Continual Pre-Training Framework for Language Understanding.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
CoKE: Contextualized Knowledge Graph Embedding.
CoRR, 2019

DuTongChuan: Context-aware Translation Model for Simultaneous Interpreting.
CoRR, 2019

Proactive Human-Machine Conversation with Explicit Conversation Goals.
CoRR, 2019

Knowledge Aware Conversation Generation with Reasoning on Augmented Graph.
CoRR, 2019

Baidu Neural Machine Translation Systems for WMT19.
Proceedings of the Fourth Conference on Machine Translation, 2019

Reading Customer Reviews to Answer Product-related Questions.
Proceedings of the 2019 SIAM International Conference on Data Mining, 2019

A Key-Phrase Aware End2end Neural Response Generation Model.
Proceedings of the Natural Language Processing and Chinese Computing, 2019

End-to-End Speech Translation with Knowledge Distillation.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Knowledge Aware Conversation Generation with Explainable Reasoning over Augmented Graphs.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Multi-agent Learning for Neural Machine Translation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

MONOPOLY: Learning to Price Public Facilities for Revaluing Private Properties with Large-Scale Urban Data.
Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019

Proactive Human-Machine Conversation with Explicit Conversation Goal.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

STACL: Simultaneous Translation with Implicit Anticipation and Controllable Latency using Prefix-to-Prefix Framework.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

D-NET: A Pre-Training and Fine-Tuning Framework for Improving the Generalization of Machine Reading Comprehension.
Proceedings of the 2nd Workshop on Machine Reading for Question Answering, 2019

Modeling Coherence for Discourse Neural Machine Translation.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Joint Extraction of Entities and Overlapping Relations Using Position-Attentive Sequence Labeling.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Entity Highlight Generation as Statistical and Neural Machine Translation.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Learning to Recommend Related Entities With Serendipity for Web Search Users.
ACM Trans. Asian Low Resour. Lang. Inf. Process., 2018

Learning a unified embedding space of web search from large-scale query log.
Knowl. Based Syst., 2018

STACL: Simultaneous Translation with Integrated Anticipation and Controllable Latency.
CoRR, 2018

Improving Entity Recommendation with Search Log and Multi-Task Learning.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Improving Neural Machine Translation with Neural Sentence Rewriting.
Proceedings of the 2018 International Conference on Asian Language Processing, 2018

Multi-Task Neural Learning Architecture for End-to-End Identification of Helpful Reviews.
Proceedings of the IEEE/ACM 2018 International Conference on Advances in Social Networks Analysis and Mining, 2018

Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications.
Proceedings of the Workshop on Machine Reading for Question Answering@ACL 2018, 2018

2017
DuReader: a Chinese Machine Reading Comprehension Dataset from Real-world Applications.
CoRR, 2017

Learning to Explain Entity Relationships by Pairwise Ranking with Convolutional Neural Networks.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Multi-task Attention-based Neural Networks for Implicit Discourse Relationship Representation and Identification.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

2016
A Distributed Representation-Based Framework for Cross-Lingual Transfer Parsing.
J. Artif. Intell. Res., 2016

Exploiting Multi-typed Treebanks for Parsing with Deep Multi-task Learning.
CoRR, 2016

Generating Recommendation Evidence Using Translation Model.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Chinese Poetry Generation with Planning based Neural Network.
Proceedings of the COLING 2016, 2016

A Unified Architecture for Semantic Role Labeling and Relation Classification.
Proceedings of the COLING 2016, 2016

A Universal Framework for Inductive Transfer Parsing across Multi-typed Treebanks.
Proceedings of the COLING 2016, 2016

Duer: Intelligent Personal Assistant.
Proceedings of the 25th ACM International Conference on Information and Knowledge Management, 2016

Learning a Semantic Space of Web Search via Session Data.
Proceedings of the Information Retrieval Technology, 2016

Active Learning for Dependency Parsing with Partial Annotation.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

Improved Neural Machine Translation with SMT Features.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

A Representation Learning Framework for Multi-Source Transfer Parsing.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Learning Semantic Hierarchies: A Continuous Vector Space Approach.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Exploiting Collective Hidden Structures in Webpage Titles for Open Domain Entity Extraction.
Proceedings of the 24th International Conference on World Wide Web, 2015

Improved beam search with constrained softmax for NMT.
Proceedings of Machine Translation Summit XV: Papers, 2015

Cross-lingual Dependency Parsing Based on Distributed Representations.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

Multi-Task Learning for Multiple Language Translation.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

2014
Web page segmentation with structured prediction and its application in web page classification.
Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2014

Improving Pivot-Based Statistical Machine Translation by Pivoting the Co-occurrence Count of Phrase Pairs.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

Improve Statistical Machine Translation with Context-Sensitive Bilingual Semantic Embedding Model.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

Policy Learning for Domain Selection in an Extensible Multi-domain Spoken Dialogue System.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

Transformation from Discontinuous to Continuous Word Alignment Improves Translation Quality.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

Revisiting Embedding Features for Simple Semi-supervised Learning.
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing, 2014

Learning Sense-specific Word Embeddings By Exploiting Bilingual Resources.
Proceedings of the COLING 2014, 2014

Learning Semantic Hierarchies via Word Embeddings.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

2013
Introduction to special section on paraphrasing.
ACM Trans. Intell. Syst. Technol., 2013

Generalization of Words for Chinese Dependency Parsing.
Proceedings of The 13th International Conference on Parsing Technologies, 2013

Bootstrapping Large-scale Named Entities using URL-Text Hybrid Patterns.
Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013

A Hierarchical Semantics-Aware Distributional Similarity Scheme.
Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013

Improving Pivot-Based Statistical Machine Translation Using Random Walk.
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013

2012
User Behaviors Lend a Helping Hand: Learning Paraphrase Query Patterns from Search Log Sessions.
Proceedings of the COLING 2012, 2012

Opening Machine Translation Black Box for Cross-Language Information Retrieval.
Proceedings of the Information Retrieval Technology, 2012

Translation Model Adaptation for Statistical Machine Translation with Monolingual Topic Information.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012

Improve SMT Quality with Automatically Extracted Paraphrase Rules.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012

2011
Two-Word Collocation Extraction Using Monolingual Word Alignment Method.
ACM Trans. Intell. Syst. Technol., 2011

A conversation with Dr. Haifeng Wang.
SIGKDD Explor., 2011

Report on the first summer school on NLP and IR in Beijing.
SIGIR Forum, 2011

Dependency-based n-gram models for general purpose sentence realisation.
Nat. Lang. Eng., 2011

Automatically Generating Questions from Queries for Community-based Question Answering.
Proceedings of the Fifth International Joint Conference on Natural Language Processing, 2011

Harvesting Related Entities with a Search Engine.
Proceedings of the Fifth International Joint Conference on Natural Language Processing, 2011

Enriching SMT Training Data via Paraphrasing.
Proceedings of the Fifth International Joint Conference on Natural Language Processing, 2011

Reordering with Source Language Collocations.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011

2010
A Linguistically Inspired Statistical Model for Chinese Punctuation Generation.
ACM Trans. Asian Lang. Inf. Process., 2010

Leveraging Multiple MT Engines for Paraphrase Generation.
Proceedings of the COLING 2010, 2010

Paraphrasing with Search Engine Query Logs.
Proceedings of the COLING 2010, 2010

Paraphrases and Applications.
Proceedings of the COLING 2010, 2010

Improving Statistical Machine Translation with Monolingual Collocation.
Proceedings of the ACL 2010, 2010

2009
Extracting paraphrase patterns from bilingual parallel corpora.
Nat. Lang. Eng., 2009

Collocation Extraction Using Monolingual Word Alignment Method.
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009

Revisiting Pivot Language Approach for Machine Translation.
Proceedings of the ACL 2009, 2009

Exploiting Heterogeneous Treebanks for Parsing.
Proceedings of the ACL 2009, 2009

Dependency Based Chinese Sentence Realization.
Proceedings of the ACL 2009, 2009

2008
The TCH machine translation system for IWSLT 2008.
Proceedings of the 2008 International Workshop on Spoken Language Translation, 2008

Predicting and Tagging Dialog-Act Using MDP and SVM.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Accurate and Robust LFG-Based Generation for Chinese.
Proceedings of the INLG 2008, 2008

Domain Adaptation for Statistical Machine Translation with Domain Dictionary and Monolingual Corpora.
Proceedings of the COLING 2008, 2008

Pivot Approach for Extracting Paraphrase Patterns from Bilingual Corpora.
Proceedings of the ACL 2008, 2008

2007
Pivot language approach for phrase-based statistical machine translation.
Mach. Transl., 2007

Improving statistical word alignment with various clues.
Proceedings of Machine Translation Summit XI: Papers, 2007

Log-linear generation models for example-based machine translation.
Proceedings of Machine Translation Summit XI: Papers, 2007

Comparative study of word alignment heuristics and phrase-based SMT.
Proceedings of Machine Translation Summit XI: Papers, 2007

Using RBMT Systems to Produce Bilingual Corpus for SMT.
Proceedings of the EMNLP-CoNLL 2007, 2007

Recovering Non-Local Dependencies for Chinese.
Proceedings of the EMNLP-CoNLL 2007, 2007

2006
Example-based machine translation based on tree-string correspondence and statistical generation.
Mach. Transl., 2006

The Effect of Translation Quality in MT-Based Cross-Language Information Retrieval.
Proceedings of the ACL 2006, 2006

Boosting Statistical Word Alignment Using Labeled and Unlabeled Data.
Proceedings of the ACL 2006, 2006

Word Alignment for Languages with Scarce Resources Using Bilingual Corpora of Other Language Pairs.
Proceedings of the ACL 2006, 2006

An Equivalent Pseudoword Solution to Chinese Word Sense Disambiguation.
Proceedings of the ACL 2006, 2006

Discriminative Pruning of Language Models for Chinese Word Segmentation.
Proceedings of the ACL 2006, 2006

2005
The Effect of Adding Rules into the Rule-based MT System.
Proceedings of Machine Translation Summit X: Papers, 2005

Boosting Statistical Word Alignment.
Proceedings of Machine Translation Summit X: Papers, 2005

Example-based Machine Translation Based on TSC and Statistical Generation.
Proceedings of Machine Translation Summit X: Papers, 2005

Improving Statistical Word Alignment with Ensemble Methods.
Proceedings of the Natural Language Processing, 2005

Alignment Model Adaptation for Domain-Specific Word Alignment.
Proceedings of the ACL 2005, 2005

2004
Improving Statistical Word Alignment with a Rule-Based Machine Translation System.
Proceedings of the COLING 2004, 2004

: Improving Domain-Specific Word Alignment with a General Bilingual Corpus.
Proceedings of the Machine Translation: From Real Users to Research, 2004

Improving Domain-Specific Word Alignment for Computer Assisted Translation.
Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, Barcelona, Spain, July 21-26, 2004, 2004

2000
Towards a Next-Generation Search Engine.
Proceedings of the PRICAI 2000, Topics in Artificial Intelligence, 6th Pacific Rim International Conference on Artificial Intelligence, Melbourne, Australia, August 28, 2000

A unified approach to statistical language modeling for Chinese.
Proceedings of the IEEE International Conference on Acoustics, 2000


  Loading...