Bo Xu

Orcid: 0000-0002-1111-1529

Affiliations:
  • University of Science and Technology of China, Department of Automation, Hefei, China
  • Chinese Academy of Sciences, Center for Excellence in Brain Science and Intelligence Technology, Beijing, China
  • Chinese Academy of Sciences, Institute of Automation, National Laboratory of Pattern Recognition, Beijing, China


According to our database1, Bo Xu authored at least 495 papers between 1991 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Self-Lateral Propagation Elevates Synaptic Modifications in Spiking Neural Networks for the Efficient Spatial and Temporal Classification.
IEEE Trans. Neural Networks Learn. Syst., November, 2024

Tuning Synaptic Connections Instead of Weights by Genetic Algorithm in Spiking Policy Network.
Mach. Intell. Res., October, 2024

Network model with internal complexity bridges artificial intelligence and neuroscience.
Nat. Comput. Sci., August, 2024

Learning Top-K Subtask Planning Tree Based on Discriminative Representation Pretraining for Decision-making.
Mach. Intell. Res., August, 2024

Enhancing Multi-agent Coordination via Dual-channel Consensus.
Mach. Intell. Res., April, 2024

A Knowledge-enhanced Two-stage Generative Framework for Medical Dialogue Information Extraction.
Mach. Intell. Res., February, 2024

Multi-Cue Guided Semi-Supervised Learning Toward Target Speaker Separation in Real Environments.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

SSCFormer: Push the Limit of Chunk-Wise Conformer for Streaming ASR Using Sequentially Sampled Chunks and Chunked Causal Convolution.
IEEE Signal Process. Lett., 2024

SNN-BERT: Training-efficient Spiking Neural Networks for energy-efficient BERT.
Neural Networks, 2024

Multi-scale full spike pattern for semantic segmentation.
Neural Networks, 2024

Multiscale fusion enhanced spiking neural network for invasive BCI neural signal decoding.
CoRR, 2024

Integer-Valued Training and Spike-Driven Inference Spiking Neural Network for High-performance and Energy-efficient Object Detection.
CoRR, 2024

Efficient Preference-based Reinforcement Learning via Aligned Experience Estimation.
CoRR, 2024

Enhanced Spatiotemporal Prediction Using Physical-guided And Frequency-enhanced Recurrent Neural Networks.
CoRR, 2024

Biologically-Plausible Topology Improved Spiking Actor Network for Efficient Deep Reinforcement Learning.
CoRR, 2024

Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classification.
CoRR, 2024

RSC-SNN: Exploring the Trade-off Between Adversarial Robustness and Accuracy in Spiking Neural Networks via Randomized Smoothing Coding.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

CIEASR: Contextual Image-Enhanced Automatic Speech Recognition for Improved Homophone Discrimination.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Bridge the Query and Document: Contrastive Learning for Generative Document Retrieval.
Proceedings of the International Joint Conference on Neural Networks, 2024

TaCoD: Tasks-Commonality-Aware World in Meta Reinforcement Learning.
Proceedings of the International Joint Conference on Neural Networks, 2024

Long Short-Term Reasoning Network with Theory of Mind for Efficient Multi-Agent Cooperation.
Proceedings of the International Joint Conference on Neural Networks, 2024

T-Agent: A Term-Aware Agent for Medical Dialogue Generation.
Proceedings of the International Joint Conference on Neural Networks, 2024

SA-MPF: A Status-Aware Mask Prediction Framework for Online Disease Diagnosis.
Proceedings of the International Joint Conference on Neural Networks, 2024

High-Performance Temporal Reversible Spiking Neural Networks with O(L) Training Memory and O(1) Inference Cost.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Spike-driven Transformer V2: Meta Spiking Neural Network Architecture Inspiring the Design of Next-generation Neuromorphic Chips.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

UNeC: Unsupervised Exploring In Controllable Space.
Proceedings of the IEEE International Conference on Acoustics, 2024

MaDE: Multi-Scale Decision Enhancement for Multi-Agent Reinforcement Learning.
Proceedings of the IEEE International Conference on Acoustics, 2024

ViLaS: Exploring the Effects of Vision and Language Context in Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024

A New Pre-Training Paradigm for Offline Multi-Agent Reinforcement Learning with Suboptimal Data.
Proceedings of the IEEE International Conference on Acoustics, 2024

SpikeVoice: High-Quality Text-to-Speech Via Efficient Spiking Neural Network.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
A Brain-Inspired Approach for Probabilistic Estimation and Efficient Planning in Precision Physical Interaction.
IEEE Trans. Cybern., October, 2023

Attention Spiking Neural Networks.
IEEE Trans. Pattern Anal. Mach. Intell., August, 2023

Meta neurons improve spiking neural networks for efficient spatio-temporal learning.
Neurocomputing, April, 2023

Offline Pre-trained Multi-agent Decision Transformer.
Mach. Intell. Res., April, 2023

Origin of the efficiency of spike timing-based neural computation for processing temporal information.
Neural Networks, March, 2023

VLP: A Survey on Vision-language Pre-training.
Int. J. Autom. Comput., 2023

Local-to-Global Causal Reasoning for Cross-Document Relation Extraction.
IEEE CAA J. Autom. Sinica, 2023

Learning Top-k Subtask Planning Tree based on Discriminative Representation Pre-training for Decision Making.
CoRR, 2023

Neuromorphic Incremental on-chip Learning with Hebbian Weight Consolidation.
CoRR, 2023

Double Reverse Regularization Network Based on Self-Knowledge Distillation for SAR Object Classification.
CoRR, 2023

Local Convolution Enhanced Global Fourier Neural Operator For Multiscale Dynamic Spaces Prediction.
CoRR, 2023

ApproBiVT: Lead ASR Models to Generalize Better Using Approximated Bias-Variance Tradeoff Guided Early Stopping and Checkpoint Averaging.
CoRR, 2023

Attention-free Spikformer: Mixing Spike Sequences with Simple Linear Transforms.
CoRR, 2023

ViLaS: Integrating Vision and Language into Automatic Speech Recognition.
CoRR, 2023

Probabilistic Modeling: Proving the Lottery Ticket Hypothesis in Spiking Neural Network.
CoRR, 2023

Mixture of personality improved Spiking actor network for efficient multi-agent cooperation.
CoRR, 2023

X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages.
CoRR, 2023

Filtered Observations for Model-Based Multi-agent Reinforcement Learning.
Proceedings of the Machine Learning and Knowledge Discovery in Databases: Research Track, 2023

ODE-based Recurrent Model-free Reinforcement Learning for POMDPs.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Spike-driven Transformer.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Generalized Robot Dynamics Learning and Gen2Real Transfer.
IROS, 2023

Knowledge Transfer from Pre-trained Language Models to Cif-based Speech Recognizers via Hierarchical Distillation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Enhancing Visual Question Answering via Deconstructing Questions and Explicating Answers.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs.
Proceedings of the International Joint Conference on Neural Networks, 2023

Make Spoken Document Readable: Leveraging Graph Attention Networks for Chinese Document-Level Spoken-to-Written Simplification.
Proceedings of the Neural Information Processing - 30th International Conference, 2023

Replay Memory as An Empirical MDP: Combining Conservative Estimation with Experience Replay.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Inherent Redundancy in Spiking Neural Networks.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Matching-Based Term Semantics Pre-Training for Spoken Patient Query Understanding.
Proceedings of the IEEE International Conference on Acoustics, 2023

Task-Prompt Generalised World Model in Multi-Environment Offline Reinforcement Learning.
Proceedings of the ECAI 2023 - 26th European Conference on Artificial Intelligence, September 30 - October 4, 2023, Kraków, Poland, 2023

Cardsformer: Grounding Language to Learn a Generalizable Policy in Hearthstone.
Proceedings of the ECAI 2023 - 26th European Conference on Artificial Intelligence, September 30 - October 4, 2023, Kraków, Poland, 2023

M3: Modularization for Multi-task and Multi-agent Offline Pre-training.
Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

Complex Dynamic Neurons Improved Spiking Transformer Network for Efficient Automatic Speech Recognition.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

PiCor: Multi-Task Deep Reinforcement Learning with Policy Correction.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Corrigendum: A brain-inspired decision-making spiking neural network and its application in unmanned aerial vehicle.
Frontiers Neurorobotics, September, 2022

Tuning Convolutional Spiking Neural Network With Biologically Plausible Reward Propagation.
IEEE Trans. Neural Networks Learn. Syst., 2022

A Brain-Inspired Approach for Collision-Free Movement Planning in the Small Operational Space.
IEEE Trans. Neural Networks Learn. Syst., 2022

Sequence-Level Speaker Change Detection With Difference-Based Continuous Integrate-and-Fire.
IEEE Signal Process. Lett., 2022

Efficient Spatiotemporal Transformer for Robotic Reinforcement Learning.
IEEE Robotics Autom. Lett., 2022

Compressing speaker extraction model with ultra-low precision quantization and knowledge distillation.
Neural Networks, 2022

Train from scratch: Single-stage joint training of speech separation and recognition.
Comput. Speech Lang., 2022

Motif-topology improved Spiking Neural Network for the Cocktail Party Effect and McGurk Effect.
CoRR, 2022

MCascade R-CNN: A Modified Cascade R-CNN for Detection of Calcified on Coronary Artery Angiography Images.
Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 2022

LiMuSE: Lightweight Multi-Modal Speaker Extraction.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Unsupervised and Pseudo-Supervised Vision-Language Alignment in Visual Dialog.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Joint Modeling of Document and Label with Clause Interaction Hypergraph for ICD Medical Code Assignment.
Proceedings of the International Joint Conference on Neural Networks, 2022

Learning in Bi-level Markov Games.
Proceedings of the International Joint Conference on Neural Networks, 2022

Recent Advances and New Frontiers in Spiking Neural Networks.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Kinematics Learning of Massive Heterogeneous Serial Robots.
Proceedings of the 2022 International Conference on Robotics and Automation, 2022

Artificial Neural Network-assisted Amplitude Thresholding Improves Spike Detection.
Proceedings of the 2022 11th International Conference on Computing and Pattern Recognition, 2022

Motif-Topology and Reward-Learning Improved Spiking Neural Network for Efficient Multi-Sensory Integration.
Proceedings of the IEEE International Conference on Acoustics, 2022

Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection.
Proceedings of the IEEE International Conference on Acoustics, 2022

Improving Cross-Modal Understanding in Visual Dialog Via Contrastive Learning.
Proceedings of the IEEE International Conference on Acoustics, 2022

A Multi Domain Knowledge Enhanced Matching Network for Response Selection in Retrieval-Based Dialogue Systems.
Proceedings of the IEEE International Conference on Acoustics, 2022

GCS: Graph-Based Coordination Strategy for Multi-Agent Reinforcement Learning.
Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, 2022

Learning Multi-Agent Action Coordination via Electing First-Move Agent.
Proceedings of the Thirty-Second International Conference on Automated Planning and Scheduling, 2022

Multi-Sacle Dynamic Coding Improved Spiking Actor Network for Reinforcement Learning.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Simultaneous Control in Belief Space for Circular Insertion in Precision Assembly.
IEEE Trans. Ind. Informatics, 2021

Efficiently Fusing Pretrained Acoustic and Linguistic Encoders for Low-Resource Speech Recognition.
IEEE Signal Process. Lett., 2021

Discretization and Re-synthesis: an alternative method to solve the Cocktail Party Problem.
CoRR, 2021

Offline Pre-trained Multi-Agent Decision Transformer: One Big Sequence Model Tackles All SMAC Tasks.
CoRR, 2021

Promoting Coordination Through Electing First-moveAgent in Multi-Agent Reinforcement Learning.
CoRR, 2021

A Simple Unified Framework for Anomaly Detection in Deep Reinforcement Learning.
CoRR, 2021

Population-coding and Dynamic-neurons improved Spiking Actor Network for Reinforcement Learning.
CoRR, 2021

General Robot Dynamics Learning and Gen2Real.
CoRR, 2021

Counterfactual Supporting Facts Extraction for Explainable Medical Record Based Diagnosis with Graph Network.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

MIMO Self-Attentive RNN Beamformer for Multi-Speaker Speech Separation.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Exploring wav2vec 2.0 on Speaker Verification and Language Identification.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Audio-Visual Speech Separation with Visual Features Enhanced by Adversarial Training.
Proceedings of the International Joint Conference on Neural Networks, 2021

A Language Model Based Pseudo-Sample Deliberation for Semi-supervised Speech Recognition.
Proceedings of the International Joint Conference on Neural Networks, 2021

Transfer Ability of Monolingual Wav2vec2.0 for Low-resource Speech Recognition.
Proceedings of the International Joint Conference on Neural Networks, 2021

Towards Modeling Auditory Restoration in Noisy Environments.
Proceedings of the International Joint Conference on Neural Networks, 2021

Two-Stage Pre-Training for Sequence to Sequence Speech Recognition.
Proceedings of the International Joint Conference on Neural Networks, 2021

Online Audio-Visual Speech Separation with Generative Adversarial Training.
Proceedings of the ICCAI '21: 2021 7th International Conference on Computing and Artificial Intelligence, Tianjin China, April 23, 2021

MixSpeech: Data Augmentation for Low-Resource Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Speaker and Direction Inferred Dual-Channel Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2021

Wase: Learning When to Attend for Speaker Extraction in Cocktail Party Environments.
Proceedings of the IEEE International Conference on Acoustics, 2021

Cif-Based Collaborative Decoding for End-to-End Contextual Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

MACCIF-TDNN: Multi Aspect Aggregation of Channel and Context Interdependence Features in TDNN-Based Speaker Verification.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Listen, Understand and Translate: Triple Supervision Decouples End-to-end Speech-to-text Translation.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Consecutive Decoding for Speech-to-text Translation.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
A Brain-Inspired Visual Fear Responses Model for UAV Emergent Obstacle Dodging.
IEEE Trans. Cogn. Dev. Syst., 2020

Chinese Short Text Classification with Mutual-Attention Convolutional Neural Networks.
ACM Trans. Asian Low Resour. Lang. Inf. Process., 2020

A biologically plausible supervised learning method for spiking neural networks using the symmetric STDP rule.
Neural Networks, 2020

Applying wav2vec2.0 to Speech Recognition in various low-resource languages.
CoRR, 2020

Audio-visual Speech Separation with Adversarially Disentangled Visual Representation.
CoRR, 2020

Finite Meta-Dynamic Neurons in Spiking Neural Networks for Spatio-temporal Learning.
CoRR, 2020

SDST: Successive Decoding for Speech-to-text Translation.
CoRR, 2020

TED: Triple Supervision Decouples End-to-end Speech-to-text Translation.
CoRR, 2020

A Comparison of Label-Synchronous and Frame-Synchronous End-to-End Models for Speech Recognition.
CoRR, 2020

Sequence to Multi-Sequence Learning via Conditional Chain Mapping for Mixture Signals.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

A Unified Framework for Low-Latency Speaker Extraction in Cocktail Party Environments.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Speaker-Conditional Chain Model for Speech Separation and Extraction.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

LISNN: Improving Spiking Neural Networks with Lateral Interactions for Robust Object Recognition.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Class-Balanced Loss for Scene Text Detection.
Proceedings of the Neural Information Processing - 27th International Conference, 2020

Low-Frequency Guided Self-Supervised Learning For High-Fidelity 3d Face Reconstruction In The Wild.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

CIF: Continuous Integrate-And-Fire for End-To-End Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Bridging the Gap between Prior and Posterior Knowledge Selection for Knowledge-Grounded Dialogue Generation.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

UENet: A Novel Generative Adversarial Network for Angiography Image Segmentation.
Proceedings of the 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2020

Convolution Pyramid Network: A Classification Network on Coronary Artery Angiogram Images.
Proceedings of the 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2020

Knowledge Aware Emotion Recognition in Textual Conversations via Multi-Task Incremental Transformer.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

DMRM: A Dual-Channel Multi-Hop Reasoning Model for Visual Dialog.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Pyrboxes: An efficient multi-scale scene text detector with feature pyramids.
Pattern Recognit. Lett., 2019

Concept learning through deep reinforcement learning with memory-augmented neural networks.
Neural Networks, 2019

Effectively training neural machine translation models with monolingual data.
Neurocomputing, 2019

Hybrid Attention for Chinese Character-Level Neural Machine Translation.
Neurocomputing, 2019

How social media usage affects employees' job satisfaction and turnover intention: An empirical study in China.
Inf. Manag., 2019

Unsupervised pre-traing for sequence to sequence speech recognition.
CoRR, 2019

Iterative Update and Unified Representation for Multi-Agent Reinforcement Learning.
CoRR, 2019

Biological Neuron Coding Inspired Binary Word Embeddings.
Cogn. Comput., 2019

Modelling Speaker-dependent Auditory Attention Using A Spiking Neural Network with Temporal Coding and Supervised Learning.
Aust. J. Intell. Inf. Process. Syst., 2019

The World in My Mind: Visual Dialog with Adversarial Multi-modal Feature Encoding.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Boosting Character-Based Chinese Speech Synthesis via Multi-Task Learning and Dictionary Tutoring.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

RevCuT Tree Search Method in Complex Single-player Game with Continuous Search Space.
Proceedings of the International Joint Conference on Neural Networks, 2019

A Unified Multi-output Semi-supervised Network for 3D Face Reconstruction.
Proceedings of the International Joint Conference on Neural Networks, 2019

Text Attention and Focal Negative Loss for Scene Text Detection.
Proceedings of the International Joint Conference on Neural Networks, 2019

Strong-Background Restrained Cross Entropy Loss for Scene Text Detection.
Proceedings of the International Joint Conference on Neural Networks, 2019

A Single-Shot Oriented Scene Text Detector with Learnable Anchors.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

Efficient and Accurate Face Shape Reconstruction by Fusion of Multiple Landmark Databases.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

NRTR: A No-Recurrence Sequence-to-Sequence Model for Scene Text Recognition.
Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019

Self-attention Aligner: A Latency-control End-to-end Model for ASR Using Self-attention Network and Chunk-hopping.
Proceedings of the IEEE International Conference on Acoustics, 2019

Speaker-Aware Speech-Transformer.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

A Working Memory Model for Task-oriented Dialog Response Generation.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Adapting Translation Models for Transcript Disfluency Detection.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
A Basal Ganglia Network Centric Reinforcement Learning Model and Its Application in Unmanned Aerial Vehicle.
IEEE Trans. Cogn. Dev. Syst., 2018

Distant supervision for relation extraction with hierarchical selective attention.
Neural Networks, 2018

Learning to activate logic rules for textual reasoning.
Neural Networks, 2018

Efficient coding matters in the organization of the early visual system.
Neural Networks, 2018

Generative adversarial training for neural machine translation.
Neurocomputing, 2018

A Brain-Inspired Decision-Making Spiking Neural Network and Its Application in Unmanned Aerial Vehicle.
Frontiers Neurorobotics, 2018

A Fast Contour Detection Model Inspired by Biological Mechanisms in Primary Vision System.
Frontiers Comput. Neurosci., 2018

Multilingual End-to-End Speech Recognition with A Single Transformer on Low-Resource Languages.
CoRR, 2018

A Brain-Inspired Decision Making Model Based on Top-Down Biasing of Prefrontal Cortex to Basal Ganglia and Its Application in Autonomous UAV Explorations.
Cogn. Comput., 2018

Toward Robot Self-Consciousness (II): Brain-Inspired Robot Bodily Self Model for Self-Recognition.
Cogn. Comput., 2018

Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Syllable-Based Sequence-to-Sequence Speech Recognition with the Transformer in Mandarin Chinese.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Single-channel Speech Dereverberation via Generative Adversarial Training.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

An End-to-End Text-Independent Speaker Identification System on Short Utterances.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Extending Recurrent Neural Aligner for Streaming End-to-End Speech Recognition in Mandarin.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Syllable-Based Acoustic Modeling with CTC for Multi-Scenarios Mandarin speech recognition.
Proceedings of the 2018 International Joint Conference on Neural Networks, 2018

Paraphrase Recognition via Combination of Neural Classifier and Keywords.
Proceedings of the 2018 International Joint Conference on Neural Networks, 2018

Hierarchical Tree Long Short-Term Memory for Sentence Representations.
Proceedings of the 2018 International Joint Conference on Neural Networks, 2018

Improving Speech Separation with Adversarial Network and Reinforcement Learning.
Proceedings of the 2018 International Joint Conference on Neural Networks, 2018

Distilled Binary Neural Network for Monaural Speech Separation.
Proceedings of the 2018 International Joint Conference on Neural Networks, 2018

Brain-inspired Balanced Tuning for Spiking Neural Networks.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Listen, Think and Listen Again: Capturing Top-down Auditory Attention for Speaker-independent Speech Separation.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Unsupervised Domain Adaptation for Neural Machine Translation.
Proceedings of the 24th International Conference on Pattern Recognition, 2018

Self-Attention Based Network for Punctuation Restoration.
Proceedings of the 24th International Conference on Pattern Recognition, 2018

Recurrent Neural Network Based Small-footprint Wake-up-word Speech Recognition System with a Score Calibration Method.
Proceedings of the 24th International Conference on Pattern Recognition, 2018

Compression of Acoustic Model via Knowledge Distillation and Pruning.
Proceedings of the 24th International Conference on Pattern Recognition, 2018

A Comparison of Modeling Units in Sequence-to-Sequence Speech Recognition with the Transformer on Mandarin Chinese.
Proceedings of the Neural Information Processing - 25th International Conference, 2018

A Cascaded Framework for Model-Based 3D Face Reconstruction.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

CBLDNN-Based Speaker-Independent Speech Separation Via Generative Adversarial Training.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Cascaded Mutual Modulation for Visual Reasoning.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Semi-Supervised Disfluency Detection.
Proceedings of the 27th International Conference on Computational Linguistics, 2018

Which Mapping Rule in the Fireworks Algorithm is Better for Large Scale Optimization.
Proceedings of the 2018 IEEE Congress on Evolutionary Computation, 2018

Unsupervised Neural Machine Translation with Weight Sharing.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

Modeling Attention and Memory for Auditory Selection in a Cocktail Party Environment.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Encoder-decoder recurrent network model for interactive character animation generation.
Vis. Comput., 2017

Self-Taught convolutional neural networks for short text clustering.
Neural Networks, 2017

Joint entity and relation extraction based on a hybrid neural network.
Neurocomputing, 2017

Hybrid Attention Networks for Chinese Short Text Classification.
Computación y Sistemas, 2017

Improving multi-layer spiking neural networks by incorporating brain-inspired rules.
Sci. China Inf. Sci., 2017

Constructing a Chinese Conversation Corpus for Sentiment Analysis.
Proceedings of the Natural Language Processing and Chinese Computing, 2017

Multilingual Recurrent Neural Networks with Residual Learning for Low-Resource Speech Recognition.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Multi-sense based neural machine translation.
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

A class-specific copy network for handling the rare word problem in neural machine translation.
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

Convolutional Neural Network with Word Embeddings for Chinese Word Segmentation.
Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017

Hierarchical Hybrid Attention Networks for Chinese Conversation Topic Classification.
Proceedings of the Neural Information Processing - 24th International Conference, 2017

Word-Level Permutation and Improved Lower Frame Rate for RNN-Based Acoustic Modeling.
Proceedings of the Neural Information Processing - 24th International Conference, 2017

Towards a Brain-Inspired Developmental Neural Network by Adaptive Synaptic Pruning.
Proceedings of the Neural Information Processing - 24th International Conference, 2017

End-to-End Chinese Image Text Recognition with Attention Model.
Proceedings of the Neural Information Processing - 24th International Conference, 2017

Measuring Word Semantic Similarity Based on Transferred Vectors.
Proceedings of the Neural Information Processing - 24th International Conference, 2017

Combining unidirectional long short-term memory with convolutional output layer for high-performance speech synthesis.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Towards Compact and Fast Neural Machine Translation Using a Combined Method.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Joint Extraction of Multiple Relations and Entities by Using a Hybrid Neural Network.
Proceedings of the Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data, 2017

Named Entity Recognition with Gated Convolutional Neural Networks.
Proceedings of the Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data, 2017

Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016
A neural network framework for relation extraction: Learning entity semantic and relation pattern.
Knowl. Based Syst., 2016

HCNN: A Neural Network Model for Combining Local and Global Features Towards Human-Like Classification.
Int. J. Pattern Recognit. Artif. Intell., 2016

Semantic expansion using word embedding clustering and convolutional neural network for improving short text classification.
Neurocomputing, 2016

Parallel Brain Simulator: A Multi-scale and Parallel Brain-Inspired Neural Network Modeling and Simulation Platform.
Cogn. Comput., 2016

Compositional Recurrent Neural Networks for Chinese Short Text Classification.
Proceedings of the 2016 IEEE/WIC/ACM International Conference on Web Intelligence, 2016

HMSNN: Hippocampus inspired Memory Spiking Neural Network.
Proceedings of the 2016 IEEE International Conference on Systems, Man, and Cybernetics, 2016

SHTM: A neocortex-inspired algorithm for one-shot text generation.
Proceedings of the 2016 IEEE International Conference on Systems, Man, and Cybernetics, 2016

Joint Learning of Entity Semantics and Relation Pattern for Relation Extraction.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2016

Ensemble of Feature Sets and Classification Methods for Stance Detection.
Proceedings of the Natural Language Understanding and Intelligent Applications, 2016

Investigating gated recurrent neural networks for acoustic modeling.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Applying connectionist temporal classification objective function to Chinese Mandarin speech recognition.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Multidimensional Residual Learning Based on Recurrent Neural Networks for Acoustic Modeling.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

First Step Towards End-to-End Parametric TTS Synthesis: Generating Spectral Parameters with Neural Attention.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Gating Recurrent Enhanced Memory Neural Networks on Language Identification.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

End-to-End Language Identification Using Attention-Based Recurrent Neural Networks.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Gating recurrent mixture density networks for acoustic modeling in statistical parametric speech synthesis.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling.
Proceedings of the COLING 2016, 2016

A Character-Aware Encoder for Neural Machine Translation.
Proceedings of the COLING 2016, 2016

Hierarchical Memory Networks for Answer Selection on Unknown Words.
Proceedings of the COLING 2016, 2016

Chinese Image Text Recognition with BLSTM-CTC: A Segmentation-Free Method.
Proceedings of the Pattern Recognition - 7th Chinese Conference, 2016

Relation Inference and Type Identification Based on Brain Knowledge Graph.
Proceedings of the Brain Informatics and Health - International Conference, 2016

Brain Knowledge Graph Analysis Based on Complex Network Theory.
Proceedings of the Brain Informatics and Health - International Conference, 2016

Brain-Inspired Obstacle Detection Based on the Biological Visual Pathway.
Proceedings of the Brain Informatics and Health - International Conference, 2016

A Spiking Neural Network Based Autonomous Reinforcement Learning Model and Its Application in Decision Making.
Proceedings of the Advances in Brain Inspired Cognitive Systems, 2016

Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

2015
基于GPU的散斑三维重建系统 (Speckle Projection Systems Based on GPU).
计算机科学, 2015

PTHMM: Beyond Single Specific Behavior Prediction.
Proceedings of the 24th International Conference on World Wide Web Companion, 2015

A Convolutional Architecture for Short Text Expansion and Classification.
Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, 2015

Parallel Recursive Deep Model for Sentiment Analysis.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2015

Bilingually-Constrained Recursive Neural Networks with Syntactic Constraints for Hierarchical Translation Model.
Proceedings of the Natural Language Processing and Chinese Computing - 4th CCF Conference, 2015

Short Text Clustering via Convolutional Neural Networks.
Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, 2015

Modeling emotion entrainment of online users in emergency events.
Proceedings of the 2015 IEEE International Conference on Intelligence and Security Informatics, 2015

Towards end-to-end speech recognition for Chinese Mandarin using long short-term memory recurrent neural networks.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Multi-task learning deep neural networks for speech feature denoising.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Multilingual tandem bottleneck feature for language identification.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Convolutional Neural Networks for Text Hashing.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Semi-supervised Chinese Word Segmentation based on Bilingual Information.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

Short Text Hashing Improved by Integrating Multi-granularity Topics and Tags.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2015

Semantic Clustering and Convolutional Neural Network for Short Text Categorization.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

2014
Multiple style exploration for story unit segmentation of broadcast news video.
Multim. Syst., 2014

Recursive Deep Learning for Sentiment Analysis over Social Data.
Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT), Warsaw, Poland, August 11-14, 2014, 2014

Emotion Evolution under Entrainment in Social Media.
Proceedings of the Social Media Processing - Third National Conference, 2014

Anchor Shot Detection with Deep Neural Network.
Proceedings of the Advances in Multimedia Information Processing - PCM 2014, 2014

Short Text Feature Enrichment Using Link Analysis on Topic-Keyword Graph.
Proceedings of the Natural Language Processing and Chinese Computing, 2014

A Hybrid Method for Chinese Entity Relation Extraction.
Proceedings of the Natural Language Processing and Chinese Computing, 2014

Video to Article Hyperlinking by Multiple Tag Property Exploration.
Proceedings of the MultiMedia Modeling - 20th Anniversary International Conference, 2014

Spatial Similarity Measure of Visual Phrases for Image Retrieval.
Proceedings of the MultiMedia Modeling - 20th Anniversary International Conference, 2014

CeleLabel: an interactive system for annotating celebrities in web videos.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Ranking Online Memes in Emergency Events Based on Transfer Entropy.
Proceedings of the IEEE Joint Intelligence and Security Informatics Conference, 2014

Data-driven tree structure based UBM reconstruction for speaker verification.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

An iVector extractor using pre-trained neural networks for speaker verification.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Investigation of stochastic Hessian-Free optimization in Deep neural networks for speech recognition.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

An improved pitch extraction algorithm for speech processing.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Improving wideband acoustic models using mixed-bandwidth training data via DNN adaptation.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Investigation of cross-lingual bottleneck features in hybrid ASR systems.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

An empirical study of multilingual and low-resource spoken term detection using deep neural networks.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

A robust framework for short text categorization based on topic model and integrated classifier.
Proceedings of the 2014 International Joint Conference on Neural Networks, 2014

Short Text Hashing Improved by Integrating Topic Features and Tags.
Proceedings of the Neural Information Processing - 21st International Conference, 2014

Image character recognition using deep convolutional neural network learned from different languages.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Variational Bayes based I-vector for speaker diarization of telephone conversations.
Proceedings of the IEEE International Conference on Acoustics, 2014

An investigation of summed-channel speaker recognition with multi-session enrollment.
Proceedings of the IEEE International Conference on Acoustics, 2014

Recursive neural network based word topology model for hierarchical phrase-based speech translation.
Proceedings of the IEEE International Conference on Acoustics, 2014

Chinese Image Text Recognition on grayscale pixels.
Proceedings of the IEEE International Conference on Acoustics, 2014

Chinese Image Character Recognition Using DNN and Machine Simulated Training Samples.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2014, 2014

Obtaining Better Word Representations via Language Transfer.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2014

Neuronal Morphology Modeling Based on Microscopy Reconstruction Data in the Public Repositories.
Proceedings of the Brain Informatics and Health - International Conference, 2014

Characterizing emotion entrainment in social media.
Proceedings of the 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2014

Improving Word Embeddings via Combining with Complementary Languages.
Proceedings of the Advances in Artificial Intelligence, 2014

Learning New Semi-Supervised Deep Auto-encoder Features for Statistical Machine Translation.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

2013
A-STAR: Toward translating Asian spoken languages.
Comput. Speech Lang., 2013

Entity Conceptualization and Understanding Based on Web-Scale Knowledge Bases.
Proceedings of the IEEE International Conference on Systems, 2013

A Fast Matching Method Based on Semantic Similarity for Short Texts.
Proceedings of the Natural Language Processing and Chinese Computing, 2013

Pseudo In-Domain Data Selection from Large-Scale Web Corpus for Spoken Language Translation.
Proceedings of the Natural Language Processing and Chinese Computing, 2013

Simulated Spoken Dialogue System Based on IOHMM with User History.
Proceedings of the Natural Language Processing and Chinese Computing, 2013

Fusion of Audio-Visual Features and Statistical Property for Commercial Segmentation.
Proceedings of the Advances in Multimedia Modeling, 19th International Conference, 2013

Temporal Video Segmentation to Scene Based on Conditional Random Fileds.
Proceedings of the Advances in Multimedia Modeling, 19th International Conference, 2013

A general Framework of video segmentation to logical unit based on conditional random fields.
Proceedings of the International Conference on Multimedia Retrieval, 2013

CASIA-KB: A Multi-source Chinese Semantic Knowledge Base Built from Structured and Unstructured Web Data.
Proceedings of the Semantic Technology - Third Joint International Conference, 2013

The CASIA machine translation system for IWSLT 2013.
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign@IWSLT 2013, 2013

Phrase-based Parallel Fragments Extraction from Comparable Corpora.
Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013

Joint and Coupled Bilingual Topic Model Based Sentence Representations for Language Model Adaptation.
Proceedings of the IJCAI 2013, 2013

Binarization of natural scene text based on L1-Norm PCA.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo Workshops, 2013

Style learning based story boundary detection for Chinese broadcast news videos.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2013

User-defined hot topic detection in microblogging.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2013

Asynchronous stochastic gradient descent for DNN training.
Proceedings of the IEEE International Conference on Acoustics, 2013

Multi-modal topic unit segmentation in videos using conditional random fields.
Proceedings of the IEEE International Conference on Acoustics, 2013

Understanding the dropout strategy and analyzing its effectiveness on LVCSR.
Proceedings of the IEEE International Conference on Acoustics, 2013

Integrating Multi-source Bilingual Information for Chinese Word Segmentation in Statistical Machine Translation.
Proceedings of the Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data, 2013

2012
Automatic Prosodic Break Detection and Feature Analysis.
J. Comput. Sci. Technol., 2012

From English pitch accent detection to Mandarin stress detection, where is the difference?
Comput. Speech Lang., 2012

Statistical and Structural Analysis of Web-Based Collaborative Knowledge Bases Generated from Wiki Encyclopedia.
Proceedings of the 2012 IEEE/WIC/ACM International Conferences on Web Intelligence, 2012

Phrase-based data selection for language model adaptation in spoken language translation.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Nesting hierarchical phrase-based model for speech-to-speech translation.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Power-normalized PLP (PNPLP) feature for robust speech recognition.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Effective near-duplicate image retrieval with image-specific visual phrase selection.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012

Discriminative training of weighted polynomial vector for acoustic language recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Graph-based multi-modal scene detection for movie and teleplay.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Unsupervised training of subspace gaussian mixture models for conversational telephone speech recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Multi-modal information fusion for news story segmentation in broadcast video.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

TV commercial detection using constrained viterbi algorithm based on time distribution.
Proceedings of the 9th International Conference on Fuzzy Systems and Knowledge Discovery, 2012

Translation Model Based Cross-Lingual Language Model Adaptation: from Word Models to Phrase Models.
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012

Automated Essay Scoring Based on Finite State Transducer: towards ASR Transcription of Oral English Speech.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012

2011
Monaural voiced speech segregation based on elaborate harmonic grouping strategies.
Sci. China Inf. Sci., 2011

Direct quad-dominant meshing of point cloud via global parameterization.
Comput. Graph., 2011

Ridge extraction of a smooth 2-manifold surface based on vector field.
Comput. Aided Geom. Des., 2011

Data-Driven UBM Generation via Tied Gaussians for GMM-Supervector Based Accent Identification.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Data-Driven Gaussian Component Selection for Fast GMM-Based Speaker Verification.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Restoring the Residual Speaker Information in Total Variability Modeling for Speaker Verification.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Automatic Prosodic Events Detection by Using Syllable-Based Acoustic, Lexical and Syntactic Features.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

An Empirical Study of Multilingual Spoken Term Detection.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Context-Dependent Duration Modeling with Backoff Strategy and Look-Up Tables for Pronunciation Assessment and Mispronunciation Detection.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

A Robust Approach to Mining Repeated Sequence in Audio Stream.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Prosody dependent Mandarin speech recognition.
Proceedings of the 2011 International Joint Conference on Neural Networks, 2011

Commercial detection by mining maximal repeated sequence in audio stream.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Exploring implicit score normalization techniques in speaker verification.
Proceedings of the IEEE International Conference on Acoustics, 2011

Exploring nuisance attribute projection and score normalization for GLDS-SVM based automatic mispronunciation detection method.
Proceedings of the IEEE International Conference on Acoustics, 2011

Structured precision modelling with Cholesky Basis Superposition for speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Monaural Voiced Speech Segregation Based on Dynamic Harmonic Function.
EURASIP J. Audio Speech Music. Process., 2010

Monaural speech separation based on MAXVQ and CASA for robust speech recognition.
Comput. Speech Lang., 2010

A preliminary exploration on tone error detection in Mandarin based on clustering.
Proceedings of the 4th International Universal Communication Symposium, 2010

Automated Chinese Essay Scoring using Vector Space Models.
Proceedings of the 4th International Universal Communication Symposium, 2010

Mandarin prosodic break detection based on complementary model.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

A new approach for automatic tone error detection in strong accented Mandarin based on dominant set.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

On the use of Gaussian component information in the generative likelihood ratio estimation for speaker verification.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

An investigation into direct scoring methods without SVM training in speaker verification.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Using prosody to improve Mandarin automatic speech recognition.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Exploring goodness of prosody by diverse matching templates.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Automatic reference independent evaluation of prosody quality using multiple knowledge fusions.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Mandarin stress detection using hierarchical model based boosting classification and regression tree.
Proceedings of the International Joint Conference on Neural Networks, 2010

Simplified Residual Factor Analysis for Text-Independent Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
High performance automatic mispronunciation detection method based on neural network and TRAP features.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Monaural voiced speech segregation based on elaborate harmonic grouping strategy.
Proceedings of the IEEE International Conference on Acoustics, 2009

Automatic pronunciation error detection based on linguistic knowledge and pronunciation space.
Proceedings of the IEEE International Conference on Acoustics, 2009

An efficient mispronounciation detction method using GLDS-SVM and formant enhanced features.
Proceedings of the IEEE International Conference on Acoustics, 2009

Chinese intonation assessment using SEV features.
Proceedings of the IEEE International Conference on Acoustics, 2009

Context Dependent Feature Based Bottom-up Rescoring SVM Classifier in Children's English Stress Mis-pronunciation Detection.
Proceedings of the 9th IEEE International Conference on Advanced Learning Technologies, 2009

The Asian network-based speech-to-speech translation system.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008
Multipitch Detection Based on Weighted Summary Correlogram.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Automatic Prosody Boundary Labeling of Mandarin Using Both Text and Acoustic Information.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Microphone Array Post-Filter Based on Auditory Filtering.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Improving searching speed and accuracy of query by humming system based on three methods: feature fusion, candidates set reduction and multiple similarity measurement rescoring.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

An effective microphone array post-filter in arbitrary environments.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Music Genre Classification Based on Multiple Classifier Fusion.
Proceedings of the Fourth International Conference on Natural Computation, 2008

Query by humming via multiscale transportation distance in random query occurrence context.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Improved phonotactic language identification using random forest language models.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Word Sense Disambiguation through Sememe Labeling.
Proceedings of the IJCAI 2007, 2007

A Novel Phone-State Matrix Based Vocabulary-Indenendent Keyword Spotting Method for Spontaneous Speech.
Proceedings of the IEEE International Conference on Acoustics, 2007

Probabilistic Parsing Action Models for Multi-Lingual Dependency Parsing.
Proceedings of the EMNLP-CoNLL 2007, 2007

Probabilistic Models for Action-Based Chinese Dependency Parsing.
Proceedings of the Machine Learning: ECML 2007, 2007

Ungreedy Methods for Chinese Deterministic Dependency Parsing.
Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007

2006
Monaural Speech Separation Based on Computational Auditory Scene Analysis and Objective Quality Assessment of Speech.
IEEE Trans. Speech Audio Process., 2006

An approach to automatic acquisition of translation templates based on phrase structure extraction and alignment.
IEEE Trans. Speech Audio Process., 2006

A Fast Framework for the Constrained Mean Trajectory Segment Model by Avoidance of Redundant Computation on Segment.
Int. J. Comput. Linguistics Chin. Lang. Process., 2006

Robust Target Speaker Tracking in Broadcast TV Streams.
Int. J. Comput. Linguistics Chin. Lang. Process., 2006

Research and Analysis of Fast Training in SVM-based Audio Classification.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

Some Improvements in Phrase-Based Statistical Machine Translation.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

All-Path Decoding Algorithm for Segmental Based Speech Recognition.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Full Utilization of Closed-captions in Broadcast News Recognition.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

Multi-Pitch Detection for Co-Channel Speech Utilizing Frequency Channel Piecewise Integration and Morphological Feedback Verification Tracking.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

Prosodic Word Prediction Using a Maximum Entropy Approach.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

Exploiting GMM-based Quality Measure for SVM Speaker Verification.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

A quality measure method using Gaussian mixture models and divergence measure for speaker identification.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Fast SVM training based on the choice of effective samples for audio classification.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

A Two-level Method for Unsupervised Speaker-based Audio Segmentation.
Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), 2006

A Comparative Study of Feature and Score Normalization for Speaker Verification.
Proceedings of the Advances in Biometrics, International Conference, 2006

One-Pass Coarse-to-Fine Segmental Speech Decoding Algorithm.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

A Novel Noise Robust Front-End Using First Order VTS in Construction of Mel-Warped Wiener Filter.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

An Improved Mandarin Keyword Spotting System Using MCE Training and Context-Enhanced Verification.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Applying Pitch Target Model to Convert F0 Contour for Expressive Mandarin Speech Synthesis.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Cluster-Based Language Model for Sentence Retrieval in Chinese Question Answering.
Proceedings of the Fifth Workshop on Chinese Language Processing, 2006

2005
Chinese Named Entity Recognition with Multiple Features.
Proceedings of the HLT/EMNLP 2005, 2005

The CASIA phrase-based machine translation system.
Proceedings of the 2005 International Workshop on Spoken Language Translation, 2005

A Histogram Algorithm for Fast Audio Retrieval.
Proceedings of the ISMIR 2005, 2005

A Hierarchical Approach for Audio Stream Segmentation and Classification.
Proceedings of the ISMIR 2005, 2005

Optimal model order selection based on regression tree in speaker identification.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Chinese prosodic phrasing with a constraint-based approach.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Chinese Question Classification from Approach and Semantic Views.
Proceedings of the Information Retrieval Technology, 2005

Product Named Entity Recognition Based on Hierarchical Hidden Markov Model.
Proceedings of the Fourth SIGHAN Workshop on Chinese Language Processing, 2005

A Hybrid GMM and Codebook Mapping Method for Spectral Conversion.
Proceedings of the Affective Computing and Intelligent Interaction, 2005

Investigation of Emotive Expressions of Spoken Sentences.
Proceedings of the Affective Computing and Intelligent Interaction, 2005

2004
Suppression of additive noise using a power spectral density MMSE estimator.
IEEE Signal Process. Lett., 2004

Outline of Research Activities on Speech-to-speech Translation in Institute of Automation, Chinese Academy of Sciences.
J. Chin. Lang. Comput., 2004

Cross-Language Acoustic Modeling in Large Vocabulary Continuous Speech Recognition.
J. Chin. Lang. Comput., 2004

Hand-Free Speech Recognition in Adverse Environment with Microphone Arrays.
J. Chin. Lang. Comput., 2004

A Novel Polyspectra-Based End Point Detector In Noisy Environments.
J. Chin. Lang. Comput., 2004

A co-chunk based method for spoken-language translation.
J. Chin. Lang. Comput., 2004

Research on IF-based Chinese and English Generation Approach.
J. Chin. Lang. Comput., 2004

Tone Modeling for Continuous Mandarin Speech Recognition.
Int. J. Speech Technol., 2004

NLPR at TREC 2004: Robust Experiments.
Proceedings of the Thirteenth Text REtrieval Conference, 2004

Bilingual chunk alignment in statistical machine translation.
Proceedings of the IEEE International Conference on Systems, 2004

Improvement of Speaker Identification by Combining Prosodic Features with Acoustic Features.
Proceedings of the Advances in Biometric Person Authentication, 2004

Text-independent speaker identification using GMM-UBM and frame level likelihood normalization.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004

A framework for fast segment model by avoidance of redundant computation on segment.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004

Trigram duration modeling in speech recognition.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004

Grapheme-to-phoneme conversion in Chinese TTS system.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004

Task-specific adaptation in Chinese name recognition.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004

Robust speaker recognition integrating pitch and Wiener filter.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004

A new multicomponent AM-FM demodulation with predicting frequency boundaries and its application to formant estimation.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Multi-layer structure MLLR adaptation algorithm with subspace regression classes and tying.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Combining agglomerative and tree-based state clustering for high accuracy acoustic modeling.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

A novel target-driven generalized JMAP adaptation algorithm.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Exploring high-performance speech recognition in noisy environments using high-order taylor series expansion.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Approach to interchange-format based Chinese generation.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Bilingual Chunk Alignment Based on Interactional Matching and Probabilistic Latent Semantic Indexing.
Proceedings of the Natural Language Processing, 2004

Window-Based Method for Information Retrieval.
Proceedings of the Natural Language Processing, 2004

Chinese-English bilingual phone modeling for cross-language speech recognition.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
NLPR at TREC 2003: Novelty and Robust.
Proceedings of The Twelfth Text REtrieval Conference, 2003

Geometric constrained maximum likelihood linear regression on Mandarin dialect adaptation.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Dynamic channel compensation based on maximum a posteriori estimation.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Statistical speech-to-speech translation with multilingual speech recognition and bilingual-chunk parsing.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Joint model and feature based compensation for robust speech recognition under non-stationary noise environments.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Discriminative optimization of large vocabulary Mandarin conversational speech recognition system.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Sequential MAP estimation based speech feature enhancement for noise robust speech recognition.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

A vector statistical piecewise polynomial approximation algorithm for environment compensation in telephone LVCSR.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Comparison and study of some variants of partially tied covariance modeling.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Fast speaker adaptation using triple diagonal and shared block diagonal transform matrices.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

A Maximum Entropy Approach for Spoken Chinese Understanding.
Proceedings of the Computational Linguistics and Intelligent Text Processing, 2003

Chinese Named Entity Recognition Combining Statistical Model wih Human Knowledge.
Proceedings of the Workshop on Multilingual and Mixed-language Named Entity Recognition, 2003

2002
Bridging the Gap between Dialogue management and dialogue models.
Proceedings of the SIGDIAL 2002 Workshop, 2002

Improvement of the post-processing method for isolated word OOV rejection.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Comparison between the spectral estimation techniques by different spectral-distortion measures.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

An approach to automatic identification of Chinese base noun phrases.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Accuracy improving method for parametric trajectory modeling and its use in a* search.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Improving performance of telephone-based Mandarin speech recognition.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Some issues on the study of vocal tract normalization.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Constrained maximum a posteriori approach for speech enhancement.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

An improved entropy-based endpoint detection algorithm.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Investigation and analysis on designing Chinese balance corpus.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Structure-based compensation using an improved statistical linear approximation for Mandarin speech recognition over telephone.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Comparisons of MLLR and CDCN for speech recognition in additive noise by experiments.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Chinese person name identification based on rules and statistics.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Linguistic and acoustic analysis of Chinese person names.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Improving parametric trajectory modeling by integration of pitch and tone information.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Codebook dependent dynamic channel estimation for Mandarin speech recognition over telephone.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Chinese spoken language analyzing based on combination of statistical and rule methods.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Parametric trajectory segment model for LVCSR.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Implementing vocal tract length normalization in the MLLR framework.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Factor analyzed Gaussian mixture models for speaker identification.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Covariance-Tied Clustering Method In Speaker Identification.
Proceedings of the 4th IEEE International Conference on Multimodal Interfaces (ICMI 2002), 2002

Pitch and tone's modeling in parametric trajectory model.
Proceedings of the IEEE International Conference on Acoustics, 2002

Using nonstandard SVM for combination of Speaker Verification and Verbal Information Verification in speaker authentication system.
Proceedings of the IEEE International Conference on Acoustics, 2002

Including detailed information feature in MFCC for large vocabulary contious speech recornition.
Proceedings of the IEEE International Conference on Acoustics, 2002

Study on prosodic boundary location in Chinaese mandarin.
Proceedings of the IEEE International Conference on Acoustics, 2002

Asymmetrical Support Vector Machines and applications in speech processing.
Proceedings of the IEEE International Conference on Acoustics, 2002

Chinese Syntactic Parsing Based on Extended GLR Parsing Algorithm with PCFG*.
Proceedings of the 19th International Conference on Computational Linguistics, 2002

Interactive Chinese-to-English Speech Translation Based on Dialogue Management.
Proceedings of the Workshop on Speech-to-Speech Translation: Algorithms and Systems@ACL 2002, 2002

2001
The study of the effect of training set on statistical language modeling.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Study and auto-detection of stress based on tonal pitch range in Mandarin.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

A novel target-driven MLLR adaptation algorithm with multi-layer structure.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

2000
Design And Implementation of A Chinese-To-English Spoken Language Translation System.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

Rule-based Post-Processing of Pinyin To Chinese Characters Conversion System.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

Block Analysis of Bilingual Corpus for Chinese-English Statistical Machine Translation.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

An Interlingua for Dialogue Translation.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

A CART-Based Hierarchical Stochastic Model for Prosodic Phrasing in Chinese.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

An Adaptive Information Retrieval System Based on Fuzzy Set.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

A New Framework For Mandarin LVCSR Based On One-pass Decoder.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

Statistical Approach to Chinese-English Spoken-language Translation in Hotel Reservation Domain.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

A Robust Method Based on Likelihood Estimation for Speech Signal Detecion.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

Processing Some Special Features in Chinese Speech Recognition.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

Japanese-to-Chinese spoken language translation based on the simple expression.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

An improved template-based approach to spoken language translation.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Incorporating HMM-state sequence confusion for rapid MLLR adaptation to new speakers.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

How to choose training set for language modeling.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

A generation system for Chinese texts.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Accent-specific Mandarin adaptation based on pronunciation modeling technology.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Update progress of Sinohear: advanced Mandarin LVCSR system at NLPR.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Chinese spoken language understanding across domain.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Towards high performance continuous Mandarin digit string recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Neural network based integration of multiple confidence measures for OOV detection.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

A stochastic polynomial tone model for continuous Mandarin speech.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Approach to Recognition and Understanding of the Time Constituents in the Spoken Chinese Language Translation.
Proceedings of the Advances in Multimodal Interfaces, 2000

Statistical Analysis of Chinese Language and Language Modeling Based on Huge Text Corpora.
Proceedings of the Advances in Multimodal Interfaces, 2000

Mandarin accent adaptation based on context-independent/context-dependent pronunciation modeling.
Proceedings of the IEEE International Conference on Acoustics, 2000

Acoustic modeling for Chinese speech recognition: a comparative study of Mandarin and Cantonese.
Proceedings of the IEEE International Conference on Acoustics, 2000

Decision tree based Mandarin tone model and its application to speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2000

1999
Regression class selection and speaker adaptation with MLLR in Mandarin continuous speech recognition.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

LODESTAR: a Mandarin spoken dialogue system for travel information retrieval.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

A novel model TD-PSPTP for speech synthesis.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

1998
Class-Triphone Acoustic Modelling Based On Decision Tree for Mandarin Continuous Speech Recognition.
Proceedings of the 1998 International Symposium on Chinese Spoken Language Processing, 1998

Speaker Normalization and A Robust Speech Feature Based on the Mellin Transform.
Proceedings of the 1998 International Symposium on Chinese Spoken Language Processing, 1998

A novel robust feature of speech signal based on the Mellin transform for speaker-independent speech recognition.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

1996
Speaker-independent dictation of Chinese speech with 32k vocabulary.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Context-dependent acoustic models for Chinese speech recognition.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

1994
Adaptation of neural network model: comparison of multilayer perceptron and LVQ.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

1992
A. 46 500 word Chinese speech recognition system.
Proceedings of the Second International Conference on Spoken Language Processing, 1992

1991
A real-time Chinese speech recognition system with unlimited vocabulary.
Proceedings of the 1991 International Conference on Acoustics, 1991


  Loading...