Audio Mixing Inversion via Embodied Self-supervised Learning.
Mach. Intell. Res., February, 2024

Employing feature mixture for active learning of object detection.
Neurocomputing, 2024

Using Ear-EEG to Decode Auditory Attention in Multiple-speaker Environment.
CoRR, 2024

Leveraging Moving Sound Source Trajectories for Universal Sound Separation.
CoRR, 2024

Cross-attention Inspired Selective State Space Models for Target Sound Extraction.
CoRR, 2024

TSE-PI: Target Sound Extraction under Reverberant Environments with Pitch Information.
CoRR, 2024

Self-supervised speech representation and contextual text embedding for match-mismatch classification with EEG recording.
CoRR, 2024

Comparing Human-Labeled and LLM-Generated Semantic Features via Cortical Neural Representation.
Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

Representation of Articulatory Features in EEG During Speech Production Tasks.
Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

Encoding and Decoding of Chinese Phonemes Based on MEG Signals.
Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

A Spectral Change Enhancement Method Based on Self-Supervised Learning Framework.
Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

ConvConcatNet: A Deep Convolutional Neural Network to Reconstruct Mel Spectrogram from the EEG.
Proceedings of the IEEE International Conference on Acoustics, 2024

A DenseNet-Based Method for Decoding Auditory Spatial Attention with EEG.
Proceedings of the IEEE International Conference on Acoustics, 2024

A Hybrid Deep-Online Learning Based Method for Active Noise Control in Wave Domain.
Proceedings of the IEEE International Conference on Acoustics, 2024

Self-Supervised Speech Representation and Contextual Text Embedding for Match-Mismatch Classification with EEG Recording.
Proceedings of the IEEE International Conference on Acoustics, 2024

Semantic Reconstruction of Continuous Language from Meg Signals.
Proceedings of the IEEE International Conference on Acoustics, 2024

A Physical Model-Based Self-Supervised Learning Method for Signal Enhancement Under Reverberant Environment.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Embodied Self-Supervised Learning (EMSSL) with Sampling and Training Coordination for Robot Arm Inverse Kinematics Model Learning.
CoRR, 2023

Emotion Classification with EEG Responses Evoked by Emotional Prosody of Speech.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Embodied Self-Supervised Learning (EMSSL) with Sampling and Training Coordination for Robot Arm Inverse Kinematic Model Learning.
Proceedings of the IEEE International Conference on Development and Learning, 2023

TT-Net: Dual-Path Transformer Based Sound Field Translation in the Spherical Harmonic Domain.
Proceedings of the IEEE International Conference on Acoustics, 2023

A Model-Based Hearing Compensation Method Using a Self-Supervised Framework.
Proceedings of the IEEE International Conference on Acoustics, 2023

PGSS: Pitch-Guided Speech Separation.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Sparse DNN Model for Frequency Expanding of Higher Order Ambisonics Encoding Process.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Room geometry blind inference based on the localization of real sound source and first order reflections.
CoRR, 2022

Unsupervised Inference of Physiologically Meaningful Articulatory Trajectories with VocalTractLab.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Unsupervised Acoustic-to-Articulatory Inversion with Variable Vocal Tract Anatomy.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Advanced Face Anti-Spoofing with Depth Segmentation.
Proceedings of the International Joint Conference on Neural Networks, 2022

Multi-Speaker Pitch Tracking via Embodied Self-Supervised Learning.
Proceedings of the IEEE International Conference on Acoustics, 2022

Direct source and early reflections localization using deep deconvolution network under reverbrate environment.
CoRR, 2021

Auditory Attention Decoding from EEG using Convolutional Recurrent Neural Network.
Proceedings of the 29th European Signal Processing Conference, 2021

Eye-gaze Estimation with HEOG and Neck EMG using Deep Neural Networks.
Proceedings of the 29th European Signal Processing Conference, 2021

Modeling of Individual HRTFs Based on Spatial Principal Component Analysis.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Forming the Concept of Direction Developmentally.
IEEE Trans. Cogn. Dev. Syst., 2020

Spectral-change enhancement with prior SNR for the hearing impaired.
CoRR, 2020

Embodied Self-supervised Learning by Coordinated Sampling and Training.
CoRR, 2020

Competing Speaker Count Estimation on the Fusion of the Spectral and Spatial Embedding Space.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Individual Distance-Dependent HRTFS Modeling Through A Few Anthropometric Measurements.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Single-Channel Speech Separation Integrating Pitch Information Based on a Multi Task Learning Framework.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Effects of Spectral and Temporal Cues to Mandarin Concurrent-Vowels Identification for Normal-Hearing and Hearing-Impaired Listeners.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

A Hierarchical Model for StarCraft II Mini-Game.
Proceedings of the 18th IEEE International Conference On Machine Learning And Applications, 2019

Action Selection Based on Prediction for Robot Planning.
Proceedings of the Joint IEEE 9th International Conference on Development and Learning and Epigenetic Robotics, 2019

Distance-dependent Modeling of Head-related Transfer Functions.
Proceedings of the IEEE International Conference on Acoustics, 2019

Integrating Spectrotemporal Context into Features Based on Auditory Perception for Classification-based Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2019

A Spectral-change-aware Loss Function for DNN-based Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2019

Improvements to the Matching Projection Decoding Method for Ambisonic System with Irregular Loudspeaker Layouts.
Proceedings of the IEEE International Conference on Acoustics, 2019

How Does a Robot Develop Its Reaching Ability Like Human Infants Do?
IEEE Trans. Cogn. Dev. Syst., 2018

Robot Learning to Play Drums with an Open-Ended Internal Model.
Proceedings of the IEEE International Conference on Robotics and Biomimetics, 2018

Improving Minority Language Speech Recognition Based on Distinctive Features.
Proceedings of the Intelligence Science and Big Data Engineering, 2018

Measuring the Band Importance Function for Mandarin Chinese with a Bayesian Adaptive Procedure.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Self-developing Proprioception-Based Robot Internal Models.
Proceedings of the Intelligence Science II, 2018

A Modified Frequency Weighted MUSIC Algorithm for Multiple Sound Sources Localization.
Proceedings of the 23rd IEEE International Conference on Digital Signal Processing, 2018

Developing Robot Reaching Skill with Relative-Location based Approximating.
Proceedings of the 2018 Joint IEEE 8th International Conference on Development and Learning and Epigenetic Robotics, 2018

A Time-Weighted Method for Predicting the Intelligibility of Speech in the Presence of Interfering Sounds.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Matching Projection Decoding Method for Ambisonics System.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Multi-Sensor Fusion Based Robot Self-Activity Recognition.
Proceedings of the 18th IEEE-RAS International Conference on Humanoid Robots, 2018

Towards human-like and transhuman perception in AI 2.0: a review.
Frontiers Inf. Technol. Electron. Eng., 2017

Developing Robot Drumming Skill with Listening-Playing Loop.
Proceedings of the Advances in Swarm Intelligence - 8th International Conference, 2017

Corner detection based real-time workpiece recognition for robot manipulation.
Proceedings of the 2017 IEEE International Conference on Robotics and Biomimetics, 2017

Learning to chase a ball efficiently and smoothly for a wheeled robot.
Proceedings of the 24th International Conference on Mechatronics and Machine Vision in Practice, 2017

The Microphone Array Arrangement Method for High Order Ambisonics Recordings.
Proceedings of the Intelligence Science and Big Data Engineering, 2017

A hierarchical inverse model based on proprioception and DNN for robot reaching.
Proceedings of the IECON 2017 - 43rd Annual Conference of the IEEE Industrial Electronics Society, Beijing, China, October 29, 2017

Multi-scale feature based convolutional neural networks for large vocabulary speech recognition.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017

Electrically-evoked frequency following responses (EFFRs) and electrically-evoked auditory brainstem responses (EABRs) in guinea pigs.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Frequency importance function of the speech intelligibility index for Mandarin Chinese.
Speech Commun., 2016

Biped robot falling motion control with human-inspired active compliance.
Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2016

Learning basic unit movements with gate-model auto-encoder for humanoid arm motion control.
Proceedings of the IEEE International Conference on Information and Automation, 2016

Robot learns the concept of direction through motion activity.
Proceedings of the 2016 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics, 2016

An infant-inspired model for robot developing its reaching ability.
Proceedings of the 2016 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics, 2016

Learning task transition from standing-up to walking for a squatted bipedal humanoid robot.
Proceedings of the 16th IEEE-RAS International Conference on Humanoid Robots, 2016

Autonomously achieving bipedal locomotion skill via hierarchical motion modelling.
Proceedings of the IEEE 14th International Workshop on Advanced Motion Control, 2016

Learning basic unit movements for humanoid arm motion control.
Proceedings of the IEEE 14th International Workshop on Advanced Motion Control, 2016

A comparative study on selecting acoustic modeling units in deep neural networks based large vocabulary Chinese speech recognition.
Neurocomputing, 2015

Real-Time Activity Recognition on Smartphones Using Deep Neural Networks.
Proceedings of the 2015 IEEE 12th Intl Conf on Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom), 2015

Semantic Parsing Using Construction Categorization.
Proceedings of the Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques, 2015

Semantic Parsing Using Hierarchical Concept Base.
Proceedings of the Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques, 2015

I-vector dependent feature space transformations for adaptive speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Long short-term memory based convolutional recurrent neural networks for large vocabulary speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Modeling speaker variability using long short-term memory networks for speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Coarse-to-fine trained multi-scale Convolutional Neural Networks for image classification.
Proceedings of the 2015 International Joint Conference on Neural Networks, 2015

Learning to Reconstruct 3D Structure from Object Motion.
Proceedings of the Neural Information Processing - 22nd International Conference, 2015

Convolutional Networks Based Edge Detector Learned via Contrast Sensitivity Function.
Proceedings of the Neural Information Processing - 22nd International Conference, 2015

Recognizing Human Activities from Raw Accelerometer Data Using Deep Neural Networks.
Proceedings of the 14th IEEE International Conference on Machine Learning and Applications, 2015

Learning arm movements of target reaching for humanoid robot.
Proceedings of the IEEE International Conference on Information and Automation, 2015

Improving long short-term memory networks using maxout units for large vocabulary speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Learning push recovery for a bipedal humanoid robot with Dynamical Movement Primitives.
Proceedings of the 15th IEEE-RAS International Conference on Humanoid Robots, 2015

Chinese syllable-to-character conversion with recurrent neural network based supervised sequence labelling.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Integrating prosodic information into recurrent neural network language model for speech recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Semi-global depth from focus.
Proceedings of the 3rd IAPR Asian Conference on Pattern Recognition, 2015

Human activity recognition with HMM-DNN model.
Proceedings of the 14th IEEE International Conference on Cognitive Informatics & Cognitive Computing, 2015

A comparative study of RPCL and MCE based discriminative training methods for LVCSR.
Neurocomputing, 2014

Visual gesture recognition for human robot interaction using dynamic movement primitives.
Proceedings of the 2014 IEEE International Conference on Systems, Man, and Cybernetics, 2014

Error-driven pronunciation dictionary construction for Mandarin speech recognition.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Decision tree based state tying for speech recognition using DNN derived embeddings.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Labeling unsegmented sequence data with DNN-HMM and its application for speech recognition.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Recurrent neural network language model with part-of-speech for Mandarin speech recognition.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Parsing named entity as syntactic structure.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

A Cyclic Contrastive Divergence Learning Algorithm for High-Order RBMs.
Proceedings of the 13th International Conference on Machine Learning and Applications, 2014

Query-based composition for large-scale language model in LVCSR.
Proceedings of the IEEE International Conference on Acoustics, 2014

Modelling and generalizing achieved robot skills with temporal Restricted Boltzmann Machines.
Proceedings of the 14th IEEE-RAS International Conference on Humanoid Robots, 2014

Learning the Taxonomy of Function Words for Parsing.
Proceedings of the COLING 2014, 2014

A nonlinear digital feedback oscillator based vibrato control model for singing synthesis.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

Voice conversion using conditional restricted Boltzmann machine.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

Learning latent variable grammars from complementary perspectives.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

Improved parsing with taxonomy of conjunctions.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

Learning Grammar with Explicit Annotations for Subordinating Conjunctions.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014

Exploiting limited data for parsing.
Proceedings of the 2014 IEEE/ACIS 13th International Conference on Computer and Information Science, 2014

Improved Chinese Parsing Using Named Entity Cue.
Proceedings of The 13th International Conference on Parsing Technologies, 2013

Multi-level Linguistic Knowledge Based Chinese Grapheme-to-Phoneme Conversion.
Proceedings of the Intelligence Science and Big Data Engineering, 2013

A Comparative Study on Selecting Acoustic Modeling Units in Deep Neural Networks Based Large Vocabulary Chinese Speech Recognition.
Proceedings of the Intelligence Science and Big Data Engineering, 2013

Discriminative Apprenticeship Learning with Both Preference and Non-preference Behavior.
Proceedings of the 12th International Conference on Machine Learning and Applications, 2013

Multi-speaker prosodic instance selection for HMM-based speech synthesis.
Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013

Prosodic modeling with rich syntactic context in HMM-based Mandarin speech synthesis.
Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013

Overview of SHRC-Ginkgo speech synthesis system for Blizzard Challenge 2013.
Proceedings of the Blizzard Challenge 2013, 2013

Deep neural networks for syllable based acoustic modeling in Chinese speech recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

The effect of part-of-speech on Mandarin speech recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Effects of aging on the ability to benefit from prior knowledge of message content in masked speech recognition.
Speech Commun., 2012

Discriminative GMM-HMM Acoustic Model Selection Using Two-Level Bayesian Ying-Yang Harmony Learning.
Proceedings of the Intelligent Science and Intelligent Data Engineering, 2012

Lightly Supervised Acoustic Model Training for Mandarin Continuous Speech Recognition.
Proceedings of the Intelligent Science and Intelligent Data Engineering, 2012

Probabilistic Speaker-Class based Acoustic Modeling for Large Vocabulary Continuous Speech Recognition.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Parsing TCT with a Coarse-to-fine Approach.
Proceedings of the Second CIPS-SIGHAN Joint Conference on Chinese Language Processing, 2012

Parsing TCT with Split Conjunction Categories.
Proceedings of the Second CIPS-SIGHAN Joint Conference on Chinese Language Processing, 2012

Perceptual Fusion Tendency of Speech Sounds.
J. Cogn. Neurosci., 2011

Parsing-based automatic Chinese term extraction.
Proceedings of the 7th International Conference on Natural Language Processing and Knowledge Engineering, 2011

Parsing-based Chinese word segmentation integrating morphological and syntactic information.
Proceedings of the 7th International Conference on Natural Language Processing and Knowledge Engineering, 2011

A Comparative Study of RPCL and MCE Based Discriminative Training Methods for LVCSR.
Proceedings of the Intelligent Science and Intelligent Data Engineering, 2011

Active online learning of the bipedal walking.
Proceedings of the 11th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2011), 2011

PKU@TRECVID2010: Pair-Wise Event Detection in Surveillance Video.
Proceedings of the TRECVID 2010 workshop participants notebook papers, 2010

A morphology-based Chinese word segmentation method.
Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering, 2010

Distributed training for Conditional Random Fields.
Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering, 2010

Hierarchical pitch target model for Mandarin speech.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Maximum entropy based tone modeling for mandarin speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2010

GMM-HMM acoustic model training by a two level procedure with Gaussian components determined by automatic model selection.
Proceedings of the IEEE International Conference on Acoustics, 2010

Distance-Dependent Head-Related Transfer Functions Measured With High Spatial Resolution Using a Spark Gap.
IEEE Trans. Speech Audio Process., 2009

PKU@TRECVID2009: Single-Actor and Pair-Activity Event Detection in surveillance Video.
Proceedings of the TRECVID 2009 workshop participants notebook papers, 2009

PHMM based asynchronous acoustic model for Chinese large vocabulary continuous speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2009

Refining Grammars for Parsing with Hierarchical Semantic Knowledge.
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009

PKU Mandarin Speech Synthesis System for Blizzard 2009.
Proceedings of the Blizzard Challenge 2009, Edinburgh, Scotland, UK, September 4, 2009, 2009

A Joint Segmenting and Labeling Approach for Chinese Lexical Analysis.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2008

Probabilistic latent speaker training for large vocabulary speech recognition.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

An Improved CRF based Chinese Language Processing System for SIGHAN Bakeoff 2007.
Proceedings of the Third International Joint Conference on Natural Language Processing, 2008

Monaural speech separation based on multi-scale Fan-Chirp Transform.
Proceedings of the IEEE International Conference on Acoustics, 2008

Exploiting prosodic and lexical features for tone modeling in a conditional random field framework.
Proceedings of the IEEE International Conference on Acoustics, 2008

Integrating Multi-level Linguistic Knowledge with a Unified Framework for Mandarin Speech Recognition.
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, 2008

Text Segmentation with LDA-Based Fisher Kernel.
Proceedings of the ACL 2008, 2008

The effect of voice cuing on releasing Chinese speech from informational masking.
Speech Commun., 2007

Context dependent syllable acoustic model for continuous Chinese speech recognition.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Effect of number of masking talkers on speech-on-speech masking in Chinese.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Probabilistic latent speaker analysis for large vocabulary speech recognition.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Refine bigram PLSA model by assigning latent topics unevenly.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

Just-in-Time Latent Semantic Adaptation on Language Model for Chinese speech Recognition Using Web Data.
Proceedings of the 2006 IEEE ACL Spoken Language Technology Workshop, 2006

CASA based speech separation for robust speech recognition.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Chinese Word Segmentation with Maximum Entropy and N-gram Language Model.
Proceedings of the Fifth Workshop on Chinese Language Processing, 2006

Learning Outliers to Refine a Corpus for Chinese Webpage Categorization.
Proceedings of the Advances in Natural Computation, First International Conference, 2005

Boosting Local Binary Pattern (LBP)-Based Face Recognition.
Proceedings of the Advances in Biometric Person Authentication, 2004

Methodologies of the Personalized Courseware Construction Tools for e-Learning.
Proceedings of the Advances in Web-Based Learning, 2004

Biomimetics speaker identification systems for network security gatekeepers.
Proceedings of the International Joint Conference on Neural Networks, 2003

An Enhanced RASTA Processing for Speaker Identification.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

Modeling of Three Types of Auditory Nerve and Its Application in Speech Recognition.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

On the use of bandpass liftering in speaker recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

On the importance of components of the MFCC in speech and speaker recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

An auditory feature extraction method based on forward-masking and its application in robust speaker identification and speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Auditory model based speech feature extraction and its application to speaker identification.
Proceedings of the International Joint Conference Neural Networks, 1999