2024
Audio Mixing Inversion via Embodied Self-supervised Learning.
Mach. Intell. Res., February, 2024
Employing feature mixture for active learning of object detection.
Neurocomputing, 2024
Using Ear-EEG to Decode Auditory Attention in Multiple-speaker Environment.
CoRR, 2024
Leveraging Moving Sound Source Trajectories for Universal Sound Separation.
CoRR, 2024
Cross-attention Inspired Selective State Space Models for Target Sound Extraction.
CoRR, 2024
TSE-PI: Target Sound Extraction under Reverberant Environments with Pitch Information.
CoRR, 2024
Self-supervised speech representation and contextual text embedding for match-mismatch classification with EEG recording.
CoRR, 2024
Comparing Human-Labeled and LLM-Generated Semantic Features via Cortical Neural Representation.
Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024
Representation of Articulatory Features in EEG During Speech Production Tasks.
Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024
Encoding and Decoding of Chinese Phonemes Based on MEG Signals.
Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024
A Spectral Change Enhancement Method Based on Self-Supervised Learning Framework.
Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024
ConvConcatNet: A Deep Convolutional Neural Network to Reconstruct Mel Spectrogram from the EEG.
Proceedings of the IEEE International Conference on Acoustics, 2024
A DenseNet-Based Method for Decoding Auditory Spatial Attention with EEG.
Proceedings of the IEEE International Conference on Acoustics, 2024
A Hybrid Deep-Online Learning Based Method for Active Noise Control in Wave Domain.
Proceedings of the IEEE International Conference on Acoustics, 2024
Self-Supervised Speech Representation and Contextual Text Embedding for Match-Mismatch Classification with EEG Recording.
Proceedings of the IEEE International Conference on Acoustics, 2024
Semantic Reconstruction of Continuous Language from Meg Signals.
Proceedings of the IEEE International Conference on Acoustics, 2024
2023
A Physical Model-Based Self-Supervised Learning Method for Signal Enhancement Under Reverberant Environment.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
Embodied Self-Supervised Learning (EMSSL) with Sampling and Training Coordination for Robot Arm Inverse Kinematics Model Learning.
CoRR, 2023
Emotion Classification with EEG Responses Evoked by Emotional Prosody of Speech.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Embodied Self-Supervised Learning (EMSSL) with Sampling and Training Coordination for Robot Arm Inverse Kinematic Model Learning.
Proceedings of the IEEE International Conference on Development and Learning, 2023
TT-Net: Dual-Path Transformer Based Sound Field Translation in the Spherical Harmonic Domain.
Proceedings of the IEEE International Conference on Acoustics, 2023
A Model-Based Hearing Compensation Method Using a Self-Supervised Framework.
Proceedings of the IEEE International Conference on Acoustics, 2023
PGSS: Pitch-Guided Speech Separation.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
Sparse DNN Model for Frequency Expanding of Higher Order Ambisonics Encoding Process.
IEEE ACM Trans. Audio Speech Lang. Process., 2022
Room geometry blind inference based on the localization of real sound source and first order reflections.
CoRR, 2022
Unsupervised Inference of Physiologically Meaningful Articulatory Trajectories with VocalTractLab.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Unsupervised Acoustic-to-Articulatory Inversion with Variable Vocal Tract Anatomy.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Advanced Face Anti-Spoofing with Depth Segmentation.
Proceedings of the International Joint Conference on Neural Networks, 2022
Multi-Speaker Pitch Tracking via Embodied Self-Supervised Learning.
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
Direct source and early reflections localization using deep deconvolution network under reverbrate environment.
CoRR, 2021
Auditory Attention Decoding from EEG using Convolutional Recurrent Neural Network.
Proceedings of the 29th European Signal Processing Conference, 2021
Eye-gaze Estimation with HEOG and Neck EMG using Deep Neural Networks.
Proceedings of the 29th European Signal Processing Conference, 2021
2020
Modeling of Individual HRTFs Based on Spatial Principal Component Analysis.
IEEE ACM Trans. Audio Speech Lang. Process., 2020
Forming the Concept of Direction Developmentally.
IEEE Trans. Cogn. Dev. Syst., 2020
Spectral-change enhancement with prior SNR for the hearing impaired.
CoRR, 2020
Embodied Self-supervised Learning by Coordinated Sampling and Training.
CoRR, 2020
Competing Speaker Count Estimation on the Fusion of the Spectral and Spatial Embedding Space.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Individual Distance-Dependent HRTFS Modeling Through A Few Anthropometric Measurements.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Single-Channel Speech Separation Integrating Pitch Information Based on a Multi Task Learning Framework.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
Effects of Spectral and Temporal Cues to Mandarin Concurrent-Vowels Identification for Normal-Hearing and Hearing-Impaired Listeners.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
A Hierarchical Model for StarCraft II Mini-Game.
Proceedings of the 18th IEEE International Conference On Machine Learning And Applications, 2019
Action Selection Based on Prediction for Robot Planning.
Proceedings of the Joint IEEE 9th International Conference on Development and Learning and Epigenetic Robotics, 2019
Distance-dependent Modeling of Head-related Transfer Functions.
Proceedings of the IEEE International Conference on Acoustics, 2019
Integrating Spectrotemporal Context into Features Based on Auditory Perception for Classification-based Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2019
A Spectral-change-aware Loss Function for DNN-based Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2019
Improvements to the Matching Projection Decoding Method for Ambisonic System with Irregular Loudspeaker Layouts.
Proceedings of the IEEE International Conference on Acoustics, 2019
2018
How Does a Robot Develop Its Reaching Ability Like Human Infants Do?
IEEE Trans. Cogn. Dev. Syst., 2018
Robot Learning to Play Drums with an Open-Ended Internal Model.
Proceedings of the IEEE International Conference on Robotics and Biomimetics, 2018
Improving Minority Language Speech Recognition Based on Distinctive Features.
Proceedings of the Intelligence Science and Big Data Engineering, 2018
Measuring the Band Importance Function for Mandarin Chinese with a Bayesian Adaptive Procedure.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Self-developing Proprioception-Based Robot Internal Models.
Proceedings of the Intelligence Science II, 2018
A Modified Frequency Weighted MUSIC Algorithm for Multiple Sound Sources Localization.
Proceedings of the 23rd IEEE International Conference on Digital Signal Processing, 2018
Developing Robot Reaching Skill with Relative-Location based Approximating.
Proceedings of the 2018 Joint IEEE 8th International Conference on Development and Learning and Epigenetic Robotics, 2018
A Time-Weighted Method for Predicting the Intelligibility of Speech in the Presence of Interfering Sounds.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Matching Projection Decoding Method for Ambisonics System.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Multi-Sensor Fusion Based Robot Self-Activity Recognition.
Proceedings of the 18th IEEE-RAS International Conference on Humanoid Robots, 2018
2017
Towards human-like and transhuman perception in AI 2.0: a review.
,
,
,
,
,
,
,
,
,
,
,
,
Frontiers Inf. Technol. Electron. Eng., 2017
Developing Robot Drumming Skill with Listening-Playing Loop.
Proceedings of the Advances in Swarm Intelligence - 8th International Conference, 2017
Corner detection based real-time workpiece recognition for robot manipulation.
Proceedings of the 2017 IEEE International Conference on Robotics and Biomimetics, 2017
Learning to chase a ball efficiently and smoothly for a wheeled robot.
Proceedings of the 24th International Conference on Mechatronics and Machine Vision in Practice, 2017
The Microphone Array Arrangement Method for High Order Ambisonics Recordings.
Proceedings of the Intelligence Science and Big Data Engineering, 2017
A hierarchical inverse model based on proprioception and DNN for robot reaching.
Proceedings of the IECON 2017 - 43rd Annual Conference of the IEEE Industrial Electronics Society, Beijing, China, October 29, 2017
Multi-scale feature based convolutional neural networks for large vocabulary speech recognition.
Proceedings of the 2017 IEEE International Conference on Multimedia and Expo, 2017
Electrically-evoked frequency following responses (EFFRs) and electrically-evoked auditory brainstem responses (EABRs) in guinea pigs.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
2016
Frequency importance function of the speech intelligibility index for Mandarin Chinese.
Speech Commun., 2016
Biped robot falling motion control with human-inspired active compliance.
Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2016
Learning basic unit movements with gate-model auto-encoder for humanoid arm motion control.
Proceedings of the IEEE International Conference on Information and Automation, 2016
Robot learns the concept of direction through motion activity.
Proceedings of the 2016 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics, 2016
An infant-inspired model for robot developing its reaching ability.
Proceedings of the 2016 Joint IEEE International Conference on Development and Learning and Epigenetic Robotics, 2016
Learning task transition from standing-up to walking for a squatted bipedal humanoid robot.
Proceedings of the 16th IEEE-RAS International Conference on Humanoid Robots, 2016
Autonomously achieving bipedal locomotion skill via hierarchical motion modelling.
Proceedings of the IEEE 14th International Workshop on Advanced Motion Control, 2016
Learning basic unit movements for humanoid arm motion control.
Proceedings of the IEEE 14th International Workshop on Advanced Motion Control, 2016
2015
A comparative study on selecting acoustic modeling units in deep neural networks based large vocabulary Chinese speech recognition.
Neurocomputing, 2015
Real-Time Activity Recognition on Smartphones Using Deep Neural Networks.
Proceedings of the 2015 IEEE 12th Intl Conf on Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom), 2015
Semantic Parsing Using Construction Categorization.
Proceedings of the Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques, 2015
Semantic Parsing Using Hierarchical Concept Base.
Proceedings of the Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques, 2015
I-vector dependent feature space transformations for adaptive speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Long short-term memory based convolutional recurrent neural networks for large vocabulary speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Modeling speaker variability using long short-term memory networks for speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Coarse-to-fine trained multi-scale Convolutional Neural Networks for image classification.
Proceedings of the 2015 International Joint Conference on Neural Networks, 2015
Learning to Reconstruct 3D Structure from Object Motion.
Proceedings of the Neural Information Processing - 22nd International Conference, 2015
Convolutional Networks Based Edge Detector Learned via Contrast Sensitivity Function.
Proceedings of the Neural Information Processing - 22nd International Conference, 2015
Recognizing Human Activities from Raw Accelerometer Data Using Deep Neural Networks.
Proceedings of the 14th IEEE International Conference on Machine Learning and Applications, 2015
Learning arm movements of target reaching for humanoid robot.
Proceedings of the IEEE International Conference on Information and Automation, 2015
Improving long short-term memory networks using maxout units for large vocabulary speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Constructing long short-term memory based deep recurrent neural networks for large vocabulary speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Learning push recovery for a bipedal humanoid robot with Dynamical Movement Primitives.
Proceedings of the 15th IEEE-RAS International Conference on Humanoid Robots, 2015
Chinese syllable-to-character conversion with recurrent neural network based supervised sequence labelling.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015
Integrating prosodic information into recurrent neural network language model for speech recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015
Semi-global depth from focus.
Proceedings of the 3rd IAPR Asian Conference on Pattern Recognition, 2015
Human activity recognition with HMM-DNN model.
Proceedings of the 14th IEEE International Conference on Cognitive Informatics & Cognitive Computing, 2015
2014
A comparative study of RPCL and MCE based discriminative training methods for LVCSR.
Neurocomputing, 2014
Visual gesture recognition for human robot interaction using dynamic movement primitives.
Proceedings of the 2014 IEEE International Conference on Systems, Man, and Cybernetics, 2014
Error-driven pronunciation dictionary construction for Mandarin speech recognition.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Decision tree based state tying for speech recognition using DNN derived embeddings.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Labeling unsegmented sequence data with DNN-HMM and its application for speech recognition.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Recurrent neural network language model with part-of-speech for Mandarin speech recognition.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Parsing named entity as syntactic structure.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
A Cyclic Contrastive Divergence Learning Algorithm for High-Order RBMs.
Proceedings of the 13th International Conference on Machine Learning and Applications, 2014
Query-based composition for large-scale language model in LVCSR.
Proceedings of the IEEE International Conference on Acoustics, 2014
Modelling and generalizing achieved robot skills with temporal Restricted Boltzmann Machines.
Proceedings of the 14th IEEE-RAS International Conference on Humanoid Robots, 2014
Learning the Taxonomy of Function Words for Parsing.
Proceedings of the COLING 2014, 2014
A nonlinear digital feedback oscillator based vibrato control model for singing synthesis.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014
Voice conversion using conditional restricted Boltzmann machine.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014
Learning latent variable grammars from complementary perspectives.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014
Improved parsing with taxonomy of conjunctions.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014
Learning Grammar with Explicit Annotations for Subordinating Conjunctions.
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014
Exploiting limited data for parsing.
Proceedings of the 2014 IEEE/ACIS 13th International Conference on Computer and Information Science, 2014
2013
Improved Chinese Parsing Using Named Entity Cue.
Proceedings of The 13th International Conference on Parsing Technologies, 2013
Multi-level Linguistic Knowledge Based Chinese Grapheme-to-Phoneme Conversion.
Proceedings of the Intelligence Science and Big Data Engineering, 2013
A Comparative Study on Selecting Acoustic Modeling Units in Deep Neural Networks Based Large Vocabulary Chinese Speech Recognition.
Proceedings of the Intelligence Science and Big Data Engineering, 2013
Discriminative Apprenticeship Learning with Both Preference and Non-preference Behavior.
Proceedings of the 12th International Conference on Machine Learning and Applications, 2013
Multi-speaker prosodic instance selection for HMM-based speech synthesis.
Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013
Prosodic modeling with rich syntactic context in HMM-based Mandarin speech synthesis.
Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013
Overview of SHRC-Ginkgo speech synthesis system for Blizzard Challenge 2013.
Proceedings of the Blizzard Challenge 2013, 2013
Deep neural networks for syllable based acoustic modeling in Chinese speech recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013
The effect of part-of-speech on Mandarin speech recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013
2012
Effects of aging on the ability to benefit from prior knowledge of message content in masked speech recognition.
Speech Commun., 2012
Discriminative GMM-HMM Acoustic Model Selection Using Two-Level Bayesian Ying-Yang Harmony Learning.
Proceedings of the Intelligent Science and Intelligent Data Engineering, 2012
Lightly Supervised Acoustic Model Training for Mandarin Continuous Speech Recognition.
Proceedings of the Intelligent Science and Intelligent Data Engineering, 2012
Probabilistic Speaker-Class based Acoustic Modeling for Large Vocabulary Continuous Speech Recognition.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Parsing TCT with a Coarse-to-fine Approach.
Proceedings of the Second CIPS-SIGHAN Joint Conference on Chinese Language Processing, 2012
Parsing TCT with Split Conjunction Categories.
Proceedings of the Second CIPS-SIGHAN Joint Conference on Chinese Language Processing, 2012
2011
Perceptual Fusion Tendency of Speech Sounds.
J. Cogn. Neurosci., 2011
Parsing-based automatic Chinese term extraction.
Proceedings of the 7th International Conference on Natural Language Processing and Knowledge Engineering, 2011
Parsing-based Chinese word segmentation integrating morphological and syntactic information.
Proceedings of the 7th International Conference on Natural Language Processing and Knowledge Engineering, 2011
A Comparative Study of RPCL and MCE Based Discriminative Training Methods for LVCSR.
Proceedings of the Intelligent Science and Intelligent Data Engineering, 2011
Active online learning of the bipedal walking.
Proceedings of the 11th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2011), 2011
2010
PKU@TRECVID2010: Pair-Wise Event Detection in Surveillance Video.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the TRECVID 2010 workshop participants notebook papers, 2010
A morphology-based Chinese word segmentation method.
Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering, 2010
Distributed training for Conditional Random Fields.
Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering, 2010
Hierarchical pitch target model for Mandarin speech.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Maximum entropy based tone modeling for mandarin speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2010
GMM-HMM acoustic model training by a two level procedure with Gaussian components determined by automatic model selection.
Proceedings of the IEEE International Conference on Acoustics, 2010
2009
Distance-Dependent Head-Related Transfer Functions Measured With High Spatial Resolution Using a Spark Gap.
IEEE Trans. Speech Audio Process., 2009
PKU@TRECVID2009: Single-Actor and Pair-Activity Event Detection in surveillance Video.
,
,
,
,
,
,
,
,
,
,
Proceedings of the TRECVID 2009 workshop participants notebook papers, 2009
PHMM based asynchronous acoustic model for Chinese large vocabulary continuous speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2009
Refining Grammars for Parsing with Hierarchical Semantic Knowledge.
Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, 2009
PKU Mandarin Speech Synthesis System for Blizzard 2009.
Proceedings of the Blizzard Challenge 2009, Edinburgh, Scotland, UK, September 4, 2009, 2009
2008
A Joint Segmenting and Labeling Approach for Chinese Lexical Analysis.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2008
Probabilistic latent speaker training for large vocabulary speech recognition.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
An Improved CRF based Chinese Language Processing System for SIGHAN Bakeoff 2007.
Proceedings of the Third International Joint Conference on Natural Language Processing, 2008
Monaural speech separation based on multi-scale Fan-Chirp Transform.
Proceedings of the IEEE International Conference on Acoustics, 2008
Exploiting prosodic and lexical features for tone modeling in a conditional random field framework.
Proceedings of the IEEE International Conference on Acoustics, 2008
Integrating Multi-level Linguistic Knowledge with a Unified Framework for Mandarin Speech Recognition.
Proceedings of the 2008 Conference on Empirical Methods in Natural Language Processing, 2008
Text Segmentation with LDA-Based Fisher Kernel.
Proceedings of the ACL 2008, 2008
2007
The effect of voice cuing on releasing Chinese speech from informational masking.
Speech Commun., 2007
Context dependent syllable acoustic model for continuous Chinese speech recognition.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Effect of number of masking talkers on speech-on-speech masking in Chinese.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Probabilistic latent speaker analysis for large vocabulary speech recognition.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Refine bigram PLSA model by assigning latent topics unevenly.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007
2006
Just-in-Time Latent Semantic Adaptation on Language Model for Chinese speech Recognition Using Web Data.
Proceedings of the 2006 IEEE ACL Spoken Language Technology Workshop, 2006
CASA based speech separation for robust speech recognition.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Chinese Word Segmentation with Maximum Entropy and N-gram Language Model.
Proceedings of the Fifth Workshop on Chinese Language Processing, 2006
2005
Learning Outliers to Refine a Corpus for Chinese Webpage Categorization.
Proceedings of the Advances in Natural Computation, First International Conference, 2005
2004
Boosting Local Binary Pattern (LBP)-Based Face Recognition.
Proceedings of the Advances in Biometric Person Authentication, 2004
Methodologies of the Personalized Courseware Construction Tools for e-Learning.
Proceedings of the Advances in Web-Based Learning, 2004
2003
Biomimetics speaker identification systems for network security gatekeepers.
Proceedings of the International Joint Conference on Neural Networks, 2003
2000
An Enhanced RASTA Processing for Speaker Identification.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000
Modeling of Three Types of Auditory Nerve and Its Application in Speech Recognition.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000
On the use of bandpass liftering in speaker recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
On the importance of components of the MFCC in speech and speaker recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
An auditory feature extraction method based on forward-masking and its application in robust speaker identification and speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
1999
Auditory model based speech feature extraction and its application to speaker identification.
Proceedings of the International Joint Conference Neural Networks, 1999