Jiqing Han

Expert Syst. Appl., December, 2023

A Glance is Enough: Extract Target Sentence By Looking at A keyword.

[BibT_eX]

[DOI]

CoRR, 2023

Patch-level contrastive embedding learning for respiratory sound classification.

[BibT_eX]

[DOI]

Wenjie Song

Biomed. Signal Process. Control., 2023

Mutual Information-based Embedding Decoupling for Generalizable Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Personality-aware Training based Speaker Adaptation for End-to-end Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Spot Keywords From Very Noisy and Mixed Speech.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Using Auxiliary Tasks In Multimodal Fusion of Wav2vec 2.0 And Bert for Multimodal Emotion Recognition.

[BibT_eX]

[DOI]

Dekai Sun

Yancheng He

Proceedings of the IEEE International Conference on Acoustics, 2023

Subband Dependency Modeling for Sound Event Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Time-Weighted Frequency Domain Audio Representation with GMM Estimator for Anomalous Sound Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Graph-Based Spectro-Temporal Dependency Modeling for Anti-Spoofing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Sentiment Knowledge Enhanced Self-supervised Learning for Multimodal Sentiment Analysis.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022

Exploring Inter-Node Relations in CNNs for Environmental Sound Classification.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2022

Contrastive Regularization for Multimodal Emotion Recognition Using Audio and Text.

[BibT_eX]

[DOI]

Fan Qian

CoRR, 2022

Word-wise Sparse Attention for Multimodal Sentiment Analysis.

[BibT_eX]

[DOI]

Fan Qian

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Exploring Transformer's Potential on Automatic Piano Transcription.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

CDMA: Cross-Domain Distance Metric Adaptation for Speaker Verification.

[BibT_eX]

[DOI]

Jianchen Li

Proceedings of the IEEE International Conference on Acoustics, 2022

Sparse Self-Attention for Semi-Supervised Sound Event Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Exploring attention mechanisms based on summary information for end-to-end automatic speech recognition.

[BibT_eX]

[DOI]

Neurocomputing, 2021

Semantic feature extraction based on subspace learning with temporal constraints for acoustic event recognition.

[BibT_eX]

[DOI]

Qiuying Shi

Digit. Signal Process., 2021

Can We Trust Deep Speech Prior?

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Model-Agnostic Fast Adaptive Multi-Objective Balancing Algorithm for Multilingual Automatic Speech Recognition Model Training.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Multimodal Sentiment Analysis with Temporal Modality Attention.

[BibT_eX]

[DOI]

Fan Qian

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Gradient Regularization for Noise-Robust Speaker Verification.

[BibT_eX]

[DOI]

Jianchen Li

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Capturing Temporal Dependencies Through Future Prediction for CNN-Based Audio Classifiers.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Contrastive Embeddind Learning Method for Respiratory Sound Classification.

[BibT_eX]

[DOI]

Wenjie Song

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Pyramidal Temporal Pooling With Discriminative Mapping for Audio Classification.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Nonnegative Matrix Factorization Based Transfer Subspace Learning for Cross-Corpus Speech Emotion Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

A Joint Framework of Denoising Autoencoder and Generative Vocoder for Monaural Speech Enhancement.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Learning Temporal Relations from Semantic Neighbors for Acoustic Scene Classification.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2020

Task-Driven Variability Model for Speaker Verification.

[BibT_eX]

[DOI]

Circuits Syst. Signal Process., 2020

Toward the pre-cocktail party problem with TasTas+.

[BibT_eX]

[DOI]

Anyan Shi

CoRR, 2020

La Furca: Iterative Context-Aware End-to-End Monaural Speech Separation Based on Dual-Path Deep Parallel Inter-Intra Bi-LSTM with Attention.

[BibT_eX]

[DOI]

Rujie Liu

CoRR, 2020

FurcaNeXt: End-to-End Monaural Speech Separation with Dynamic Gated Dilated Temporal Convolutional Networks.

[BibT_eX]

[DOI]

Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

ATReSN-Net: Capturing Attentive Temporal Relations in Semantic Neighborhood for Acoustic Scene Classification.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Speech Separation Based on Multi-Stage Elaborated Dual-Path Deep BiLSTM with Auxiliary Identity Loss.

[BibT_eX]

[DOI]

Rujie Liu

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Self-Supervised Adversarial Multi-Task Learning for Vocoder-Based Monaural Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Double Adversarial Network Based Monaural Speech Enhancement for Robust Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Structured Sparse Attention for end-to-end Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Pan: Phoneme-Aware Network for Monaural Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

TDMF: Task-Driven Multilevel Framework for End-to-End Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

A bilevel framework for joint optimization of session compensation and classification for speaker identification.

[BibT_eX]

[DOI]

Digit. Signal Process., 2019

A Multi-Task Learning Framework for Overcoming the Catastrophic Forgetting in Automatic Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2019

Hard Sample Mining for the Improved Retraining of Automatic Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2019

FurcaNeXt: End-to-end monaural speech separation with dynamic gated dilated temporal convolutional networks.

[BibT_eX]

[DOI]

CoRR, 2019

FurcaNet: An end-to-end deep gated convolutional, long short-term memory, deep neural networks for single channel speech separation.

[BibT_eX]

[DOI]

CoRR, 2019

Is CQT more suitable for monaural speech separation than STFT? an empirical study.

[BibT_eX]

[DOI]

CoRR, 2019

Abnormal heart sound detection using temporal quasi-periodic features and long short-term memory without segmentation.

[BibT_eX]

[DOI]

Biomed. Signal Process. Control., 2019

Trace Ratio Criterion Based Large Margin Subspace Learning for Feature Selection.

[BibT_eX]

[DOI]

IEEE Access, 2019

Acoustic Scene Classification by Implicitly Identifying Distinct Sound Events.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Deep Attention Gated Dilated Temporal Convolutional Networks with Intra-Parallel Convolutional Modules for End-to-End Monaural Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

End-to-End Monaural Speech Separation with Multi-Scale Dynamic Weighted Gated Dilated Convolutional Pyramid Network.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Subspace Pooling Based Temporal Features Extraction for Audio Event Recognition.

[BibT_eX]

[DOI]

Qiuying Shi

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Cross-Corpus Speech Emotion Recognition Using Semi-Supervised Transfer Non-Negative Matrix Factorization with Adaptation Regularization.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Convolutional Grid Long Short-Term Memory Recurrent Neural Network for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Neural Information Processing - 26th International Conference, 2019

Furcax: End-to-end Monaural Speech Separation Based on Deep Gated (De)convolutional Neural Networks with Adversarial Example Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Investigation of Monaural Front-End Processing for Robust Speech Recognition Without Retraining or Joint-Training.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018

Efficient general sparse denoising with non-convex sparse constraint and total variation regularization.

[BibT_eX]

[DOI]

Digit. Signal Process., 2018

Investigation of Monaural Front-End Processing for Robust ASR without Retraining or Joint-Training.

[BibT_eX]

[DOI]

CoRR, 2018

Adaptive overlapping-group sparse denoising for heart sound signals.

[BibT_eX]

[DOI]

Biomed. Signal Process. Control., 2018

Unsupervised Temporal Feature Learning Based on Sparse Coding Embedded BoAW for Acoustic Event Recognition.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

A Compact and Discriminative Feature Based on Auditory Summary Statistics for Acoustic Scene Classification.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Deep Neural Network Based Discriminative Training for I-Vector/PLDA Speaker Verification.

[BibT_eX]

[DOI]

Guibin Zheng

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Heart sound classification based on scaled spectrogram and tensor decomposition.

[BibT_eX]

[DOI]

Expert Syst. Appl., 2017

Heart sound classification based on scaled spectrogram and partial least squares regression.

[BibT_eX]

[DOI]

Biomed. Signal Process. Control., 2017

Speaker Verification via Estimating Total Variability Space Using Probabilistic Partial Least Squares.

[BibT_eX]

[DOI]

Yilin Pan

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Learning Deep Neural Network Based Kernel Functions for Small Sample Size Classification.

[BibT_eX]

[DOI]

Guibin Zheng

Proceedings of the Neural Information Processing - 24th International Conference, 2017

Towards Heart Sound Classification Without Segmentation Using Convolutional Neural Network.

[BibT_eX]

[DOI]

Proceedings of the Computing in Cardiology, 2017

2016

Signal Periodic Decomposition With Conjugate Subspaces.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2016

Sparse Decomposition for Signal Periodic Model Over Complex Exponential Dictionary.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2016

Speaker Verification via Modeling Kurtosis Using Sparse Coding.

[BibT_eX]

[DOI]

Int. J. Pattern Recognit. Artif. Intell., 2016

Optimization of learned dictionary for sparse coding in speech processing.

[BibT_eX]

[DOI]

Guanglu Sun

Neurocomputing, 2016

Towards heart sound classification without segmentation via autocorrelation feature and diffusion maps.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2016

Towards optimal vlad for human action recognition from still images.

[BibT_eX]

[DOI]

Lei Zhang

Xiantong Zhen

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Realistic human action recognition: When deep learning meets VLAD.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Abnormal Heart Sounds detection based on the Scaled Time-Frequency Representation and Feature Selection.

[BibT_eX]

[DOI]

Proceedings of the Computing in Cardiology, CinC 2016, Vancouver, 2016

2015

Soft Margin Based Low-Rank Audio Signal Classification.

[BibT_eX]

[DOI]

Neural Process. Lett., 2015

Dictionary evaluation and optimization for sparse coding based speech processing.

[BibT_eX]

[DOI]

Inf. Sci., 2015

Spectrum enhancement with sparse coding for robust speech recognition.

[BibT_eX]

[DOI]

Guanglu Sun

Digit. Signal Process., 2015

Ramanujan subspace pursuit for signal periodic decomposition.

[BibT_eX]

[DOI]

CoRR, 2015

Noise-robust speaker recognition based on morphological component analysis.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

2014

Confidence Measure Based on Context Consistency Using Word Occurrence Probability and Topic Adaptation for Spoken Term Detection.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2014

A new framework for robust speech recognition in complex channel environments.

[BibT_eX]

[DOI]

Digit. Signal Process., 2014

Sparse Representation with Optimized Learned Dictionary for Robust Voice Activity Detection.

[BibT_eX]

[DOI]

Circuits Syst. Signal Process., 2014

Evaluation of dictionary for sparse coding in speech processing.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Learning semantic kernels for scene classification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Robust minimum statistics project coefficients feature for acoustic environment recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Audio classification with low-rank matrix representation features.

[BibT_eX]

[DOI]

ACM Trans. Intell. Syst. Technol., 2013

Identification of Objectionable Audio Segments Based on Pseudo and Heterogeneous Mixture Models.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2013

Audio Segment Classification Using Online Learning Based Tensor Representation Feature Discrimination.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2013

Statistical voice activity detection based on sparse representation over learned dictionary.

[BibT_eX]

[DOI]

Digit. Signal Process., 2013

Guarantees of Augmented Trace Norm Models in Tensor Recovery.

[BibT_eX]

[DOI]

Proceedings of the IJCAI 2013, 2013

Case based reasoning solution to the problem of sustained learning in keyword spotting.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Upper and lower bounds for approximation of the Kullback-Leibler divergence between Hidden Markov models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Sparse-Based auditory Model for robust speaker Recognition.

[BibT_eX]

[DOI]

Int. J. Pattern Recognit. Artif. Intell., 2012

Likelihood ratio sign test for voice activity detection.

[BibT_eX]

[DOI]

IET Signal Process., 2012

Identifiability of multivariate logistic mixture models

[BibT_eX]

[DOI]

CoRR, 2012

Guarantees of Augmented Trace Norm Models in Tensor Recovery

[BibT_eX]

[DOI]

CoRR, 2012

Low-rank Audio Signal Classification Under Soft Margin and Trace Norm Constraints.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

A Novel Confidence Measure Based on Context Consistency for Spoken Term Detection.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Sparse power spectrum based robust voice activity detector.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

A solution to residual noise in speech denoising with sparse representation.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Gaussian Specific Compensation for Channel Distortion in Speech Recognition.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2011

MAP-based Audio Coding Compensation for Speaker Recognition.

[BibT_eX]

[DOI]

Tao Jiang

J. Signal Inf. Process., 2011

Voice activity detection based on conjugate subspace matching pursuit and likelihood ratio test.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2011

Online Learning for Classification of Low-rank Representation Features and Its Applications in Audio Segment Classification

[BibT_eX]

[DOI]

CoRR, 2011

Trace Norm Regularized Tensor Classification and Its Online Learning Approaches

[BibT_eX]

[DOI]

CoRR, 2011

Heterogeneous mixture models using sparse representation features for applause and laugh detection.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE International Workshop on Machine Learning for Signal Processing, 2011

Real-World Speech/Non-Speech Audio Classification Based on Sparse Representation Features and GPCs.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

AUC Optimization Based Confidence Measure for Keyword Spotting.

[BibT_eX]

[DOI]

Haiyang Li

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

A Novel Framework Based on Trace Norm Minimization for Audio Event Detection.

[BibT_eX]

[DOI]

Proceedings of the Neural Information Processing - 18th International Conference, 2011

A cochlear neuron based robust feature for speaker recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Compensation of partly reliable components for band-limited speech recognition with missing data techniques.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

A modified MAP criterion based on hidden Markov model for voice activity detecion.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Particle-based realistic simulation of fluid-solid interaction.

[BibT_eX]

[DOI]

Hongquan Sun

Comput. Animat. Virtual Worlds, 2010

Study on the Recognition of Objectionable Audio.

[BibT_eX]

[DOI]

Int. J. Pattern Recognit. Artif. Intell., 2010

Compensation of signal with erasures via sparse representation into its significant subspace.

[BibT_eX]

[DOI]

Proceedings of the 10th International Conference on Information Sciences, 2010

Model synthesis for band-limited speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Robust statistical voice activity detection using a likelihood ratio sign test.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Voice Activity Detection Based on Complex Exponential Atomic Decomposition and Likelihood Ratio Test.

[BibT_eX]

[DOI]

Proceedings of the 20th International Conference on Pattern Recognition, 2010

2009

Speaker identification and verification from audio coded speech in matched and mismatched conditions.

[BibT_eX]

[DOI]

Tao Jiang

Boyang Gao

Proceedings of the IEEE International Conference on Robotics and Biomimetics, 2009

A Fast Audio Retrieval Method Based on Negativity Judgment.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2009), 2009

2008

Text-independent Speaker Identification Based on MAP Channel Compensation and Pitch-dependent Features.

[BibT_eX]

Rongchun Gao

Proceedings of the 2008 International Conference on Information & Knowledge Engineering, 2008

2007

Automatic conversion from lexical words to prosodic words for mandarin text-to-speech system.

[BibT_eX]

[DOI]

Int. J. Speech Technol., 2007

2006

Automatic Music Transcription Based on Harmonic Structure Information.

[BibT_eX]

[DOI]

Guibin Zheng

J. Comput. Res. Dev., 2006

Improved Mandarin Speech Recognition by Lattice Rescoring with Enhanced Tone Models.

[BibT_eX]

[DOI]

Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

A multi-space distribution (MSD) approach to speech recognition of tonal languages.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

2005

Modifying Spectral Envelope to Synthetically Adjust Voice Quality and Articulation Parameters for Emotional Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Affective Computing and Intelligent Interaction, 2005

2002

Sharpe Ratio-Oriented Active Trading: A Learning Approach.

[BibT_eX]

[DOI]

Yang Liu

Xiaohui Yu

Proceedings of the MICAI 2002: Advances in Artificial Intelligence, 2002

2001

Robust Speech Recognition Method Based on Discriminative Environment Feature Extraction.

[BibT_eX]

[DOI]

Wen Gao

J. Comput. Sci. Technol., 2001

2000

An environment model-based robust speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1999

Robust telephone speech recognition based on channel compensation.

[BibT_eX]

[DOI]