Yan Song
Orcid: 0000-0002-5668-9068Affiliations:
- University of Science and Technology of China, National Engineering Laboratory for Speech and Language Information Processing, Hefei, China
According to our database1,
Yan Song
authored at least 115 papers
between 2005 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
MAT-SED: A Masked Audio Transformer with Masked-Reconstruction Based Pre-training for Sound Event Detection.
CoRR, 2024
Meta Representation Learning Method for Robust Speaker Verification in Unseen Domains.
Proceedings of the IEEE International Conference on Acoustics, 2024
2023
Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Fine-tuning Audio Spectrogram Transformer with Task-aware Adapters for Sound Event Detection.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
AST-SED: An Effective Sound Event Detection Method Based on Audio Spectrogram Transformer.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
An Effective Anomalous Sound Detection Method Based on Representation Learning with Simulated Anomalies.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the Workshop on Deepfake Audio Detection and Analysis co-located with 32th International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023
Convolutional Recurrent Neural Network and Multitask Learning for Manipulation Region Location.
Proceedings of the Workshop on Deepfake Audio Detection and Analysis co-located with 32th International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023
2022
Cross-Lingual Self-training to Learn Multilingual Representation for Low-Resource Speech Recognition.
Circuits Syst. Signal Process., 2022
Class-Aware Distribution Alignment based Unsupervised Domain Adaptation for Speaker Verification.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Self-Supervised Representation Learning for Unsupervised Anomalous Sound Detection Under Domain Shift.
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
Multi-Granularity Sequence Alignment Mapping for Encoder-Decoder Based End-to-End ASR.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Circuits Syst. Signal Process., 2021
XLST: Cross-lingual Self-training to Learn Multilingual Representation for Low Resource Speech Recognition.
CoRR, 2021
An Effective Mutual Mean Teaching Based Domain Adaptation Method for Sound Event Detection.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
A Weight Moving Average Based Alternate Decoupled Learning Algorithm for Long-Tailed Language Identification.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
An Improved Mean Teacher Based Method for Large Scale Weakly Labeled Semi-Supervised Sound Event Detection.
Proceedings of the IEEE International Conference on Acoustics, 2021
An Effective Deep Embedding Learning Method Based on Dense-Residual Networks for Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the CAA Symposium on Fault Detection, 2021
2020
Segment boundary detection directed attention for online end-to-end speech recognition.
EURASIP J. Audio Speech Music. Process., 2020
Circuits Syst. Signal Process., 2020
Effective Exploitation of Posterior Information for Attention-Based Speech Recognition.
IEEE Access, 2020
An Effective Perturbation Based Semi-Supervised Learning Method for Sound Event Detection.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Semi-Supervised End-to-End ASR via Teacher-Student Learning with Conditional Posterior Distribution.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
An Effective Speaker Recognition Method Based on Joint Identification and Verification Supervisions.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Task-Aware Mean Teacher Method for Large Scale Weakly Labeled Semi-Supervised Sound Event Detection.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
An Online Speaker-aware Speech Separation Approach Based on Time-domain Representation.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
Listening and Grouping: An Online Autoregressive Approach for Monaural Speech Separation.
IEEE ACM Trans. Audio Speech Lang. Process., 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Improving Aggregation and Loss Function for Better Embedding Learning in End-to-End Speaker Verification System.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
A Region Based Attention Method for Weakly Supervised Sound Event Detection and Classification.
Proceedings of the IEEE International Conference on Acoustics, 2019
Topic Detection in Conversational Telephone Speech Using CNN with Multi-stream Inputs.
Proceedings of the IEEE International Conference on Acoustics, 2019
Knowledge Distillation from Multilingual and Monolingual Teachers for End-to-End Multilingual Speech Recognition.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
Speaker to Emotion: Domain Adaptation for Speech Emotion Recognition with Residual Adapters.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
2018
IEEE ACM Trans. Audio Speech Lang. Process., 2018
Circuits Syst. Signal Process., 2018
Improved Supervised Locality Preserving Projection for I-vector Based Speaker Verification.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Acoustic Modeling with Densely Connected Residual Network for Multichannel Speech Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
An Attention Pooling Based Representation Learning Method for Speech Emotion Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018
2017
End-to-End Language Identification Using High-Order Utterance Representation with Bilinear Pooling.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 2017 IEEE International Conference on Image Processing, 2017
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Topic classification based on distributed document representation and latent topic information.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
2016
A new variance-based approach for discriminative feature extraction in machine hearing classification using spectrogram features.
Digit. Signal Process., 2016
Circuits Syst. Signal Process., 2016
Proceedings of the 2016 Visual Communications and Image Processing, 2016
Improvements on Deep Bottleneck Network based I-Vector Representation for Spoken Language Identification.
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016
LID-senone Extraction via Deep Neural Networks for End-to-End Language Identification.
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Compact convolutional neural network transfer learning for small-scale image classification.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
2015
IEEE ACM Trans. Audio Speech Lang. Process., 2015
Reconstruction of Phonated Speech from Whispers Using Formant-Derived Plausible Pitch Modulation.
ACM Trans. Access. Comput., 2015
Circuits Syst. Signal Process., 2015
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Low frequency ultrasonic voice activity detection using convolutional neural networks.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
2014
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Performance evaluation of deep bottleneck features for spoken language identification.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the International Conference on Audio, 2014
2013
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Joint spectral distribution modeling using restricted boltzmann machines for voice conversion.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Proceedings of the IEEE International Conference on Acoustics, 2013
Proceedings of the IEEE International Conference on Acoustics, 2013
2012
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
2011
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011
Proceedings of the First Asian Conference on Pattern Recognition, 2011
2010
The description of iFlyTek Speech Lab system for NIST2009 Language Recognition Evaluation.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010
2009
IEEE Trans. Circuits Syst. Video Technol., 2009
Comput. Vis. Image Underst., 2009
Proceedings of the IEEE International Conference on Systems, 2009
Proceedings of the IEEE International Conference on Systems, 2009
Proceedings of the 32nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2009
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009
2008
IEEE Trans. Multim., 2008
EURASIP J. Adv. Signal Process., 2008
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008
2007
Int. J. Semantic Comput., 2007
Proceedings of the First IEEE International Conference on Semantic Computing (ICSC 2007), 2007
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2007
Proceedings of the Advances in Multimedia Modeling, 2007
Proceedings of the 15th International Conference on Multimedia 2007, 2007
Proceedings of the 15th International Conference on Multimedia 2007, 2007
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007
Proceedings of the IEEE International Conference on Acoustics, 2007
2006
Automatic video annotation by semi-supervised learning with kernel density estimation.
Proceedings of the 14th ACM International Conference on Multimedia, 2006
Proceedings of the 14th ACM International Conference on Multimedia, 2006
Proceedings of the 8th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2006
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006
Proceedings of the 6th IEEE International Conference on Data Mining (ICDM 2006), 2006
An Automatic Video Semantic Annotation Scheme Based on Combination of Complementary Predictors.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2006
2005
Semi-automatic video annotation based on active learning with multiple complementary predictors.
Proceedings of the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2005