Shansong Liu

Orcid: 0000-0001-6202-5615

According to our database1, Shansong Liu authored at least 32 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer.
CoRR, 2024

Unified Pretraining Target Based Video-Music Retrieval with Music Rhythm and Video Optical Flow Information.
Proceedings of the IEEE International Conference on Acoustics, 2024

Humtrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond.
Proceedings of the IEEE International Conference on Acoustics, 2024

Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
M<sup>2</sup>UGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models.
CoRR, 2023

Prosody Modeling with 3D Visual Information for Expressive Video Dubbing.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2022
Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

A Hierarchical Speaker Representation Framework for One-shot Singing Voice Conversion.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Exploiting Cross Domain Acoustic-to-Articulatory Inverted Features for Disordered Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Audio-Visual Multi-Channel Integration and Recognition of Overlapped Speech.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Recent Progress in the CUHK Dysarthric Speech Recognition System.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Bayesian Learning of LF-MMI Trained Time Delay Neural Networks for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Adversarial Data Augmentation for Disordered Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Bayesian Parametric and Architectural Domain Adaptation of LF-MMI Trained TDNNs for Elderly and Dysarthric Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Development of the Cuhk Elderly Speech Recognition System for Neurocognitive Disorder Detection Using the Dementiabank Corpus.
Proceedings of the IEEE International Conference on Acoustics, 2021

Bayesian Transformer Language Models for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Neural Architecture Search for Speech Recognition.
CoRR, 2020

Exploiting Cross-Domain Visual Feature Generation for Disordered Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Investigation of Data Augmentation Techniques for Disordered Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Audio-Visual Recognition of Overlapped Speech for the LRS2 Dataset.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Exploiting Visual Features Using Bayesian Gated Neural Networks for Disordered Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

On the Use of Pitch Features for Disordered Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

LF-MMI Training of Bayesian and Gaussian Process Time Delay Neural Networks for Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

The CUHK Dysarthric Speech Recognition Systems for English and Cantonese.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Bayesian and Gaussian Process Neural Networks for Large Vocabulary Continuous Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Comprehensive simulation of metagenomic sequencing data with non-uniform sampling distribution.
Quant. Biol., 2018

Development of the CUHK Dysarthric Speech Recognition System for the UA Speech Corpus.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Gaussian Process Neural Networks for Speech Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Limited-Memory BFGS Optimization of Recurrent Neural Network Language Models for Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Reading the Underlying Information From Massive Metagenomic Sequencing Data.
Proc. IEEE, 2017


  Loading...