Shansong Liu

Orcid: 0000-0001-6202-5615

According to our database¹, Shansong Liu authored at least 33 papers between 2017 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

2017

2018

2019

2020

2021

2022

2023

2024

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

MuMu-LLaMA: Multi-modal Music Understanding and Generation via Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Editing Music with Melody and Text: Using ControlNet for Diffusion Transformer.

[BibT_eX]

[DOI]

CoRR, 2024

Unified Pretraining Target Based Video-Music Retrieval with Music Rhythm and Video Optical Flow Information.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Humtrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

M<sup>2</sup>UGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2023

Prosody Modeling with 3D Visual Information for Expressive Video Dubbing.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2022

Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2022

A Hierarchical Speaker Representation Framework for One-shot Singing Voice Conversion.

[BibT_eX]

[DOI]

Xu Li

Shansong Liu

Ying Shan

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Exploiting Cross Domain Acoustic-to-Articulatory Inverted Features for Disordered Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Audio-Visual Multi-Channel Integration and Recognition of Overlapped Speech.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Recent Progress in the CUHK Dysarthric Speech Recognition System.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Bayesian Learning of LF-MMI Trained Time Delay Neural Networks for Speech Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Adversarial Data Augmentation for Disordered Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Bayesian Parametric and Architectural Domain Adaptation of LF-MMI Trained TDNNs for Elderly and Dysarthric Speech Recognition.

[BibT_eX]

[DOI]

Jiajun Deng

Fabian Ritter Gutierrez

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Development of the Cuhk Elderly Speech Recognition System for Neurocognitive Disorder Detection Using the Dementiabank Corpus.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Bayesian Transformer Language Models for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Neural Architecture Search for Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2020

Exploiting Cross-Domain Visual Feature Generation for Disordered Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Investigation of Data Augmentation Techniques for Disordered Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Audio-Visual Recognition of Overlapped Speech for the LRS2 Dataset.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Exploiting Visual Features Using Bayesian Gated Neural Networks for Disordered Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

On the Use of Pitch Features for Disordered Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

LF-MMI Training of Bayesian and Gaussian Process Time Delay Neural Networks for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

The CUHK Dysarthric Speech Recognition Systems for English and Cantonese.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Bayesian and Gaussian Process Neural Networks for Large Vocabulary Continuous Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Comprehensive simulation of metagenomic sequencing data with non-uniform sampling distribution.

[BibT_eX]

[DOI]

Quant. Biol., 2018

Development of the CUHK Dysarthric Speech Recognition System for the UA Speech Corpus.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Gaussian Process Neural Networks for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Limited-Memory BFGS Optimization of Recurrent Neural Network Language Models for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Reading the Underlying Information From Massive Metagenomic Sequencing Data.

[BibT_eX]

[DOI]

Proc. IEEE, 2017

Shansong Liu

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...