CoRR, 2024

YOLO-Stutter: End-to-end Region-Wise Speech Dysfluency Detection.

[DOI]

CoRR, 2024

VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis.

[DOI]

CoRR, 2024

Towards Hierarchical Spoken Language Dysfluency Modeling.

[DOI]

CoRR, 2024

Stutter-Solver: End-To-End Multi-Lingual Dysfluency Detection.

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

SSDM: Scalable Speech Dysfluency Modeling.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Enhancing GAN-based Vocoders with Contrastive Learning Under Data-Limited Condition.

[DOI]

Haoming Guo

Seth Z. Zhao

Gerald Friedland

Proceedings of the IEEE International Conference on Acoustics, 2024

Towards Hierarchical Spoken Language Disfluency Modeling.

[DOI]

Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

2023

Unsupervised TTS Acoustic Modeling for TTS With Conditional Disentangled Sequential VAE.

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Deep Speech Synthesis from MRI-Based Articulatory Representations.

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Articulatory Representation Learning via Joint Factor Analysis and Neural Matrix Factorization.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Unconstrained Dysfluency Modeling for Dysfluent Speech Transcription and Detection.

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Av-Data2Vec: Self-Supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations.

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

UTTS: Unsupervised TTS with Conditional Disentangled Sequential Variational Auto-encoder.

[DOI]

CoRR, 2022

Towards Improved Zero-shot Voice Conversion with Conditional DSVAE.

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Deep Neural Convolutive Matrix Factorization for Articulatory Representation Decomposition.

[DOI]

Alan W. Black

Louis Goldstein

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Robust Disentangled Variational Speech Representation Learning for Zero-Shot Voice Conversion.

[DOI]