Wei-Ning Hsu
Orcid: 0000-0001-5546-5217
According to our database1,
Wei-Ning Hsu
authored at least 96 papers
between 2015 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
Audiobox TTA-RAG: Improving Zero-Shot and Few-Shot Text-To-Audio with Retrieval-Augmented Generation.
CoRR, 2024
High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching.
CoRR, 2024
Learning Fine-Grained Controllability on Speech Generation via Efficient Fine-Tuning.
CoRR, 2024
Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Attention or Convolution: Transformer Encoders in Audio Language Models for Inference Efficiency.
Proceedings of the IEEE International Conference on Acoustics, 2024
M2BART: Multilingual and Multimodal Encoder-Decoder Pre-Training for Any-to-Any Machine Translation.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
2023
Trans. Assoc. Comput. Linguistics, 2023
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Efficient Self-supervised Learning with Contextualized Target Representations for Vision, Speech and Language.
Proceedings of the International Conference on Machine Learning, 2023
Proceedings of the International Conference on Machine Learning, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Cocktail Hubert: Generalized Self-Supervised Pre-Training for Mixture and Single-Source Speech.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Regeneration.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Av-Data2Vec: Self-Supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
2022
ReVISE: Self-Supervised Speech Resynthesis with Visual Input for Universal and Generalized Speech Enhancement.
CoRR, 2022
A Single Self-Supervised Model for Many Speech Modalities Enables Zero-Shot Modality Transfer.
CoRR, 2022
CoRR, 2022
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
u-HuBERT: Unified Mixed-Modal Speech Pretraining And Zero-Shot Transfer to Unlabeled Modality.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Enhanced Direct Speech-to-Speech Translation Using Self-supervised Pre-training and Data Augmentation.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language.
Proceedings of the International Conference on Machine Learning, 2022
Proceedings of the Tenth International Conference on Learning Representations, 2022
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
2021
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
CoRR, 2021
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2021
Kaizen: Continuously Improving Teacher Using Exponential Moving Average for Semi-Supervised Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021
2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 8th International Conference on Learning Representations, 2020
2019
CoRR, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 7th International Conference on Learning Representations, 2019
Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization.
Proceedings of the IEEE International Conference on Acoustics, 2019
Semi-supervised Training for Improving Data Efficiency in End-to-end Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2019
2018
Disentangling by Partitioning: A Representation Learning Framework for Multimodal Sensory Data.
CoRR, 2018
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018
A Study of Enhancement, Augmentation and Autoencoder Methods for Domain Adaptation in Distant Speech Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Unsupervised Adaptation with Interpretable Disentangled Representations for Distant Conversational Speech Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 24th International Conference on Pattern Recognition, 2018
Extracting Domain Invariant Features by Unsupervised Learning for Robust Automatic Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
2017
Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
2016
CoRR, 2016
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016
Development of the MIT ASR system for the 2016 Arabic Multi-genre Broadcast Challenge.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016
SLS at SemEval-2016 Task 3: Neural-based Approaches for Ranking in Community Question Answering.
Proceedings of the 10th International Workshop on Semantic Evaluation, 2016
Exploiting Depth and Highway Connections in Convolutional Recurrent Deep Neural Networks for Speech Recognition.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the COLING 2016, 2016
2015
Enhancing automatically discovered multi-level acoustic patterns considering context consistency with applications in spoken term detection.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015