2024
M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses.
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Faster Speech-LLaMA Inference with Multi-token Prediction.
CoRR, 2024
Listening to Multi-talker Conversations: Modular and End-to-end Perspectives.
CoRR, 2024
On Speaker Attribution with SURT.
Proceedings of the Odyssey 2024: The Speaker and Language Recognition Workshop, 2024
Enhancing Neural Transducer for Multilingual ASR with Synchronized Language Diarization.
Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024
Training Early-Exit Architectures for Automatic Speech Recognition: Fine-Tuning Pre-Trained Models or Training from Scratch.
Proceedings of the IEEE International Conference on Acoustics, 2024
Updated Corpora and Benchmarks for Long-Form Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024
ConEC: Earnings Call Dataset with Real-world Contexts for Benchmarking Contextual Speech Recognition.
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
2023
SURT 2.0: Advances in Transducer-Based Multi-Talker Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
Training dynamic models using early exits for automatic speech recognition on resource-constrained devices.
CoRR, 2023
The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios.
CoRR, 2023
GPU-accelerated Guided Source Separation for Meeting Transcription.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Anchored Speech Recognition with Neural Transducers.
Proceedings of the IEEE International Conference on Acoustics, 2023
Adapting Self-Supervised Models to Multi-Talker Speech Recognition Using Speaker Embeddings.
Proceedings of the IEEE International Conference on Acoustics, 2023
Learning From Flawed Data: Weakly Supervised Automatic Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
2022
Joint speaker diarization and speech recognition based on region proposal networks.
Comput. Speech Lang., 2022
A machine learning-based approach to determine infection status in recipients of BBV152 (Covaxin) whole-virion inactivated SARS-CoV-2 vaccine for serological surveys.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Comput. Biol. Medicine, 2022
Low-Latency Speech Separation Guided Diarization for Telephone Conversations.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Injecting Text and Cross-Lingual Supervision in Few-Shot Learning from Self-Supervised Models.
Proceedings of the IEEE International Conference on Acoustics, 2022
Continuous Streaming Multi-Talker ASR with Dual-Path Transducers.
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap.
CoRR, 2021
Sequential Multi-Frame Neural Beamforming for Speech Separation and Enhancement.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Multi-Class Spectral Clustering with Overlaps for Speaker Diarization.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
DOVER-Lap: A Method for Combining Overlap-Aware Diarization Outputs.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Integration of Speech Separation, Diarization, and Recognition for Multi-Speaker Meetings: System Description, Comparison, and Analysis.
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Auxiliary Loss Function for Target Speech Extraction and Recognition with Weak Supervision Based on Speaker Characteristics.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Training Hybrid Models on Noisy Transliterated Transcripts for Code-Switched Speech Recognition.
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Reformulating DOVER-Lap Label Mapping as a Graph Partitioning Problem.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Target-Speaker Voice Activity Detection with Improved i-Vector Estimation for Unknown Number of Speaker.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
2020
Frustratingly Easy Noise-aware Training of Acoustic Models.
CoRR, 2020
The JHU Multi-Microphone Multi-Speaker ASR System for the CHiME-6 Challenge.
CoRR, 2020
2019
Using ASR Methods for OCR.
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019
Probing the Information Encoded in X-Vectors.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
2018
Analysis of Data Generated From Multidimensional Type-1 and Type-2 Fuzzy Membership Functions.
IEEE Trans. Fuzzy Syst., 2018
Uncertain fuzzy self-organization based clustering: interval type-2 fuzzy approach to adaptive resonance theory.
Inf. Sci., 2018
2017
Principal component analysis approach in selecting type-1 and type-2 fuzzy membership functions for high-dimensional data.
Proceedings of the Joint 17th World Congress of International Fuzzy Systems Association and 9th International Conference on Soft Computing and Intelligent Systems, 2017
Learning local and global contexts using a convolutional recurrent network model for relation classification in biomedical text.
Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), 2017
2016
Visual analysis and representations of type-2 fuzzy membership functions.
Proceedings of the 2016 IEEE International Conference on Fuzzy Systems, 2016