Nithin Rao Koluguri
According to our database1,
Nithin Rao Koluguri
authored at least 21 papers
between 2017 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
META-CAT: Speaker-Informed Speech Embeddings via Meta Information Concatenation for Multi-talker ASR.
CoRR, 2024
Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens.
CoRR, 2024
Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation.
CoRR, 2024
NEST: Self-supervised Fast Conformer as All-purpose Seasoning to Speech Processing Tasks.
CoRR, 2024
Codec-ASR: Training Performant Automatic Speech Recognition Systems with Discrete Speech Representations.
CoRR, 2024
BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5.
CoRR, 2024
CoRR, 2024
Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024
Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
2023
The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System.
CoRR, 2023
Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation.
CoRR, 2023
A Compact End-to-End Model with Local and Global Context for Spoken Language Identification.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
TitaNet: Neural Model for Speaker Representation with 1D Depth-Wise Separable Convolutions and Global Context.
Proceedings of the IEEE International Conference on Acoustics, 2022
2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
Comparison of Speech Tasks and Recording Devices for Voice Based Automatic Classification of Healthy Subjects and Patients with Amyotrophic Lateral Sclerosis.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
2017
Spectrogram Enhancement Using Multiple Window Savitzky-Golay (MWSG) Filter for Robust Bird Sound Detection.
IEEE ACM Trans. Audio Speech Lang. Process., 2017