Nithin Rao Koluguri

According to our database1, Nithin Rao Koluguri authored at least 21 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
META-CAT: Speaker-Informed Speech Embeddings via Meta Information Concatenation for Multi-talker ASR.
CoRR, 2024

Sortformer: Seamless Integration of Speaker Diarization and ASR by Bridging Timestamps and Tokens.
CoRR, 2024

Longer is (Not Necessarily) Stronger: Punctuated Long-Sequence Training for Enhanced Speech Recognition and Translation.
CoRR, 2024

NEST: Self-supervised Fast Conformer as All-purpose Seasoning to Speech Processing Tasks.
CoRR, 2024

Codec-ASR: Training Performant Automatic Speech Recognition Systems with Discrete Speech Representations.
CoRR, 2024

BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5.
CoRR, 2024

Less is More: Accurate Speech Recognition & Translation without Web-Scale Data.
CoRR, 2024

Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024

Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach.
Proceedings of the IEEE International Conference on Acoustics, 2024

Investigating End-to-End ASR Architectures for Long Form Audio Transcription.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
The CHiME-7 Challenge: System Description and Performance of NeMo Team's DASR System.
CoRR, 2023

Property-Aware Multi-Speaker Data Simulation: A Probabilistic Modelling Technique for Synthetic Data Generation.
CoRR, 2023

A Compact End-to-End Model with Local and Global Context for Spoken Language Identification.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Fast Conformer With Linearly Scalable Attention For Efficient Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
AmberNet: A Compact End-to-End Model for Spoken Language Identification.
CoRR, 2022

NeMo Open Source Speaker Diarization System.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Multi-scale Speaker Diarization with Dynamic Scale Weighting.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

TitaNet: Neural Model for Speaker Representation with 1D Depth-Wise Separable Convolutions and Global Context.
Proceedings of the IEEE International Conference on Acoustics, 2022

2020
Meta-Learning for Robust Child-Adult Classification from Speech.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Comparison of Speech Tasks and Recording Devices for Voice Based Automatic Classification of Healthy Subjects and Patients with Amyotrophic Lateral Sclerosis.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

2017
Spectrogram Enhancement Using Multiple Window Savitzky-Golay (MWSG) Filter for Robust Bird Sound Detection.
IEEE ACM Trans. Audio Speech Lang. Process., 2017


  Loading...