Sriram Ganapathy

Orcid: 0000-0002-5779-9066

According to our database1, Sriram Ganapathy authored at least 168 papers between 2003 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Speech Dereverberation With Frequency Domain Autoregressive Modeling.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Representation Learning With Hidden Unit Clustering for Low Resource Speech Applications.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Summary of the DISPLACE challenge 2023-DIarization of SPeaker and LAnguage in Conversational Environments.
Speech Commun., 2024

Gradient-free Post-hoc Explainability Using Distillation Aided Learnable Approach.
CoRR, 2024

STAB: Speech Tokenizer Assessment Benchmark.
CoRR, 2024

Improving Self-supervised Pre-training using Accent-Specific Codebooks.
CoRR, 2024

Towards the Next Frontier in Speech Representation Learning Using Disentanglement.
CoRR, 2024

The Second DISPLACE Challenge : DIarization of SPeaker and LAnguage in Conversational Environments.
CoRR, 2024

Overlap-aware End-to-End Supervised Hierarchical Graph Clustering for Speaker Diarization.
CoRR, 2024

LLM Augmented LLMs: Expanding Capabilities through Composition.
CoRR, 2024

LLM Augmented LLMs: Expanding Capabilities through Composition.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Zero Shot Audio To Audio Emotion Transfer With Speaker Disentanglement.
Proceedings of the IEEE International Conference on Acoustics, 2024

Multimodal Modeling for Spoken Language Identification.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Speech enhancement with frequency domain auto-regressive modeling.
CoRR, 2023

Multimodal Modeling For Spoken Language Identification.
CoRR, 2023

MASR: Metadata Aware Speech Representation.
CoRR, 2023

Coswara: A respiratory sounds and symptoms dataset for remote screening of SARS-CoV-2 infection.
CoRR, 2023

HCAM - Hierarchical Cross Attention Model for Multi-modal Emotion Recognition.
CoRR, 2023

DISPLACE Challenge: DIarization of SPeaker and LAnguage in Conversational Environments.
CoRR, 2023

Label Aware Speech Representation Learning For Language Identification.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Enhancing the EEG Speech Match Mismatch Tasks With Word Boundaries.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

The DISPLACE Challenge 2023 - DIarization of SPeaker and LAnguage in Conversational Environments.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Supervised Hierarchical Clustering Using Graph Neural Networks for Speaker Diarization.
Proceedings of the IEEE International Conference on Acoustics, 2023

Self-Influence Guided Data Reweighting for Language Model Pre-training.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Accented Speech Recognition With Accent-specific Codebooks.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

MASR: Multi-Label Aware Speech Representation.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Pseudo-Label Based Supervised Contrastive Loss for Robust Speech Representations.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Towards sound based testing of COVID-19 - Summary of the first Diagnostics of COVID-19 using Acoustics (DiCOVA) Challenge.
Comput. Speech Lang., 2022

PLDA inspired Siamese networks for speaker verification.
Comput. Speech Lang., 2022

Dereverberation of autoregressive envelopes for far-field speech recognition.
Comput. Speech Lang., 2022

Interpretable Acoustic Representation Learning on Breathing and Speech Signals for COVID-19 Detection.
CoRR, 2022

Svadhyaya system for the Second Diagnosing COVID-19 using Acoustics Challenge 2021.
CoRR, 2022

Transformer Networks for Non-Intrusive Speech Quality Prediction.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Acoustic Representation Learning on Breathing and Speech Signals for COVID-19 Detection.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Speaker conditioned acoustic modeling for multi-speaker conversational ASR.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Analyzing the impact of SARS-CoV-2 variants on respiratory sound signals.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Coswara: A website application enabling COVID-19 screening by analysing respiratory sound samples and health symptoms.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Semi-supervised Acoustic and Language Modeling for Hindi ASR.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Leveraging Symmetrical Convolutional Transformer Networks for Speech to Singing Voice Style Transfer.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

The Second Dicova Challenge: Dataset and Performance Analysis for Diagnosis of Covid-19 Using Acoustics.
Proceedings of the IEEE International Conference on Acoustics, 2022

End-To-End Speech Recognition with Joint Dereverberation of Sub-Band Autoregressive Envelopes.
Proceedings of the IEEE International Conference on Acoustics, 2022

Self Supervised Representation Learning with Deep Clustering for Acoustic Unit Discovery from Raw Speech.
Proceedings of the IEEE International Conference on Acoustics, 2022

Multimodal Transformer with Learnable Frontend and Self Attention for Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Self-Supervised Representation Learning With Path Integral Clustering for Speaker Diarization.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

The Second DiCOVA Challenge: Dataset and performance analysis for COVID-19 diagnosis using acoustics.
CoRR, 2021

Multi-modal Point-of-Care Diagnostics for COVID-19 Based On Acoustics and Symptoms.
CoRR, 2021

Deep Correlation Analysis for Audio-EEG Decoding.
CoRR, 2021

A Multi-Head Relevance Weighting Framework for Learning Raw Waveform Audio Representations.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

LEAP Submission for the Third DIHARD Diarization Challenge.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

The Third DIHARD Diarization Challenge.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

SRIB-LEAP Submission to Far-Field Multi-Channel Speech Enhancement Challenge for Video Conferencing.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

DiCOVA Challenge: Dataset, Task, and Baseline System for COVID-19 Diagnosis Using Acoustics.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Uncovering the Acoustic Cues of COVID-19 Infection.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Investigating Feature Selection and Explainability for COVID-19 Diagnostics from Cough Sounds.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Deep Multiway Canonical Correlation Analysis For Multi-Subject Eeg Normalization.
Proceedings of the IEEE International Conference on Acoustics, 2021

NISP: A Multi-lingual Multi-accent Dataset for Speaker Profiling.
Proceedings of the IEEE International Conference on Acoustics, 2021

End-to-End Lyrics Recognition with Voice to Singing Style Transfer.
Proceedings of the IEEE International Conference on Acoustics, 2021

Representation Learning for Speech Recognition Using Feedback Based Relevance Weighting.
Proceedings of the IEEE International Conference on Acoustics, 2021

Self-Supervised Metric Learning With Graph Clustering For Speaker Diarization.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
Towards Relevance and Sequence Modeling in Language Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Interpretable Representation Learning for Speech and Audio Signals Based on Relevance Weighting.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Automatic speaker profiling from short duration speech data.
Speech Commun., 2020

Supervised I-vector modeling for language and accent recognition.
Comput. Speech Lang., 2020

Deep Learning Based Dereverberation of Temporal Envelopesfor Robust Speech Recognition.
CoRR, 2020

Third DIHARD Challenge Evaluation Plan.
CoRR, 2020

LEAP System for SRE19 Challenge - Improvements and Error Analysis.
CoRR, 2020

Pairwise Discriminative Neural PLDA for Speaker Verification.
CoRR, 2020

LEAP System for SRE 2019 CTS Challenge - Improvements and Error Analysis.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

NPLDA: A Deep Neural PLDA Model for Speaker Verification.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

IITG- Indigo Submissions for NIST 2018 Speaker Recognition Evaluation and Post-Challenge Improvements.
Proceedings of the 2020 National Conference on Communications, 2020

Deep Self-Supervised Hierarchical Clustering for Speaker Diarization.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Coswara - A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Neural PLDA Modeling for End-to-End Speaker Verification.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Deep Learning Based Dereverberation of Temporal Envelopes for Robust Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Audiovisual Correspondence Learning in Humans and Machines.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Context Dependent RNNLM for Automatic Transcription of Conversations.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Robust Raw Waveform Speech Recognition Using Relevance Weighted Representations.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Improving Voice Separation by Incorporating End-To-End Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

On The Impact of Language Familiarity in Talker Change Detection.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

3-D Acoustic Modeling for Far-Field Multi-Channel Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Unsupervised Neural Mask Estimator for Generalized Eigen-Value Beamforming Based Asr.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Deep Canonical Correlation Analysis For Decoding The Auditory Brain.
Proceedings of the 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2020

2019
Modulation Filter Learning Using Deep Variational Networks for Robust Speech Recognition.
IEEE J. Sel. Top. Signal Process., 2019

3-D Feature and Acoustic Modeling for Far-Field Speech Recognition.
CoRR, 2019

LEAP Diarization System for the Second DIHARD Challenge.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

The Second DIHARD Diarization Challenge: Dataset, Task, and Baselines.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Attention Based Hybrid i-Vector BLSTM Model for Language Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Active Learning Methods for Low Resource End-to-End Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

A Study of x-Vector Based Speaker Recognition on Short Utterances.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Unsupervised Raw Waveform Representation Learning for ASR.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Analyzing Human Reaction Time for Talker Change Detection.
Proceedings of the IEEE International Conference on Acoustics, 2019

The Leap Speaker Recognition System for NIST SRE 2018 Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2019

End-to-end Language Recognition Using Attention Based Hierarchical Gated Recurrent Unit Models.
Proceedings of the IEEE International Conference on Acoustics, 2019

A Deep Neural Network Based End to End Model for Joint Height and Age Estimation from Short Duration Speech.
Proceedings of the IEEE International Conference on Acoustics, 2019

Deep Variational Filter Learning Models for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

Level-wise Subject adaptation to improve classification of motor and mental EEG tasks.
Proceedings of the 41st Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2019

Second Language Transfer Learning in Humans and Machines Using Image Supervision.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Speaker and Language Aware Training for End-to-End ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
The LEAP Language Recognition System for LRE 2017 Challenge - Improvements and Error Analysis.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Supervised I-vector Modeling - Theory and Applications.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

On Convolutional LSTM Modeling for Joint Wake-Word Detection and Text Dependent Speaker Verification.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Far-Field Speech Recognition Using Multivariate Autoregressive Models.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Speaker and Language Recognition - From Laboratory Technologies to the Wild.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Talker Diarization in the Wild: the Case of Child-centered Daylong Audio-recordings.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Comparison of Unsupervised Modulation Filter Learning Methods for ASR.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Leveraging LSTM Models for Overlap Detection in Multi-Party Meetings.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Enhancement and Analysis of Conversational Speech: JSALT 2017.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

3-D CNN Models for Far-Field Multi-Channel Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Multivariate Autoregressive Spectrogram Modeling for Noisy Speech Recognition.
IEEE Signal Process. Lett., 2017

Increasing the robustness of CNN acoustic models using autoregressive moving average spectrogram features and channel dropout.
Pattern Recognit. Lett., 2017

IITG-Indigo System for NIST 2016 SRE Challenge.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Speech Representation Learning Using Unsupervised Data-Driven Modulation Filtering for Robust ASR.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Factor analysis methods for joint speaker verification and spoof detection.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Leveraging native language speech for accent identification using deep Siamese networks.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Unsupervised HMM posteriograms for language independent acoustic modeling in zero resource conditions.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Deep learning methods for unsupervised acoustic modeling - Leap submission to ZeroSpeech challenge 2017.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016
The IBM 2016 Speaker Recognition System.
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

The IBM Speaker Recognition System: Recent Advances and Error Analysis.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

An Investigation on the Use of i-Vectors for Robust ASR.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Speaker age estimation on conversational telephone speech using senone posterior based i-vectors.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Investigating factor analysis features for deep neural networks in noisy speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Nearest neighbor discriminant analysis for language recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Robust speech processing using ARMA spectrogram models.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Robust Feature Extraction Using Modulation Filtering of Autoregressive Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

Robust language identification using convolutional neural network features.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Analyzing convolutional neural networks for speech activity detection in mismatched acoustic conditions.
Proceedings of the IEEE International Conference on Acoustics, 2014

Shift-invariant features for speech activity detection in adverse radio-frequency channel conditions.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Enhancing Frequency Shifted Speech Signals in Single Side-Band Communication.
IEEE Signal Process. Lett., 2013

The IBM speech activity detection system for the DARPA RATS program.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Robust speaker recognition using spectro-temporal autoregressive models.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

TRAP language identification system for RATS phase II evaluation.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Unsupervised channel adaptation for language identification using co-training.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Noisy channel adaptation in language identification.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Adaptation transforms of auto-associative neural networks as features for speaker verification.
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Feature extraction using 2-d autoregressive models for speaker recognition.
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Data-driven Posterior Features for Low Resource Speech Recognition Applications.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Analysis of Temporal Resolution in Frequency Domain Linear Prediction.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Multilingual MLP features for low-resource LVCSR systems.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012


Comparison of Different Approaches for Speech Recognition in Hands-free Mode.
Proceedings of the 10th ITG Conference on Speech Communication, 2012

2011
Multi-layer perceptron based speech activity detection for speaker verification.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2011

Modulation Spectrum Analysis for Recognition of Reverberant Speech.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Feature normalization for speaker verification in room reverberation.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Autoregressive Models of Amplitude Modulations in Audio Compression.
IEEE Trans. Speech Audio Process., 2010

Wide-Band Audio Coding Based on Frequency-Domain Linear Prediction.
EURASIP J. Audio Speech Music. Process., 2010

A phoneme recognition framework based on auditory spectro-temporal receptive fields.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Cross-lingual and multi-stream posterior features for low resource LVCSR systems.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Sparse auto-associative neural networks: theory and application to speech recognition.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Comparison of modulation features for phoneme recognition.
Proceedings of the IEEE International Conference on Acoustics, 2010

Robust spectro-temporal features based on autoregressive models of Hilbert envelopes.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Applications of signal analysis using autoregressive models for amplitude modulation.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009

Error Resilient Speech Coding Using Sub-band Hilbert Envelopes.
Proceedings of the Text, Speech and Dialogue, 12th International Conference, 2009

Tandem representations of spectral envelope and modulation frequency features for ASR.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Arithmetic coding of sub-band residuals in FDLP speech/audio codec.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Static and dynamic modulation spectrum for speech recognition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Phoneme recognition using spectral envelope and modulation frequency features.
Proceedings of the IEEE International Conference on Acoustics, 2009

Temporal envelope subtraction for robust speech recognition using modulation spectrum.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008
Recognition of Reverberant Speech Using Frequency Domain Linear Prediction.
IEEE Signal Process. Lett., 2008

Perceptually Motivated Sub-band Decomposition for FDLP Audio Coding.
Proceedings of the Text, Speech and Dialogue, 11th International Conference, 2008

Hilbert Envelope Based Features for Far-Field Speech Recognition.
Proceedings of the Machine Learning for Multimodal Interaction, 5th International Workshop, 2008

Hilbert envelope based spectro-temporal features for phoneme recognition in telephone speech.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Front-end for far-field speech recognition based on frequency domain linear prediction.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Spectral noise shaping: improvements in speech/audio codec based on linear prediction in spectral domain.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Temporal masking for bit-rate reduction in audio codec based on Frequency Domain Linear Prediction.
Proceedings of the IEEE International Conference on Acoustics, 2008

Spectro-temporal features for Automatic Speech Recognition using Linear Prediction in spectral domain.
Proceedings of the 2008 16th European Signal Processing Conference, 2008

2007
Non-uniform Speech/Audio Coding Exploiting Predictability of Temporal Evolution of Spectral Envelopes.
Proceedings of the Text, Speech and Dialogue, 10th International Conference, 2007

Frequency Domain Linear Prediction for QMF Sub-bands and Applications to Audio Coding.
Proceedings of the Machine Learning for Multimodal Interaction , 2007

2003
Agreement strategies for cooperative control of uninhabited autonomous vehicles.
Proceedings of the American Control Conference, 2003


  Loading...