Aswin Shanmugam Subramanian

Orcid: 0000-0003-4446-001X

Affiliations:
  • Microsoft, Redmond, WA, USA
  • Johns Hopkins University, Baltimore, MD, USA (PhD 2022)
  • Mitsubishi Electric Research Laboratories, Cambridge, MA, USA (2021 - 2022)
  • Indian Institute of Technology Madras, Chennai, India (former)


According to our database1, Aswin Shanmugam Subramanian authored at least 34 papers between 2013 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
TS-SEP: Joint Diarization and Separation Conditioned on Estimated Speaker Embeddings.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Soft Language Identification for Language-Agnostic Many-to-One End-to-End Speech Translation.
CoRR, 2024

Late Audio-Visual Fusion for in-the-Wild Speaker Diarization.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Tackling the Cocktail Fork Problem for Separation and Transcription of Real-World Soundtracks.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Hyperbolic Audio Source Separation.
Proceedings of the IEEE International Conference on Acoustics, 2023

Reverberation as Supervision For Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition.
Comput. Speech Lang., 2022

Towards End-to-end Speaker Diarization in the Wild.
CoRR, 2022

Heterogeneous Target Speech Separation.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Improved Domain Generalization via Disentangled Multi-Task Learning in Unsupervised Anomalous Sound Detection.
Proceedings of the 7th Workshop on Detection and Classification of Acoustic Scenes and Events 2022, 2022

2021
ESPnet-SE: End-To-End Speech Enhancement and Separation Toolkit Designed for ASR Integration.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization.
Proceedings of the IEEE International Conference on Acoustics, 2021

An Exploration of Self-Supervised Pretrained Representations for End-to-End Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
Significance of spectral cues in automatic speech segmentation for Indian language speech synthesizers.
Speech Commun., 2020

The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans.
CoRR, 2020

The JHU Multi-Microphone Multi-Speaker ASR System for the CHiME-6 Challenge.
CoRR, 2020

End-to-End Far-Field Speech Recognition with Unified Dereverberation and Beamforming.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

End-to-End ASR with Adaptive Span Self-Attention.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Far-Field Location Guided Target Speech Extraction Using End-to-End Speech Recognition Objectives.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Attention-Based ASR with Lightweight and Dynamic Convolutions.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Dry, Focus, and Transcribe: End-to-End Integration of Dereverberation, Beamforming, and ASR.
CoRR, 2019

Generalized Weighted-Prediction-Error Dereverberation with Varying Source Priors For Reverberant Speech Recognition.
Proceedings of the 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2019

Speech Enhancement Using End-to-End Speech Recognition Objectives.
Proceedings of the 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2019

2018
Student-Teacher Learning for BLSTM Mask-based Speech Enhancement.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Building State-of-the-art Distant Speech Recognition Using the CHiME-4 Challenge with a Setup of Speech Enhancement Baseline.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017
TBT (Toolkit to Build TTS): A High Performance Framework to Build Multiple Language HTS Voice.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016
Significance of Pseudo-syllables in building better acoustic models for Indian English TTS.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Building speech synthesis systems for Indian languages.
Proceedings of the Twenty First National Conference on Communications, 2015

Blizzard Challenge 2015 : Submission by DONLab, IIT Madras.
Proceedings of the Blizzard Challenge 2015, 2015

2014
Group delay based phone segmentation for HTS.
Proceedings of the Twentieth National Conference on Communications, 2014

A hybrid approach to segmentation of speech using group delay processing and HMM based embedded reestimation.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

IIT Madras's Submission to the Blizzard Challenge 2014.
Proceedings of the Blizzard Challenge 2014, Singapore, Singapore, September 19, 2014, 2014

2013
A common attribute based unified HTS framework for speech synthesis in Indian languages.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

A syllable based statistical text to speech system.
Proceedings of the 21st European Signal Processing Conference, 2013


  Loading...