Anurag Kumar

Affiliations:
  • Facebook Research, Facebook Reality Labs, Redmond, WA USA
  • Indian Institute of Technology, Kanpur, India (former)


According to our database1, Anurag Kumar authored at least 74 papers between 2012 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Neural-Network-Based Direction-of-Arrival Estimation for Reverberant Speech - The Importance of Energetic, Temporal, and Spatial Information.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Improved direction of arrival estimations with a wearable microphone array for dynamic environments by reliability weighting.
CoRR, 2024

High Fidelity Text-Guided Music Generation and Editing via Single-Stage Flow Matching.
CoRR, 2024

AV-CrossNet: an Audiovisual Complex Spectral Mapping Network for Speech Separation By Leveraging Narrow- and Cross-Band Modeling.
CoRR, 2024

URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement.
CoRR, 2024

Cross-Talk Reduction.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Ambisonics Networks - The Effect of Radial Functions Regularization.
Proceedings of the IEEE International Conference on Acoustics, 2024

A Closer Look at Wav2vec2 Embeddings for On-Device Single-Channel Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2024

Audiovisual Speaker Separation with Full- and Sub-Band Modeling in the Time-Frequency Domain.
Proceedings of the IEEE International Conference on Acoustics, 2024

Spherical World-Locking for Audio-Visual Localization in Egocentric Videos.
Proceedings of the Computer Vision - ECCV 2024, 2024

Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch.
CoRR, 2023

Rethinking Complex-Valued Deep Neural Networks for Monaural Speech Enhancement.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Spatialization Quality Metric for Binaural Speech.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Time-domain Transformer-based Audiovisual Speaker Separation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2023

Paaploss: A Phonetic-Aligned Acoustic Parameter Loss for Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2023

LA-VOCE: LOW-SNR Audio-Visual Speech Enhancement Using Neural Vocoders.
Proceedings of the IEEE International Conference on Acoustics, 2023

Nord: Non-Matching Reference Based Relative Depth Estimation from Binaural Speech.
Proceedings of the IEEE International Conference on Acoustics, 2023

Torchaudio-Squim: Reference-Less Speech Quality and Intelligibility Measures in Torchaudio.
Proceedings of the IEEE International Conference on Acoustics, 2023

Leveraging Heteroscedastic Uncertainty in Learning Complex Spectral Mapping for Single-Channel Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2023

TorchAudio 2.1: Advancing Speech Recognition, Self-Supervised Learning, and Audio Processing Components for Pytorch.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
RemixIT: Continual Self-Training of Speech Enhancement Models via Bootstrapped Remixing.
IEEE J. Sel. Top. Signal Process., 2022

Direction Of Arrival Estimation For Reverberant Speech Based On Neural Networks And The Direct-Path Dominance Test.
Proceedings of the 17th International Workshop on Acoustic Signal Enhancement, 2022

Improving Speech Enhancement through Fine-Grained Speech Characteristics.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

SAQAM: Spatial Audio Quality Assessment Metric.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Speech Quality Assessment through MOS using Non-Matching References.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Time-domain Ad-hoc Array Speech Enhancement Using a Triple-path Network.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Continual Self-Training With Bootstrapped Remixing For Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2022

Conformer-Based Self-Supervised Learning For Non-Speech Audio Tasks.
Proceedings of the IEEE International Conference on Acoustics, 2022

Multichannel Speech Enhancement Without Beamforming.
Proceedings of the IEEE International Conference on Acoustics, 2022

TPARN: Triple-Path Attentive Recurrent Network for Time-Domain Multichannel Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2022

The Impact of Removing Head Movements on Audio-Visual Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2022

Audio Signal Processing for Telepresence Based on Wearable Array in Noisy and Dynamic Scenes.
Proceedings of the IEEE International Conference on Acoustics, 2022


2021
SAGRNN: Self-Attentive Gated RNN For Binaural Speaker Separation With Interaural Cue Preservation.
IEEE Signal Process. Lett., 2021

NICE-Beam: Neural Integrated Covariance Estimators for Time-Varying Beamformers.
CoRR, 2021

TADRN: Triple-Attentive Dual-Recurrent Network for Ad-hoc Array Multichannel Speech Enhancement.
CoRR, 2021

Ego4D: Around the World in 3, 000 Hours of Egocentric Video.
CoRR, 2021

Online Self-Attentive Gated RNNs for Real-Time Speaker Separation.
CoRR, 2021

DPLM: A Deep Perceptual Spatial-Audio Localization Metric.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

NORESQA: A Framework for Speech Quality Assessment using Non-Matching References.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Do Sound Event Representations Generalize to Other Audio Tasks? A Case Study in Audio Transfer Learning.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Multi-Channel Speech Enhancement Using Graph Neural Networks.
Proceedings of the IEEE International Conference on Acoustics, 2021

Incorporating Real-World Noisy Speech in Neural-Network-Based Speech Enhancement Systems.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
A Sequential Self Teaching Approach for Improving Generalization in Sound Event Recognition.
Proceedings of the 37th International Conference on Machine Learning, 2020

SeCoST: : Sequential Co-Supervision for Large Scale Weakly Labeled Audio Event Detection.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
SeCoST: Sequential Co-Supervision for Weakly Labeled Audio Event Detection.
CoRR, 2019

Learning Sound Events from Webly Labeled Data.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

2018
A Closer Look at Weak Label Learning for Audio Events.
CoRR, 2018

NELS - Never-Ending Learner of Sounds.
CoRR, 2018

Classifier Risk Estimation Under Limited Labeling Resources.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2018

Content-Based Representations of Audio Using Siamese Neural Networks.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Knowledge Transfer from Weakly Labeled Audio Using Convolutional Neural Network for Sound Events and Scenes.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Framework for Evaluation of Sound Event Detection in Web Videos.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Deep CNN Framework for Audio Event Recognition using Weakly Labeled Web Data.
CoRR, 2017

Audio Content Based Geotagging in Multimedia.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Audio event and scene recognition: A unified approach using strongly and weakly labeled data.
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

Discovering sound concepts and acoustic relations in text.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

An approach for self-training audio event detectors using web data.
Proceedings of the 25th European Signal Processing Conference, 2017

2016
An Approach for Self-Training Audio Event Detectors Using Web Data.
CoRR, 2016

Features and Kernels for Audio Event Recognition.
CoRR, 2016

Audio Event Detection using Weakly Labeled Data.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Speech Enhancement in Multiple-Noise Conditions Using Deep Neural Networks.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Weakly supervised scalable audio content analysis.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2016

Experiments on the DCASE Challenge 2016: Acoustic Scene Classification and Sound Event Detection in Real Life Recording.
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2016

2015
Unsupervised Fusion Weight Learning in Multiple Classifier Systems.
CoRR, 2015


A novel ranking method for multiple classifier systems.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

Monaural speaker segregation using group delay spectral matrix factorization.
Proceedings of the Twentieth National Conference on Communications, 2014

Detecting sound objects in audio recordings.
Proceedings of the 22nd European Signal Processing Conference, 2014

2013
Event detection in short duration audio using Gaussian Mixture Model and Random Forest Classifier.
Proceedings of the 21st European Signal Processing Conference, 2013

2012
Audio event detection from acoustic unit occurrence patterns.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012


  Loading...