2025
Supervised contrastive learning from weakly-labeled audio segments for musical version matching.
CoRR, February, 2025
2024
Joint Semantic Knowledge Distillation and Masked Acoustic Modeling for Full-band Speech Restoration with Improved Intelligibility.
CoRR, 2024
Sequential Contrastive Audio-Visual Learning.
CoRR, 2024
GASS: Generalizing Audio Source Separation with Large-Scale Data.
Proceedings of the IEEE International Conference on Acoustics, 2024
Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity.
Proceedings of the Computer Vision - ECCV 2024, 2024
2023
CLIPSonic: Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision Models.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023
Mono-to-Stereo Through Parametric Stereo Generation.
Proceedings of the 24th International Society for Music Information Retrieval Conference, 2023
Adversarial Permutation Invariant Training for Universal Sound Separation.
Proceedings of the IEEE International Conference on Acoustics, 2023
Full-Band General Audio Synthesis with Score-Based Diffusion.
Proceedings of the IEEE International Conference on Acoustics, 2023
Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker Separation.
Proceedings of the IEEE International Conference on Acoustics, 2023
Upsampling Layers for Music Source Separation.
Proceedings of the 31st European Signal Processing Conference, 2023
2022
Universal Speech Enhancement with Score-based Diffusion.
CoRR, 2022
Assessing Algorithmic Biases for Musical Version Identification.
Proceedings of the WSDM '22: The Fifteenth ACM International Conference on Web Search and Data Mining, Virtual Event / Tempe, AZ, USA, February 21, 2022
On Loss Functions and Evaluation Metrics for Music Source Separation.
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
Audio-Based Musical Version Identification: Elements and challenges.
IEEE Signal Process. Mag., 2021
Heaps' law and vocabulary richness in the history of classical music harmony.
EPJ Data Sci., 2021
On tuning consistent annealed sampling for denoising score matching.
CoRR, 2021
Adversarial Auto-Encoding for Packet Loss Concealment.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021
Investigating the Efficacy of Music Version Retrieval Systems for Setlist Identification.
Proceedings of the IEEE International Conference on Acoustics, 2021
Automatic Multitrack Mixing With A Differentiable Mixing Console Of Neural Audio Effects.
Proceedings of the IEEE International Conference on Acoustics, 2021
SESQA: Semi-Supervised Learning for Speech Quality Assessment.
Proceedings of the IEEE International Conference on Acoustics, 2021
Upsampling Artifacts in Neural Audio Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2021
2020
Experience: advanced network operations in (Un)-connected remote communities.
Proceedings of the MobiCom '20: The 26th Annual International Conference on Mobile Computing and Networking, 2020
Less is more: Faster and better music version identification with embedding distillation.
Proceedings of the 21th International Society for Music Information Retrieval Conference, 2020
Combining musical features for cover detection.
Proceedings of the 21th International Society for Music Information Retrieval Conference, 2020
Input Complexity and Out-of-distribution Detection with Likelihood-based Generative Models.
Proceedings of the 8th International Conference on Learning Representations, 2020
Accurate and Scalable Version Identification Using Musically-Motivated Embeddings.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
Time-domain speech enhancement using generative adversarial networks.
Speech Commun., 2019
Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019
Towards Generalized Speech Enhancement with Generative Adversarial Networks.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Learning Problem-Agnostic Speech Representations from Multiple Self-Supervised Tasks.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Training Neural Audio Classifiers with Few Data.
Proceedings of the IEEE International Conference on Acoustics, 2019
From Correlation to Imagination: Deep Generative Models for Artificial Intelligence.
Proceedings of the Artificial Intelligence Research and Development, 2019
2018
MobInsight: A Framework Using Semantic Neighborhood Features for Localized Interpretations of Urban Mobility.
ACM Trans. Interact. Intell. Syst., 2018
Assessing the impact of machine intelligence on human behaviour: an interdisciplinary endeavour.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2018
Overcoming Catastrophic Forgetting with Hard Attention to the Task.
Proceedings of the 35th International Conference on Machine Learning, 2018
Language and Noise Transfer in Speech Enhancement Generative Adversarial Network.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Whispered-to-voiced Alaryngeal Speech Conversion with Generative Adversarial Networks.
Proceedings of the Fourth International Conference, 2018
Self-Attention Linguistic-Acoustic Decoder.
Proceedings of the Fourth International Conference, 2018
Towards a Universal Neural Network Encoder for Time Series.
Proceedings of the Artificial Intelligence Research and Development, 2018
There goes Wally: Anonymously sharing your location gives you away.
Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2018), 2018
2017
Beyond Interruptibility: Predicting Opportune Moments to Engage Mobile Phone Users.
Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 2017
Continual Prediction of Notification Attendance with Classical and Deep Network Approaches.
CoRR, 2017
Getting Deep Recommenders Fit: Bloom Embeddings for Sparse Binary Input/Output Networks.
Proceedings of the Eleventh ACM Conference on Recommender Systems, 2017
Practical Processing of Mobile Sensor Data for Continual Deep Learning Predictions.
Proceedings of the 1st International Workshop on Embedded and Mobile Deep Learning (Deep Learning for Mobile Systems and Applications), 2017
SEGAN: Speech Enhancement Generative Adversarial Network.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Compact Embedding of Binary-coded Inputs and Outputs using Bloom Filters.
Proceedings of the 5th International Conference on Learning Representations, 2017
Hot or Not? Forecasting Cellular Network Hot Spots Using Sector Performance Indicators.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017
The Good, the Bad, and the KPIs: How to Combine Performance Metrics to Better Capture Underperforming Sectors in Mobile Networks.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017
Effect of acoustic conditions on algorithms to detect Parkinson's disease from speech.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words.
Proceedings of the First Workshop on Abusive Language Online, 2017
2016
Particle swarm optimization for time series motif discovery.
Knowl. Based Syst., 2016
Ranking and significance of variable-length similarity-based time series motifs.
Expert Syst. Appl., 2016
Time-Delayed Melody Surfaces for Rāga Recognition.
Proceedings of the 17th International Society for Music Information Retrieval Conference, 2016
A Genetic Algorithm to Discover Flexible Motifs with Support.
Proceedings of the IEEE International Conference on Data Mining Workshops, 2016
Phrase-based rĀga recognition using vector space modeling.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Discovering rāga motifs by characterizing communities in networks of melodic patterns.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
2015
Analysis of the Impact of a Tag Recommendation System in a Real-World Folksonomy.
ACM Trans. Intell. Syst. Technol., 2015
Improving Melodic Similarity in Indian Art Music Using Culture-Specific Melodic Characteristics.
Proceedings of the 16th International Society for Music Information Retrieval Conference, 2015
An evaluation of methodologies for melodic similarity in audio recordings of Indian art music.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
2014
Unsupervised Music Structure Annotation by Time Series Structure Features and Segment Similarity.
IEEE Trans. Multim., 2014
An empirical evaluation of similarity measures for time series classification.
Knowl. Based Syst., 2014
Class-based tag recommendation and user-based evaluation in online audio clip sharing.
Knowl. Based Syst., 2014
Mining Melodic Patterns in Large Audio Collections of Indian Art Music.
Proceedings of the Tenth International Conference on Signal-Image Technology and Internet-Based Systems, 2014
Audio Clip Classification Using Social Tags and the Effect of Tag Expansion.
Proceedings of the AES International Conference on Semantic Audio 2014, 2014
Landmark Detection in Hindustani Music Melodies.
Proceedings of the Music Technology meets Philosophy, 2014
2013
Folksonomy-Based Tag Recommendation for Collaborative Tagging Systems.
Int. J. Semantic Web Inf. Syst., 2013
Tonal representations for music retrieval: from version identification to query-by-humming.
Int. J. Multim. Inf. Retr., 2013
Towards cover group thumbnailing.
Proceedings of the ACM Multimedia Conference, 2013
2012
MTG-QBH: Query By Humming dataset.
Dataset, November, 2012
Predictability of Music Descriptor Time Series and its Application to Cover Song Detection.
IEEE Trans. Speech Audio Process., 2012
Characterization and exploitation of community structure in cover song networks.
Pattern Recognit. Lett., 2012
Measuring the evolution of contemporary western popular music
CoRR, 2012
Melody, bass line, and harmony representations for music version identification.
Proceedings of the 21st World Wide Web Conference, 2012
Power-law distribution in encoded MFCC frames of speech, music, and environmental sound signals.
Proceedings of the 21st World Wide Web Conference, 2012
Extracting Semantic Information from an Online Carnatic Music Forum.
Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012
Characterization of Intonation in Carnatic Music by Parametrizing Pitch Histograms.
Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012
Structure-Based Audio Fingerprinting for Music Retrieval.
Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012
Folksonomy-based Tag Recommendation for Online Audio Clip Sharing.
Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012
A Competitive Measure to Assess the Similarity between Two Time Series.
Proceedings of the Case-Based Reasoning Research and Development, 2012
Audio Content-Based Music Retrieval.
Proceedings of the Multimodal Music Processing, 2012
Sample Identification in Hip Hop Music.
Proceedings of the From Sounds to Music and Emotions - 9th International Symposium, 2012
Unsupervised Detection of Music Boundaries by Time Series Structure Features.
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012
2011
Unifying Low-Level and High-Level Music Similarity Measures.
IEEE Trans. Multim., 2011
Assessing the Tuning of Sung Indian Classical Music.
Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011
Computational Approaches for the Understanding of Melody in Carnatic Music.
Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011
Nonlinear audio recurrence analysis with application to genre classification.
Proceedings of the IEEE International Conference on Acoustics, 2011
2010
Audio Cover Song Identification and Similarity: Background, Approaches, Evaluation, and Beyond.
Proceedings of the Advances in Music Information Retrieval, 2010
Indexing music by mood: design and integration of an automatic content-based annotator.
Multim. Tools Appl., 2010
Unsupervised Accuracy Improvement for Cover Song Detection Using Spectral Connectivity Network.
Proceedings of the 11th International Society for Music Information Retrieval Conference, 2010
2009
Unsupervised Detection of Cover Song Sets: Accuracy Improvement and Original Identification.
Proceedings of the 10th International Society for Music Information Retrieval Conference, 2009
Music Mood Representations from Social Tags.
Proceedings of the 10th International Society for Music Information Retrieval Conference, 2009
From Low-Level to High-Level: Comparative Study of Music Similarity Measures.
Proceedings of the 11th IEEE International Symposium on Multimedia, 2009
Music Mood Annotator Design and Integration.
Proceedings of the Seventh International Workshop on Content-Based Multimedia Indexing, 2009
2008
Chroma Binary Similarity and Local Alignment Applied to Cover Song Identification.
IEEE Trans. Speech Audio Process., 2008
Audio cover song identification based on tonal sequence alignment.
Proceedings of the IEEE International Conference on Acoustics, 2008
2007
A Qualitative Assessment of Measures for the Evaluation of a Cover Song Identification System.
Proceedings of the 8th International Conference on Music Information Retrieval, 2007