2025

Supervised contrastive learning from weakly-labeled audio segments for musical version matching.

[DOI]

,

Recep Oguz Araz

,

Dmitry Bogdanov

,

CoRR, February, 2025

2024

Joint Semantic Knowledge Distillation and Masked Acoustic Modeling for Full-band Speech Restoration with Improved Intelligibility.

[DOI]

,

,

,

Santiago Pascual

CoRR, 2024

Sequential Contrastive Audio-Visual Learning.

[DOI]

Ioannis Tsiamas

,

Santiago Pascual

,

,

CoRR, 2024

GASS: Generalizing Audio Source Separation with Large-Scale Data.

[DOI]

,

,

Santiago Pascual

,

Proceedings of the IEEE International Conference on Acoustics, 2024

Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity.

[DOI]

Santiago Pascual

,

,

Ioannis Tsiamas

,

Proceedings of the Computer Vision - ECCV 2024, 2024

2023

Carnatic Varnam Dataset.

[DOI]

Gopala K. Koduri

,

,

,

Dataset, March, 2023

CLIPSonic: Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision Models.

[DOI]

,

,

,

Gautam Bhattacharya

,

Santiago Pascual

,

,

Taylor Berg-Kirkpatrick

,

Julian J. McAuley

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

Mono-to-Stereo Through Parametric Stereo Generation.

[DOI]

,

,

Santiago Pascual

,

,

,

Jeroen Breebaart

,

Giulio Cengarle

Proceedings of the 24th International Society for Music Information Retrieval Conference, 2023

Adversarial Permutation Invariant Training for Universal Sound Separation.

[DOI]

Emilian Postolache

,

,

Santiago Pascual

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Full-Band General Audio Synthesis with Score-Based Diffusion.

[DOI]

Santiago Pascual

,

Gautam Bhattacharya

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker Separation.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Upsampling Layers for Music Source Separation.

[DOI]

,

,

Santiago Pascual

,

Giulio Cengarle

,

,

Proceedings of the 31st European Signal Processing Conference, 2023

2022

Universal Speech Enhancement with Score-based Diffusion.

[DOI]

,

Santiago Pascual

,

,

Recep Oguz Araz

,

CoRR, 2022

Assessing Algorithmic Biases for Musical Version Identification.

[DOI]

,

,

,

Proceedings of the WSDM '22: The Fifteenth ACM International Conference on Web Search and Data Mining, Virtual Event / Tempe, AZ, USA, February 21, 2022

On Loss Functions and Evaluation Metrics for Music Source Separation.

[DOI]

,

,

Santiago Pascual

,

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Audio-Based Musical Version Identification: Elements and challenges.

[DOI]

,

Guillaume Doras

,

Rachel M. Bittner

,

Christopher J. Tralie

,

IEEE Signal Process. Mag., 2021

Heaps' law and vocabulary richness in the history of classical music harmony.

[DOI]

Marc Serra-Peralta

,

,

EPJ Data Sci., 2021

On tuning consistent annealed sampling for denoising score matching.

[DOI]

,

Santiago Pascual

,

CoRR, 2021

Adversarial Auto-Encoding for Packet Loss Concealment.

[DOI]

Santiago Pascual

,

,

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

Investigating the Efficacy of Music Version Retrieval Systems for Setlist Identification.

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

Automatic Multitrack Mixing With A Differentiable Mixing Console Of Neural Audio Effects.

[DOI]

Christian J. Steinmetz

,

,

Santiago Pascual

,

Proceedings of the IEEE International Conference on Acoustics, 2021

SESQA: Semi-Supervised Learning for Speech Quality Assessment.

[DOI]

,

,

Santiago Pascual

Proceedings of the IEEE International Conference on Acoustics, 2021

Upsampling Artifacts in Neural Audio Synthesis.

[DOI]

,

Santiago Pascual

,

Giulio Cengarle

,

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Experience: advanced network operations in (Un)-connected remote communities.

[DOI]

,

,

,

,

Ilias Leontiadis

Proceedings of the MobiCom '20: The 26th Annual International Conference on Mobile Computing and Networking, 2020

Less is more: Faster and better music version identification with embedding distillation.

[DOI]

,

,

Proceedings of the 21th International Society for Music Information Retrieval Conference, 2020

Combining musical features for cover detection.

[DOI]

Guillaume Doras

,

,

,

,

Geoffroy Peeters

Proceedings of the 21th International Society for Music Information Retrieval Conference, 2020

Input Complexity and Out-of-distribution Detection with Likelihood-based Generative Models.

[DOI]

,

,

,

Olga Slizovskaia

,

José F. Núñez

,

Proceedings of the 8th International Conference on Learning Representations, 2020

Accurate and Scalable Version Identification Using Musically-Motivated Embeddings.

[DOI]

,

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Time-domain speech enhancement using generative adversarial networks.

[DOI]

Santiago Pascual

,

,

Antonio Bonafonte

Speech Commun., 2019

Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion.

[DOI]

,

Santiago Pascual

,

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Towards Generalized Speech Enhancement with Generative Adversarial Networks.

[DOI]

Santiago Pascual

,

,

Antonio Bonafonte

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Learning Problem-Agnostic Speech Representations from Multiple Self-Supervised Tasks.

[DOI]

Santiago Pascual

,

Mirco Ravanelli

,

,

Antonio Bonafonte

,

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Training Neural Audio Classifiers with Few Data.

[DOI]

,

,

Proceedings of the IEEE International Conference on Acoustics, 2019

From Correlation to Imagination: Deep Generative Models for Artificial Intelligence.

[DOI]

Proceedings of the Artificial Intelligence Research and Development, 2019

2018

MobInsight: A Framework Using Semantic Neighborhood Features for Localized Interpretations of Urban Mobility.

[DOI]

,

,

Enrique Frías-Martínez

,

ACM Trans. Interact. Intell. Syst., 2018

Assessing the impact of machine intelligence on human behaviour: an interdisciplinary endeavour.

[DOI]

CoRR, 2018

Overcoming Catastrophic Forgetting with Hard Attention to the Task.

[DOI]

,

,

,

Alexandros Karatzoglou

Proceedings of the 35th International Conference on Machine Learning, 2018

Language and Noise Transfer in Speech Enhancement Generative Adversarial Network.

[DOI]

Santiago Pascual

,

,

,

Antonio Bonafonte

,

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Whispered-to-voiced Alaryngeal Speech Conversion with Generative Adversarial Networks.

[DOI]

Santiago Pascual

,

Antonio Bonafonte

,

,

José Andrés González López

Proceedings of the Fourth International Conference, 2018

Self-Attention Linguistic-Acoustic Decoder.

[DOI]

Santiago Pascual

,

Antonio Bonafonte

,

Proceedings of the Fourth International Conference, 2018

Towards a Universal Neural Network Encoder for Time Series.

[DOI]

,

Santiago Pascual

,

Alexandros Karatzoglou

Proceedings of the Artificial Intelligence Research and Development, 2018

There goes Wally: Anonymously sharing your location gives you away.

[DOI]

Apostolos Pyrgelis

,

Nicolas Kourtellis

,

Ilias Leontiadis

,

,

Claudio Soriente

Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2018), 2018

2017

Beyond Interruptibility: Predicting Opportune Moments to Engage Mobile Phone Users.

[DOI]

,

,

Kleomenis Katevas

,

,

Aleksandar Matic

,

Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 2017

Continual Prediction of Notification Attendance with Classical and Deep Network Approaches.

[DOI]

Kleomenis Katevas

,

Ilias Leontiadis

,

,

CoRR, 2017

Getting Deep Recommenders Fit: Bloom Embeddings for Sparse Binary Input/Output Networks.

[DOI]

,

Alexandros Karatzoglou

Proceedings of the Eleventh ACM Conference on Recommender Systems, 2017

Practical Processing of Mobile Sensor Data for Continual Deep Learning Predictions.

[DOI]

Kleomenis Katevas

,

Ilias Leontiadis

,

,

Proceedings of the 1st International Workshop on Embedded and Mobile Deep Learning (Deep Learning for Mobile Systems and Applications), 2017

SEGAN: Speech Enhancement Generative Adversarial Network.

[DOI]

Santiago Pascual

,

Antonio Bonafonte

,

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Compact Embedding of Binary-coded Inputs and Outputs using Bloom Filters.

[DOI]

,

Alexandros Karatzoglou

Proceedings of the 5th International Conference on Learning Representations, 2017

Hot or Not? Forecasting Cellular Network Hot Spots Using Sector Performance Indicators.

[DOI]

,

Ilias Leontiadis

,

Alexandros Karatzoglou

,

Konstantina Papagiannaki

Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

The Good, the Bad, and the KPIs: How to Combine Performance Metrics to Better Capture Underperforming Sectors in Mobile Networks.

[DOI]

Ilias Leontiadis

,

,

Alessandro Finamore

,

Giorgos Dimopoulos

,

Konstantina Papagiannaki

Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

Effect of acoustic conditions on algorithms to detect Parkinson's disease from speech.

[DOI]

Juan Camilo Vásquez-Correa

,

,

Juan Rafael Orozco-Arroyave

,

Jesús Francisco Vargas-Bonilla

,

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words.

[DOI]

,

Ilias Leontiadis

,

Dimitris Spathis

,

Gianluca Stringhini

,

Jeremy Blackburn

,

Proceedings of the First Workshop on Abusive Language Online, 2017

2016

Particle swarm optimization for time series motif discovery.

[DOI]

,

Josep Lluís Arcos

Knowl. Based Syst., 2016

Ranking and significance of variable-length similarity-based time series motifs.

[DOI]

,

,

,

Josep Lluís Arcos

Expert Syst. Appl., 2016

Time-Delayed Melody Surfaces for Rāga Recognition.

[DOI]

,

,

Kaustuv Kanti Ganguli

,

Sertan Sentürk

,

Proceedings of the 17th International Society for Music Information Retrieval Conference, 2016

A Genetic Algorithm to Discover Flexible Motifs with Support.

[DOI]

,

Aleksandar Matic

,

Josep Lluís Arcos

,

Alexandros Karatzoglou

Proceedings of the IEEE International Conference on Data Mining Workshops, 2016

Phrase-based rĀga recognition using vector space modeling.

[DOI]

,

,

,

Sertan Sentürk

,

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Discovering rāga motifs by characterizing communities in networks of melodic patterns.

[DOI]

,

,

,

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Analysis of the Impact of a Tag Recommendation System in a Real-World Folksonomy.

[DOI]

,

,

ACM Trans. Intell. Syst. Technol., 2015

Improving Melodic Similarity in Indian Art Music Using Culture-Specific Melodic Characteristics.

[DOI]

,

,

Proceedings of the 16th International Society for Music Information Retrieval Conference, 2015

An evaluation of methodologies for melodic similarity in audio recordings of Indian art music.

[DOI]

,

,

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

Carnatic Varnam Dataset.

[DOI]

Gopala K. Koduri

,

,

,

Dataset, February, 2014

Carnatic Varnam Dataset.

[DOI]

Gopala K. Koduri

,

,

,

Dataset, February, 2014

Unsupervised Music Structure Annotation by Time Series Structure Features and Segment Similarity.

[DOI]

,

Meinard Müller

,

,

Josep Lluís Arcos

IEEE Trans. Multim., 2014

An empirical evaluation of similarity measures for time series classification.

[DOI]

,

Josep Lluís Arcos

Knowl. Based Syst., 2014

Class-based tag recommendation and user-based evaluation in online audio clip sharing.

[DOI]

,

,

Knowl. Based Syst., 2014

Mining Melodic Patterns in Large Audio Collections of Indian Art Music.

[DOI]

,

,

,

Proceedings of the Tenth International Conference on Signal-Image Technology and Internet-Based Systems, 2014

Audio Clip Classification Using Social Tags and the Effect of Tag Expansion.

[DOI]

,

,

Proceedings of the AES International Conference on Semantic Audio 2014, 2014

Landmark Detection in Hindustani Music Melodies.

[DOI]

,

,

Kaustuv Kanti Ganguli

,

Proceedings of the Music Technology meets Philosophy, 2014

2013

Folksonomy-Based Tag Recommendation for Collaborative Tagging Systems.

[DOI]

,

,

Int. J. Semantic Web Inf. Syst., 2013

Tonal representations for music retrieval: from version identification to query-by-humming.

[DOI]

,

,

Int. J. Multim. Inf. Retr., 2013

Towards cover group thumbnailing.

[DOI]

,

Meinard Müller

,

Proceedings of the ACM Multimedia Conference, 2013

2012

MTG-QBH: Query By Humming dataset.

[DOI]

,

,

Dataset, November, 2012

Predictability of Music Descriptor Time Series and its Application to Cover Song Detection.

[DOI]

,

,

,

Ralph G. Andrzejak

IEEE Trans. Speech Audio Process., 2012

Characterization and exploitation of community structure in cover song networks.

[DOI]

,

Massimiliano Zanin

,

Perfecto Herrera

,

Pattern Recognit. Lett., 2012

Measuring the evolution of contemporary western popular music

[DOI]

,

,

Marián Boguñá

,

,

Josep Lluís Arcos

CoRR, 2012

Melody, bass line, and harmony representations for music version identification.

[DOI]

,

,

Proceedings of the 21st World Wide Web Conference, 2012

Power-law distribution in encoded MFCC frames of speech, music, and environmental sound signals.

[DOI]

,

,

,

Perfecto Herrera

Proceedings of the 21st World Wide Web Conference, 2012

Extracting Semantic Information from an Online Carnatic Music Forum.

[DOI]

,

,

Gopala K. Koduri

,

Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012

Characterization of Intonation in Carnatic Music by Parametrizing Pitch Histograms.

[DOI]

Gopala K. Koduri

,

,

Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012

Structure-Based Audio Fingerprinting for Music Retrieval.

[DOI]

,

,

Meinard Müller

,

Josep Lluís Arcos

Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012

Folksonomy-based Tag Recommendation for Online Audio Clip Sharing.

[DOI]

,

,

Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012

A Competitive Measure to Assess the Similarity between Two Time Series.

[DOI]

,

Josep Lluís Arcos

Proceedings of the Case-Based Reasoning Research and Development, 2012

Audio Content-Based Music Retrieval.

[DOI]

,

Meinard Müller

,

Proceedings of the Multimodal Music Processing, 2012

Sample Identification in Hip Hop Music.

[DOI]

,

,

Proceedings of the From Sounds to Music and Emotions - 9th International Symposium, 2012

Unsupervised Detection of Music Boundaries by Time Series Structure Features.

[DOI]

,

Meinard Müller

,

,

Josep Lluís Arcos

Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2011

Unifying Low-Level and High-Level Music Similarity Measures.

[DOI]

Dmitry Bogdanov

,

,

,

Perfecto Herrera

,

IEEE Trans. Multim., 2011

Assessing the Tuning of Sung Indian Classical Music.

[DOI]

,

Gopala K. Koduri

,

,

Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Computational Approaches for the Understanding of Melody in Carnatic Music.

[DOI]

Gopala K. Koduri

,

,

,

Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Nonlinear audio recurrence analysis with application to genre classification.

[DOI]

,

Carlos A. de los Santos

,

Ralph G. Andrzejak

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Audio Cover Song Identification and Similarity: Background, Approaches, Evaluation, and Beyond.

[DOI]

,

,

Perfecto Herrera

Proceedings of the Advances in Music Information Retrieval, 2010

Indexing music by mood: design and integration of an automatic content-based annotator.

[DOI]

,

,

,

,

Perfecto Herrera

,

Multim. Tools Appl., 2010

Unsupervised Accuracy Improvement for Cover Song Detection Using Spectral Connectivity Network.

[DOI]

Mathieu Lagrange

,

Proceedings of the 11th International Society for Music Information Retrieval Conference, 2010

2009

Unsupervised Detection of Cover Song Sets: Accuracy Improvement and Original Identification.

[DOI]

,

Massimiliano Zanin

,

,

Proceedings of the 10th International Society for Music Information Retrieval Conference, 2009

Music Mood Representations from Social Tags.

[DOI]

,

,

,

Perfecto Herrera

Proceedings of the 10th International Society for Music Information Retrieval Conference, 2009

From Low-Level to High-Level: Comparative Study of Music Similarity Measures.

[DOI]

Dmitry Bogdanov

,

,

,

Perfecto Herrera

Proceedings of the 11th IEEE International Symposium on Multimedia, 2009

Music Mood Annotator Design and Integration.

[DOI]

,

,

,

,

Perfecto Herrera

Proceedings of the Seventh International Workshop on Content-Based Multimedia Indexing, 2009

2008

Chroma Binary Similarity and Local Alignment Applied to Cover Song Identification.

[DOI]

,

,

Perfecto Herrera

,

IEEE Trans. Speech Audio Process., 2008

Audio cover song identification based on tonal sequence alignment.

[DOI]

,

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

A Qualitative Assessment of Measures for the Evaluation of a Cover Song Identification System.

[DOI]

Proceedings of the 8th International Conference on Music Information Retrieval, 2007