Joan Serrà

Orcid: 0000-0003-1303-6558

According to our database1, Joan Serrà authored at least 99 papers between 2007 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Joint Semantic Knowledge Distillation and Masked Acoustic Modeling for Full-band Speech Restoration with Improved Intelligibility.
CoRR, 2024

Masked Generative Video-to-Audio Transformers with Enhanced Synchronicity.
CoRR, 2024

Sequential Contrastive Audio-Visual Learning.
CoRR, 2024

GASS: Generalizing Audio Source Separation with Large-Scale Data.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Carnatic Varnam Dataset.
Dataset, March, 2023

CLIPSonic: Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision Models.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

Mono-to-Stereo Through Parametric Stereo Generation.
Proceedings of the 24th International Society for Music Information Retrieval Conference, 2023

Adversarial Permutation Invariant Training for Universal Sound Separation.
Proceedings of the IEEE International Conference on Acoustics, 2023

Full-Band General Audio Synthesis with Score-Based Diffusion.
Proceedings of the IEEE International Conference on Acoustics, 2023

Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker Separation.
Proceedings of the IEEE International Conference on Acoustics, 2023

Upsampling Layers for Music Source Separation.
Proceedings of the 31st European Signal Processing Conference, 2023

2022
Universal Speech Enhancement with Score-based Diffusion.
CoRR, 2022

Assessing Algorithmic Biases for Musical Version Identification.
Proceedings of the WSDM '22: The Fifteenth ACM International Conference on Web Search and Data Mining, Virtual Event / Tempe, AZ, USA, February 21, 2022

On Loss Functions and Evaluation Metrics for Music Source Separation.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Audio-Based Musical Version Identification: Elements and challenges.
IEEE Signal Process. Mag., 2021

Heaps' law and vocabulary richness in the history of classical music harmony.
EPJ Data Sci., 2021

On tuning consistent annealed sampling for denoising score matching.
CoRR, 2021

Adversarial Auto-Encoding for Packet Loss Concealment.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

Investigating the Efficacy of Music Version Retrieval Systems for Setlist Identification.
Proceedings of the IEEE International Conference on Acoustics, 2021

Automatic Multitrack Mixing With A Differentiable Mixing Console Of Neural Audio Effects.
Proceedings of the IEEE International Conference on Acoustics, 2021

SESQA: Semi-Supervised Learning for Speech Quality Assessment.
Proceedings of the IEEE International Conference on Acoustics, 2021

Upsampling Artifacts in Neural Audio Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Experience: advanced network operations in (Un)-connected remote communities.
Proceedings of the MobiCom '20: The 26th Annual International Conference on Mobile Computing and Networking, 2020

Less is more: Faster and better music version identification with embedding distillation.
Proceedings of the 21th International Society for Music Information Retrieval Conference, 2020

Combining musical features for cover detection.
Proceedings of the 21th International Society for Music Information Retrieval Conference, 2020

Input Complexity and Out-of-distribution Detection with Likelihood-based Generative Models.
Proceedings of the 8th International Conference on Learning Representations, 2020

Accurate and Scalable Version Identification Using Musically-Motivated Embeddings.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Time-domain speech enhancement using generative adversarial networks.
Speech Commun., 2019

Blow: a single-scale hyperconditioned flow for non-parallel raw-audio voice conversion.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Towards Generalized Speech Enhancement with Generative Adversarial Networks.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Learning Problem-Agnostic Speech Representations from Multiple Self-Supervised Tasks.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Training Neural Audio Classifiers with Few Data.
Proceedings of the IEEE International Conference on Acoustics, 2019

From Correlation to Imagination: Deep Generative Models for Artificial Intelligence.
Proceedings of the Artificial Intelligence Research and Development, 2019

2018
MobInsight: A Framework Using Semantic Neighborhood Features for Localized Interpretations of Urban Mobility.
ACM Trans. Interact. Intell. Syst., 2018

Assessing the impact of machine intelligence on human behaviour: an interdisciplinary endeavour.
CoRR, 2018

Overcoming Catastrophic Forgetting with Hard Attention to the Task.
Proceedings of the 35th International Conference on Machine Learning, 2018

Language and Noise Transfer in Speech Enhancement Generative Adversarial Network.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Whispered-to-voiced Alaryngeal Speech Conversion with Generative Adversarial Networks.
Proceedings of the Fourth International Conference, 2018

Self-Attention Linguistic-Acoustic Decoder.
Proceedings of the Fourth International Conference, 2018

Towards a Universal Neural Network Encoder for Time Series.
Proceedings of the Artificial Intelligence Research and Development, 2018

There goes Wally: Anonymously sharing your location gives you away.
Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2018), 2018

2017
Beyond Interruptibility: Predicting Opportune Moments to Engage Mobile Phone Users.
Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 2017

Continual Prediction of Notification Attendance with Classical and Deep Network Approaches.
CoRR, 2017

Getting Deep Recommenders Fit: Bloom Embeddings for Sparse Binary Input/Output Networks.
Proceedings of the Eleventh ACM Conference on Recommender Systems, 2017

Practical Processing of Mobile Sensor Data for Continual Deep Learning Predictions.
Proceedings of the 1st International Workshop on Embedded and Mobile Deep Learning (Deep Learning for Mobile Systems and Applications), 2017

SEGAN: Speech Enhancement Generative Adversarial Network.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Compact Embedding of Binary-coded Inputs and Outputs using Bloom Filters.
Proceedings of the 5th International Conference on Learning Representations, 2017

Hot or Not? Forecasting Cellular Network Hot Spots Using Sector Performance Indicators.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

The Good, the Bad, and the KPIs: How to Combine Performance Metrics to Better Capture Underperforming Sectors in Mobile Networks.
Proceedings of the 33rd IEEE International Conference on Data Engineering, 2017

Effect of acoustic conditions on algorithms to detect Parkinson's disease from speech.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words.
Proceedings of the First Workshop on Abusive Language Online, 2017

2016
Particle swarm optimization for time series motif discovery.
Knowl. Based Syst., 2016

Ranking and significance of variable-length similarity-based time series motifs.
Expert Syst. Appl., 2016

Time-Delayed Melody Surfaces for Rāga Recognition.
Proceedings of the 17th International Society for Music Information Retrieval Conference, 2016

A Genetic Algorithm to Discover Flexible Motifs with Support.
Proceedings of the IEEE International Conference on Data Mining Workshops, 2016

Phrase-based rĀga recognition using vector space modeling.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Discovering rāga motifs by characterizing communities in networks of melodic patterns.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Analysis of the Impact of a Tag Recommendation System in a Real-World Folksonomy.
ACM Trans. Intell. Syst. Technol., 2015

Improving Melodic Similarity in Indian Art Music Using Culture-Specific Melodic Characteristics.
Proceedings of the 16th International Society for Music Information Retrieval Conference, 2015

An evaluation of methodologies for melodic similarity in audio recordings of Indian art music.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Carnatic Varnam Dataset.
Dataset, February, 2014

Carnatic Varnam Dataset.
Dataset, February, 2014

Unsupervised Music Structure Annotation by Time Series Structure Features and Segment Similarity.
IEEE Trans. Multim., 2014

An empirical evaluation of similarity measures for time series classification.
Knowl. Based Syst., 2014

Class-based tag recommendation and user-based evaluation in online audio clip sharing.
Knowl. Based Syst., 2014

Mining Melodic Patterns in Large Audio Collections of Indian Art Music.
Proceedings of the Tenth International Conference on Signal-Image Technology and Internet-Based Systems, 2014

Audio Clip Classification Using Social Tags and the Effect of Tag Expansion.
Proceedings of the AES International Conference on Semantic Audio 2014, 2014

Landmark Detection in Hindustani Music Melodies.
Proceedings of the Music Technology meets Philosophy, 2014

2013
Folksonomy-Based Tag Recommendation for Collaborative Tagging Systems.
Int. J. Semantic Web Inf. Syst., 2013

Tonal representations for music retrieval: from version identification to query-by-humming.
Int. J. Multim. Inf. Retr., 2013

Towards cover group thumbnailing.
Proceedings of the ACM Multimedia Conference, 2013

2012
MTG-QBH: Query By Humming dataset.
Dataset, November, 2012

Predictability of Music Descriptor Time Series and its Application to Cover Song Detection.
IEEE Trans. Speech Audio Process., 2012

Characterization and exploitation of community structure in cover song networks.
Pattern Recognit. Lett., 2012

Measuring the evolution of contemporary western popular music
CoRR, 2012

Melody, bass line, and harmony representations for music version identification.
Proceedings of the 21st World Wide Web Conference, 2012

Power-law distribution in encoded MFCC frames of speech, music, and environmental sound signals.
Proceedings of the 21st World Wide Web Conference, 2012

Extracting Semantic Information from an Online Carnatic Music Forum.
Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012

Characterization of Intonation in Carnatic Music by Parametrizing Pitch Histograms.
Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012

Structure-Based Audio Fingerprinting for Music Retrieval.
Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012

Folksonomy-based Tag Recommendation for Online Audio Clip Sharing.
Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012

A Competitive Measure to Assess the Similarity between Two Time Series.
Proceedings of the Case-Based Reasoning Research and Development, 2012

Audio Content-Based Music Retrieval.
Proceedings of the Multimodal Music Processing, 2012

Sample Identification in Hip Hop Music.
Proceedings of the From Sounds to Music and Emotions - 9th International Symposium, 2012

Unsupervised Detection of Music Boundaries by Time Series Structure Features.
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2011
Unifying Low-Level and High-Level Music Similarity Measures.
IEEE Trans. Multim., 2011

Assessing the Tuning of Sung Indian Classical Music.
Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Computational Approaches for the Understanding of Melody in Carnatic Music.
Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Nonlinear audio recurrence analysis with application to genre classification.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Audio Cover Song Identification and Similarity: Background, Approaches, Evaluation, and Beyond.
Proceedings of the Advances in Music Information Retrieval, 2010

Indexing music by mood: design and integration of an automatic content-based annotator.
Multim. Tools Appl., 2010

Unsupervised Accuracy Improvement for Cover Song Detection Using Spectral Connectivity Network.
Proceedings of the 11th International Society for Music Information Retrieval Conference, 2010

2009
Unsupervised Detection of Cover Song Sets: Accuracy Improvement and Original Identification.
Proceedings of the 10th International Society for Music Information Retrieval Conference, 2009

Music Mood Representations from Social Tags.
Proceedings of the 10th International Society for Music Information Retrieval Conference, 2009

From Low-Level to High-Level: Comparative Study of Music Similarity Measures.
Proceedings of the 11th IEEE International Symposium on Multimedia, 2009

Music Mood Annotator Design and Integration.
Proceedings of the Seventh International Workshop on Content-Based Multimedia Indexing, 2009

2008
Chroma Binary Similarity and Local Alignment Applied to Cover Song Identification.
IEEE Trans. Speech Audio Process., 2008

Audio cover song identification based on tonal sequence alignment.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
A Qualitative Assessment of Measures for the Evaluation of a Cover Song Identification System.
Proceedings of the 8th International Conference on Music Information Retrieval, 2007


  Loading...