Neil Zeghidour

Orcid: 0000-0001-6896-3987

According to our database1, Neil Zeghidour authored at least 43 papers between 2016 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Moshi: a speech-text foundation model for real-time dialogue.
CoRR, 2024

MAD Speech: Measures of Acoustic Diversity of Speech.
CoRR, 2024

MusicRL: Aligning Music Generation to Human Preferences.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023
AudioLM: A Language Modeling Approach to Audio Generation.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision.
Trans. Assoc. Comput. Linguistics, 2023

AudioPaLM: A Large Language Model That Can Speak and Listen.
CoRR, 2023

SoundStorm: Efficient Parallel Audio Generation.
CoRR, 2023

DNArch: Learning Convolutional Neural Architectures by Backpropagation.
CoRR, 2023

SingSong: Generating musical accompaniments from singing.
CoRR, 2023

MusicLM: Generating Music From Text.
CoRR, 2023

TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Pose-graph SLAM Using Multi-order Ultrasonic Echoes and Beamforming for Long-range Inspection Robots.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Speech Intelligibility Classifiers from 550k Disordered Speech Samples.
Proceedings of the IEEE International Conference on Acoustics, 2023

Disentangling Speech from Surroundings with Neural Embeddings.
Proceedings of the IEEE International Conference on Acoustics, 2023

LMCodec: A Low Bitrate Speech Codec with Causal Transformer Models.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
SoundStream: An End-to-End Neural Audio Codec.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

AudioLM: a Language Modeling Approach to Audio Generation.
CoRR, 2022

Disentangling speech from surroundings in a neural audio codec.
CoRR, 2022

Multi-instrument Music Synthesis with Spectrogram Diffusion.
Proceedings of the 23rd International Society for Music Information Retrieval Conference, 2022

Learning neural audio features without supervision.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Combined Grid and Feature-based Mapping of Metal Structures with Ultrasonic Guided Waves.
Proceedings of the 2022 International Conference on Robotics and Automation, 2022

General-purpose, long-context autoregressive modeling with Perceiver AR.
Proceedings of the International Conference on Machine Learning, 2022

Learning Strides in Convolutional Neural Networks.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Polygonal Shapes Reconstruction from Acoustic Echoes Using a Mobile Sensor and Beamforming.
Proceedings of the 30th European Signal Processing Conference, 2022

2021
Wavesplit: End-to-End Speech Separation by Speaker Clustering.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Self-Supervised Learning of Audio Representations From Permutations With Differentiable Ranking.
IEEE Signal Process. Lett., 2021

LEAF: A Learnable Frontend for Audio Classification.
Proceedings of the 9th International Conference on Learning Representations, 2021

Contrastive Learning of General-Purpose Audio Representations.
Proceedings of the IEEE International Conference on Acoustics, 2021

Learning From Heterogeneous Eeg Signals with Differentiable Channel Reordering.
Proceedings of the IEEE International Conference on Acoustics, 2021

Dive: End-to-End Speech Diarization Via Iterative Speaker Embedding.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2019
Learning representations of speech from the raw waveform. (Apprentissage de représentations de la parole à partir du signal brut).
PhD thesis, 2019

Deep multi-class learning from label proportions.
CoRR, 2019

Learning to Detect Dysarthria from Raw Speech.
Proceedings of the IEEE International Conference on Acoustics, 2019

To Reverse the Gradient or Not: an Empirical Comparison of Adversarial and Multi-task Learning in Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Fully Convolutional Speech Recognition.
CoRR, 2018

SING: Symbol-to-Instrument Neural Generator.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

End-to-End Speech Recognition from the Raw Waveform.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Sampling Strategies in Siamese Networks for Unsupervised Speech Representation Learning.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Learning Filterbanks from Raw Speech for Phone Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Fader Networks: Manipulating Images by Sliding Attributes.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Learning Weakly Supervised Multimodal Phoneme Embeddings.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016
Joint Learning of Speaker and Phonetic Similarities with Siamese Networks.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

A deep scattering spectrum - Deep Siamese network pipeline for unsupervised acoustic modeling.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016


  Loading...