2024

DINOv2: Learning Robust Visual Features without Supervision.

[DOI]

Trans. Mach. Learn. Res., 2024

Multimodal Autoregressive Pre-training of Large Vision Encoders.

[DOI]

Victor Guilherme Turrisi da Costa

CoRR, 2024

DataComp-LM: In search of the next generation of training sets for language models.

[DOI]

Khyathi Raghavi Chandu

Alexandros G. Dimakis

CoRR, 2024

DataComp-LM: In search of the next generation of training sets for language models.

[DOI]

Khyathi Raghavi Chandu

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Scalable Pre-training of Large Autoregressive Image Models.

[DOI]

Alaaeldin El-Nouby

Michal Klein

Shuangfei Zhai

Miguel Ángel Bautista

Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023

ResMLP: Feedforward Networks for Image Classification With Data-Efficient Training.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., April, 2023

Image Compression with Product Quantized Masked Image Modeling.

[DOI]

Trans. Mach. Learn. Res., 2023

DINOv2: Learning Robust Visual Features without Supervision.

[DOI]

CoRR, 2023

Are Visual Recognition Models Robust to Image Compression?

[DOI]

CoRR, 2023

Improving Statistical Fidelity for Neural Image Compression with Implicit Local Likelihood Models.

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Variable Rate Allocation for Vector-Quantized Autoencoders.

[DOI]

Federico Baldassarre

Alaaeldin El-Nouby

Hervé Jégou

Proceedings of the IEEE International Conference on Acoustics, 2023

OmniMAE: Single Model Masked Pretraining on Images and Videos.

[DOI]

Rohit Girdhar

Alaaeldin El-Nouby

Mannat Singh

Kalyan Vasudev Alwala

Armand Joulin

Ishan Misra

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

ImageBind One Embedding Space to Bind Them All.

[DOI]

Kalyan Vasudev Alwala

Armand Joulin

Ishan Misra

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Three Things Everyone Should Know About Vision Transformers.

[DOI]

Proceedings of the Computer Vision, 2022

2021

Augmenting Convolutional networks with attention-based aggregation.

[DOI]

CoRR, 2021

Are Large-scale Datasets Necessary for Self-Supervised Pre-training?

[DOI]

CoRR, 2021

XCiT: Cross-Covariance Image Transformers.

[DOI]

CoRR, 2021

ResMLP: Feedforward networks for image classification with data-efficient training.

[DOI]

CoRR, 2021

Training Vision Transformers for Image Retrieval.

[DOI]

CoRR, 2021

LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference.

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2019

Skip-Clip: Self-Supervised Spatiotemporal Representation Learning by Future Clip Order Ranking.

[DOI]

CoRR, 2019

Tell, Draw, and Repeat: Generating and Modifying Images Based on Continual Linguistic Instruction.

[DOI]

Samira Ebrahimi Kahou

Yoshua Bengio

Graham W. Taylor

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

2018

Keep Drawing It: Iterative language-based image generation and editing.

[DOI]

Samira Ebrahimi Kahou

Yoshua Bengio

Graham W. Taylor

CoRR, 2018

Real-Time End-to-End Action Detection with Two-Stream Networks.

[DOI]

Alaaeldin El-Nouby

Graham W. Taylor

Proceedings of the 15th Conference on Computer and Robot Vision, 2018