2025

DVMark: A Deep Multiscale Framework for Video Watermarking.

[DOI]

,

,

,

,

Peyman Milanfar

,

IEEE Trans. Image Process., 2025

2024

Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency.

[DOI]

,

Sangnie Bhardwaj

,

,

,

,

,

Guillaume Lajoie

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Improve Supervised Representation Learning with Masked Image Modeling.

[DOI]

,

,

,

,

,

Mojtaba Seyedhosseini

CoRR, 2023

StyleDrop: Text-to-Image Generation in Any Style.

[DOI]

,

,

,

Daniel Castro Chin

,

,

,

,

,

,

,

,

,

Michael Rubinstein

,

CoRR, 2023

Learning Disentangled Prompts for Compositional Image Synthesis.

[DOI]

,

,

,

,

,

,

,

CoRR, 2023

Soundini: Sound-Guided Diffusion for Natural Video Editing.

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2023

StraIT: Non-autoregressive Generation with Stratified Image Transformer.

[DOI]

,

,

,

,

,

CoRR, 2023

StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners.

[DOI]

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

StyleDrop: Text-to-Image Synthesis of Any Style.

[DOI]

,

,

,

,

,

,

,

,

,

Michael Rubinstein

,

,

,

,

Daniel Castro Chin

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Muse: Text-To-Image Generation via Masked Generative Transformers.

[DOI]

,

,

,

Aaron Maschinot

,

,

,

Ming-Hsuan Yang

,

Kevin Patrick Murphy

,

William T. Freeman

,

Michael Rubinstein

,

,

Proceedings of the International Conference on Machine Learning, 2023

Discrete Predictor-Corrector Diffusion Models for Image Synthesis.

[DOI]

,

,

,

,

,

Proceedings of the Eleventh International Conference on Learning Representations, 2023

VQ3D: Learning a 3D-Aware Generative Model on ImageNet.

[DOI]

,

,

,

,

Charles Herrmann

,

Pratul P. Srinivasan

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Score-Based Diffusion Models as Principled Priors for Inverse Imaging.

[DOI]

,

,

Michael Rubinstein

,

,

Katherine L. Bouman

,

William T. Freeman

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

MAGVIT: Masked Generative Video Transformer.

[DOI]

,

,

,

,

,

,

Alexander G. Hauptmann

,

Ming-Hsuan Yang

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Visual Prompt Tuning for Generative Transfer Learning.

[DOI]

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis.

[DOI]

,

,

Shlok Kumar Mishra

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Imagic: Text-Based Real Image Editing with Diffusion Models.

[DOI]

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

A simple, efficient and scalable contrastive masked autoencoder for learning visual representations.

[DOI]

,

Joshua Robinson

,

,

,

,

Aaron Maschinot

,

CoRR, 2022

Palette: Image-to-Image Diffusion Models.

[DOI]

Chitwan Saharia

,

,

,

,

,

,

,

Mohammad Norouzi

Proceedings of the SIGGRAPH '22: Special Interest Group on Computer Graphics and Interactive Techniques Conference, Vancouver, BC, Canada, August 7, 2022

ViTGAN: Training GANs with Vision Transformers.

[DOI]

,

,

,

,

,

Proceedings of the Tenth International Conference on Learning Representations, 2022

Improved Masked Image Generation with Token-Critic.

[DOI]

,

,

,

Proceedings of the Computer Vision - ECCV 2022, 2022

BLT: Bidirectional Layout Transformer for Controllable Layout Generation.

[DOI]

,

,

,

,

,

,

Proceedings of the Computer Vision - ECCV 2022, 2022

Deep 3D-to-2D Watermarking: Embedding Messages in 3D Meshes and Extracting Them from 2D Renderings.

[DOI]

,

,

,

,

,

Peyman Milanfar

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Zoom-to-Inpaint: Image Inpainting with High-Frequency Details.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

Pyramid Adversarial Training Improves ViT Performance.

[DOI]

Charles Herrmann

,

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

MaskGIT: Masked Generative Image Transformer.

[DOI]

,

,

,

,

William T. Freeman

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

SLIDE: Single Image 3D Photography with Soft Layering and Depth-aware Inpainting.

[DOI]

,

,

,

,

,

Michael Krainin

,

,

William T. Freeman

,

,

,

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

LASR: Learning Articulated Shape Reconstruction From a Monocular Video.

[DOI]

,

,

,

,

,

,

,

William T. Freeman

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

AutoFlow: Learning a Better Training Set for Optical Flow.

[DOI]

,

,

Charles Herrmann

,

,

Michael Krainin

,

,

,

William T. Freeman

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

OCONet: Image Extrapolation by Object Completion.

[DOI]

Richard Strong Bowen

,

,

Charles Herrmann

,

,

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

Distortion Agnostic Deep Watermarking.

[DOI]

,

,

,

,

Peyman Milanfar

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2018

Personal Photo Enhancement

[DOI]

PhD thesis, 2018

SwapNet: Image Based Garment Transfer.

[DOI]

,

Patsorn Sangkloy

,

,

,

,

Proceedings of the Computer Vision - ECCV 2018, 2018

PairedCycleGAN: Asymmetric Style Transfer for Applying and Removing Makeup.

[DOI]

,

,

,

Adam Finkelstein

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

Panning and Zooming High-Resolution Panoramas in Virtual Reality Devices.

[DOI]

,

Michael F. Cohen

Proceedings of the 30th Annual ACM Symposium on User Interface Software and Technology, 2017

2016

Automatic triage for a photo series.

[DOI]

,

,

,

,

Adam Finkelstein

ACM Trans. Graph., 2016

2015

Palette-based photo recoloring.

[DOI]

,

,

,

Stephen DiVerdi

,

Adam Finkelstein

ACM Trans. Graph., 2015

2013

Rectangling panoramic images via warping.

[DOI]

,

,

ACM Trans. Graph., 2013

Content-Aware Rotation.

[DOI]

,

,

Proceedings of the IEEE International Conference on Computer Vision, 2013

Cross Segment Decoding for Improved Quality of Experience for Video Applications.

[DOI]

,

,

,

,

,

,

Proceedings of the 2013 Data Compression Conference, 2013