Han Zhang

Orcid: 0000-0001-7072-2189

Affiliations:

Google Brain, Mountain View, CA, USA
Rutgers University, Department of Computer Science, Piscataway, NJ, USA (PhD 2018)
Beijing University of Posts and Telecommunications, Multimedia Communication and Pattern Recognition Lab, Beijing, China (2009 - 2012)

According to our database¹, Han Zhang authored at least 72 papers between 2010 and 2024.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2024

ChatDiT: A Training-Free Baseline for Task-Agnostic Free-Form Chatting with Diffusion Transformers.

[BibT_eX]

[DOI]

CoRR, 2024

The Matrix: Infinite-Horizon World Generation with Real-Time Moving Control.

[BibT_eX]

[DOI]

CoRR, 2024

DICE: Discrete Inversion Enabling Controllable Editing for Multinomial Diffusion and Masked Generative Models.

[BibT_eX]

[DOI]

CoRR, 2024

ε-VAE: Denoising as Visual Decoding.

[BibT_eX]

[DOI]

CoRR, 2024

BACON: Supercharge Your VLM with Bag-of-Concept Graph to Mitigate Hallucinations.

[BibT_eX]

[DOI]

CoRR, 2024

Steering Prototypes with Prompt-tuning for Rehearsal-free Continual Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

CCM: Real-Time Controllable Visual Content Creation Using Text-to-Image Consistency Models.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Lipschitz Singularities in Diffusion Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

DreamClean: Restoring Clean Image Using Deep Diffusion Prior.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Parrot: Pareto-Optimal Multi-reward Reinforcement Learning Framework for Text-to-Image Generation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

2023

CCM: Adding Conditional Controls to Text-to-Image Consistency Models.

[BibT_eX]

[DOI]

CoRR, 2023

Eliminating Lipschitz Singularities in Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2023

Learning Disentangled Prompts for Compositional Image Synthesis.

[BibT_eX]

[DOI]

CoRR, 2023

Steering Prototype with Prompt-tuning for Rehearsal-free Continual Learning.

[BibT_eX]

[DOI]

CoRR, 2023

StraIT: Non-autoregressive Generation with Stratified Image Transformer.

[BibT_eX]

[DOI]

CoRR, 2023

StoryBench: A Multifaceted Benchmark for Continuous Story Visualization.

[BibT_eX]

[DOI]

Mohammad Taghi Saffar

Han Zhang

Dumitru Erhan

Vittorio Ferrari

Pieter-Jan Kindermans

Paul Voigtlaender

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Muse: Text-To-Image Generation via Masked Generative Transformers.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Phenaki: Variable Length Video Generation from Open Domain Textual Descriptions.

[BibT_eX]

[DOI]

Ruben Villegas

Mohammad Babaeizadeh

Pieter-Jan Kindermans

Hernan Moraldo

Han Zhang

Mohammad Taghi Saffar

Santiago Castro

Julius Kunze

Dumitru Erhan

Proceedings of the Eleventh International Conference on Learning Representations, 2023

VQ3D: Learning a 3D-Aware Generative Model on ImageNet.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SVDiff: Compact Parameter Space for Diffusion Fine-Tuning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Dimensionality-Varying Diffusion Process.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MAGVIT: Masked Generative Video Transformer.

[BibT_eX]

[DOI]

Alexander G. Hauptmann

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Visual Prompt Tuning for Generative Transfer Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MAGE: MAsked Generative Encoder to Unify Representation Learning and Image Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2022

Deep image synthesis from intuitive user input: A review and perspectives.

[BibT_eX]

[DOI]

Comput. Vis. Media, 2022

Phenaki: Variable Length Video Generation From Open Domain Textual Description.

[BibT_eX]

[DOI]

Ruben Villegas

Mohammad Babaeizadeh

Pieter-Jan Kindermans

Hernan Moraldo

Han Zhang

Mohammad Taghi Saffar

Santiago Castro

Julius Kunze

Dumitru Erhan

CoRR, 2022

Vector-quantized Image Modeling with Improved VQGAN.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

ViTGAN: Training GANs with Vision Transformers.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

Learning Instance-Specific Adaptation for Cross-Domain Segmentation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

MaxViT: Multi-axis Vision Transformer.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision, 2022

BLT: Bidirectional Layout Transformer for Controllable Layout Generation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

DualPrompt: Complementary Prompting for Rehearsal-Free Continual Learning.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

MAXIM: Multi-Axis MLP for Image Processing.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

MaskGIT: Masked Generative Image Transformer.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Learning to Prompt for Continual Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Nested Hierarchical Transformer: Towards Accurate, Data-Efficient and Interpretable Visual Understanding.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Aggregating Nested Transformers.

[BibT_eX]

[DOI]

CoRR, 2021

Improved Transformer for High-Resolution GANs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

PseudoSeg: Designing Pseudo Labels for Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Revisiting Hierarchical Approach for Persistent Long-Term Video Prediction.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

Cross-Modal Contrastive Learning for Text-to-Image Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Improved Consistency Regularization for GANs.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Image Augmentations for GAN Training.

[BibT_eX]

[DOI]

CoRR, 2020

A Simple Semi-Supervised Learning Framework for Object Detection.

[BibT_eX]

[DOI]

CoRR, 2020

FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Small-GAN: Speeding up GAN Training using Core-Sets.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Consistency Regularization for Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

ReMixMatch: Semi-Supervised Learning with Distribution Matching and Augmentation Anchoring.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Distilling Effective Supervision From Severe Label Noise.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models.

[BibT_eX]

[DOI]

Giannis Daras

Augustus Odena

Han Zhang

Alexandros G. Dimakis

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2019

ReMixMatch: Semi-Supervised Learning with Distribution Alignment and Augmentation Anchoring.

[BibT_eX]

[DOI]

CoRR, 2019

IEG: Robust Neural Network Training to Tackle Severe Label Noise.

[BibT_eX]

[DOI]

CoRR, 2019

Self-Attention Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the 36th International Conference on Machine Learning, 2019

Co-Occurrent Features in Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

SegAN: Adversarial Network with Multi-scale L 1 Loss for Medical Image Segmentation.

[BibT_eX]

[DOI]

Neuroinformatics, 2018

Improving GANs Using Optimal Transport.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

AttnGAN: Fine-Grained Text to Image Generation With Attentional Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

Multi-feature based benchmark for cervical dysplasia classification evaluation.

[BibT_eX]

[DOI]

Pattern Recognit., 2017

SegAN: Adversarial Network with Multi-scale L<sub>1</sub> Loss for Medical Image Segmentation.

[BibT_eX]

[DOI]

CoRR, 2017

StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks.

[BibT_eX]

[DOI]

Han Zhang

Tao Xu

Hongsheng Li

Proceedings of the IEEE International Conference on Computer Vision, 2017

Link the Head to the "Beak": Zero Shot Learning from Noisy Text Description at Part Precision.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016

StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks.

[BibT_eX]

[DOI]

CoRR, 2016

Multimodal Deep Learning for Cervical Dysplasia Diagnosis.

[BibT_eX]

[DOI]

Proceedings of the Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016, 2016

SPDA-CNN: Unifying Semantic Part Detection and Abstraction for Fine-Grained Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2014

Robust shape prior modeling based on Gaussian-Bernoulli restricted Boltzmann Machine.

[BibT_eX]

[DOI]

Proceedings of the IEEE 11th International Symposium on Biomedical Imaging, 2014

2013

Categorization of Underwater Habitats Using Dynamic Video Textures.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops, 2013

2012

Video Browser Showdown by NUS.

[BibT_eX]

[DOI]

Proceedings of the Advances in Multimedia Modeling - 18th International Conference, 2012

2011

BUPT-MCPRL at TRECVID 2011.

[BibT_eX]

[DOI]

Proceedings of the 2011 TREC Video Retrieval Evaluation, 2011

2010

BUPT-MCPRL at TRECVID 2010.

[BibT_eX]

[DOI]

Proceedings of the TRECVID 2010 workshop participants notebook papers, 2010

Han Zhang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...