Fan Wang

Orcid: 0000-0001-7320-1119

Affiliations:
  • Alibaba Group, Hangzhou, China


According to our database1, Fan Wang authored at least 90 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Dynamic gradient reactivation for backward compatible person re-identification.
Pattern Recognit., February, 2024

Graph Convolution Based Efficient Re-Ranking for Visual Retrieval.
IEEE Trans. Multim., 2024

Dual-View Curricular Optimal Transport for Cross-Lingual Cross-Modal Retrieval.
IEEE Trans. Image Process., 2024

Region Generation and Assessment Network for Occluded Person Re-Identification.
IEEE Trans. Inf. Forensics Secur., 2024

SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels.
Int. J. Comput. Vis., 2024

MVGenMaster: Scaling Multi-View Generation from Any Image via 3D Priors Enhanced Diffusion Model.
CoRR, 2024

Dynamic Diffusion Transformer.
CoRR, 2024

AnyLogo: Symbiotic Subject-Driven Diffusion System with Gemini Status.
CoRR, 2024

RealisDance: Equip controllable character animation with realistic hands.
CoRR, 2024

RealisHuman: A Two-Stage Approach for Refining Malformed Human Parts in Generated Images.
CoRR, 2024

VCD-Texture: Variance Alignment based 3D-2D Co-Denoising for Text-Guided Texturing.
CoRR, 2024

Tuning-Free Alignment of Diffusion Models with Direct Noise Optimization.
CoRR, 2024

Uncovering the Text Embedding in Text-to-Image Diffusion Models.
CoRR, 2024

Text Data-Centric Image Captioning with Interactive Prompts.
CoRR, 2024

Dynamic Token-Pass Transformers for Semantic Segmentation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Dynamic Tuning Towards Parameter and Inference Efficiency for ViT Adaptation.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Animate3D: Animating Any 3D Model with Multi-view Video Diffusion.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Adaptive Query Selection for Camouflaged Instance Segmentation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

DiffAug: Enhance Unsupervised Contrastive Learning with Domain-Knowledge-Free Diffusion-based Data Augmentation.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Accelerating Parallel Sampling of Diffusion Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Language-Guided Few-Shot Semantic Segmentation.
Proceedings of the IEEE International Conference on Acoustics, 2024

DMT: Comprehensive Distillation with Multiple Self-Supervised Teachers.
Proceedings of the IEEE International Conference on Acoustics, 2024

SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer.
Proceedings of the Computer Vision - ECCV 2024, 2024

VCD-Texture: Variance Alignment Based 3D-2D Co-denoising for Text-Guided Texturing.
Proceedings of the Computer Vision - ECCV 2024, 2024

CL2CM: Improving Cross-Lingual Cross-Modal Retrieval via Cross-Lingual Knowledge Transfer.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

BVT-IMA: Binary Vision Transformer with Information-Modified Attention.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
A Unified Multimodal De- and Re-Coupling Framework for RGB-D Motion Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., October, 2023

What Limits the Performance of Local Self-attention?
Int. J. Comput. Vis., October, 2023

Efficient Token-Guided Image-Text Retrieval With Consistent Multimodal Contrastive Training.
IEEE Trans. Image Process., 2023

Land Use and Land Cover Mapping in China Using Multimodal Fine-Grained Dual Network.
IEEE Trans. Geosci. Remote. Sens., 2023

SingleInsert: Inserting New Concepts from a Single Image into Text-to-Image Models for Flexible Editing.
CoRR, 2023

Boosting Unsupervised Contrastive Learning Using Diffusion-Based Data Augmentation From Scratch.
CoRR, 2023

ICPC: Instance-Conditioned Prompting with Contrastive Learning for Semantic Segmentation.
CoRR, 2023

Improved Neural Radiance Fields Using Pseudo-depth and Fusion.
CoRR, 2023

RegionBLIP: A Unified Multi-modal Pre-training Framework for Holistic and Regional Comprehension.
CoRR, 2023

Dynamic Token-Pass Transformers for Semantic Segmentation.
CoRR, 2023

Revisit Parameter-Efficient Transfer Learning: A Two-Stage Paradigm.
CoRR, 2023

D2Q-DETR: Decoupling and Dynamic Queries for Oriented Object Detection with Transformers.
CoRR, 2023

Data Pruning via Moving-one-Sample-out.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Points-to-3D: Bridging the Gap between Sparse Points and Shape-Controllable Text-to-3D Generation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

UniNeXt: Exploring A Unified Architecture for Vision Recognition.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Patch-level Contrastive Learning via Positional Query for Visual Pre-training.
Proceedings of the International Conference on Machine Learning, 2023

LMSeg: Language-guided Multi-dataset Segmentation.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Revisiting Vision Transformer from the View of Path Ensemble.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

D<sup>2</sup>Q-DETR: Decoupling and Dynamic Queries for Oriented Object Detection with Transformers.
Proceedings of the IEEE International Conference on Acoustics, 2023

Foundation Model Drives Weakly Incremental Learning for Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Efficient Mask Correction for Click-Based Interactive Image Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Beyond Appearance: A Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

DOAD: Decoupled One Stage Action Detection Network.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Making Vision Transformers Efficient from A Token Sparsification View.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Frequency Domain Disentanglement for Arbitrary Neural Style Transfer.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Head-Free Lightweight Semantic Segmentation with Linear Transformer.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

SwinRDM: Integrate SwinRNN with Diffusion Model towards High-Resolution and High-Quality Weather Forecasting.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Multi-View Evolutionary Training for Unsupervised Domain Adaptive Person Re-Identification.
IEEE Trans. Inf. Forensics Secur., 2022

Class-Aware Feature Aggregation Network for Video Object Detection.
IEEE Trans. Circuits Syst. Video Technol., 2022

Refining pseudo labels for unsupervised Domain Adaptive Re-Identification.
Knowl. Based Syst., 2022

Revisiting instance search: A new benchmark using cycle self-training.
Neurocomputing, 2022

Effective Vision Transformer Training: A Data-Centric Perspective.
CoRR, 2022

FAKD: Feature Augmented Knowledge Distillation for Semantic Segmentation.
CoRR, 2022

Semi-supervised Semantic Segmentation with Mutual Knowledge Distillation.
CoRR, 2022

VTC-LFC: Vision Transformer Compression with Low-Frequency Components.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

TAGPerson: A Target-Aware Generation Pipeline for Person Re-identification.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation.
Proceedings of the HCMA@MM 2022: Proceedings of the 3rd International Workshop on Human-Centric Multimedia Analysis, 2022

CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Graph Convolution for Re-Ranking in Person Re-Identification.
Proceedings of the IEEE International Conference on Acoustics, 2022

Image-to-Video Re-Identification via Mutual Discriminative Knowledge Transfer.
Proceedings of the IEEE International Conference on Acoustics, 2022

Adaptive Matching Strategy for Multi-Target Multi-Camera Tracking.
Proceedings of the IEEE International Conference on Acoustics, 2022

TransFGU: A Top-Down Approach to Fine-Grained Unsupervised Semantic Segmentation.
Proceedings of the Computer Vision - ECCV 2022, 2022

KVT: k-NN Attention for Boosting Vision Transformers.
Proceedings of the Computer Vision, 2022

Unstructured Feature Decoupling for Vehicle Re-identification.
Proceedings of the Computer Vision - ECCV 2022, 2022

Decoupling and Recoupling Spatiotemporal Representation for RGB-D-based Motion Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Scaled ReLU Matters for Training Vision Transformers.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Context and Structure Mining Network for Video Object Detection.
Int. J. Comput. Vis., 2021

ELSA: Enhanced Local Self-Attention for Vision Transformer.
CoRR, 2021

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification.
CoRR, 2021

Achieving Human Parity on Visual Question Answering.
CoRR, 2021

2nd Place Solution to Google Landmark Retrieval 2021.
CoRR, 2021

KVT: k-NN Attention for Boosting Vision Transformers.
CoRR, 2021

Augmented Transformer with Adaptive Graph for Temporal Action Proposal Generation.
CoRR, 2021

1st Place Solution to ECCV-TAO-2020: Detect and Represent Any Object for Tracking.
CoRR, 2021

Exploring the Quality of GAN Generated Images for Person Re-Identification.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

TransReID: Transformer-based Object Re-Identification.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

An Empirical Study of Vehicle Re-Identification on the AI City Challenge.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

City-Scale Multi-Camera Vehicle Tracking Guided by Crossroad Zones.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

2020
1st Place Solution to VisDA-2020: Bias Elimination for Domain Adaptive Pedestrian Re-identification.
CoRR, 2020

Exploiting Better Feature Aggregation for Video Object Detection.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Multi-Domain Learning and Identity Mining for Vehicle Re-Identification.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020


  Loading...