2025
Video Depth Anything: Consistent Depth Estimation for Super-Long Videos.
CoRR, January, 2025

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos.
CoRR, January, 2025

Fully automated segmentation of brain and scalp blood vessels on multi-parametric magnetic resonance imaging using multi-view cascaded networks.
Comput. Methods Programs Biomed., 2025

2024
Lightweight Model Pre-Training via Language Guided Knowledge Distillation.
IEEE Trans. Multim., 2024

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models.
CoRR, 2024

CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis.
CoRR, 2024

Revisiting Evolutionary Program Repair via Code Language Model.
CoRR, 2024

DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention.
CoRR, 2024

Disentangled Pre-training for Image Matting.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Depth Anything V2.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Classification Done Right for Vision-Language Pre-Training.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
DCNet: Large-Scale Point Cloud Semantic Segmentation With Discriminative and Efficient Feature Aggregation.
IEEE Trans. Circuits Syst. Video Technol., August, 2023

CCNet: Criss-Cross Attention for Semantic Segmentation.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

Efficient image dehazing algorithm using multiple priors constraints.
J. Vis. Commun. Image Represent., February, 2023

Harnessing Diffusion Models for Visual Perception with Meta Prompts.
CoRR, 2023

BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs.
CoRR, 2023

SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

SeMask: Semantically Masked Transformers for Semantic Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Executing your Commands via Motion Diffusion in Latent Space.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Task-aware Scheduling and Performance Optimization on Yitian710 SoC for GEMM-based Workloads on the Cloud.
Proceedings of the 5th IEEE International Conference on Artificial Intelligence Circuits and Systems, 2023

2022
AlignSeg: Feature-Aligned Segmentation Networks.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Executing your Commands via Motion Diffusion in Latent Space.
CoRR, 2022

Coordinates Are NOT Lonely - Codebook Prior Helps Implicit Neural 3D representations.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Efficient Single-Image Depth Estimation on Mobile Devices, Mobile AI & AIM 2022 Challenge: Report.
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Skeleton-based action recognition with temporal action graph and temporal adaptive graph convolution structure.
Multim. Tools Appl., 2021

Shuffle Transformer with Feature Alignment for Video Face Parsing.
CoRR, 2021

Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer.
CoRR, 2021

Fast and Accurate Single-Image Depth Estimation on Mobile Devices, Mobile AI 2021 Challenge: Report.
CoRR, 2021

Half-Real Half-Fake Distillation for Class-Incremental Semantic Segmentation.
CoRR, 2021

Human De-Occlusion: Invisible Perception and Recovery for Humans.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

A Simple Baseline for Fast and Accurate Depth Estimation on Mobile Devices.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

High-Resolution Deep Image Matting.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Semantic Image Segmentation by Scale-Adaptive Networks.
IEEE Trans. Image Process., 2020

Deep Learning-Based Automated Image Segmentation for Concrete Petrographic Analysis.
CoRR, 2020

Agriculture-Vision: A Large Aerial Image Database for Agricultural Pattern Analysis.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

The 1st Agriculture-Vision Challenge: Methods and Results.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation.
CoRR, 2019

Towards End-to-End Speech Recognition with Deep Multipath Convolutional Neural Networks.
Proceedings of the Intelligent Robotics and Applications - 12th International Conference, 2019

Motion-Guided Spatial Time Attention for Video Object Segmentation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

CCNet: Criss-Cross Attention for Semantic Segmentation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

SPGNet: Semantic Prediction Guidance for Scene Parsing.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Devil in the Details: Towards Accurate Single and Multiple Human Parsing.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Devil in the Details: Towards Accurate Single and Multiple Human Parsing.
CoRR, 2018

Weakly-Supervised Semantic Segmentation Network With Deep Seeded Region Growing.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Deep patch learning for weakly supervised object classification and discovery.
Pattern Recognit., 2017

Point Linking Network for Object Detection.
CoRR, 2017

Object-Level Proposals.
Proceedings of the IEEE International Conference on Computer Vision, 2017

2016
Continuous Gesture Recognition Based on Hidden Markov Model.
Proceedings of the Internet and Distributed Computing Systems, 2016