2025

Video Depth Anything: Consistent Depth Estimation for Super-Long Videos.

[DOI]

Sili Chen

Hengkai Guo

CoRR, January, 2025

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos.

[DOI]

CoRR, January, 2025

Fully automated segmentation of brain and scalp blood vessels on multi-parametric magnetic resonance imaging using multi-view cascaded networks.

[DOI]

Comput. Methods Programs Biomed., 2025

2024

Lightweight Model Pre-Training via Language Guided Knowledge Distillation.

[DOI]

IEEE Trans. Multim., 2024

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models.

[DOI]

CoRR, 2024

CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis.

[DOI]

CoRR, 2024

Revisiting Evolutionary Program Repair via Code Language Model.

[DOI]

CoRR, 2024

DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention.

[DOI]

CoRR, 2024

Disentangled Pre-training for Image Matting.

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Depth Anything V2.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Classification Done Right for Vision-Language Pre-Training.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

DCNet: Large-Scale Point Cloud Semantic Segmentation With Discriminative and Efficient Feature Aggregation.

[DOI]

IEEE Trans. Circuits Syst. Video Technol., August, 2023

CCNet: Criss-Cross Attention for Semantic Segmentation.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., June, 2023

Efficient image dehazing algorithm using multiple priors constraints.

[DOI]

J. Vis. Commun. Image Represent., February, 2023

Harnessing Diffusion Models for Visual Perception with Meta Prompts.

[DOI]

CoRR, 2023

BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs.

[DOI]

CoRR, 2023

SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation.

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

SeMask: Semantically Masked Transformers for Semantic Segmentation.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Executing your Commands via Motion Diffusion in Latent Space.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Task-aware Scheduling and Performance Optimization on Yitian710 SoC for GEMM-based Workloads on the Cloud.

[DOI]

Proceedings of the 5th IEEE International Conference on Artificial Intelligence Circuits and Systems, 2023

2022

AlignSeg: Feature-Aligned Segmentation Networks.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

Executing your Commands via Motion Diffusion in Latent Space.

[DOI]

CoRR, 2022

Coordinates Are NOT Lonely - Codebook Prior Helps Implicit Neural 3D representations.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Efficient Single-Image Depth Estimation on Mobile Devices, Mobile AI & AIM 2022 Challenge: Report.

[DOI]

Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Skeleton-based action recognition with temporal action graph and temporal adaptive graph convolution structure.

[DOI]

Multim. Tools Appl., 2021

Shuffle Transformer with Feature Alignment for Video Face Parsing.

[DOI]

CoRR, 2021

Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer.

[DOI]

CoRR, 2021

Fast and Accurate Single-Image Depth Estimation on Mobile Devices, Mobile AI 2021 Challenge: Report.

[DOI]

CoRR, 2021

Half-Real Half-Fake Distillation for Class-Incremental Semantic Segmentation.

[DOI]

CoRR, 2021

Human De-Occlusion: Invisible Perception and Recovery for Humans.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

A Simple Baseline for Fast and Accurate Depth Estimation on Mobile Devices.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

High-Resolution Deep Image Matting.

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Semantic Image Segmentation by Scale-Adaptive Networks.

[DOI]

IEEE Trans. Image Process., 2020

Deep Learning-Based Automated Image Segmentation for Concrete Petrographic Analysis.

[DOI]

CoRR, 2020

Agriculture-Vision: A Large Aerial Image Database for Agricultural Pattern Analysis.

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

The 1st Agriculture-Vision Challenge: Methods and Results.

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation.

[DOI]

CoRR, 2019

Towards End-to-End Speech Recognition with Deep Multipath Convolutional Neural Networks.

[DOI]

Proceedings of the Intelligent Robotics and Applications - 12th International Conference, 2019

Motion-Guided Spatial Time Attention for Video Object Segmentation.

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

CCNet: Criss-Cross Attention for Semantic Segmentation.

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

SPGNet: Semantic Prediction Guidance for Scene Parsing.

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Devil in the Details: Towards Accurate Single and Multiple Human Parsing.

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Devil in the Details: Towards Accurate Single and Multiple Human Parsing.

[DOI]

CoRR, 2018

Weakly-Supervised Semantic Segmentation Network With Deep Seeded Region Growing.

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017

Deep patch learning for weakly supervised object classification and discovery.

[DOI]

Pattern Recognit., 2017

Point Linking Network for Object Detection.

[DOI]

CoRR, 2017

Object-Level Proposals.

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2017

2016

Continuous Gesture Recognition Based on Hidden Markov Model.

[DOI]

Proceedings of the Internet and Distributed Computing Systems, 2016