2025
Video Depth Anything: Consistent Depth Estimation for Super-Long Videos.
CoRR, January, 2025
Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos.
CoRR, January, 2025
Fully automated segmentation of brain and scalp blood vessels on multi-parametric magnetic resonance imaging using multi-view cascaded networks.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Comput. Methods Programs Biomed., 2025
2024
Lightweight Model Pre-Training via Language Guided Knowledge Distillation.
IEEE Trans. Multim., 2024
LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis.
CoRR, 2024
Revisiting Evolutionary Program Repair via Code Language Model.
CoRR, 2024
DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention.
CoRR, 2024
Disentangled Pre-training for Image Matting.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Classification Done Right for Vision-Language Pre-Training.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
2023
DCNet: Large-Scale Point Cloud Semantic Segmentation With Discriminative and Efficient Feature Aggregation.
IEEE Trans. Circuits Syst. Video Technol., August, 2023
CCNet: Criss-Cross Attention for Semantic Segmentation.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2023
Efficient image dehazing algorithm using multiple priors constraints.
J. Vis. Commun. Image Represent., February, 2023
Harnessing Diffusion Models for Visual Perception with Meta Prompts.
CoRR, 2023
BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs.
CoRR, 2023
SeaFormer: Squeeze-enhanced Axial Transformer for Mobile Semantic Segmentation.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
SeMask: Semantically Masked Transformers for Semantic Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Executing your Commands via Motion Diffusion in Latent Space.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Task-aware Scheduling and Performance Optimization on Yitian710 SoC for GEMM-based Workloads on the Cloud.
Proceedings of the 5th IEEE International Conference on Artificial Intelligence Circuits and Systems, 2023
2022
AlignSeg: Feature-Aligned Segmentation Networks.
IEEE Trans. Pattern Anal. Mach. Intell., 2022
Executing your Commands via Motion Diffusion in Latent Space.
CoRR, 2022
Coordinates Are NOT Lonely - Codebook Prior Helps Implicit Neural 3D representations.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Efficient Single-Image Depth Estimation on Mobile Devices, Mobile AI & AIM 2022 Challenge: Report.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022
TopFormer: Token Pyramid Transformer for Mobile Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
2021
Skeleton-based action recognition with temporal action graph and temporal adaptive graph convolution structure.
Multim. Tools Appl., 2021
Shuffle Transformer with Feature Alignment for Video Face Parsing.
CoRR, 2021
Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer.
CoRR, 2021
Fast and Accurate Single-Image Depth Estimation on Mobile Devices, Mobile AI 2021 Challenge: Report.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2021
Half-Real Half-Fake Distillation for Class-Incremental Semantic Segmentation.
CoRR, 2021
Human De-Occlusion: Invisible Perception and Recovery for Humans.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
A Simple Baseline for Fast and Accurate Depth Estimation on Mobile Devices.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021
High-Resolution Deep Image Matting.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
Semantic Image Segmentation by Scale-Adaptive Networks.
IEEE Trans. Image Process., 2020
Deep Learning-Based Automated Image Segmentation for Concrete Petrographic Analysis.
CoRR, 2020
Agriculture-Vision: A Large Aerial Image Database for Agricultural Pattern Analysis.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
The 1st Agriculture-Vision Challenge: Methods and Results.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
2019
Proposal, Tracking and Segmentation (PTS): A Cascaded Network for Video Object Segmentation.
CoRR, 2019
Towards End-to-End Speech Recognition with Deep Multipath Convolutional Neural Networks.
Proceedings of the Intelligent Robotics and Applications - 12th International Conference, 2019
Motion-Guided Spatial Time Attention for Video Object Segmentation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019
CCNet: Criss-Cross Attention for Semantic Segmentation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
SPGNet: Semantic Prediction Guidance for Scene Parsing.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Devil in the Details: Towards Accurate Single and Multiple Human Parsing.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019
2018
Devil in the Details: Towards Accurate Single and Multiple Human Parsing.
CoRR, 2018
Weakly-Supervised Semantic Segmentation Network With Deep Seeded Region Growing.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018
2017
Deep patch learning for weakly supervised object classification and discovery.
Pattern Recognit., 2017
Point Linking Network for Object Detection.
CoRR, 2017
Proceedings of the IEEE International Conference on Computer Vision, 2017
2016
Continuous Gesture Recognition Based on Hidden Markov Model.
Proceedings of the Internet and Distributed Computing Systems, 2016