Dan Xu

IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

MCTformer+: Multi-Class Token Transformer for Weakly Supervised Semantic Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head Video Generation.

[BibT_eX]

[DOI]

Fa-Ting Hong

Li Shen

IEEE Trans. Pattern Anal. Mach. Intell., May, 2024

MotionGPT-2: A General-Purpose Motion-Language Model for Motion Generation and Understanding.

[BibT_eX]

[DOI]

CoRR, 2024

Learning Online Scale Transformation for Talking Head Video Generation.

[BibT_eX]

[DOI]

Fa-Ting Hong

CoRR, 2024

Holistic-Motion2D: Scalable Whole-body Human Motion Generation in 2D Space.

[BibT_eX]

[DOI]

CoRR, 2024

Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection.

[BibT_eX]

[DOI]

CoRR, 2024

X-VILA: Cross-Modality Alignment for Large Language Model.

[BibT_eX]

[DOI]

CoRR, 2024

GScream: Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal.

[BibT_eX]

[DOI]

CoRR, 2024

Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised Semantic Segmentation.

[BibT_eX]

[DOI]

CoRR, 2024

SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

RoomTex: Texturing Compositional Indoor Scenes via Iterative Inpainting.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Motion-Oriented Compositional Neural Radiance Fields for Monocular Dynamic Human Modeling.

[BibT_eX]

[DOI]

Jaehyeok Kim

Dongyoon Wee

Proceedings of the Computer Vision - ECCV 2024, 2024

CVT-xRF: Contrastive In-Voxel Transformer for 3D Consistent Radiance Fields from Sparse Inputs.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

DetCLIPv3: Towards Versatile Generative Open-Vocabulary Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Efficient Multitask Dense Predictor via Binarization.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Implicit Event-RGBD Neural SLAM.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Interactive3D: Create What You Want by Interactive 3D Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Text-to-3D Generation with Bidirectional Diffusion Using Both 2D and 3D Priors.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Learning class-agnostic masks with cross-task refinement for weakly supervised semantic segmentation.

[BibT_eX]

[DOI]

Neural Comput. Appl., September, 2023

Reducing Spatial Labeling Redundancy for Active Semi-Supervised Crowd Counting.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

AttentionGAN: Unpaired Image-to-Image Translation Using Attention-Guided Generative Adversarial Networks.

[BibT_eX]

[DOI]

IEEE Trans. Neural Networks Learn. Syst., April, 2023

Continual Attentive Fusion for Incremental Learning in Semantic Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2023

Uncertainty-Aware Contrastive Distillation for Incremental Semantic Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2023

Implicit Event-RGBD Neural SLAM.

[BibT_eX]

[DOI]

CoRR, 2023

Joint 2D-3D Multi-Task Learning on Cityscapes-3D: 3D Detection, Segmentation, and Depth Estimation.

[BibT_eX]

[DOI]

CoRR, 2023

You Only Train Once: Multi-Identity Free-Viewpoint Neural Human Rendering from Monocular Videos.

[BibT_eX]

[DOI]

Jaehyeok Kim

Dongyoon Wee

CoRR, 2023

CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Fine-grained Domain Adaptive Crowd Counting via Point-derived Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

TaskPrompter: Spatial-Channel Multi-Task Prompting for Dense Scene Understanding.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Switch-NeRF: Learning Scene Decomposition with Mixture of Experts for Large-scale Neural Radiance Fields.

[BibT_eX]

[DOI]

Zhenxing Mi

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Learning Unified Decompositional and Compositional NeRF for Editable Novel View Synthesis.

[BibT_eX]

[DOI]

Yuxin Wang

Wayne Wu

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head Video Generation.

[BibT_eX]

[DOI]

Fa-Ting Hong

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Multi-Modal Multi-Task Joint 2D and 3D Scene Perception and Localization.

[BibT_eX]

[DOI]

Proceedings of the 4th International Workshop on Human-centric Multimedia Analysis, 2023

DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Learning Multi-Modal Class-Specific Tokens for Weakly Supervised Dense Object Localization.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Contrastive Multi-Task Dense Prediction.

[BibT_eX]

[DOI]

Siwei Yang

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Probabilistic Graph Attention Network With Conditional Kernels for Pixel-Wise Prediction.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Inverted Pyramid Multi-task Transformer for Dense Scene Understanding.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Network Binarization via Contrastive Learning.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Lipschitz Continuity Retained Binary Neural Network.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Multi-class Token Transformer for Weakly Supervised Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Generalized Binary Search Network for Highly-Efficient Multi-View Stereo.

[BibT_eX]

[DOI]

Zhenxing Mi

Di Chang

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Depth-Aware Generative Adversarial Network for Talking Head Video Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Reducing Spatial Labeling Redundancy for Semi-supervised Crowd Counting.

[BibT_eX]

[DOI]

CoRR, 2021

Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection.

[BibT_eX]

[DOI]

CoRR, 2021

Sign-Agnostic CONet: Learning Implicit Surface Reconstructions by Sign-Agnostic Optimization of Convolutional Occupancy Networks.

[BibT_eX]

[DOI]

CoRR, 2021

Variational Structured Attention Networks for Deep Visual Representation Learning.

[BibT_eX]

[DOI]

Guanglei Yang

Paolo Rota

Mingli Ding

CoRR, 2021

Cross-modal Consensus Network for Weakly Supervised Temporal Action Localization.

[BibT_eX]

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Moving SLAM: Fully Unsupervised Deep Learning in Non-Rigid Scenes.

[BibT_eX]

[DOI]

Andrea Vedaldi

João F. Henriques

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Leveraging Auxiliary Tasks with Affinity Learning for Weakly Supervised Semantic Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks.

[BibT_eX]

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Learning Parallel Dense Correspondence From Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Delving Into Localization Errors for Monocular 3D Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

Learning How to Smile: Expression Video Generation With Conditional Adversarial Recurrent Nets.

[BibT_eX]

[DOI]

Wei Wang

IEEE Trans. Multim., 2020

Progressive Fusion for Unsupervised Binocular Depth Estimation Using Cycled Networks.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2020

Scope Head for Accurate Localization in Object Detection.

[BibT_eX]

[DOI]

CoRR, 2020

Edge Guided GANs with Semantic Preserving for Semantic Image Synthesis.

[BibT_eX]

[DOI]

CoRR, 2020

Multi-Channel Attention Selection GANs for Guided Image-to-Image Translation.

[BibT_eX]

[DOI]

CoRR, 2020

Dynamic Graph Message Passing Networks.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019

Monocular Depth Estimation Using Multi-Scale Continuous CRFs as Sequential Deep Networks.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2019

Asymmetric Generative Adversarial Networks for Image-to-Image Translation.

[BibT_eX]

[DOI]

CoRR, 2019

Deep Micro-Dictionary Learning and Coding Network.

[BibT_eX]

[DOI]

Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Cycle In Cycle Generative Adversarial Networks for Keypoint-Guided Image Generation.

[BibT_eX]

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

Attention-Guided Generative Adversarial Networks for Unsupervised Image-to-Image Translation.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2019

Expression Conditional Gan for Facial Expression-to-Expression Translation.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Structured Modeling of Joint Deep Feature and Prediction Refinement for Salient Object Detection.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Unsupervised Collaborative Learning of Keyframe Detection and Visual Odometry Towards Monocular Deep SLAM.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Attribute-Guided Sketch Generation.

[BibT_eX]

[DOI]

Proceedings of the 14th IEEE International Conference on Automatic Face & Gesture Recognition, 2019

Multi-Channel Attention Selection GAN With Cascaded Semantic Guidance for Cross-View Image Translation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Structured Coupled Generative Adversarial Networks for Unsupervised Monocular Depth Estimation.

[BibT_eX]

[DOI]

Proceedings of the 2019 International Conference on 3D Vision, 2019

2018

Exploring Multi-Modal and Structured Representation Learning for Visual Image and Video Understanding.

[BibT_eX]

[DOI]

PhD thesis, 2018

Cross-Paced Representation Learning With Partial Curricula for Sketch-Based Image Retrieval.

[BibT_eX]

[DOI]

Jingkuan Song

IEEE Trans. Image Process., 2018

Every Smile is Unique: Landmark-Guided Diverse Smile Generation.

[BibT_eX]

[DOI]

Wei Wang

CoRR, 2018

GestureGAN for Hand Gesture-to-Gesture Translation in the Wild.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Group Consistent Similarity Learning via Deep CRF for Person Re-Identification.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Every Smile Is Unique: Landmark-Guided Diverse Smile Generation.

[BibT_eX]

[DOI]

Wei Wang

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

PAD-Net: Multi-Tasks Guided Prediction-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Dual Generator Generative Adversarial Networks for Multi-domain Image-to-Image Translation.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ACCV 2018, 2018

Unsupervised Adversarial Depth Estimation Using Cycled Generative Networks.

[BibT_eX]

[DOI]

Proceedings of the 2018 International Conference on 3D Vision, 2018

2017

Supervised Local Descriptor Learning for Human Action Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Multim., 2017

Detecting anomalous events in videos by learning deep representations of appearance and motion.

[BibT_eX]

[DOI]

Comput. Vis. Image Underst., 2017

Learning Deep Structured Multi-Scale Features using Attention-Gated CRFs for Contour Prediction.

[BibT_eX]

[DOI]

Wanli Ouyang

Xiaogang Wang

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Multi-scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Learning Cross-Modal Deep Representations for Robust Pedestrian Detection.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Viraliency: Pooling Local Virality.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016

Academic Coupled Dictionary Learning for Sketch-based Image Retrieval.

[BibT_eX]

[DOI]

Jingkuan Song

Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Multi-Paced Dictionary Learning for cross-domain retrieval and recognition.

[BibT_eX]

[DOI]

Jingkuan Song