Dan Xu

Orcid: 0000-0003-0136-9603

Affiliations:
  • Hong Kong University of Science and Technology, Department of Computer Science and Engineering, Hong Kong
  • University of Oxford, UK (2018 - 2021)
  • University of Trento, Department of Information Engineering and Computer Science, Italy (PhD 2018)


According to our database1, Dan Xu authored at least 97 papers between 2015 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
InvPT++: Inverted Pyramid Multi-Task Transformer for Visual Scene Understanding.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

MCTformer+: Multi-Class Token Transformer for Weakly Supervised Semantic Segmentation.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

DaGAN++: Depth-Aware Generative Adversarial Network for Talking Head Video Generation.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2024

MotionGPT-2: A General-Purpose Motion-Language Model for Motion Generation and Understanding.
CoRR, 2024

Learning Online Scale Transformation for Talking Head Video Generation.
CoRR, 2024

Holistic-Motion2D: Scalable Whole-body Human Motion Generation in 2D Space.
CoRR, 2024

Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection.
CoRR, 2024

X-VILA: Cross-Modality Alignment for Large Language Model.
CoRR, 2024

GScream: Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal.
CoRR, 2024

Auxiliary Tasks Enhanced Dual-affinity Learning for Weakly Supervised Semantic Segmentation.
CoRR, 2024

SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis.
Proceedings of the Computer Vision - ECCV 2024, 2024

Learning 3D Geometry and Feature Consistent Gaussian Splatting for Object Removal.
Proceedings of the Computer Vision - ECCV 2024, 2024

RoomTex: Texturing Compositional Indoor Scenes via Iterative Inpainting.
Proceedings of the Computer Vision - ECCV 2024, 2024

Motion-Oriented Compositional Neural Radiance Fields for Monocular Dynamic Human Modeling.
Proceedings of the Computer Vision - ECCV 2024, 2024

CVT-xRF: Contrastive In-Voxel Transformer for 3D Consistent Radiance Fields from Sparse Inputs.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

DetCLIPv3: Towards Versatile Generative Open-Vocabulary Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Efficient Multitask Dense Predictor via Binarization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Implicit Event-RGBD Neural SLAM.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Interactive3D: Create What You Want by Interactive 3D Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Text-to-3D Generation with Bidirectional Diffusion Using Both 2D and 3D Priors.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Learning class-agnostic masks with cross-task refinement for weakly supervised semantic segmentation.
Neural Comput. Appl., September, 2023

Reducing Spatial Labeling Redundancy for Active Semi-Supervised Crowd Counting.
IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

AttentionGAN: Unpaired Image-to-Image Translation Using Attention-Guided Generative Adversarial Networks.
IEEE Trans. Neural Networks Learn. Syst., April, 2023

Continual Attentive Fusion for Incremental Learning in Semantic Segmentation.
IEEE Trans. Multim., 2023

Uncertainty-Aware Contrastive Distillation for Incremental Semantic Segmentation.
IEEE Trans. Pattern Anal. Mach. Intell., 2023

Implicit Event-RGBD Neural SLAM.
CoRR, 2023

Joint 2D-3D Multi-Task Learning on Cityscapes-3D: 3D Detection, Segmentation, and Depth Estimation.
CoRR, 2023

You Only Train Once: Multi-Identity Free-Viewpoint Neural Human Rendering from Monocular Videos.
CoRR, 2023

CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Fine-grained Domain Adaptive Crowd Counting via Point-derived Segmentation.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

TaskPrompter: Spatial-Channel Multi-Task Prompting for Dense Scene Understanding.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Switch-NeRF: Learning Scene Decomposition with Mixture of Experts for Large-scale Neural Radiance Fields.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Learning Unified Decompositional and Compositional NeRF for Editable Novel View Synthesis.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head Video Generation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Multi-Modal Multi-Task Joint 2D and 3D Scene Perception and Localization.
Proceedings of the 4th International Workshop on Human-centric Multimedia Analysis, 2023

DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Learning Multi-Modal Class-Specific Tokens for Weakly Supervised Dense Object Localization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Contrastive Multi-Task Dense Prediction.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Probabilistic Graph Attention Network With Conditional Kernels for Pixel-Wise Prediction.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Inverted Pyramid Multi-task Transformer for Dense Scene Understanding.
Proceedings of the Computer Vision - ECCV 2022, 2022

Network Binarization via Contrastive Learning.
Proceedings of the Computer Vision - ECCV 2022, 2022

Lipschitz Continuity Retained Binary Neural Network.
Proceedings of the Computer Vision - ECCV 2022, 2022

Multi-class Token Transformer for Weakly Supervised Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Generalized Binary Search Network for Highly-Efficient Multi-View Stereo.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Depth-Aware Generative Adversarial Network for Talking Head Video Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Reducing Spatial Labeling Redundancy for Semi-supervised Crowd Counting.
CoRR, 2021

Learning Geometry-Guided Depth via Projective Modeling for Monocular 3D Object Detection.
CoRR, 2021

Sign-Agnostic CONet: Learning Implicit Surface Reconstructions by Sign-Agnostic Optimization of Convolutional Occupancy Networks.
CoRR, 2021

Variational Structured Attention Networks for Deep Visual Representation Learning.
CoRR, 2021

Cross-modal Consensus Network for Weakly Supervised Temporal Action Localization.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Moving SLAM: Fully Unsupervised Deep Learning in Non-Rigid Scenes.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Leveraging Auxiliary Tasks with Affinity Learning for Weakly Supervised Semantic Segmentation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

SA-ConvONet: Sign-Agnostic Optimization of Convolutional Occupancy Networks.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Learning Parallel Dense Correspondence From Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Delving Into Localization Errors for Monocular 3D Object Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Learning How to Smile: Expression Video Generation With Conditional Adversarial Recurrent Nets.
IEEE Trans. Multim., 2020

Progressive Fusion for Unsupervised Binocular Depth Estimation Using Cycled Networks.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

Scope Head for Accurate Localization in Object Detection.
CoRR, 2020

Edge Guided GANs with Semantic Preserving for Semantic Image Synthesis.
CoRR, 2020

Multi-Channel Attention Selection GANs for Guided Image-to-Image Translation.
CoRR, 2020

Dynamic Graph Message Passing Networks.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Monocular Depth Estimation Using Multi-Scale Continuous CRFs as Sequential Deep Networks.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Asymmetric Generative Adversarial Networks for Image-to-Image Translation.
CoRR, 2019

Deep Micro-Dictionary Learning and Coding Network.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Cycle In Cycle Generative Adversarial Networks for Keypoint-Guided Image Generation.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Attention-Guided Generative Adversarial Networks for Unsupervised Image-to-Image Translation.
Proceedings of the International Joint Conference on Neural Networks, 2019

Expression Conditional Gan for Facial Expression-to-Expression Translation.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

Structured Modeling of Joint Deep Feature and Prediction Refinement for Salient Object Detection.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Unsupervised Collaborative Learning of Keyframe Detection and Visual Odometry Towards Monocular Deep SLAM.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Attribute-Guided Sketch Generation.
Proceedings of the 14th IEEE International Conference on Automatic Face & Gesture Recognition, 2019

Multi-Channel Attention Selection GAN With Cascaded Semantic Guidance for Cross-View Image Translation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Structured Coupled Generative Adversarial Networks for Unsupervised Monocular Depth Estimation.
Proceedings of the 2019 International Conference on 3D Vision, 2019

2018
Exploring Multi-Modal and Structured Representation Learning for Visual Image and Video Understanding.
PhD thesis, 2018

Cross-Paced Representation Learning With Partial Curricula for Sketch-Based Image Retrieval.
IEEE Trans. Image Process., 2018

Every Smile is Unique: Landmark-Guided Diverse Smile Generation.
CoRR, 2018

GestureGAN for Hand Gesture-to-Gesture Translation in the Wild.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Group Consistent Similarity Learning via Deep CRF for Person Re-Identification.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Every Smile Is Unique: Landmark-Guided Diverse Smile Generation.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

PAD-Net: Multi-Tasks Guided Prediction-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Dual Generator Generative Adversarial Networks for Multi-domain Image-to-Image Translation.
Proceedings of the Computer Vision - ACCV 2018, 2018

Unsupervised Adversarial Depth Estimation Using Cycled Generative Networks.
Proceedings of the 2018 International Conference on 3D Vision, 2018

2017
Supervised Local Descriptor Learning for Human Action Recognition.
IEEE Trans. Multim., 2017

Detecting anomalous events in videos by learning deep representations of appearance and motion.
Comput. Vis. Image Underst., 2017

Learning Deep Structured Multi-Scale Features using Attention-Gated CRFs for Contour Prediction.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Multi-scale Continuous CRFs as Sequential Deep Networks for Monocular Depth Estimation.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Learning Cross-Modal Deep Representations for Robust Pedestrian Detection.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Viraliency: Pooling Local Virality.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
Academic Coupled Dictionary Learning for Sketch-based Image Retrieval.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Multi-Paced Dictionary Learning for cross-domain retrieval and recognition.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016

2015
Learning Deep Representations of Appearance and Motion for Anomalous Event Detection.
Proceedings of the British Machine Vision Conference 2015, 2015


  Loading...