Xuming He

Orcid: 0000-0003-2150-1237

Affiliations:
  • ShanghaiTech University, China


According to our database1, Xuming He authored at least 153 papers between 2004 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Multi-Modal Modality-Masked Diffusion Network for Brain MRI Synthesis With Random Modality Missing.
IEEE Trans. Medical Imaging, July, 2024

SGTR+: End-to-End Scene Graph Generation With Transformer.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2024

Composing Novel Classes: A Concept-Driven Approach to Generalized Category Discovery.
CoRR, 2024

Just say what you want: only-prompting self-rewarding online preference optimization.
CoRR, 2024

CIBench: Evaluating Your LLMs with a Code Interpreter Plugin.
CoRR, 2024

SP<sup>2</sup>OT: Semantic-Regularized Progressive Partial Optimal Transport for Imbalanced Clustering.
CoRR, 2024

P<sup>2</sup>OT: Progressive Partial Optimal Transport for Deep Imbalanced Clustering.
CoRR, 2024

RealDex: Towards Human-like Grasping for Robotic Dexterous Hand.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Multi-Level Progressive Reinforcement Learning for Control Policy in Physical Simulations.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

P2OT: Progressive Partial Optimal Transport for Deep Imbalanced Clustering.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Dual-Level Adaptive Self-labeling for Novel Class Discovery in Point Cloud Segmentation.
Proceedings of the Computer Vision - ECCV 2024, 2024

SPHINX: A Mixer of Weights, Visual Embeddings and Image Scales for Multi-modal Large Language Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

LLM-HD: Layout Language Model for Hotspot Detection with GDS Semantic Encoding.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

DSGG: Dense Relation Transformer for an End-to-End Scene Graph Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Mining Fine-Grained Image-Text Alignment for Zero-Shot Captioning via Text-Only Training.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Novel Class Discovery for Long-tailed Recognition.
Trans. Mach. Learn. Res., 2023

GenEM: Physics-Informed Generative Cryo-Electron Microscopy.
CoRR, 2023

SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models.
CoRR, 2023

ATTA: Anomaly-aware Test-Time Adaptation for Out-of-Distribution Detection in Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

OccluBEV: Occlusion Aware Spatiotemporal Modeling for Multi-view 3D Object Detection.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Gradient-Map-Guided Adaptive Domain Generalization for Cross Modality MRI Segmentation.
Proceedings of the Machine Learning for Health, 2023

HC-Net: Hybrid Classification Network for Automatic Periodontal Disease Diagnosis.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Semi-Supervised Domain-Adaptive Pulmonary Artery Segmentation via Uncertainty Guidance and Shape Strengthening.
Proceedings of the 20th IEEE International Symposium on Biomedical Imaging, 2023

Exploring Learning-Based Control Policy for Fish-Like Robots in Altered Background Flows.
IROS, 2023

MILD: Modeling the Instance Learning Dynamics for Learning with Noisy Labels.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Modeling Multimodal Aleatoric Uncertainty in Segmentation with Mixture of Stochastic Experts.
Proceedings of the Eleventh International Conference on Learning Representations, 2023


Human-centric Scene Understanding for 3D Large-scale Scenarios.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Grounded Image Text Matching with Mismatched Relation Reasoning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Class-relation Knowledge Distillation for Novel Class Discovery.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Part-aware Prototypical Graph Network for One-shot Skeleton-based Action Recognition.
Proceedings of the 17th IEEE International Conference on Automatic Face and Gesture Recognition, 2023

HOICLIP: Efficient Knowledge Transfer for HOI Detection with Vision-Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Cascade Sparse Feature Propagation Network for Interactive Segmentation.
Proceedings of the 34th British Machine Vision Conference 2023, 2023

CALIP: Zero-Shot Enhancement of CLIP with Parameter-Free Attention.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Modeling Multimodal Aleatoric Uncertainty in Segmentation with Mixture of Stochastic Expert.
CoRR, 2022

A Novel Unified Conditional Score-based Generative Framework for Multi-modal Medical Image Completion.
CoRR, 2022

Automatic spinal curvature measurement on ultrasound spine images using Faster R-CNN.
CoRR, 2022

Intention-aware Feature Propagation Network for Interactive Segmentation.
CoRR, 2022

Budget-aware Few-shot Learning via Graph Convolutional Network.
CoRR, 2022

KD-VLP: Improving End-to-End Vision-and-Language Pretraining with Object Knowledge Distillation.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

ROI-Constrained Bidding via Curriculum-Guided Bayesian Reinforcement Learning.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Weakly Supervised Nuclei Segmentation Via Instance Learning.
Proceedings of the 19th IEEE International Symposium on Biomedical Imaging, 2022

FishGym: A High-Performance Physics-based Simulation Framework for Underwater Robot Learning.
Proceedings of the 2022 International Conference on Robotics and Automation, 2022

Robust Temporally-Coherent Strategy for Few-shot Video Instance Segmentation.
Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

Generative Negative Text Replay for Continual Vision-Language Pretraining.
Proceedings of the Computer Vision - ECCV 2022, 2022

Learning Semantic Correspondence with Sparse Annotations.
Proceedings of the Computer Vision - ECCV 2022, 2022

General Incremental Learning with Domain-aware Categorical Representations.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Workshop on Autonomous Driving at CVPR 2021: Technical Report for Streaming Perception Challenge.
CoRR, 2021

Automatic segmentation of vertebral features on ultrasound spine images using Stacked Hourglass Network.
CoRR, 2021

Fixed-Price Diffusion Mechanism Design.
Proceedings of the PRICAI 2021: Trends in Artificial Intelligence, 2021

Dynamic Grained Encoder for Vision Transformers.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

An EM Framework for Online Incremental Learning of Semantic Segmentation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Single Image 3D Object Estimation with Primitive Graph Networks.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Weakly Supervised Volumetric Segmentation via Self-taught Shape Denoising Model.
Proceedings of the Medical Imaging with Deep Learning, 7-9 July 2021, Lübeck, Germany., 2021

Superpixel-Guided Iterative Learning from Noisy Labels for Medical Image Segmentation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2021 - 24th International Conference, Strasbourg, France, September 27, 2021

Learning Implicit Temporal Alignment for Few-shot Video Classification.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

GNeRF: GAN-based Neural Radiance Field without Posed Camera.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Distribution Alignment: A Unified Framework for Long-Tail Visual Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

DER: Dynamically Expandable Representation for Class Incremental Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Relation-aware Instance Refinement for Weakly Supervised Visual Grounding.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Bipartite Graph Network With Adaptive Message Passing for Unbiased Scene Graph Generation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Learning a Layout Transfer Network for Context Aware Object Detection.
IEEE Trans. Intell. Transp. Syst., 2020

Confidence-Aware Adversarial Learning for Self-supervised Semantic Matching.
Proceedings of the Pattern Recognition and Computer Vision, Third Chinese Conference, 2020

Towards Purely Unsupervised Disentanglement of Appearance and Shape for Person Images Generation.
Proceedings of the HuMA'20: Proceedings of the 1st International Workshop on Human-centric Multimedia Analysis, 2020

LGNN: A Context-aware Line Segment Detector.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Shape-Aware Semi-supervised 3D Semantic Segmentation for Medical Images.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2020, 2020

Disentangled Representation Learning for Controllable Image Synthesis: An Information-Theoretic Perspective.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

Part-Aware Prototype Network for Few-Shot Semantic Segmentation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Learning Context-aware Task Reasoning for Efficient Meta Reinforcement Learning.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

Learning Cross-Modal Context Graph for Visual Grounding.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Learning Autonomous Exploration and Mapping with Semantic Vision.
CoRR, 2019

LatentGNN: Learning Efficient Non-local Relations for Visual Recognition.
Proceedings of the 36th International Conference on Machine Learning, 2019

Pose-Aware Multi-Level Feature Network for Human Object Interaction Detection.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Dynamic Context Correspondence Network for Semantic Alignment.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

A Dual Attention Network with Semantic Embedding for Few-Shot Learning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Learning to refine depth for robust stereo estimation.
Pattern Recognit., 2018

Simplifying Sentences with Sequence to Sequence Models.
CoRR, 2018

Instance-Aware Detailed Action Labeling in Videos.
Proceedings of the 2018 IEEE Winter Conference on Applications of Computer Vision, 2018

One-Shot Action Localization by Learning Sequence Matching Network.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

SemStyle: Learning to Generate Stylised Image Captions Using Unaligned Text.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Geometry-Aware Deep Network for Single-Image Novel View Synthesis.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

3D Object Structure Recovery via Semi-supervised Learning on Videos.
Proceedings of the British Machine Vision Conference 2018, 2018

3D Box Proposals From a Single Monocular Image of an Indoor Scene.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Stacked Learning to Search for Scene Labeling.
IEEE Trans. Image Process., 2017

Forest Change Detection in Incomplete Satellite Images With Deep Neural Networks.
IEEE Trans. Geosci. Remote. Sens., 2017

Learning Spatial Transforms for Refining Object Segment Proposals.
Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision, 2017

Learning deep structured network for weakly supervised change detection.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Deep Free-Form Deformation Network for Object-Mask Registration.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Indoor Scene Parsing with Instance Segmentation, Semantic Labeling and Support Relationship Inference.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Efficient Scene Layout Aware Object Detection for Traffic Surveillance.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017

Predicting Salient Face in Multiple-Face Videos.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Boundary-Aware Instance Segmentation.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
Contour Completion Without Region Segmentation.
IEEE Trans. Image Process., 2016

Semantic-Aware Depth Super-Resolution in Outdoor Scenes.
CoRR, 2016

Weakly Supervised Change Detection in a Pair of Images.
CoRR, 2016

Shape-aware Instance Segmentation.
CoRR, 2016

Learning Hough Transform with Latent Structures for Joint Object Detection and Pose Estimation.
Proceedings of the MultiMedia Modeling - 22nd International Conference, 2016

Semantic context and depth-aware object proposal generation.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016

Building Scene Models by Completing and Hallucinating Depth and Semantics.
Proceedings of the Computer Vision - ECCV 2016, 2016

Learning Dynamic Hierarchical Models for Anytime Scene Labeling.
Proceedings of the Computer Vision - ECCV 2016, 2016

Learning to Co-Generate Object Proposals with a Deep Structured Network.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Learning to Generate Object Segment Proposals with Multi-modal Cues.
Proceedings of the Computer Vision - ACCV 2016, 2016

Object-Aware Dictionary Learning with Deep Features.
Proceedings of the Computer Vision - ACCV 2016, 2016

SentiCap: Generating Image Descriptions with Sentiments.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Robust Face Alignment Under Occlusion via Regional Predictive Power Estimation.
IEEE Trans. Image Process., 2015

Winding Number Constrained Contour Detection.
IEEE Trans. Image Process., 2015

Structured Depth Prediction in Challenging Monocular Video Sequences.
CoRR, 2015

Motion Segmentation of Truncated Signed Distance Function Based Volumetric Surfaces.
Proceedings of the 2015 IEEE Winter Conference on Applications of Computer Vision, 2015

Choosing Basic-Level Concept Names Using Visual and Language Context.
Proceedings of the 2015 IEEE Winter Conference on Applications of Computer Vision, 2015

Multi-class Semantic Video Segmentation with Exemplar-Based Object Reasoning.
Proceedings of the 2015 IEEE Winter Conference on Applications of Computer Vision, 2015

Studying Object Naming with Online Photos and Caption.
Proceedings of the 2015 Workshop on Community-Organized Multimodal Mining: Opportunities for Novel Solutions, 2015

Structural Kernel Learning for Large Scale Multiclass Object Co-detection.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Indoor scene structure analysis for single image depth estimation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Multiclass semantic video segmentation with object-level active inference.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Separating objects and clutter in indoor scenes.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014
Scene understanding by labeling pixels.
Commun. ACM, 2014

Joint semantic and geometric segmentation of videos with a stage model.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2014

Object Co-detection via Efficient Inference in a Fully-Connected CRF.
Proceedings of the Computer Vision - ECCV 2014, 2014

Superpixel Graph Label Transfer with Learned Distance Metric.
Proceedings of the Computer Vision - ECCV 2014, 2014

Data-Driven Street Scene Layout Estimation for Distant Object Detection.
Proceedings of the 2014 International Conference on Digital Image Computing: Techniques and Applications, 2014

Discrete-Continuous Depth Estimation from a Single Image.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

An Exemplar-Based CRF for Multi-instance Object Segmentation.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

2013
Tracking Large-Scale Video Remix in Real-World Events.
IEEE Trans. Multim., 2013

Picture tags and world knowledge: learning tag relations from visual semantic sources.
Proceedings of the ACM Multimedia Conference, 2013

Glass object segmentation by label transfer on joint depth and appearance manifolds.
Proceedings of the IEEE International Conference on Image Processing, 2013

Symmetry detection via contour grouping.
Proceedings of the IEEE International Conference on Image Processing, 2013

Multi-instance Object Segmentation with Exemplars.
Proceedings of the 2013 IEEE International Conference on Computer Vision Workshops, 2013

Learning Structured Hough Voting for Joint Object Detection and Occlusion Reasoning.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Winding Number for Region-Boundary Consistent Salient Contour Extraction.
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

2012
Glass object localization by joint inference of boundary and depth.
Proceedings of the 21st International Conference on Pattern Recognition, 2012

Image segmentation for enhancing symbol recognition in prosthetic vision.
Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2012

An face-based visual fixation system for prosthetic vision.
Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2012

The role of vision processing in prosthetic vision.
Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2012

Learning Hough Forest with Depth-Encoded Context for Object Detection.
Proceedings of the 2012 International Conference on Digital Image Computing Techniques and Applications, 2012

Connected contours: A new contour completion model that respects the closure effect.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

2011
Spatial Semantics and Classifier Cascades: The ANU 2011 Multimedia Event Detection System.
Proceedings of the 2011 TREC Video Retrieval Evaluation, 2011

Efficient Image Denoising by MRF Approximation with Uniform-Sampled Multi-spanning-tree.
Proceedings of the Sixth International Conference on Image and Graphics, 2011

Laplacian Margin Distribution Boosting for Learning from Sparsely Labeled Data.
Proceedings of the 2011 International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2011

Analysis on Tree Structure Selection for MRF Inference in Low-level Vision.
Proceedings of the 2011 International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2011

2010
A unified model of short-range and long-range motion perception.
Proceedings of the Advances in Neural Information Processing Systems 23: 24th Annual Conference on Neural Information Processing Systems 2010. Proceedings of a meeting held 6-9 December 2010, 2010

Occlusion Boundary Detection Using Pseudo-depth.
Proceedings of the Computer Vision, 2010

2008
Learning structured prediction models for image labeling.
PhD thesis, 2008

Learning Flexible Features for Conditional Random Fields.
IEEE Trans. Pattern Anal. Mach. Intell., 2008

Learning Hybrid Models for Image Annotation with Partially Labeled Data.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Using latent Dirichlet allocation to incorporate domain knowledge for topic transition detection.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Latent topic random fields: Learning using a taxonomy of labels.
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008

2006
Topological map learning from outdoor image sequences.
J. Field Robotics, 2006

Learning and Incorporating Top-Down Cues in Image Segmentation.
Proceedings of the Computer Vision, 2006

2004
Multiscale Conditional Random Fields for Image Labeling.
Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), with CD-ROM, 27 June, 2004


  Loading...