Ruimao Zhang

Orcid: 0000-0001-9511-7532

Affiliations:
  • The Chinese University of Hong Kong, Shenzhen, China


According to our database1, Ruimao Zhang authored at least 95 papers between 2011 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Hierarchical Weight Averaging for Deep Neural Networks.
IEEE Trans. Neural Networks Learn. Syst., September, 2024

Advancing Medical Radiograph Representation Learning: A Hybrid Pre-training Paradigm with Multilevel Semantic Granularity.
CoRR, 2024

Story3D-Agent: Exploring 3D Storytelling Visualization with Large Language Models.
CoRR, 2024

F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions.
CoRR, 2024

Open-World Human-Object Interaction Detection via Multi-modal Prompts.
CoRR, 2024

MotionLLM: Understanding Human Behaviors from Human Motions and Videos.
CoRR, 2024

SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension.
CoRR, 2024

MineDreamer: Learning to Follow Instructions via Chain-of-Imagination for Simulated-World Control.
CoRR, 2024

Toward Accurate Camera-based 3D Object Detection via Cascade Depth Estimation and Calibration.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024

HumanTOMATO: Text-aligned Whole-body Motion Generation.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Enhancing Human-AI Collaboration Through Logic-Guided Reasoning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

X-Pose: Detecting Any Keypoints.
Proceedings of the Computer Vision - ECCV 2024, 2024

Open-World Human-Object Interaction Detection via Multi-Modal Prompts.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

FreeMan: Towards Benchmarking 3D Human Pose Estimation Under Real-World Conditions.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

SEED-Bench: Benchmarking Multimodal Large Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

SmartEdit: Exploring Complex Instruction-Based Image Editing with Multimodal Large Language Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

X4D-SceneFormer: Enhanced Scene Understanding on 4D Point Cloud Videos through Cross-Modal Knowledge Transfer.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Community Channel-Net: Efficient channel-wise interactions via community graph topology.
Pattern Recognit., September, 2023

Multi-Stage Spatio-Temporal Aggregation Transformer for Video Person Re-Identification.
IEEE Trans. Multim., 2023

SEED-Bench-2: Benchmarking Multimodal Large Language Models.
CoRR, 2023

UniPose: Detecting Any Keypoints.
CoRR, 2023

Molecular Conformation Generation via Shifting Scores.
CoRR, 2023

FreeMan: Towards Benchmarking 3D Human Pose Estimation in the Wild.
CoRR, 2023

SR-OOD: Out-of-Distribution Detection via Sample Repairing.
CoRR, 2023

Boosting Human-Object Interaction Detection with Text-to-Image Diffusion Model.
CoRR, 2023

Hierarchical Weight Averaging for Deep Neural Networks.
CoRR, 2023

Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Discovering Intrinsic Spatial-Temporal Logic Rules to Explain Human Actions.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Dance with You: The Diversity Controllable Dancer Generation via Diffusion Models.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Inherent Consistent Learning for Accurate Semi-supervised Medical Image Segmentation.
Proceedings of the Medical Imaging with Deep Learning, 2023

Toward Unpaired Multi-modal Medical Image Segmentation via Learning Structured Semantic Consistency.
Proceedings of the Medical Imaging with Deep Learning, 2023

YONA: You Only Need One Adjacent Reference-Frame for Accurate and Fast Video Polyp Detection.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2023, 2023

Explicit Box Detection Unifies End-to-End Multi-Person Pose Estimation.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Neural Interactive Keypoint Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

SupFusion: Supervised LiDAR-Camera Fusion for 3D Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Semantic Human Parsing via Scalable Semantic Transfer Over Multiple Label Domains.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Crowd Counting Via Perspective-Guided Fractional-Dilation Convolution.
IEEE Trans. Multim., 2022

MetaCloth: Learning Unseen Tasks of Dense Fashion Landmark Detection From a Few Samples.
IEEE Trans. Image Process., 2022

PolarMask++: Enhanced Polar Representation for Single-Shot Instance Segmentation and Beyond.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds.
CoRR, 2022

Toward Unpaired Multi-modal Medical Image Segmentation via Learning Structured Semantic Consistency.
CoRR, 2022

Active Domain Adaptation with Multi-level Contrastive Units for Semantic Segmentation.
CoRR, 2022

MetaDance: Few-shot Dancing Video Retargeting via Temporal-aware Meta-learning.
CoRR, 2022

AMOS: A Large-Scale Abdominal Multi-Organ Benchmark for Versatile Medical Image Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Let Images Give You More: Point Cloud Cross-Modal Training for Shape Analysis.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Toward Clinically Assisted Colorectal Polyp Recognition via Structured Cross-Modal Representation Consistency.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2022, 2022

Polygon-Free: Unconstrained Scene Text Detection with Box Annotations.
Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

2DPASS: 2D Priors Assisted Semantic Segmentation on LiDAR Point Clouds.
Proceedings of the Computer Vision - ECCV 2022, 2022

Weakly Supervised Object Localization via Transformer with Implicit Spatial Calibration.
Proceedings of the Computer Vision - ECCV 2022, 2022

Active Domain Adaptation with Multi-level Contrastive Units for Semantic Segmentation.
Proceedings of the Computer Vision - ACCV 2022, 2022

2021
Switchable Normalization for Learning-to-Normalize Deep Representation.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring.
CoRR, 2021

Shallow Attention Network for Polyp Segmentation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2021 - 24th International Conference, Strasbourg, France, September 27, 2021

Multi-compound Transformer for Accurate Biomedical Image Segmentation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2021 - 24th International Conference, Strasbourg, France, September 27, 2021

PointLIE: Locally Invertible Embedding for Point Cloud Sampling and Recovery.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

InstanceRefer: Cooperative Holistic Understanding for Visual Grounding on Point Clouds through Instance Multi-level Contextual Referring.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

End-to-End Dense Video Captioning with Parallel Decoding.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Parser-Free Virtual Try-On via Distilling Appearance Flows.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Sparse Single Sweep LiDAR Point Cloud Segmentation via Learning Contextual Shape Priors from Scene Completion.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
SCN: Switchable Context Network for Semantic Segmentation of RGB-D Images.
IEEE Trans. Cybern., 2020

SSN: Learning Sparse Switchable Normalization via SparsestMax.
Int. J. Comput. Vis., 2020

SelfText Beyond Polygon: Unconstrained Text Detection with Box Supervision and Dynamic Self-Training.
CoRR, 2020

AIM 2020 Challenge on Learned Image Signal Processing Pipeline.
CoRR, 2020

UXNet: Searching Multi-level Feature Aggregation for 3D Medical Image Segmentation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2020, 2020

Progressive Abdominal Segmentation with Adaptively Hard Region Prediction and Feature Enhancement.
Proceedings of the 17th IEEE International Symposium on Biomedical Imaging, 2020

Towards Content-Independent Multi-Reference Super-Resolution: Adaptive Pattern Matching and Feature Aggregation.
Proceedings of the Computer Vision - ECCV 2020, 2020


Exemplar Normalization for Learning Deep Representation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Towards Photo-Realistic Virtual Try-On by Adaptively Generating↔Preserving Image Content.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
SCAN: Self-and-Collaborative Attention Network for Video Person Re-Identification.
IEEE Trans. Image Process., 2019

Progressively diffused networks for semantic visual parsing.
Pattern Recognit., 2019

Hierarchical Scene Parsing by Weakly Supervised Learning with Image Descriptions.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

DeepFashion2: A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Re-Identification of Clothing Images.
CoRR, 2019

Differentiable Dynamic Normalization for Learning Deep Representation.
Proceedings of the 36th International Conference on Machine Learning, 2019

Differentiable Learning-to-Normalize via Switchable Normalization.
Proceedings of the 7th International Conference on Learning Representations, 2019

Differentiable Learning-to-Group Channels via Groupable Convolutional Neural Networks.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Once a MAN: Towards Multi-Target Attack via Learning Multi-Target Adversarial Network Once.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

SSN: Learning Sparse Switchable Normalization via SparsestMax.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

DeepFashion2: A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Re-Identification of Clothing Images.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Image-to-Video Person Re-Identification With Temporally Memorized Similarity Learning.
IEEE Trans. Circuits Syst. Video Technol., 2018

Learning deep representations for semantic image parsing: a comprehensive overview.
Frontiers Comput. Sci., 2018

Do Normalization Layers in a Deep ConvNet Really Need to Be Distinct?
CoRR, 2018

Attentive Crowd Flow Machines.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

CUImage: A Neverending Learning Platform on a Convolutional Knowledge Graph of Billion Web Images.
Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2018), 2018

Scheduling Large-scale Distributed Training via Reinforcement Learning.
Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2018), 2018

2017
Cost-Effective Active Learning for Deep Image Classification.
IEEE Trans. Circuits Syst. Video Technol., 2017

Scene Parsing by Weakly Supervised Learning with Image Descriptions.
CoRR, 2017

Progressively Diffused Networks for Semantic Image Segmentation.
CoRR, 2017

2016
Geometric Scene Parsing with Hierarchical LSTM.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Deep Structured Scene Parsing by Learning with Image Descriptions.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015
Bit-Scalable Deep Hashing With Regularized Similarity Learning for Image Retrieval and Person Re-Identification.
IEEE Trans. Image Process., 2015

Adaptive Scene Category Discovery With Generative Learning and Compositional Sampling.
IEEE Trans. Circuits Syst. Video Technol., 2015

2014
Deep boosting: Layered feature mining for general image classification.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

2011
Color style transfer by constraint locally linear embedding.
Proceedings of the 18th IEEE International Conference on Image Processing, 2011


  Loading...