Xiaodan Liang

Orcid: 0000-0003-3213-3062

According to our database1, Xiaodan Liang authored at least 406 papers between 2011 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Correctable Landmark Discovery via Large Models for Vision-Language Navigation.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Fine-Grained Visual-Text Prompt-Driven Self-Training for Open-Vocabulary Object Detection.
IEEE Trans. Neural Networks Learn. Syst., November, 2024

Template-Based Contrastive Distillation Pretraining for Math Word Problem Solving.
IEEE Trans. Neural Networks Learn. Syst., September, 2024

DNA Family: Boosting Weight-Sharing NAS With Block-Wise Supervisions.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2024

Iterative Graph Self-Distillation.
IEEE Trans. Knowl. Data Eng., March, 2024

Prototypical Graph Contrastive Learning.
IEEE Trans. Neural Networks Learn. Syst., February, 2024

Multi-scale adaptive networks for efficient inference.
Int. J. Mach. Learn. Cybern., February, 2024

Sitcom-Crafter: A Plot-Driven Human Motion Generation System in 3D Scenes.
CoRR, 2024

PIVOT-R: Primitive-Driven Waypoint-Aware World Model for Robotic Manipulation.
CoRR, 2024

Learning Interaction-aware 3D Gaussian Splatting for One-shot Hand Avatars.
CoRR, 2024

UncertaintyRAG: Span-Level Uncertainty Enhanced Long-Context Modeling for Retrieval-Augmented Generation.
CoRR, 2024

Efficient Training of Large Vision Models via Advanced Automated Progressive Learning.
CoRR, 2024

Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models.
CoRR, 2024

Qihoo-T2X: An Efficiency-Focused Diffusion Transformer via Proxy Tokens for Text-to-Any-Task.
CoRR, 2024

EasyControl: Transfer ControlNet to Video Diffusion for Controllable Generation and Interpolation.
CoRR, 2024

All Robots in One: A New Standard and Unified Dataset for Versatile, General-Purpose Embodied Agents.
CoRR, 2024

MUSE: Mamba is Efficient Multi-scale Learner for Text-video Retrieval.
CoRR, 2024

FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance.
CoRR, 2024

CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models.
CoRR, 2024

Benchmarking LLMs for Optimization Modeling and Enhancing Reasoning via Reverse Socratic Synthesis.
CoRR, 2024

HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models.
CoRR, 2024

OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion.
CoRR, 2024

Affordances-Oriented Planning using Foundation Models for Continuous Vision-Language Navigation.
CoRR, 2024

Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs.
CoRR, 2024

FVEL: Interactive Formal Verification Environment with Large Language Models via Theorem Proving.
CoRR, 2024

Predicting Genetic Mutation from Whole Slide Images via Biomedical-Linguistic Knowledge Enhanced Multi-label Classification.
CoRR, 2024

UA-Track: Uncertainty-Aware End-to-End 3D Multi-Object Tracking.
CoRR, 2024

AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation.
CoRR, 2024

VITON-DiT: Learning In-the-Wild Video Try-On from Human Dance Videos via Diffusion Transformers.
CoRR, 2024

The SkatingVerse Workshop & Challenge: Methods and Results.
CoRR, 2024

Proving Theorems Recursively.
CoRR, 2024

DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data.
CoRR, 2024

Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs.
CoRR, 2024

MMTryon: Multi-Modal Multi-Reference Control for High-Quality Fashion Generation.
CoRR, 2024

TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation.
CoRR, 2024

ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving.
CoRR, 2024

DialogGen: Multi-modal Interactive Dialogue System for Multi-turn Text-to-Image Generation.
CoRR, 2024

Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation.
CoRR, 2024

NavCoT: Boosting LLM-Based Vision-and-Language Navigation via Learning Disentangled Reasoning.
CoRR, 2024

AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis.
CoRR, 2024

MapGPT: Map-Guided Prompting for Unified Vision-and-Language Navigation.
CoRR, 2024

Holistic Autonomous Driving Understanding by Bird's-Eye-View Injected Multi-Modal Large Models.
CoRR, 2024

Optimal operation of integrated electricity and gas networks with risk analysis using downside risk constraints method.
Comput. Chem. Eng., 2024

ATG: Benchmarking Automated Theorem Generation for Generative Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

DreamVTON: Customizing 3D Virtual Try-on with Personalized Diffusion Models.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

LiDAR-NeRF: Novel LiDAR View Synthesis via Neural Radiance Fields.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Learning to Generalize Unseen Domains via Multi-source Meta Learning for Text Classification.
Proceedings of the Pattern Recognition - 27th International Conference, 2024

DQ-LoRe: Dual Queries with Low Rank Approximation Re-ranking for In-Context Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

LEGO-Prover: Neural Theorem Proving with Growing Libraries.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Ins-DetCLIP: Aligning Detection Model to Follow Human-Language Instruction.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

MUSTARD: Mastering Uniform Synthesis of Theorem and Proof Data.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

AlignedCoT: Prompting Large Language Models via Native-Speaking Demonstrations.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

GarmentAligner: Text-to-Garment Generation via Retrieval-Augmented Multi-level Corrections.
Proceedings of the Computer Vision - ECCV 2024, 2024

Contrastive Learning with Counterfactual Explanations for Radiology Report Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

Making Large Language Models Better Planners with Reasoning-Decision Alignment.
Proceedings of the Computer Vision - ECCV 2024, 2024

LayerDiff: Exploring Text-Guided Multi-layered Composable Image Synthesis via Layer-Collaborative Diffusion Model.
Proceedings of the Computer Vision - ECCV 2024, 2024

HumanRefiner: Benchmarking Abnormal Human Generation and Refining with Coarse-to-Fine Pose-Reversible Guidance.
Proceedings of the Computer Vision - ECCV 2024, 2024

DetCLIPv3: Towards Versatile Generative Open-Vocabulary Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

AlignMiF: Geometry-Aligned Multimodal Implicit Field for LiDAR-Camera Joint Synthesis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

MLP Can Be a Good Transformer Learner.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Holistic Autonomous Driving Understanding by Bird'View Injected Multi-Modal Large Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

APTNESS: Incorporating Appraisal Theory and Emotion Support Strategies for Empathetic Response Generation.
Proceedings of the 33rd ACM International Conference on Information and Knowledge Management, 2024

CorNav: Autonomous Agent with Self-Corrected Planning for Zero-Shot Vision-and-Language Navigation.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

CLOMO: Counterfactual Logical Modification with Large Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

RAP: Efficient Text-Video Retrieval with Sparse-and-Correlated Adapter.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

VisDiaHalBench: A Visual Dialogue Benchmark For Diagnosing Hallucination in Large Vision-Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Towards Detailed Text-to-Motion Synthesis via Basic-to-Advanced Hierarchical Diffusion Model.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

PTUS: Photo-Realistic Talking Upper-Body Synthesis via 3D-Aware Motion Decomposition Warping.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Monocular 3D Hand Mesh Recovery via Dual Noise Estimation.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

3D Visibility-Aware Generalizable Neural Radiance Fields for Interacting Hands.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Construction and effect evaluation of prediction model for red blood cell transfusion requirement in cesarean section based on artificial intelligence.
BMC Medical Informatics Decis. Mak., December, 2023

Towards Causality-Aware Inferring: A Sequential Discriminative Approach for Medical Diagnosis.
IEEE Trans. Pattern Anal. Mach. Intell., November, 2023

Entity-Graph Enhanced Cross-Modal Pretraining for Instance-Level Product Retrieval.
IEEE Trans. Pattern Anal. Mach. Intell., November, 2023

Towards Deviation-Robust Agent Navigation via Perturbation-Aware Contrastive Learning.
IEEE Trans. Pattern Anal. Mach. Intell., October, 2023

Discourse-Aware Graph Networks for Textual Logical Reasoning.
IEEE Trans. Pattern Anal. Mach. Intell., October, 2023

Knowledge Fusion Distillation: Improving Distillation with Multi-scale Attention Mechanisms.
Neural Process. Lett., October, 2023

DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Vision Transformers.
IEEE Trans. Pattern Anal. Mach. Intell., April, 2023

Dynamic Support Network for Few-Shot Class Incremental Learning.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

Auxiliary signal-guided knowledge encoder-decoder for medical report generation.
World Wide Web (WWW), January, 2023

Caption-Aided Product Detection via Collaborative Pseudo-Label Harmonization.
IEEE Trans. Multim., 2023

Point-Guided Contrastive Learning for Monocular 3-D Object Detection.
IEEE Trans. Cybern., 2023

MARLlib: A Scalable and Efficient Multi-agent Reinforcement Learning Library.
J. Mach. Learn. Res., 2023

WarpDiffusion: Efficient Diffusion Model for High-Fidelity Virtual Try-on.
CoRR, 2023

DreamVideo: High-Fidelity Image-to-Video Generation with Image Retention and Text Guidance.
CoRR, 2023

Speak Like a Native: Prompting Large Language Models in a Native Style.
CoRR, 2023

LEGO-Prover: Neural Theorem Proving with Growing Libraries.
CoRR, 2023

Fashion Matrix: Editing Photos by Just Talking.
CoRR, 2023

RM-PRT: Realistic Robotic Manipulation Simulator and Benchmark with Progressive Reasoning Tasks.
CoRR, 2023

MO-VLN: A Multi-Task Benchmark for Open-set Zero-Shot Vision-and-Language Navigation.
CoRR, 2023

UniDiff: Advancing Vision-Language Models with Generative and Discriminative Learning.
CoRR, 2023

Boosting Text-to-Image Diffusion Models with Fine-Grained Semantic Rewards.
CoRR, 2023

Boosting Visual-Language Models by Exploiting Hard Samples.
CoRR, 2023

Towards Medical Artificial General Intelligence via Knowledge-Enhanced Multimodal Pretraining.
CoRR, 2023

LiDAR-NeRF: Novel LiDAR View Synthesis via Neural Radiance Fields.
CoRR, 2023

Applications of Strongly Regular Cayley Graphs to Codebooks.
IEEE Access, 2023

RecFormer: Recurrent Multi-modal Transformer with History-Aware Contrastive Learning for Visual Dialog.
Proceedings of the Pattern Recognition and Computer Vision - 6th Chinese Conference, 2023

RIO: A Benchmark for Reasoning Intention-Oriented Objects in Open Environments.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Vision Language Navigation with Knowledge-driven Environmental Dreamer.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

ViewCo: Discovering Text-Supervised Segmentation Masks via Multi-View Semantic Consistency.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Self-Guided Noise-Free Data Generation for Efficient Zero-Shot Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

CTP: Towards Vision-Language Continual Pretraining via Compatible Momentum Contrast and Topology Preservation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

DiffCloth: Diffusion Based Garment Synthesis and Manipulation via Structural Cross-modal Semantic Alignment.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Towards High-Fidelity Text-Guided 3D Face Generation and Manipulation Using only Images.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

LAW-Diffusion: Complex Scene Generation by Diffusion with Layouts.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Coordinate Transformer: Achieving Single-stage Multi-person Mesh Recovery from Videos.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

FULLER: Unified Multi-modality Multi-task 3D Perception via Multi-level Gradient Calibration.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

DiffDis: Empowering Generative Diffusion Model with Cross-Modal Discrimination Capability.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

GrowCLIP: Data-aware Automatic Model Growing for Large-scale Contrastive Language-Image Pre-training.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

MixReorg: Cross-Modal Mixed Patch Reorganization is a Good Mask Learner for Open-World Semantic Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Composable Text Controls in Latent Space with ODEs.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

CLIP<sup>2</sup>: Contrastive Language-Image-Point Pretraining from Real-World Point Cloud Data.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

DetCLIPv2: Scalable Open-Vocabulary Object Detection Pre-training via Word-Region Alignment.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

GP-VTON: Towards General Purpose Virtual Try-On via Collaborative Local-Flow Global-Parsing Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Learning to Segment Every Referring Object Point by Point.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

CapDet: Unifying Dense Captioning and Open-World Detection Pretraining.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Visual Exemplar Driven Task-Prompting for Unified Perception in Autonomous Driving.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Dynamic Graph Enhanced Contrastive Learning for Chest X-Ray Report Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Application of Intelligent Mobile Terminal in Virtual Building Construction Training Teaching.
Proceedings of the Advanced Hybrid Information Processing, 2023

DT-Solver: Automated Theorem Proving with Dynamic-Tree Sampling Guided by Proof-level Value Function.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Actional Atomic-Concept Learning for Demystifying Vision-Language Navigation.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

3D-TOGO: Towards Text-Guided Cross-Category 3D Object Generation.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

NLIP: Noise-Robust Language-Image Pre-training.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Configurable Graph Reasoning for Visual Relationship Detection.
IEEE Trans. Neural Networks Learn. Syst., 2022

Knowledge-Routed Visual Question Reasoning: Challenges for Deep Representation Embedding.
IEEE Trans. Neural Networks Learn. Syst., 2022

Unifying Relational Sentence Generation and Retrieval for Medical Image Report Composition.
IEEE Trans. Cybern., 2022

Atom correlation based graph propagation for scene graph generation.
Pattern Recognit., 2022

Adversarial Reinforced Instruction Attacker for Robust Vision-Language Navigation.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Graphonomy: Universal Image Parsing via Graph Reasoning and Transfer.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

A Modified Whale Optimization Algorithm and Its Application in Seismic Inversion Problem.
Mob. Inf. Syst., 2022

elBERto: Self-supervised commonsense learning for question answering.
Knowl. Based Syst., 2022

PathReasoner: Explainable reasoning paths for commonsense question answering.
Knowl. Based Syst., 2022

P<sup>3</sup>OVD: Fine-grained Visual-Text Prompt-Driven Self-Training for Open-Vocabulary Object Detection.
CoRR, 2022

MARLlib: Extending RLlib for Multi-agent Reinforcement Learning.
CoRR, 2022

Learning Self-Regularized Adversarial Views for Self-Supervised Vision Transformers.
CoRR, 2022

Composable Text Control Operations in Latent Space with Ordinary Differential Equations.
CoRR, 2022

PASTA-GAN++: A Versatile Framework for High-Resolution Unpaired Virtual Try-on.
CoRR, 2022

Benchmarking the Robustness of LiDAR-Camera Fusion for 3D Object Detection.
CoRR, 2022

ZeroGen<sup>+</sup>: Self-Guided High-Quality Data Generation in Efficient Zero-Shot Learning.
CoRR, 2022

Modern Augmented Reality: Applications, Trends, and Future Directions.
CoRR, 2022

Wukong: 100 Million Large-scale Chinese Cross-modal Pre-training Dataset and A Foundation Framework.
CoRR, 2022

Exploring Inter-Channel Correlation for Diversity-preserved KnowledgeDistillation.
CoRR, 2022

Towards robust partially supervised multi-structure medical image segmentation on small-scale data.
Appl. Soft Comput., 2022

AI on the edge: a comprehensive review.
Artif. Intell. Rev., 2022

MedDG: An Entity-Centric Medical Consultation Dataset for Entity-Aware Medical Dialogue Generation.
Proceedings of the Natural Language Processing and Chinese Computing, 2022

CoupAlign: Coupling Word-Pixel with Sentence-Mask Alignments for Referring Image Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

DetCLIP: Dictionary-Enriched Visual-Concept Paralleled Pre-training for Open-world Detection.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Effective Adaptation in Multi-Task Co-Training for Unified Autonomous Driving.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Towards Hard-pose Virtual Try-on via 3D-aware Global Correspondence Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Wukong: A 100 Million Large-scale Chinese Cross-modal Pre-training Benchmark.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Structure-Preserving 3D Garment Modeling with Neural Sewing Machines.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Unbiased Math Word Problems Benchmark for Mitigating Solving Bias.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

Don't Take It Literally: An Edit-Invariant Sequence Loss for Text Generation.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

ARMANI: Part-level Garment-Text Alignment for Unified Cross-Modal Fashion Design.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

"My nose is running." "Are you also coughing?": Building A Medical Diagnosis Agent with Interpretable Inquiry Logics.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Policy Diagnosis via Measuring Role Diversity in Cooperative Multi-agent RL.
Proceedings of the International Conference on Machine Learning, 2022

FILIP: Fine-grained Interactive Language-Image Pre-Training.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Revisiting Over-smoothing in BERT from the Perspective of Graph.
Proceedings of the Tenth International Conference on Learning Representations, 2022

LogicSolver: Towards Interpretable Math Word Problem Solving with Logical Prompt-enhanced Learning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

MetaLogic: Logical Reasoning Explanations with Fine-Grained Structure.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Improving Multi-turn Emotional Support Dialogue Generation with Lookahead Strategy Planning.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

UniGeo: Unifying Geometry Logical Reasoning via Reformulating Mathematical Expression.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

RelCLIP: Adapting Language-Image Pretraining for Visual Relationship Detection via Relational Contrastive Learning.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

SiRi: A Simple Selective Retraining Mechanism for Transformer-Based Visual Grounding.
Proceedings of the Computer Vision - ECCV 2022, 2022

Open-World Semantic Segmentation via Contrasting and Clustering Vision-Language Embedding.
Proceedings of the Computer Vision - ECCV 2022, 2022

CODA: A Real-World Road Corner Case Dataset for Object Detection in Autonomous Driving.
Proceedings of the Computer Vision - ECCV 2022, 2022

BodyGAN: General-purpose Controllable Neural Human Body Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Continual Object Detection via Prototypical Task Correlation Guided Gating Mechanism.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Beyond Fixation: Dynamic Window Visual Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Knowledge Distillation via the Target-aware Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

ADAPT: Vision-Language Navigation with Modality-Aligned Action Prompts.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Automated Progressive Learning for Efficient Training of Vision Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Cross-modal Clinical Graph Transformer for Ophthalmic Report Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Arch-Graph: Acyclic Architecture Relation Predictor for Task-Transferable Neural Architecture Search.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Dressing in the Wild by Watching Dance Videos.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

M5Product: Self-harmonized Contrastive Learning for E-commercial Multi-modal Pretraining.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Visual-Language Navigation Pretraining via Prompt-based Environmental Self-exploration.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Contrastive Instruction-Trajectory Learning for Vision-Language Navigation.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Laneformer: Object-Aware Row-Column Transformers for Lane Detection.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

AutoBERT-Zero: Evolving BERT Backbone from Scratch.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Graph Reasoning Networks and Applications.
Proceedings of the Neuro-Symbolic Artificial Intelligence: The State of the Art, 2021

Medical-VLBERT: Medical Visual Language BERT for COVID-19 CT Report Generation With Alternate Learning.
IEEE Trans. Neural Networks Learn. Syst., 2021

GTAE: Graph Transformer-Based Auto-Encoders for Linguistic-Constrained Text Style Transfer.
ACM Trans. Intell. Syst. Technol., 2021

Image Comes Dancing With Collaborative Parsing-Flow Video Synthesis.
IEEE Trans. Image Process., 2021

Interpretable Visual Question Answering by Reasoning on Dependency Trees.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Heterogeneous graph reasoning for knowledge-grounded medical dialogue system.
Neurocomputing, 2021

Heterogeneous Excitation-and-Squeeze Network for visual dialog.
Neurocomputing, 2021

IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning.
CoRR, 2021

DS-Net++: Dynamic Weight Slicing for Efficient Inference in CNNs and Transformers.
CoRR, 2021

M5Product: A Multi-modal Pretraining Benchmark for E-commercial Product Downstream Tasks.
CoRR, 2021

Deep Learning for Embodied Vision Navigation: A Survey.
CoRR, 2021

Don't Take It Literally: An Edit-Invariant Sequence Loss for Text Generation.
CoRR, 2021

SODA10M: Towards Large-Scale Object Detection Benchmark for Autonomous Driving.
CoRR, 2021

One Million Scenes for Autonomous Driving: ONCE Dataset.
CoRR, 2021

Prototypical Graph Contrastive Learning.
CoRR, 2021

Vision-Language Navigation with Random Environmental Mixup.
CoRR, 2021

SOON: Scenario Oriented Object Navigation with Graph-based Exploration.
CoRR, 2021

UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers.
CoRR, 2021

Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

One Million Scenes for Autonomous Driving: ONCE Dataset.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

IconQA: A New Benchmark for Abstract Diagram Understanding and Visual Language Reasoning.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

SODA10M: A Large-Scale 2D Self/Semi-Supervised Object Detection Dataset for Autonomous Driving.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

DAGN: Discourse-Aware Graph Network for Logical Reasoning.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

WAS-VTON: Warping Architecture Search for Virtual Try-on Network.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp.
Proceedings of the IEEE International Conference on Robotics and Automation, 2021

SparseBERT: Rethinking the Importance Analysis in Self-attention.
Proceedings of the 38th International Conference on Machine Learning, 2021

Unifying Dynamic Optimizer Search and Network Architecture Search.
Proceedings of the 2021 IEEE International Conference on Multimedia and Expo, 2021

Transformer Based Multi-Agent Framework.
Proceedings of the 2021 IEEE International Conference on Multimedia & Expo Workshops, 2021

Loss Function Discovery for Object Detection via Convergence-Simulation Driven Search.
Proceedings of the 9th International Conference on Learning Representations, 2021

UPDeT: Universal Multi-agent RL via Policy Decoupling with Transformers.
Proceedings of the 9th International Conference on Learning Representations, 2021

M3D-VTON: A Monocular-to-3D Virtual Try-On Network.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Product1M: Towards Weakly Supervised Instance-Level Product Retrieval via Cross-Modal Pretraining.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

UltraPose: Synthesizing Dense Pose with 1 Billion Points by Human-body Decoupling 3D Model.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

NASOA: Towards Faster Task-oriented Online Fine-tuning with a Zoo of Models.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Pi-NAS: Improving Neural Architecture Search by Reducing Supernet Training Consistency Shift.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Voxel Transformer for 3D Object Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Exploring Inter-Channel Correlation for Diversity-preserved Knowledge Distillation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Exploring Geometry-aware Contrast and Clustering Harmonization for Self-supervised 3D Object Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Linguistically Routing Capsule Network for Out-of-distribution Visual Question Answering.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Self-Motivated Communication Agent for Real-World Vision-Dialog Navigation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Vision-Language Navigation with Random Environmental Mixup.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Wav-BERT: Cooperative Acoustic and Linguistic Representation Learning for Low-Resource Speech Recognition.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

SOON: Scenario Oriented Object Navigation With Graph-Based Exploration.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Dynamic Slimmable Network.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

TransNAS-Bench-101: Improving Transferability and Generalizability of Cross-Task Neural Architecture Search.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Multiple Adaptive Strategies-based Rat Swarm Optimizer.
Proceedings of the 7th IEEE International Conference on Cloud Computing and Intelligent Systems, 2021

Towards Quantifiable Dialogue Coherence Evaluation.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Neural-Symbolic Solver for Math Word Problems with Auxiliary Tasks.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

GeoQA: A Geometric Question Answering Benchmark Towards Multimodal Numerical Reasoning.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

Ada-Segment: Automated Multi-loss Adaptation for Panoptic Segmentation.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Adversarial Meta Sampling for Multilingual Low-Resource Speech Recognition.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Graph-Evolving Meta-Learning for Low-Resource Medical Dialogue Generation.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

REM-Net: Recursive Erasure Memory Network for Commonsense Evidence Refinement.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Improving Deep Binary Embedding Networks by Order-Aware Reweighting of Triplets.
IEEE Trans. Circuits Syst. Video Technol., 2020

Unsupervised object-level video summarization with online motion auto-encoder.
Pattern Recognit. Lett., 2020

Guest editorial: Image/video understanding and analysis.
Pattern Recognit. Lett., 2020

A modified surrogate-assisted multi-swarm artificial bee colony for complex numerical optimization problems.
Microprocess. Microsystems, 2020

REM-Net: Recursive Erasure Memory Network for Commonsense Evidence Refinement.
CoRR, 2020

Towards Robust Medical Image Segmentation on Small-Scale Data with Incomplete Labels.
CoRR, 2020

MedDG: A Large-scale Medical Consultation Dataset for Building Medical Dialogue System.
CoRR, 2020

Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report Generation.
CoRR, 2020

Linguistically Driven Graph Capsule Network for Visual Question Reasoning.
CoRR, 2020

Learning Reinforced Agents with Counterfactual Simulation for Medical Automatic Diagnosis.
CoRR, 2020

Towards Interpretable Natural Language Understanding with Explanations as Latent Variables.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Auto-Panoptic: Cooperative Multi-Component Architecture Search for Panoptic Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

AutoSync: Learning to Synchronize for Data-Parallel Distributed Deep Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Grammatically Recognizing Images with Tree Convolution.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020

CP-GAN: Context Pyramid Generative Adversarial Network for Speech Enhancement.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Semantically-Aligned Universal Tree-Structured Solver for Math Word Problems.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

A Data-Centric Framework for Composable NLP Workflows.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, 2020

Record-to-Text Generation with Style Imitation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue Systems.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

CurveLane-NAS: Unifying Lane-Sensitive Architecture Search and Adaptive Point Blending.
Proceedings of the Computer Vision - ECCV 2020, 2020

CATCH: Context-Based Meta Reinforcement Learning for Transferrable Architecture Search.
Proceedings of the Computer Vision - ECCV 2020, 2020

Vision-Language Navigation With Self-Supervised Auxiliary Reasoning Tasks.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Bidirectional Graph Reasoning Network for Panoptic Segmentation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Block-Wisely Supervised Neural Architecture Search With Knowledge Distillation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

SP-NAS: Serial-to-Parallel Backbone Search for Object Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Fashion Editing With Adversarial Parsing Learning.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Vision-Dialog Navigation by Exploring Cross-Modal Memory.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

SM-NAS: Structural-to-Modular Neural Architecture Search for Object Detection.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Universal-RCNN: Universal Object Detector via Transferable Graph R-CNN.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Dynamic Knowledge Routing Network for Target-Guided Open-Domain Conversation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

ElixirNet: Relation-Aware Network Architecture Adaptation for Medical Lesion Detection.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
ConnNet: A Long-Range Relation-Aware Pixel-Connectivity Network for Salient Segmentation.
IEEE Trans. Image Process., 2019

Lifecycle coevolution framework for many evolutionary and swarm intelligence algorithms fusion in solving complex optimization problems.
Swarm Evol. Comput., 2019

Look into Person: Joint Body Parsing & Pose Estimation Network and a New Benchmark.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Dilated temporal relational adversarial network for generic video summarization.
Multim. Tools Appl., 2019

Blockwisely Supervised Neural Architecture Search with Knowledge Distillation.
CoRR, 2019

Explainable High-order Visual Question Reasoning: A New Benchmark and Knowledge-routed Network.
CoRR, 2019

Fashion Editing with Multi-scale Attention Normalization.
CoRR, 2019

Towards Multi-pose Guided Virtual Try-on Network.
CoRR, 2019

Indicator-based multi-objective adaptive bacterial foraging algorithm for RFID network planning.
Clust. Comput., 2019

Soft Transfer Learning via Gradient Diagnosis for Visual Relationship Detection.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Heterogeneous Graph Learning for Visual Commonsense Reasoning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Neural Architecture Search for Adversarial Medical Image Segmentation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Multivariate-Information Adversarial Ensemble for Scalable Joint Distribution Matching.
Proceedings of the 36th International Conference on Machine Learning, 2019

Part-Preserving Pose Manipulation for Person Image Synthesis.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

AutoLoss: Learning Discrete Schedule for Alternate Optimization.
Proceedings of the 7th International Conference on Learning Representations, 2019

Meta R-CNN: Towards General Solver for Instance-Level Low-Shot Learning.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Auto-FPN: Automatic Network Architecture Adaptation for Object Detection Beyond Classification.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Towards Multi-Pose Guided Virtual Try-On Network.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

FW-GAN: Flow-Navigated Warping GAN for Video Virtual Try-On.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Layout-Graph Reasoning for Fashion Landmark Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Reasoning-RCNN: Unifying Adaptive Global Reasoning Into Large-Scale Object Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Spatial-Aware Graph Relation Network for Large-Scale Object Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Learning Personalized Modular Network Guided by Structured Knowledge.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Rethinking Knowledge Graph Propagation for Zero-Shot Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Graphonomy: Universal Human Parsing via Graph Transfer Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Blending-Target Domain Adaptation by Adversarial Meta-Adaptation Networks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Target-Guided Open-Domain Conversation.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

End-to-End Knowledge-Routed Relational Dialogue System for Automatic Diagnosis.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Knowledge-Driven Encode, Retrieve, Paraphrase for Medical Image Report Generation.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Scale-Aware Fast R-CNN for Pedestrian Detection.
IEEE Trans. Multim., 2018

Multistage Object Detection With Group Recursive Learning.
IEEE Trans. Multim., 2018

Proposal-Free Network for Instance-Level Object Segmentation.
IEEE Trans. Pattern Anal. Mach. Intell., 2018

AutoLoss: Learning Discrete Schedules for Alternate Optimization.
CoRR, 2018

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation.
CoRR, 2018

Toward Characteristic-Preserving Image-based Virtual Try-On Network.
CoRR, 2018

Geometric Generalization Based Zero-Shot Learning Dataset Infinite World: Simple Yet Powerful.
CoRR, 2018

Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation.
CoRR, 2018

DTR-GAN: Dilated Temporal Relational Adversarial Network for Video Summarization.
CoRR, 2018

Unsupervised Real-to-Virtual Domain Unification for End-to-End Highway Driving.
CoRR, 2018

Droplet property optimization in printable electronics fabrication using root system growth algorithm.
Comput. Ind. Eng., 2018

Symbolic Graph Reasoning Meets Convolutions.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Hybrid Retrieval-Generation Reinforced Agent for Medical Image Report Generation.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Hybrid Knowledge Routed Modules for Large-scale Object Detection.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Deep Generative Models with Learnable Knowledge Constraints.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Soft-Gated Warping-GAN for Pose-Guided Person Image Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Adaptive Temporal Encoding Network for Video Instance-level Human Parsing.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Structured Deep Learning for Pixel-level Understanding.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Reinforced Auto-Zoom Net: Towards Accurate and Fast Breast Cancer Segmentation in Whole-Slide Images.
Proceedings of the Deep Learning in Medical Image Analysis - and - Multimodal Learning for Clinical Decision Support, 2018

Unsupervised Domain Adaptation for Automatic Estimation of Cardiothoracic Ratio.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2018, 2018

SCAN: Structure Correcting Adversarial Network for Organ Segmentation in Chest X-Rays.
Proceedings of the Deep Learning in Medical Image Analysis - and - Multimodal Learning for Clinical Decision Support, 2018

StepDeep: A Novel Spatial-temporal Mobility Event Prediction Framework based on Deep Neural Network.
Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2018

Teaching Robots to Predict Human Motion.
Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018

Deep Learning Based Supervised Semantic Segmentation of Electron Cryo-Subtomograms.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

A Modulation Module for Multi-task Learning with Applications in Image Retrieval.
Proceedings of the Computer Vision - ECCV 2018, 2018

Real-to-Virtual Domain Unification for End-to-End Autonomous Driving.
Proceedings of the Computer Vision - ECCV 2018, 2018

Toward Characteristic-Preserving Image-Based Virtual Try-On Network.
Proceedings of the Computer Vision - ECCV 2018, 2018

Generative Semantic Manipulation with Mask-Contrasting GAN.
Proceedings of the Computer Vision - ECCV 2018, 2018

CIRL: Controllable Imitative Reinforcement Learning for Vision-Based Self-driving.
Proceedings of the Computer Vision - ECCV 2018, 2018

Adversarial Geometry-Aware Human Motion Prediction.
Proceedings of the Computer Vision - ECCV 2018, 2018

Instance-Level Human Parsing via Part Grouping Network.
Proceedings of the Computer Vision - ECCV 2018, 2018

RCAA: Relational Context-Aware Agents for Person Search.
Proceedings of the Computer Vision - ECCV 2018, 2018

Dynamic-Structured Semantic Propagation Network.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Reinforcement Cutting-Agent Learning for Video Object Segmentation.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Visual Question Reasoning on General Dependency Tree.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Query-Conditioned Three-Player Adversarial Network for Video Summarization.
Proceedings of the British Machine Vision Conference 2018, 2018

Image-derived generative modeling of pseudo-macromolecular structures - towards the statistical assessment of Electron CryoTomography template matching.
Proceedings of the British Machine Vision Conference 2018, 2018

Semantic-aware Grad-GAN for Virtual-to-Real Urban Scene Adaption.
Proceedings of the British Machine Vision Conference 2018, 2018

Adaptive Context-aware Reinforced Agent for Handwritten Text Recognition.
Proceedings of the British Machine Vision Conference 2018, 2018

2017
Artificial Bee Colony Optimizer Based on Bee Life-Cycle for Stationary and Dynamic Optimization.
IEEE Trans. Syst. Man Cybern. Syst., 2017

Attentive Contexts for Object Detection.
IEEE Trans. Multim., 2017

Root system growth biomimicry for global optimization models and emergent behaviors.
Soft Comput., 2017

Optimal layout and deployment for RFID system using a novel hybrid artificial bee colony optimizer based on bee life-cycle model.
Soft Comput., 2017

STC: A Simple to Complex Framework for Weakly-Supervised Semantic Segmentation.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

Human Parsing with Contextualized Convolutional Neural Network.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

Learning to Segment Human by Watching YouTube.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

Generative Semantic Manipulation with Contrasting GAN.
CoRR, 2017

Controllable Text Generation.
CoRR, 2017

Nonparametric Variational Auto-encoders for Hierarchical Representation Learning.
CoRR, 2017

Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing.
CoRR, 2017

SCAN: Structure Correcting Adversarial Network for Chest X-rays Organ Segmentation.
CoRR, 2017

ZM-Net: Real-time Zero-shot Image Manipulation Network.
CoRR, 2017

Deep learning-based subdivision approach for large scale macromolecules structure recovery from electron cryo tomograms.
Bioinform., 2017

Poseidon: An Efficient Communication Architecture for Distributed Deep Learning on GPU Clusters.
Proceedings of the 2017 USENIX Annual Technical Conference, 2017

Structured Generative Adversarial Networks.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Deep Attribute-preserving Metric Learning for Natural Language Object Retrieval.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Toward Controlled Generation of Text.
Proceedings of the 34th International Conference on Machine Learning, 2017

Temporal Dynamic Graph LSTM for Action-Driven Video Object Detection.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Dual Motion GAN for Future-Flow Embedded Video Prediction.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Recurrent Topic-Transition GAN for Visual Paragraph Generation.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Nonparametric Variational Auto-Encoders for Hierarchical Representation Learning.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Object Region Mining with Adversarial Erasing: A Simple Classification to Semantic Segmentation Approach.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Recurrent 3D Pose Sequence Machines.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Deep Variation-Structured Reinforcement Learning for Visual Relationship and Attribute Detection.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Interpretable Structure-Evolving LSTM.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Perceptual Generative Adversarial Networks for Small Object Detection.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Look into Person: Self-Supervised Structure-Sensitive Learning and a New Benchmark for Human Parsing.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Attention-Aware Face Hallucination via Deep Reinforcement Learning.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

2016
Clothes Co-Parsing Via Joint Image Segmentation and Labeling With Application to Clothing Retrieval.
IEEE Trans. Multim., 2016

Recognizing Focal Liver Lesions in CEUS With Dynamically Trained Latent Structured Models.
IEEE Trans. Medical Imaging, 2016

Scale-Aware Pixelwise Object Proposal Networks.
IEEE Trans. Image Process., 2016

Multi-loss Regularized Deep Neural Network.
IEEE Trans. Circuits Syst. Video Technol., 2016

Learning to segment with image-level annotations.
Pattern Recognit., 2016

Peak-Piloted Deep Network for Facial Expression Recognition.
CoRR, 2016

Multi-stage Object Detection with Group Recursive Learning.
CoRR, 2016

RGB-D Scene Labeling with Long Short-Term Memorized Fusion Model.
CoRR, 2016

Scale-aware Pixel-wise Object Proposal Networks.
CoRR, 2016

Tree-Structured Reinforcement Learning for Sequential Object Localization.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Human Pose Estimation from Depth Images via Inference Embedded Multi-task Learning.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Geometric Scene Parsing with Hierarchical LSTM.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Peak-Piloted Deep Network for Facial Expression Recognition.
Proceedings of the Computer Vision - ECCV 2016, 2016

Is Faster R-CNN Doing Well for Pedestrian Detection?
Proceedings of the Computer Vision - ECCV 2016, 2016

Semantic Object Parsing with Graph LSTM.
Proceedings of the Computer Vision - ECCV 2016, 2016

LSTM-CF: Unifying Context Modeling and Fusion with LSTMs for RGB-D Scene Labeling.
Proceedings of the Computer Vision - ECCV 2016, 2016

Deep Structured Scene Parsing by Learning with Image Descriptions.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Reversible Recursive Instance-Level Object Segmentation.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Semantic Object Parsing with Local-Global Long Short-Term Memory.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Adaptive Bacterial Foraging Algorithm and Its Application in Mobile Robot Path Planning.
Proceedings of the Bio-inspired Computing - Theories and Applications, 2016

Biomimicry of Plant Root Foraging for Distributed Optimization: Models and Emergent Behaviors.
Proceedings of the Bio-inspired Computing - Theories and Applications, 2016

2015
Fashion Parsing With Video Context.
IEEE Trans. Multim., 2015

Deep Human Parsing with Active Template Regression.
IEEE Trans. Pattern Anal. Mach. Intell., 2015

STC: A Simple to Complex Framework for Weakly-supervised Semantic Segmentation.
CoRR, 2015

Scale-aware Fast R-CNN for Pedestrian Detection.
CoRR, 2015

Root system growth for global optimization.
Proceedings of the IEEE International Conference on Information and Automation, 2015

Human Parsing with Contextualized Convolutional Neural Network.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Towards Computational Baby Learning: A Weakly-Supervised Approach for Object Detection.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Matching-CNN meets KNN: Quasi-parametric human parsing.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

A NSGA-II with ADMM Mutation for Solving Multi-objective Robust PCA Problem.
Proceedings of the Bio-Inspired Computing - Theories and Applications, 2015

2014
Complex Background Subtraction by Pursuing Dynamic Spatio-Temporal Models.
IEEE Trans. Image Process., 2014

Computational Baby Learning.
CoRR, 2014

Fashion Parsing with Video Context.
Proceedings of the ACM International Conference on Multimedia, MM '14, Orlando, FL, USA, November 03, 2014

Recognizing focal liver lesions in contrast-enhanced ultrasound with discriminatively trained spatio-temporal model.
Proceedings of the IEEE 11th International Symposium on Biomedical Imaging, 2014

2013
Learning latent spatio-temporal compositional model for human action recognition.
Proceedings of the ACM Multimedia Conference, 2013

Bio-inspired Algorithms for FPGA Implementation of RFID Base-Band Transmission Model & IP Core.
Proceedings of the 2013 Fourth International Conference on Emerging Intelligent Data and Web Technologies, 2013

2011
The Model Design and Simulation of Automatic Placement Path for the Shell of Composite Materials.
Proceedings of the Fourth International Symposium on Parallel Architectures, 2011


  Loading...