Yezhou Yang

Orcid: 0000-0003-0126-8976

According to our database1, Yezhou Yang authored at least 152 papers between 2009 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Open-ti: open traffic intelligence with augmented language model.
Int. J. Mach. Learn. Cybern., October, 2024

Segment Anything Model Can Not Segment Anything: Assessing AI Foundation Model's Generalizability in Permafrost Mapping.
Remote. Sens., March, 2024

Formalizing and evaluating requirements of perception systems for automated vehicles using spatio-temporal perception logic.
Int. J. Robotics Res., February, 2024

Latent Space Energy-based Neural ODEs.
CoRR, 2024

Roundabout Dilemma Zone Data Mining and Forecasting with Trajectory Prediction and Graph Neural Networks.
CoRR, 2024

Recent Event Camera Innovations: A Survey.
CoRR, 2024

SynTraC: A Synthetic Dataset for Traffic Signal Control from Traffic Monitoring Cameras.
CoRR, 2024

Deep Geometric Moments Promote Shape Consistency in Text-to-3D Generation.
CoRR, 2024

R.A.C.E.: Robust Adversarial Concept Erasure for Secure Text-to-Image Diffusion Model.
CoRR, 2024

SEVD: Synthetic Event-based Vision Dataset for Ego and Fixed Traffic Perception.
CoRR, 2024

Learning Decomposable and Debiased Representations via Attribute-Centric Information Bottlenecks.
CoRR, 2024

λ-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space.
CoRR, 2024

Towards Addressing the Misalignment of Object Proposal Evaluation for Vision-Language Tasks via Semantic Grounding.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Lost in Translation? Translation Errors and Challenges for Fair Assessment of Text-to-Image Models on Multilingual Concepts.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Short Papers, 2024

TROPE: TRaining-Free Object-Part Enhancement for Seamlessly Improving Fine-Grained Zero-Shot Image Captioning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Precision or Recall? An Analysis of Image Captions for Training Text-to-Image Generation Model.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Getting it Right: Improving Spatial Consistency in Text-to-Image Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

REVISION: Rendering Tools Enable Spatial Fidelity in Vision-Language Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

Evaluating Multimodal Large Language Models across Distribution Shifts and Augmentations.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

eTraM: Event-Based Traffic Monitoring Dataset.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Grounding Stylistic Domain Generalization with Quantitative Domain Shift Measures and Synthetic Scene Images.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

'Eyes of a Hawk and Ears of a Fox': Part Prototype Network for Generalized Zero-Shot Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ConceptBed: Evaluating Concept Learning Abilities of Text-to-Image Diffusion Models.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Mole Recruitment: Poisoning of Image Classifiers via Selective Batch Sampling.
CoRR, 2023

Improving Diversity with Adversarially Learned Transformations for Domain Generalization.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

SKoPe3D: A Synthetic Dataset for Vehicle Keypoint Perception in 3D from Traffic Monitoring Cameras.
Proceedings of the 25th IEEE International Conference on Intelligent Transportation Systems, 2023

CAROM Air - Vehicle Localization and Traffic Scene Reconstruction from Aerial Videos.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Attributing Image Generative Models using Latent Fingerprints.
Proceedings of the International Conference on Machine Learning, 2023

Adversarial Bayesian Augmentation for Single-Source Domain Generalization.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

End-to-end Knowledge Retrieval with Multi-modal Queries.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
A Coulomb Force Inspired Loss Function for High-Performance Pedestrian Detection.
IEEE Signal Process. Lett., 2022

Benchmarking Spatial Relationships in Text-to-Image Generation.
CoRR, 2022

Learning Action-Effect Dynamics from Pairs of Scene-graphs.
CoRR, 2022

Reasoning about Actions over Visual and Linguistic Modalities: A Survey.
CoRR, 2022

Targeted Attack on Deep RL-based Autonomous Driving with Learned Visual Patterns.
Proceedings of the 2022 International Conference on Robotics and Automation, 2022

CAVAN: Commonsense Knowledge Anchored Video Captioning.
Proceedings of the 26th International Conference on Pattern Recognition, 2022

Attributable Watermarking of Speech Generative Models.
Proceedings of the IEEE International Conference on Acoustics, 2022

Learning Action-Effect Dynamics for Hypothetical Vision-Language Reasoning Task.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

CRIPP-VQA: Counterfactual Reasoning about Implicit Physical Properties via Video Question Answering.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Injecting Semantic Concepts into End-to-End Image Captioning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

SSR-GNNs: Stroke-based Sketch Representation with Graph Neural Networks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

Tragedy Plus Time: Capturing Unintended Human Activities from Weakly-labeled Videos.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

Good, Better, Best: Textual Distractors Generation for Multiple-Choice Visual Question Answering via Reinforcement Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

To Find Waldo You Need Contextual Cues: Debiasing Who's Waldo.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2022

Semantically Distributed Robust Optimization for Vision-and-Language Inference.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021
Efficient Robotic Object Search Via HIEM: Hierarchical Policy Learning With Intrinsic-Extrinsic Modeling.
IEEE Robotics Autom. Lett., 2021

CLEVR_HYP: A Challenge Dataset and Baselines for Visual Question Answering with Hypothetical Actions over Images.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

CAROM - Vehicle Localization and Traffic Scene Reconstruction from Monocular Cameras on Road Infrastructures.
Proceedings of the IEEE International Conference on Robotics and Automation, 2021

Decentralized Attribution of Generative Models.
Proceedings of the 9th International Conference on Learning Representations, 2021

SEED: Self-supervised Distillation For Visual Representation.
Proceedings of the 9th International Conference on Learning Representations, 2021

Compressing Visual-linguistic Model via Knowledge Distillation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Weakly Supervised Relative Spatial Reasoning for Visual Question Answering.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Hierarchical and Partially Observable Goal-Driven Policy Learning With Goals Relational Graph.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

First Workshop on Knowledge Injection in Neural Networks (KINN).
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021

SMURF: SeMantic and linguistic UndeRstanding Fusion for Caption Evaluation via Typicality Analysis.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

WeaQA: Weak Supervision via Captions for Visual Question Answering.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

Attribute-Guided Adversarial Training for Robustness to Natural Perturbations.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Neural Style Transfer: A Review.
IEEE Trans. Vis. Comput. Graph., 2020

Low to High Dimensional Modality Hallucination Using Aggregated Fields of View.
IEEE Robotics Autom. Lett., 2020

Fine-grained visual understanding and reasoning.
Neurocomputing, 2020

Self-Supervised VQA: Answering Visual Questions using Images and Captions.
CoRR, 2020

Weak Supervision and Referring Attention for Temporal-Textual Association Learning.
CoRR, 2020

Resisting the Distracting-factors in Pedestrian Detection.
CoRR, 2020

Diverse Visuo-Lingustic Question Answering (DVLQA) Challenge.
CoRR, 2020

memeBot: Towards Automatic Image Meme Generation.
CoRR, 2020

Enabling Incremental Knowledge Transfer for Object Detection at the Edge.
CoRR, 2020

From Seeing to Moving: A Survey on Learning for Visual Indoor Navigation (VIN).
CoRR, 2020

TKD: Temporal Knowledge Distillation for Active Perception.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2020

Learning hierarchical behavior and motion planning for autonomous driving.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

Visuo-Lingustic Question Answering (VLQA) Challenge.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

MUTANT: A Training Paradigm for Out-of-Distribution Generalization in Visual Question Answering.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

ViTAA: Visual-Textual Attributes Alignment in Person Search by Natural Language.
Proceedings of the Computer Vision - ECCV 2020, 2020

VQA-LOL: Visual Question Answering Under the Lens of Logic.
Proceedings of the Computer Vision - ECCV 2020, 2020

Enabling Incremental Knowledge Transfer for Object Detection at the Edge.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Interpretable Partitioned Embedding for Intelligent Multi-item Fashion Outfit Composition.
ACM Trans. Multim. Comput. Commun. Appl., 2019

Robot learning of manipulation activities with overall planning through precedence graph.
Robotics Auton. Syst., 2019

A survey on semantic-based methods for the understanding of human movements.
Robotics Auton. Syst., 2019

GAPLE: Generalizable Approaching Policy LEarning for Robotic Object Searching in Indoor Environment.
IEEE Robotics Autom. Lett., 2019

Good, Better, Best: Textual Distractors Generation for Multi-Choice VQA via Policy Gradient.
CoRR, 2019

Blocksworld Revisited: Learning and Reasoning to Generate Event-Sequences from Image Pairs.
CoRR, 2019

Fluorescence Image Histology Pattern Transformation using Image Style Transfer.
CoRR, 2019

Active Adversarial Evader Tracking with a Probabilistic Pursuer under the Pursuit-Evasion Game Framework.
CoRR, 2019

Improving Model Robustness with Transformation-Invariant Attacks.
CoRR, 2019

How Shall I Drive? Interaction Modeling and Motion Planning towards Empathetic and Socially-Graceful Driving.
CoRR, 2019

Spatial Knowledge Distillation to Aid Visual Reasoning.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Integrating Knowledge and Reasoning in Image Understanding.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

How Shall I Drive? Interaction Modeling and Motion Planning towards Empathetic and Socially-Graceful Driving.
Proceedings of the International Conference on Robotics and Automation, 2019

Image Decomposition and Classification Through a Generative Model.
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019

A Novel Design of Adaptive and Hierarchical Convolutional Neural Networks using Partial Reconfiguration on FPGA.
Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019

Cooking With Blocks : A Recipe for Visual Reasoning on Image-Pairs.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Modularized Textual Grounding for Counterfactual Resilience.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Convolutional neural networks: Ensemble modeling, fine-tuning and unsupervised semantic localization for neurosurgical CLE images.
J. Vis. Commun. Image Represent., 2018

Prediction of Manipulation Actions.
Int. J. Comput. Vis., 2018

Image Understanding using vision and reasoning through Scene Description Graph.
Comput. Vis. Image Underst., 2018

Interpretable Partitioned Embedding for Customized Fashion Outfit Composition.
CoRR, 2018

Weakly Supervised Attention Learning for Textual Phrases Grounding.
CoRR, 2018

Prospects for Theranostics in Neurosurgical Technology: Empowering Confocal Laser Endomicroscopy Diagnostics via Deep Learning.
CoRR, 2018

Stroke Controllable Fast Style Transfer with Adaptive Receptive Fields.
CoRR, 2018

DeepSIC: Deep Semantic Image Compression.
CoRR, 2018

Combining Knowledge and Reasoning through Probabilistic Soft Logic for Image Puzzle Solving.
Proceedings of the Thirty-Fourth Conference on Uncertainty in Artificial Intelligence, 2018

Interpretable Partitioned Embedding for Customized Multi-item Fashion Outfit Composition.
Proceedings of the 2018 ACM on International Conference on Multimedia Retrieval, 2018

Weakly-Supervised Learning-Based Feature Localization for Confocal Laser Endomicroscopy Glioma Images.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2018, 2018

Active Object Perceiver: Recognition-Guided Policy Learning for Object Searching on Mobile Robots.
Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018

Extrinsic Dexterity Through Active Slip Control Using Deep Predictive Models.
Proceedings of the 2018 IEEE International Conference on Robotics and Automation, 2018

DeepSIC: Deep Semantic Image Compression.
Proceedings of the Neural Information Processing - 25th International Conference, 2018

DeepSSH: Deep Semantic Structured Hashing for Explainable Person Re-Identification.
Proceedings of the 2018 IEEE International Conference on Image Processing, 2018

Stroke Controllable Fast Style Transfer with Adaptive Receptive Fields.
Proceedings of the Computer Vision - ECCV 2018, 2018

Transductive Unbiased Embedding for Zero-Shot Learning.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Explicit Reasoning over End-to-End Neural Architectures for Visual Question Answering.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Unsupervised Linking of Visual Features to Textual Descriptions in Long Manipulation Activities.
IEEE Robotics Autom. Lett., 2017

TripletGAN: Training Generative Model with Triplet Loss.
CoRR, 2017

Convolutional Neural Networks: Ensemble Modeling, Fine-Tuning and Unsupervised Semantic Localization.
CoRR, 2017

On the Importance of Consistency in Training Deep Neural Networks.
CoRR, 2017

Neural Style Transfer: A Review.
CoRR, 2017

Improving utility of brain tumor confocal laser endomicroscopy: objective value assessment and diagnostic frame detection with convolutional neural networks.
Proceedings of the Medical Imaging 2017: Computer-Aided Diagnosis, 2017

What can i do around here? Deep functional scene understanding for cognitive robots.
Proceedings of the 2017 IEEE International Conference on Robotics and Automation, 2017

Fast task-specific target detection via graph based constraints representation and checking.
Proceedings of the 2017 IEEE International Conference on Robotics and Automation, 2017

Collision-free trajectory planning in human-robot interaction through hand movement prediction from vision.
Proceedings of the 17th IEEE-RAS International Conference on Humanoid Robotics, 2017

Hand Movement Prediction Based Collision-Free Human-Robot Interaction.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2017

2016
What Can I Do Around Here? Deep Functional Scene Understanding for Cognitive Robots.
CoRR, 2016

Answering Image Riddles using Vision and Reasoning through Probabilistic Soft Logic.
CoRR, 2016

LightNet: A Versatile, Standalone Matlab-based Environment for Deep Learning.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Co-active learning to adapt humanoid movement for manipulation.
Proceedings of the 16th IEEE-RAS International Conference on Humanoid Robots, 2016

Reliable Attribute-Based Object Recognition Using High Predictive Value Classifiers.
Proceedings of the Computer Vision - ECCV 2016, 2016

2015
Manipulation Action Understanding for Observation and Execution.
PhD thesis, 2015

Neural Self Talk: Image Understanding via Continuous Questioning and Answering.
CoRR, 2015

From Images to Sentences through Scene Description Graphs using Commonsense Reasoning and Knowledge.
CoRR, 2015

Learning the spatial semantics of manipulation actions through preposition grounding.
Proceedings of the IEEE International Conference on Robotics and Automation, 2015

Grasp type revisited: A modern perspective on a classical feature for vision.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

Learning the Semantics of Manipulation Action.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

Visual Commonsense for Scene Understanding Using Perception, Semantic Parsing and Reasoning.
Proceedings of the 2015 AAAI Spring Symposia, 2015

Robot Learning Manipulation Action Plans by "Watching" Unconstrained Videos from the World Wide Web.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014
Low-level and high-level prior learning for visual saliency estimation.
Inf. Sci., 2014

Manipulation action tree bank: A knowledge resource for humanoids.
Proceedings of the 14th IEEE-RAS International Conference on Humanoid Robots, 2014

Learning hand movements from markerless demonstrations for humanoid tasks.
Proceedings of the 14th IEEE-RAS International Conference on Humanoid Robots, 2014

2013
Color-to-gray based on chance of happening preservation.
Neurocomputing, 2013

Minimalist plans for interpreting manipulation actions.
Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013

Robots with language: Multi-label visual recognition using NLP.
Proceedings of the 2013 IEEE International Conference on Robotics and Automation, 2013

Detection of Manipulation Action Consequences (MAC).
Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013

Action Attribute Detection from Sports Videos with Contextual Constraints.
Proceedings of the British Machine Vision Conference, 2013

2012
Synergistic methods for using language in robotics.
Proceedings of the Workshop on Performance Metrics for Intelligent Systems, College Park, MD, USA, March 20, 2012

Using a minimal action grammar for activity understanding in the real world.
Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012

Towards a Watson that sees: Language-guided action recognition for robots.
Proceedings of the IEEE International Conference on Robotics and Automation, 2012

2011
Active scene recognition with vision and language.
Proceedings of the IEEE International Conference on Computer Vision, 2011

Corpus-Guided Sentence Generation of Natural Images.
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, 2011

A Corpus-Guided Framework for Robotic Visual Perception.
Proceedings of the Language-Action Tools for Cognitive Artificial Agents, 2011

2010
What Is the Chance of Happening: A New Way to Predict Where People Look.
Proceedings of the Computer Vision - ECCV 2010, 2010

2009
Visual attention analysis by pseudo gravitational field.
Proceedings of the 17th International Conference on Multimedia 2009, 2009


  Loading...