Baoxiong Jia

Orcid: 0000-0002-4968-3290

According to our database1, Baoxiong Jia authored at least 34 papers between 2017 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Multi-modal Situated Reasoning in 3D Scenes.
CoRR, 2024

PhysPart: Physically Plausible Part Completion for Interactable Objects.
CoRR, 2024

SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields.
CoRR, 2024

Task-oriented Sequential Grounding in 3D Scenes.
CoRR, 2024

Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V.
CoRR, 2024

PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI.
CoRR, 2024

An Embodied Generalist Agent in 3D World.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Unifying 3D Vision-Language Understanding via Promptable Queries.
Proceedings of the Computer Vision - ECCV 2024, 2024

SlotLifter: Slot-Guided Feature Lifting for Learning Object-Centric Radiance Fields.
Proceedings of the Computer Vision - ECCV 2024, 2024

SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding.
Proceedings of the Computer Vision - ECCV 2024, 2024

PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Move as you Say, Interact as you can: Language-Guided Human Motion Generation with Scene Affordance.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Learning a Causal Transition Model for Object Cutting.
IROS, 2023

Improving Object-centric Learning with Query Optimization.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

X-VoE: Measuring eXplanatory Violation of Expectation in Physical Events.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Diffusion-based Generation, Optimization, and Planning in 3D Scenes.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation.
CoRR, 2022

Unsupervised Object-Centric Learning with Bi-Level Optimized Query Slot Attention.
CoRR, 2022

Latent Diffusion Energy-Based Model for Interpretable Text Modeling.
CoRR, 2022

EgoTaskQA: Understanding Human Tasks in Egocentric Videos.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Latent Diffusion Energy-Based Model for Interpretable Text Modelling.
Proceedings of the International Conference on Machine Learning, 2022

Learning Algebraic Representation for Systematic Generalization in Abstract Reasoning.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
A Generalized Earley Parser for Human Activity Parsing and Prediction.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

ACRE: Abstract Causal REasoning Beyond Covariation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
LEMMA: A Multi-view Dataset for LEarning Multi-agent Multi-task Activities.
Proceedings of the Computer Vision - ECCV 2020, 2020

2019
Human Activity Understanding and Prediction with Stochastic Grammar.
PhD thesis, 2019

Learning Perceptual Inference by Contrasting.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

RAVEN: A Dataset for Relational and Analogical Visual REasoNing.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Generalized Earley Parser: Bridging Symbolic Grammars and Sequence Data for Future Prediction.
Proceedings of the 35th International Conference on Machine Learning, 2018

Learning Human-Object Interactions by Graph Parsing Neural Networks.
Proceedings of the Computer Vision - ECCV 2018, 2018

2017
Mining User Reviews for Mobile App Comparisons.
Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 2017


  Loading...