Siyuan Huang

Orcid: 0000-0003-1524-7148

Affiliations:
  • Beijing Institute for General Artificial Intelligence (BIGAI), China
  • University of California, Los Angeles, CA, USA (PhD 2021)


According to our database1, Siyuan Huang authored at least 57 papers between 2017 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Grasp Multiple Objects With One Hand.
IEEE Robotics Autom. Lett., May, 2024

PhysPart: Physically Plausible Part Completion for Interactable Objects.
CoRR, 2024

Task-oriented Sequential Grounding in 3D Scenes.
CoRR, 2024

Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations.
CoRR, 2024

PhyRecon: Physically Plausible Neural Scene Reconstruction.
CoRR, 2024

PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI.
CoRR, 2024

Autonomous Character-Scene Interaction Synthesis from Text Instruction.
Proceedings of the SIGGRAPH Asia 2024 Conference Papers, 2024

An Embodied Generalist Agent in 3D World.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Neural-Symbolic Recursive Machine for Systematic Generalization.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Unifying 3D Vision-Language Understanding via Promptable Queries.
Proceedings of the Computer Vision - ECCV 2024, 2024

F-HOI: Toward Fine-Grained Semantic-Aligned 3D Human-Object Interactions.
Proceedings of the Computer Vision - ECCV 2024, 2024

SlotLifter: Slot-Guided Feature Lifting for Learning Object-Centric Radiance Fields.
Proceedings of the Computer Vision - ECCV 2024, 2024

SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding.
Proceedings of the Computer Vision - ECCV 2024, 2024

Move as you Say, Interact as you can: Language-Guided Human Motion Generation with Scene Affordance.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Scaling Up Dynamic Human-Scene Interaction Modeling.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Single-view 3D Scene Reconstruction with High-fidelity Shape and Texture.
Proceedings of the International Conference on 3D Vision, 2024

2023
ProBio: A Protocol-guided Multimodal Dataset for Molecular Biology Lab.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

GenDexGrasp: Generalizable Dexterous Grasping.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

SQA3D: Situated Question Answering in 3D Scenes.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Improving Object-centric Learning with Query Optimization.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

A Minimalist Dataset for Systematic Generalization of Perception, Syntax, and Semantics.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

3D-VisTA: Pre-trained Transformer for 3D Vision and Text Alignment.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Full-Body Articulated Human-Object Interaction.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

ARNOLD: A Benchmark for Language-Grounded Task Learning With Continuous States in Realistic 3D Scenes.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Diffusion-based Generation, Optimization, and Planning in 3D Scenes.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

GAPartNet: Cross-Category Domain-Generalizable Object Perception and Manipulation via Generalizable and Actionable Parts.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
CHAIRS: Towards Full-Body Articulated Human-Object Interaction.
CoRR, 2022

Perceive, Ground, Reason, and Act: A Benchmark for General-purpose Visual Representation.
CoRR, 2022

Unsupervised Object-Centric Learning with Bi-Level Optimized Query Slot Attention.
CoRR, 2022

PartAfford: Part-level Affordance Discovery from 3D Objects.
CoRR, 2022

HUMANISE: Language-conditioned Human Motion Generation in 3D Scenes.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

EgoTaskQA: Understanding Human Tasks in Egocentric Videos.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Learning V1 Simple Cells with Vector Representation of Local Content and Matrix Representation of Local Motion.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Human-like Holistic 3D Scene Understanding.
PhD thesis, 2021

A Generalized Earley Parser for Human Activity Parsing and Prediction.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

A HINT from Arithmetic: On Systematic Generalization of Perception, Syntax, and Semantics.
CoRR, 2021

Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

VLGrammar: Grounded Grammar Induction of Vision and Language.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

YouRefIt: Embodied Reference Understanding with Language and Gesture.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Learning Neural Representation of Camera Pose with Matrix Representation of Pose Shift via View Synthesis.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Inter-GPS: Interpretable Geometry Problem Solving with Formal Language and Symbolic Reasoning.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

SMART: A Situation Model for Algebra Story Problems via Attributed Grammar.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Learning by Fixing: Solving Math Word Problems with Weak Supervision.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense.
CoRR, 2020

Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning.
Proceedings of the 37th International Conference on Machine Learning, 2020

A Competence-Aware Curriculum for Visual Concepts Learning via Question Answering.
Proceedings of the Computer Vision - ECCV 2020, 2020

LEMMA: A Multi-view Dataset for LEarning Multi-agent Multi-task Activities.
Proceedings of the Computer Vision - ECCV 2020, 2020

2019
PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Understanding Human Gaze Communication by Spatio-Temporal Graph Reasoning.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Holistic++ Scene Understanding: Single-View 3D Holistic Scene Parsing and Human Pose Estimation With Human-Object Interaction and Physical Commonsense.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

2018
Configurable 3D Scene Synthesis and 2D Image Rendering with Per-pixel Ground Truth Using Stochastic Grammars.
Int. J. Comput. Vis., 2018

Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image.
Proceedings of the Computer Vision - ECCV 2018, 2018

Human-Centric Indoor Scene Synthesis Using Stochastic Grammar.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Configurable, Photorealistic Image Rendering and Ground Truth Synthesis by Sampling Stochastic Grammars Representing Indoor Scenes.
CoRR, 2017

Predicting Human Activities Using Stochastic Grammar.
Proceedings of the IEEE International Conference on Computer Vision, 2017


  Loading...