Botian Shi

Orcid: 0000-0003-3677-7252

According to our database¹, Botian Shi authored at least 59 papers between 2019 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

LiCROcc: Teach Radar for Accurate Semantic Occupancy Prediction Using LiDAR and Camera.

[BibT_eX]

[DOI]

IEEE Robotics Autom. Lett., January, 2025

2024

Few-Shot Cross-Domain Object Detection With Instance-Level Prototype-Based Meta-Learning.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. Video Technol., October, 2024

SensorX2Vehicle: Online Sensors-to-Vehicle Rotation Calibration Methods in Road Scenarios.

[BibT_eX]

[DOI]

IEEE Robotics Autom. Lett., 2024

Human-Like Decision Making at Unsignalized Intersections Using Social Value Orientation.

[BibT_eX]

[DOI]

IEEE Intell. Transp. Syst. Mag., 2024

GeoX: Geometric Problem Solving Through Unified Formalized Vision-Language Pre-training.

[BibT_eX]

[DOI]

CoRR, 2024

OmniDocBench: Benchmarking Diverse PDF Document Parsing with Comprehensive Annotations.

[BibT_eX]

[DOI]

CoRR, 2024

Chimera: Improving Generalist Model with Domain-Specific Experts.

[BibT_eX]

[DOI]

CoRR, 2024

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling.

[BibT_eX]

[DOI]

CoRR, 2024

ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving.

[BibT_eX]

[DOI]

CoRR, 2024

Training-Free Adaptive Diffusion with Bounded Difference Approximation Strategy.

[BibT_eX]

[DOI]

CoRR, 2024

MinerU: An Open-Source Solution for Precise Document Content Extraction.

[BibT_eX]

[DOI]

CoRR, 2024

DreamForge: Motion-Aware Autoregressive Video Generation for Multi-View Driving Scenes.

[BibT_eX]

[DOI]

CoRR, 2024

DriveArena: A Closed-loop Generative Simulation Platform for Autonomous Driving.

[BibT_eX]

[DOI]

CoRR, 2024

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text.

[BibT_eX]

[DOI]

CoRR, 2024

Continuously Learning, Adapting, and Improving: A Dual-Process Approach to Autonomous Driving.

[BibT_eX]

[DOI]

CoRR, 2024

Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond.

[BibT_eX]

[DOI]

CoRR, 2024

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites.

[BibT_eX]

[DOI]

CoRR, 2024

UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition.

[BibT_eX]

[DOI]

CoRR, 2024

ChartX & ChartVLM: A Versatile Benchmark and Foundation Model for Complicated Chart Reasoning.

[BibT_eX]

[DOI]

CoRR, 2024

OASim: an Open and Adaptive Simulator based on Neural Rendering for Autonomous Driving.

[BibT_eX]

[DOI]

CoRR, 2024

Drive Like a Human: Rethinking Autonomous Driving with Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2024

LimSim++: A Closed-Loop Platform for Deploying Multimodal LLMs in Autonomous Driving.

[BibT_eX]

[DOI]

Proceedings of the IEEE Intelligent Vehicles Symposium, 2024

Realistic Rainy Weather Simulation for LiDARs in CARLA Simulator.

[BibT_eX]

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024

An Extrinsic Calibration Method between LiDAR and GNSS/INS for Autonomous Driving.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Zero-training LiDAR-Camera Extrinsic Calibration Method Using Segment Anything Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

VeloVox: A Low-Cost and Accurate 4D Object Detector with Single-Frame Point Cloud of Livox LiDAR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language Models.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

ReSimAD: Zero-Shot 3D Domain Transfer for Autonomous Driving with Source Reconstruction and Target Simulation.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Reg-TTA3D: Better Regression Makes Better Test-Time Adaptive 3D Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

2023

Multi-Sensor Fusion and Cooperative Perception for Autonomous Driving: A Review.

[BibT_eX]

[DOI]

IEEE Intell. Transp. Syst. Mag., 2023

Realistic Rainy Weather Simulation for LiDARs in CARLA Simulator.

[BibT_eX]

[DOI]

CoRR, 2023

Towards Knowledge-driven Autonomous Driving.

[BibT_eX]

[DOI]

CoRR, 2023

SceneDM: Scene-level Multi-agent Trajectory Generation with Consistent Diffusion Models.

[BibT_eX]

[DOI]

CoRR, 2023

On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving.

[BibT_eX]

[DOI]

CoRR, 2023

StructChart: Perception, Structuring, Reasoning for Visual Chart Understanding.

[BibT_eX]

[DOI]

CoRR, 2023

SPOT: Scalable 3D Pre-training via Occupancy Prediction for Autonomous Driving.

[BibT_eX]

[DOI]

CoRR, 2023

TrafficMCTS: A Closed-Loop Traffic Flow Generation Framework with Group-Based Monte Carlo Tree Search.

[BibT_eX]

[DOI]

CoRR, 2023

StreetSurf: Extending Multi-view Implicit Surface Reconstruction to Street Views.

[BibT_eX]

[DOI]

CoRR, 2023

AD-PT: Autonomous Driving Pre-Training with Large-scale Point Cloud Dataset.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

RangePerception: Taming LiDAR Range View for Efficient and Accurate 3D Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

SUG: Single-dataset Unified Generalization for 3D Point Cloud Classification.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Uni3D: A Unified Baseline for Multi-Dataset 3D Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Bi3D: Bi-Domain Active Learning for Cross-Domain 3D Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

LoGoNet: Towards Accurate 3D Object Detection with Local-to-Global Cross- Modal Fusion.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

LWSIS: LiDAR-Guided Weakly Supervised Instance Segmentation for Autonomous Driving.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

ADAS: A Simple Active-and-Adaptive Baseline for Cross-Domain 3D Semantic Segmentation.

[BibT_eX]

[DOI]

CoRR, 2022

Multi-modal Sensor Fusion for Auto Driving Perception: A Survey.

[BibT_eX]

[DOI]

CoRR, 2022

Learning Cross-Image Object Semantic Relation in Transformer for Few-Shot Fine-Grained Image Classification.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

2021

Hashing based Efficient Inference for Image-Text Matching.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020

A Benchmark for Structured Procedural Knowledge Extraction from Cooking Videos.

[BibT_eX]

[DOI]

CoRR, 2020

UniViLM: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation.

[BibT_eX]

[DOI]

CoRR, 2020

Learning Semantic Concepts and Temporal Alignment for Narrated Video Procedural Captioning.

[BibT_eX]

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Functionality Discovery and Prediction of Physical Objects.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Microsoft Concept Graph: Mining Semantic Concepts for Short Text Understanding.

[BibT_eX]

[DOI]

Data Intell., 2019

Knowledge Aware Semantic Concept Expansion for Image-Text Matching.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Dense Procedure Captioning in Narrated Instructional Videos.

[BibT_eX]

[DOI]

Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Botian Shi

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...