Zhixi Cai

Orcid: 0000-0001-7978-0860

According to our database1, Zhixi Cai authored at least 14 papers between 2022 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Hi-SLAM: Scaling-up Semantics in SLAM with a Hierarchically Categorical Gaussian Splatting.
CoRR, 2024

NEUSIS: A Compositional Neuro-Symbolic Framework for Autonomous Perception, Reasoning, and Planning in Complex UAV Search Missions.
CoRR, 2024

HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning.
CoRR, 2024

MRAC Track 1: 2nd Workshop on Multimodal, Generative and Responsible Affective Computing.
Proceedings of the 2nd International Workshop on Multimodal and Responsible Affective Computing, 2024

1M-Deepfakes Detection Challenge.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

AV-Deepfake1M: A Large-Scale LLM-Driven Audio-Visual Deepfake Dataset.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

JRDB-Social: A Multifaceted Robotic Dataset for Understanding of Context and Dynamics of Human Interactions Within Social Groups.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
<i>Glitch in the matrix</i>: A large scale benchmark for content driven audio-visual forgery detection and localization.
Comput. Vis. Image Underst., November, 2023

AV-Deepfake1M: A Large-Scale LLM-Driven Audio-Visual Deepfake Dataset.
CoRR, 2023

Pavlok-Nudge: A Feedback Mechanism for Atomic Behaviour Modification with Snoring Usecase.
CoRR, 2023

Emolysis: A Multimodal Open-Source Group Emotion Analysis and Visualization Toolkit.
CoRR, 2023

"Glitch in the Matrix!": A Large Scale Benchmark for Content Driven Audio-Visual Forgery Detection and Localization.
CoRR, 2023

MARLIN: Masked Autoencoder for facial video Representation LearnINg.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
Do You Really Mean That? Content Driven Audio-Visual Deepfake Dataset and Multimodal Method for Temporal Forgery Localization.
CoRR, 2022


  Loading...