FlameGS: Reconstruct flame light field via Gaussian Splatting.
CoRR, 2024
TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document.
CoRR, 2024
Exploring the Capabilities of Large Multimodal Models on Dense Text.
Proceedings of the Document Analysis and Recognition - ICDAR 2024 - 18th International Conference, Athens, Greece, August 30, 2024
Monkey: Image Resolution and Text Label are Important Things for Large Multi-Modal Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024