2025

A Systematic Survey of Automatic Prompt Optimization Techniques.

[DOI]

CoRR, February, 2025

Modality mixer exploiting complementary information for multi-modal action recognition.

[DOI]

,

,

Muhammad Adi Nugroho

,

Comput. Vis. Image Underst., 2025

Black-Box Visual Prompt Engineering for Mitigating Object Hallucination in Large Vision Language Models.

[DOI]

,

,

,

,

,

,

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025

Diffusion Model Patching via Mixture-of-Prompts.

[DOI]

,

,

,

,

,

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

Parameter Efficient Mamba Tuning via Projector-targeted Diagonal-centric Linear Transformation.

[DOI]

,

,

,

CoRR, 2024

RITUAL: Random Image Transformations as a Universal Anti-hallucination Lever in LVLMs.

[DOI]

,

,

,

,

CoRR, 2024

Don't Miss the Forest for the Trees: Attentional Vision Calibration for Large Vision Language Models.

[DOI]

,

,

,

,

CoRR, 2024

Sketch-based Video Object Localization.

[DOI]

,

,

,

,

,

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

Denoising Task Routing for Diffusion Models.

[DOI]

,

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts.

[DOI]

,

,

,

,

,

Proceedings of the Computer Vision - ECCV 2024, 2024

Flow-Assisted Motion Learning Network for Weakly-Supervised Group Activity Recognition.

[DOI]

Muhammad Adi Nugroho

,

,

,

,

,

,

Proceedings of the Computer Vision - ECCV 2024, 2024

Spatio-Temporal Proximity-Aware Dual-Path Model for Panoramic Activity Recognition.

[DOI]

,

,

,

Proceedings of the Computer Vision - ECCV 2024, 2024

HarmonyView: Harmonizing Consistency and Diversity in One-Image-to-3D.

[DOI]

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Tackling the Challenges in Scene Graph Generation With Local-to-Global Interactions.

[DOI]

,

,

IEEE Trans. Neural Networks Learn. Syst., December, 2023

Cross-modal alignment and translation for missing modality action recognition.

[DOI]

,

,

,

Muhammad Adi Nugroho

,

Comput. Vis. Image Underst., November, 2023

Modality Mixer for Multi-modal Action Recognition.

[DOI]

,

,

,

Muhammad Adi Nugroho

,

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

AHFu-Net: Align, Hallucinate, and Fuse Network for Missing Multimodal Action Recognition.

[DOI]

Muhammad Adi Nugroho

,

,

,

Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 2023

Multi-modal Social Group Activity Recognition in Panoramic Scene.

[DOI]

,

,

,

,

Muhammad Adi Nugroho

,

Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 2023

Audio-Visual Glance Network for Efficient Video Recognition.

[DOI]

Muhammad Adi Nugroho

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Towards Good Practices for Missing Modality Robust Action Recognition.

[DOI]

,

,

,

Muhammad Adi Nugroho

,

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

Explore and Match: End-to-End Video Grounding with Transformer.

[DOI]

,

,

,

,

,

CoRR, 2022

Temporal Flow Mask Attention for Open-Set Long-Tailed Recognition of Wild Animals in Camera-Trap Images.

[DOI]

,

,

,

Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

2021

What and When to Look?: Temporal Span Proposal Network for Video Visual Relation Detection.

[DOI]

,

,

CoRR, 2021