2025
Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis.
CoRR, May, 2025

From Reflection to Perfection: Scaling Inference-Time Optimization for Text-to-Image Diffusion Models via Reflection Tuning.
CoRR, April, 2025

SANA-Sprint: One-Step Diffusion with Continuous-Time Consistency Distillation.
CoRR, March, 2025

DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Models.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2025

2024
FiVL: A Framework for Improved Vision-Language Alignment.
CoRR, 2024

A Noise is Worth Diffusion Guidance.
CoRR, 2024

FastRM: An efficient and automatic explainability framework for multimodal generative models.
CoRR, 2024

LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs.
CoRR, 2024

Margin-aware Preference Optimization for Aligning Diffusion Models without Reference.
CoRR, 2024

PIXART-δ: Fast and Controllable Image Generation with Latent Consistency Models.
CoRR, 2024

Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss.
CoRR, 2024

Getting it Right: Improving Spatial Consistency in Text-to-Image Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

2022
Vision Transformers Are Robust Learners.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Flood Segmentation on Sentinel-1 SAR Imagery with Semi-Supervised Learning.
CoRR, 2021

Fast and Accurate Quantized Camera Scene Detection on Smartphones, Mobile AI 2021 Challenge: Report.
CoRR, 2021

2020
A review of deep learning with special emphasis on architectures, applications and recent trends.
Knowl. Based Syst., 2020

G-SimCLR: Self-Supervised Contrastive Learning with Guided Projection via Pseudo Labelling.
Proceedings of the 20th International Conference on Data Mining Workshops, 2020

2019
A Review of Deep Learning with Special Emphasis on Architectures, Applications and Recent Trends.
CoRR, 2019