2021
TAP: Text-Aware Pre-Training for Text-VQA and Text-Caption.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
VIVO: Surpassing Human Performance in Novel Object Captioning with Visual Vocabulary Pre-Training.
CoRR, 2020

Hashing-based Non-Maximum Suppression for Crowded Object Detection.
CoRR, 2020

Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks.
Proceedings of the Computer Vision - ECCV 2020, 2020