2021

TAP: Text-Aware Pre-Training for Text-VQA and Text-Caption.

[DOI]

Zhengyuan Yang

Yijuan Lu

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

VIVO: Visual Vocabulary Pre-Training for Novel Object Captioning.

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

VIVO: Surpassing Human Performance in Novel Object Captioning with Visual Vocabulary Pre-Training.

[DOI]

CoRR, 2020

Hashing-based Non-Maximum Suppression for Crowded Object Detection.

[DOI]

CoRR, 2020

Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks.

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020