Jie Lei

Affiliations:
  • Meta AI, Seattle, WA, USA
  • University of North Carolina at Chapel Hill, Department of Computer Science, NC, USA (PhD 2022)


According to our database1, Jie Lei authored at least 22 papers between 2017 and 2023.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
PERCEIVER-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Vision Transformers are Parameter-Efficient Audio-Visual Learners.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

VindLU: A Recipe for Effective Video-and-Language Pretraining.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Revealing Single Frame Bias for Video-and-Language Learning.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
LoopITR: Combining Dual and Cross Encoder Architectures for Image-Text Retrieval.
CoRR, 2022

Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

EclipSE: Efficient Long-Range Video Retrieval Using Sight and Sound.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
QVHighlights: Detecting Moments and Highlights in Videos via Natural Language Queries.
CoRR, 2021

VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning.
CoRR, 2021

VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Detecting Moments and Highlights in Videos via Natural Language Queries.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Unifying Vision-and-Language Tasks via Text Generation.
Proceedings of the 38th International Conference on Machine Learning, 2021

Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA Models.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Less Is More: ClipBERT for Video-and-Language Learning via Sparse Sampling.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

mTVR: Multilingual Moment Retrieval in Videos.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
What is More Likely to Happen Next? Video-and-Language Future Event Prediction.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval.
Proceedings of the Computer Vision - ECCV 2020, 2020

TVQA+: Spatio-Temporal Grounding for Video Question Answering.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2018
TVQA: Localized, Compositional Video Question Answering.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

2017
Weakly Supervised Image Classification with Coarse and Fine Labels.
Proceedings of the 14th Conference on Computer and Robot Vision, 2017


  Loading...