We stand with Ukraine

We stand with Ukraine

Vasu Sharma

According to our database¹, Vasu Sharma authored at least 31 papers between 2015 and 2024.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

DINOv2: Learning Robust Visual Features without Supervision.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2024

The Brittleness of AI-Generated Image Watermarking Techniques: Examining Their Robustness Against Visual Paraphrasing Attacks.

[BibT_eX]

[DOI]

Niyar R. Barman

,

,

,

Shashwat Bajpai

,

Shwetangshu Biswas

,

,

,

,

,

CoRR, 2024

An Introduction to Vision-Language Modeling.

[BibT_eX]

[DOI]

CoRR, 2024

Text Quality-Based Pruning for Efficient Training of Language Models.

[BibT_eX]

[DOI]

,

,

Newsha Ardalani

,

Kushal Tirumala

,

,

,

,

,

Armen Aghajanyan

,

,

Luke Zettlemoyer

CoRR, 2024

Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM.

[BibT_eX]

[DOI]

Sainbayar Sukhbaatar

,

,

,

,

Xi Victoria Lin

,

Baptiste Rozière

,

,

,

,

,

CoRR, 2024

ε-ViLM : Efficient Video-Language Model via Masked Video Modeling with Semantic Vector-Quantized Tokenizer.

[BibT_eX]

[DOI]

Jacob Zhiyuan Fang

,

,

,

Robinson Piramuthu

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision Workshops, 2024

Demystifying CLIP Data.

[BibT_eX]

[DOI]

,

,

Xiaoqing Ellen Tan

,

,

,

,

,

,

Luke Zettlemoyer

,

Christoph Feichtenhofer

Proceedings of the Twelfth International Conference on Learning Representations, 2024

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions.

[BibT_eX]

[DOI]

,

,

,

Mary Williamson

,

,

Adriana Romero-Soriano

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

E-ViLM: Efficient Video-Language Model via Masked Video Modeling with Semantic Vector-Quantized Tokenizer.

[BibT_eX]

[DOI]

Jacob Zhiyuan Fang

,

,

,

Robinson Piramuthu

CoRR, 2023

Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning.

[BibT_eX]

[DOI]

CoRR, 2023

Alexa, play with robot: Introducing the First Alexa Prize SimBot Challenge on Embodied AI.

[BibT_eX]

[DOI]

CoRR, 2023

DINOv2: Learning Robust Visual Features without Supervision.

[BibT_eX]

[DOI]

CoRR, 2023

Alexa Arena: A User-Centric Interactive Platform for Embodied AI.

[BibT_eX]

[DOI]

CoRR, 2023

Alexa Arena: A User-Centric Interactive Platform for Embodied AI.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

MAViL: Masked Audio-Video Learners.

[BibT_eX]

[DOI]

,

,

,

Chaitanya Ryali

,

,

,

,

,

,

Christoph Feichtenhofer

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Shimmy: Accelerating Inter-Container Communication for the IoT Edge.

[BibT_eX]

[DOI]

Manan Khasgiwale

,

,

Shivakant Mishra

,

Biljith Thadichi

,

,

Proceedings of the IEEE Global Communications Conference, 2023

Flap: Fast Language-Audio Pre-Training.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

CH-MARL: A Multimodal Benchmark for Cooperative, Heterogeneous Multi-Agent Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

Gaurav S. Sukhatme

CoRR, 2022

PISA: PoIncaré Saliency-Aware Interpolative Augmentation.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Tweet Based Reach Aware Temporal Attention Network for NFT Valuation.

[BibT_eX]

[DOI]

,

,

,

Atula Tejaswi Neerkaje

,

,

Dipanwita Guhathakurta

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2019

Induced Attention Invariance: Defending VQA Models against Adversarial Attacks.

[BibT_eX]

[DOI]

,

,

Louis-Philippe Morency

Proceedings of the Visually Grounded Interaction and Language (ViGIL), 2019

Multimodal Behavioral Markers Exploring Suicidal Intent in Social Media Videos.

[BibT_eX]

[DOI]

Ankit Parag Shah

,

,

Vaibhav Vaibhav

,

Mahmoud Alismail

,

Louis-Philippe Morency

Proceedings of the International Conference on Multimodal Interaction, 2019

Community Regularization of Visually-Grounded Dialog.

[BibT_eX]

[DOI]

,

Swaminathan Gurumurthy

,

,

,

Katia P. Sycara

Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019

2018

Mind Your Language: Learning Visually Grounded Dialog in a Multi-Agent Setting.

[BibT_eX]

[DOI]

,

Swaminathan Gurumurthy

,

,

Katia P. Sycara

CoRR, 2018

Cyclegen: Cyclic consistency based product review generator from attributes.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 11th International Conference on Natural Language Generation, 2018

BioAMA: Towards an End to End BioMedical Question Answering System.

[BibT_eX]

[DOI]

,

Nitish Kulkarni

,

Srividya Pranavi

,

,

,

Teruko Mitamura

Proceedings of the BioNLP 2018 workshop, Melbourne, Australia, July 19, 2018, 2018

2017

Segmentation Guided Attention Networks for Visual Question Answering.

[BibT_eX]

[DOI]

,

,

Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016

Automatic tagging and retrieval of E-Commerce products based on visual features.

[BibT_eX]

[DOI]

,

Proceedings of the Student Research Workshop, 2016

2015

Analyzing Newspaper Crime Reports for Identification of Safe Transit Paths.

[BibT_eX]

[DOI]

,

Rajat Kulshreshtha

,

,

Nishant Agrawal

,

Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Image summarization using topic modelling.

[BibT_eX]

[DOI]

,

,

Nishant Agrawal

,

,

Rajat Kulshreshtha

Proceedings of the 2015 IEEE International Conference on Signal and Image Processing Applications, 2015

A Deep Neural Network based approach for vocal extraction from songs.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Signal and Image Processing Applications, 2015

Loading...