Understanding Information Storage and Transfer in Multi-Modal Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Distilling Knowledge from Text-to-Image Generative Models Improves Visio-Linguistic Reasoning in CLIP.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Explaining CLIP's Performance Disparities on Data from Blind/Low Vision Users.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Find My Things: Personalized Accessibility through Teachable AI for People who are Blind or Low Vision.
Proceedings of the Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, 2024
Strong Baselines for Parameter-Efficient Few-Shot Fine-Tuning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
EditVal: Benchmarking Diffusion Based Text-Guided Image Editing Methods.
CoRR, 2023
Augmenting CLIP with Improved Visio-Linguistic Reasoning.
CoRR, 2023
NP-SemiSeg: When Neural Processes meet Semi-Supervised Semantic Segmentation.
Proceedings of the International Conference on Machine Learning, 2023
Hard-Meta-Dataset++: Towards Understanding Few-Shot Performance on Difficult Tasks.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
Understanding Personalized Accessibility through Teachable AI: Designing and Evaluating Find My Things for People who are Blind or Low Vision.
Proceedings of the 25th International ACM SIGACCESS Conference on Computers and Accessibility, 2023
NP-Match: When Neural Processes meet Semi-Supervised Learning.
Proceedings of the International Conference on Machine Learning, 2022
Memory Efficient Meta-Learning with Large Images.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
ORBIT: A Real-World Few-Shot Dataset for Teachable Object Recognition.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Disability-first Dataset Creation: Lessons from Constructing a Dataset for Teachable Object Recognition with Blind and Low Vision Data Collectors.
Proceedings of the ASSETS '21: The 23rd International ACM SIGACCESS Conference on Computers and Accessibility, 2021
A Revised Generative Evaluation of Visual Dialogue.
CoRR, 2020
A Dynamic AI System for Extending the Capabilities of Blind People.
Proceedings of the Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems, 2020
Computer vision and natural language processing for people with vision impairment.
PhD thesis, 2019
Visual Dialogue without Vision or Dialogue.
CoRR, 2018
FlipDial: A Generative Model for Two-Way Visual Dialogue.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018
Random forests versus Neural Networks - What's best for camera localization?
Proceedings of the 2017 IEEE International Conference on Robotics and Automation, 2017
Bottom-Up Top-Down Cues for Weakly-Supervised Semantic Segmentation.
Proceedings of the Energy Minimization Methods in Computer Vision and Pattern Recognition, 2017
Random Forests versus Neural Networks - What's Best for Camera Relocalization?
CoRR, 2016
Mining Pixels: Weakly Supervised Semantic Segmentation Using Image Labels.
CoRR, 2016