R. Manmatha

Orcid: 0000-0003-2315-8583

Affiliations:
  • University of Massachusetts Amherst, USA


According to our database1, R. Manmatha authored at least 119 papers between 1989 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Improving Semantic Segmentation via Efficient Self-Training.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2024

NAVERO: Unlocking Fine-Grained Semantics for Video-Language Compositionality.
CoRR, 2024

Mixed-Query Transformer: A Unified Image Segmentation Architecture.
CoRR, 2024

DEED: Dynamic Early Exit on Decoder for Accelerating Encoder-Decoder Transformer Models.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

Multiple-Question Multiple-Answer Text-VQA.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track, 2024

ICDAR 2024 Competition on Recognition and VQA on Handwritten Documents.
Proceedings of the Document Analysis and Recognition - ICDAR 2024 - 18th International Conference, Athens, Greece, August 30, 2024

DocKD: Knowledge Distillation from LLMs for Open-World Document Understanding Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding.
Proceedings of the Computer Vision - ECCV 2024, 2024

On the Scalability of Diffusion-based Text-to-Image Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

No Head Left Behind - Multi-Head Alignment Distillation for Transformers.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

DocFormerv2: Local Features for Document Understanding.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
SimCon Loss with Multiple Views for Text Supervised Semantic Segmentation.
CoRR, 2023

DocTr: Document Transformer for Structured Information Extraction in Documents.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

PolyFormer: Referring Image Segmentation as Sequential Polygon Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022
On Calibration of Scene-Text Recognition Models.
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

GLASS: Global to Local Attention for Scene-Text Spotting.
Proceedings of the Computer Vision - ECCV 2022, 2022

YORO - Lightweight End to End Visual Grounding.
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

Towards Weakly-Supervised Text Spotting using a Multi-Task Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

LaTr: Layout-Aware Transformer for Scene-Text VQA.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

ResNeSt: Split-Attention Networks.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

2021
Saliency Driven Perceptual Image Compression.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

DocFormer: End-to-End Transformer for Document Understanding.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Sequence-to-Sequence Contrastive Learning for Text Recognition.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
A Comprehensive Study of Deep Video Action Recognition.
CoRR, 2020

Document Visual Question Answering Challenge 2020.
CoRR, 2020

DocVQA: A Dataset for VQA on Document Images.
CoRR, 2020

Improving Semantic Segmentation via Self-Training.
CoRR, 2020

Hierarchical Auto-Regressive Model for Image Compression Incorporating Object Saliency and a Deep Perceptual Loss.
CoRR, 2020

SCATTER: Selective Context Attentional Scene Text Recognizer.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Dependence Models for Searching Text in Document Images.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Human Perceptual Evaluations for Image Compression.
CoRR, 2019

Deep Perceptual Compression.
CoRR, 2019

Searching for Apparel Products from Images in the Wild.
CoRR, 2019

2018
Compressed Video Action Recognition.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

2017
Sampling Matters in Deep Embedding Learning.
Proceedings of the IEEE International Conference on Computer Vision, 2017

2016
Image Annotation using Multi-scale Hypergraph Heat Diffusion Framework.
Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, 2016

Efficient Exploration of Text Regions in Natural Scene Images Using Adaptive Image Sampling.
Proceedings of the Computer Vision - ECCV 2016 Workshops, 2016

Deep Decision Network for Multi-class Image Classification.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

2015
ICMR 2014: 4th ACM International Conference on Multimedia Retrieval.
SIGIR Forum, 2015

Automatic Image Annotation using Deep Learning Representations.
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

2014
Special issue on Multimedia Event Detection.
Mach. Vis. Appl., 2014

Large scale document image retrieval by automatic word annotation.
Int. J. Document Anal. Recognit., 2014

Incorporating query-specific feedback into learning-to-rank models.
Proceedings of the 37th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2014

A Hybrid Model for Automatic Image Annotation.
Proceedings of the International Conference on Multimedia Retrieval, 2014

Modeling Concept Dependencies for Event Detection.
Proceedings of the International Conference on Multimedia Retrieval, 2014

Sequential Word Spotting in Historical Handwritten Documents.
Proceedings of the 11th IAPR International Workshop on Document Analysis Systems, 2014

2013
SRI-Sarnoff AURORA System at TRECVID 2013 Multimedia Event Detection and Recounting.
Proceedings of the 2013 TREC Video Retrieval Evaluation, 2013

Short Text Queries for Video Retrieval Multimedia event Detection at TRECVID 2013.
Proceedings of the 2013 TREC Video Retrieval Evaluation, 2013

Creating an Improved Version Using Noisy OCR from Multiple Editions.
Proceedings of the 12th International Conference on Document Analysis and Recognition, 2013

Formulating Action Recognition as a Ranking Problem.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2013

Predicting retweet count using visual cues.
Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, 2013

2012
A Novel Word Spotting Method Based on Recurrent Neural Networks.
IEEE Trans. Pattern Anal. Mach. Intell., 2012

SRI-Sarnoff AURORA System at TRECVID 2012 Multimedia Event Detection and Recounting.
Proceedings of the 2012 TREC Video Retrieval Evaluation, 2012

Finding translations in scanned book collections.
Proceedings of the 35th International ACM SIGIR conference on research and development in Information Retrieval, 2012

A framework for manipulating and searching multiple retrieval types.
Proceedings of the 35th International ACM SIGIR conference on research and development in Information Retrieval, 2012

On Influence of Line Segmentation in Efficient Word Segmentation in Old Manuscripts.
Proceedings of the 2012 International Conference on Frontiers in Handwriting Recognition, 2012

An Efficient Framework for Searching Text in Noisy Document Images.
Proceedings of the 10th IAPR International Workshop on Document Analysis Systems, 2012

2011

A Fast Alignment Scheme for Automatic OCR Evaluation of Books.
Proceedings of the 2011 International Conference on Document Analysis and Recognition, 2011

BLSTM Neural Network Based Word Retrieval for Hindi Documents.
Proceedings of the 2011 International Conference on Document Analysis and Recognition, 2011

Partial duplicate detection for large book collections.
Proceedings of the 20th ACM Conference on Information and Knowledge Management, 2011

Mining relational structure from millions of books: position paper.
Proceedings of the 4th ACM Workshop on Online books, 2011

2010
Adapting BLSTM Neural Network Based Keyword Spotting Trained on Modern Data to Historical Documents.
Proceedings of the International Conference on Frontiers in Handwriting Recognition, 2010

Nearest neighbor based collection OCR.
Proceedings of the Ninth IAPR International Workshop on Document Analysis Systems, 2010

Image retrieval using Markov Random Fields and global image features.
Proceedings of the 9th ACM International Conference on Image and Video Retrieval, 2010

2009
Finding words in alphabet soup: Inference on freeform character recognition for historical scripts.
Pattern Recognit., 2009

Robust Recognition of Documents by Fusing Results of Word Clusters.
Proceedings of the 10th International Conference on Document Analysis and Recognition, 2009

2008
Document Image Analysis and Recognition.
Proceedings of the Wiley Encyclopedia of Computer Science and Engineering, 2008

Distributed image search in camera sensor networks.
Proceedings of the 6th International Conference on Embedded Networked Sensor Systems, 2008

A discrete direct retrieval model for image and video retrieval.
Proceedings of the 7th ACM International Conference on Image and Video Retrieval, 2008

2007
Word spotting for historical documents.
Int. J. Document Anal. Recognit., 2007

Further explorations in text alignment with handwritten documents.
Int. J. Document Anal. Recognit., 2007

Efficient Search in Document Image Collections.
Proceedings of the Computer Vision, 2007

2006
A hierarchical, HMM-based automatic evaluation of OCR accuracy for a digital library of books.
Proceedings of the ACM/IEEE Joint Conference on Digital Libraries, 2006

Exploring the Use of Conditional Random Field Models and HMMs for Historical Handwritten Document Recognition.
Proceedings of the Second International Workshop on Document Image Analysis for Libraries (DIAL 2006), 2006

Aligning Transcripts to Automatically Segmented Handwritten Manuscripts.
Proceedings of the Document Analysis Systems VII, 7th International Workshop, 2006

2005
Multimedia information retrieval: workshop report.
SIGIR Forum, 2005

A Scale Space Approach for Automatically Segmenting Words from Historical Handwritten Documents.
IEEE Trans. Pattern Anal. Mach. Intell., 2005

Boosted decision trees for word recognition in handwritten document retrieval.
Proceedings of the SIGIR 2005: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2005

Joint visual-text modeling for automatic retrieval of multimedia documents.
Proceedings of the 13th ACM International Conference on Multimedia, 2005

Classification Models for Historical Manuscript Recognition.
Proceedings of the Eighth International Conference on Document Analysis and Recognition (ICDAR 2005), 29 August, 2005

Combining text and audio-visual features in video indexing.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Learning Shapes for Image Classification and Retrieval.
Proceedings of the Image and Video Retrieval, 4th International Conference, 2005

2004
A search engine for historical manuscript images.
Proceedings of the SIGIR 2004: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2004

Statistical models for automatic video annotation and retrieval.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Holistic Word Recognition for Handwritten Historical Documents.
Proceedings of the 1st International Workshop on Document Image Analysis for Libraries (DIAL 2004), 2004

Text Alignment with Handwritten Documents.
Proceedings of the 1st International Workshop on Document Image Analysis for Libraries (DIAL 2004), 2004

Multiple Bernoulli Relevance Models for Image and Video Annotation.
Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), with CD-ROM, 27 June, 2004

An Inference Network Approach to Image Retrieval.
Proceedings of the Image and Video Retrieval: Third International Conference, 2004

Using Maximum Entropy for Automatic Image Annotation.
Proceedings of the Image and Video Retrieval: Third International Conference, 2004

2003
Challenges in information retrieval and language modeling: report of a workshop held at the center for intelligent information retrieval, University of Massachusetts Amherst, September 2002.
SIGIR Forum, 2003

Automatic image annotation and retrieval using cross-media relevance models.
Proceedings of the SIGIR 2003: Proceedings of the 26th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, July 28, 2003

A Model for Learning the Semantics of Pictures.
Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

Mobile Distributed Information Retrieval for Highly-Partitioned Networks.
Proceedings of the 11th IEEE International Conference on Network Protocols (ICNP 2003), 2003

Features for Word Spotting in Historical Manuscripts.
Proceedings of the 7th International Conference on Document Analysis and Recognition (ICDAR 2003), 2003

Word Image Matching Using Dynamic Time Warping.
Proceedings of the 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2003), 2003

2002
A critical examination of TDT's cost function.
Proceedings of the SIGIR 2002: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2002

2001
Modeling Score Distributions for Combining the Outputs of Search Engines.
Proceedings of the SIGIR 2001: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2001

Automatic Segmentation and Indexing in a Database of Bird Images.
Proceedings of the Eighth International Conference On Computer Vision (ICCV-01), Vancouver, British Columbia, Canada, July 7-14, 2001, 2001

1999
Indexing and Retrieval, SIGIR'99 Workshop Summary.
SIGIR Forum, 1999

TextFinder: An Automatic System to Detect and Recognize Text In Images.
IEEE Trans. Pattern Anal. Mach. Intell., 1999

Indexing Flower Patent Images Using Domain Knowledge.
IEEE Intell. Syst., 1999

Scale Space Technique for Word Segmentation in Handwritten Documents.
Proceedings of the Scale-Space Theories in Computer Vision, 1999

1998
Multimedia Indexing and Retrieval, Summary Report.
SIGIR Forum, 1998

On computing global similarity in images.
Proceedings of the Proceedings Fourth IEEE Workshop on Applications of Computer Vision, 1998

Indexing flowers by color names using domain knowledge-driven segmentation.
Proceedings of the Proceedings Fourth IEEE Workshop on Applications of Computer Vision, 1998

Retrieving Images by Appearance.
Proceedings of the Sixth International Conference on Computer Vision (ICCV-98), 1998

Computing local and global similarity in images.
Proceedings of the Human Vision and Electronic Imaging III, 1998

Document image cleanup and binarization.
Proceedings of the Document Recognition V, San Jose, CA, USA, January 24, 1998, 1998

1997
Image Retrieval by Appearance.
Proceedings of the SIGIR '97: Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1997

Syntactic characterization of appearance and its application to image retrieval.
Proceedings of the Human Vision and Electronic Imaging II, 1997

Finding Text in Images.
Proceedings of the 2nd ACM International Conference on Digital Libraries, 1997

1996
Image Retrieval Using Scale-Space Matching.
Proceedings of the Computer Vision, 1996

Indexing Handwriting Using Word Matching.
Proceedings of the 1st ACM International Conference on Digital Libraries, 1996

Word Spotting: A New Approach to Indexing Handwriting.
Proceedings of the 1996 Conference on Computer Vision and Pattern Recognition (CVPR '96), 1996

1994
Measuring the Affine Transform Using Gaussian Filters.
Proceedings of the Computer Vision, 1994

A framework for recovering affine transforms using points, lines or image brightnesses.
Proceedings of the Conference on Computer Vision and Pattern Recognition, 1994

1993
Extracting affine deformations from image patches. I. Finding scale and rotation.
Proceedings of the Conference on Computer Vision and Pattern Recognition, 1993

1989
A data set for quantitative motion analysis.
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1989


  Loading...