Nicola Messina

Orcid: 0000-0003-3011-2487

According to our database1, Nicola Messina authored at least 56 papers between 2018 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Cascaded transformer-based networks for wikipedia large-scale image-caption matching.
Multim. Tools Appl., July, 2024

VISIONE Feature Repository for VBS: Multi-Modal Features and Detected Objects from MVK Dataset.
Dataset, January, 2024

VISIONE Feature Repository for VBS: Multi-Modal Features and Detected Objects from VBSLHE Dataset.
Dataset, January, 2024

Towards Identity-Aware Cross-Modal Retrieval: a Dataset and a Baseline.
CoRR, 2024

Maybe you are looking for CroQS: Cross-modal Query Suggestion for Text-to-Image Retrieval.
CoRR, 2024

Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation.
CoRR, 2024

Mind the Prompt: A Novel Benchmark for Prompt-based Class-Agnostic Counting.
CoRR, 2024

Joint-Dataset Learning and Cross-Consistent Regularization for Text-to-Motion Retrieval.
CoRR, 2024

Is CLIP the main roadblock for fine-grained open-world perception?
CoRR, 2024

Evaluating Performance and Trends in Interactive Video Retrieval: Insights From the 12th VBS Competition.
IEEE Access, 2024

VISIONE 5.0: Enhanced User Interface and AI Models for VBS2024.
Proceedings of the MultiMedia Modeling - 30th International Conference, 2024

Will VISIONE Remain Competitive in Lifelog Image Search?
Proceedings of the 7th Annual ACM Workshop on the Lifelog Search Challenge, 2024

The Devil is in the Fine-Grained Details: Evaluating open-Vocabulary Object Detectors for Fine-Grained Understanding.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
Interactive video retrieval in the age of effective joint embedding deep models: lessons from the 11th VBS.
Multim. Syst., December, 2023

VISIONE Feature Repository for VBS: Multi-Modal Features and Detected Objects from VBSLHE Dataset.
Dataset, October, 2023

VISIONE Feature Repository for VBS: Multi-Modal Features and Detected Objects from MVK Dataset.
Dataset, September, 2023

VISIONE Feature Repository for VBS: Multi-Modal Features and Detected Objects from V3C1+V3C2 Dataset.
Dataset, July, 2023

CrowdSim2: An Open Synthetic Benchmark for Object Detectors.
Proceedings of the 18th International Joint Conference on Computer Vision, 2023

Text-to-Motion Retrieval: Towards Joint Understanding of Human Motion Data and Natural Language.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

VISIONE at Video Browser Showdown 2023.
Proceedings of the MultiMedia Modeling - 29th International Conference, 2023

Improving Query and Assessment Quality in Text-Based Interactive Video Retrieval Evaluation.
Proceedings of the 2023 ACM International Conference on Multimedia Retrieval, 2023

VISIONE: A Large-Scale Video Retrieval System with Advanced Search Functionalities.
Proceedings of the 2023 ACM International Conference on Multimedia Retrieval, 2023

AIMH Lab Approaches for Deepfake Detection.
Proceedings of the Italia Intelligenza Artificiale, 2023

AIMH Lab 2022 Activities for Vision.
Proceedings of the Italia Intelligenza Artificiale, 2023

An Optimized Pipeline for Image-Based Localization in Museums from Egocentric Images.
Proceedings of the Image Analysis and Processing - ICIAP 2023, 2023

MC-GTA: A Synthetic Benchmark for Multi-Camera Vehicle Tracking.
Proceedings of the Image Analysis and Processing - ICIAP 2023, 2023

Development of a Realistic Crowd Simulation Environment for Fine-Grained Validation of People Tracking Methods.
Proceedings of the 18th International Joint Conference on Computer Vision, 2023

VISIONE for newbies: an easier-to-use video retrieval system.
Proceedings of the 20th International Conference on Content-based Multimedia Indexing, 2023

2022
COCO, LVIS, Open Images V4 classes mapping.
Dataset, October, 2022

Bus Violence: a large-scale benchmark for video violence detection in public transport.
Dataset, September, 2022

Relational Learning in Computer Vision.
PhD thesis, 2022

Bus Violence: An Open Benchmark for Video Violence Detection on Public Transport.
Sensors, 2022

The Face Deepfake Detection Challenge.
J. Imaging, 2022

Deep learning for structural health monitoring: An application to heritage structures.
CoRR, 2022

Transformer-Based Multi-modal Proposal and Re-Rank for Wikipedia Image-Caption Matching.
CoRR, 2022

VISIONE at Video Browser Showdown 2022.
Proceedings of the MultiMedia Modeling - 28th International Conference, 2022

A Spatio- Temporal Attentive Network for Video-Based Crowd Counting.
Proceedings of the IEEE Symposium on Computers and Communications, 2022

Towards Unsupervised Machine Learning Approaches for Knowledge Graphs.
Proceedings of the 18th Italian Research Conference on Digital Libraries, 2022

Recurrent Vision Transformer for Solving Visual Reasoning Problems.
Proceedings of the Image Analysis and Processing - ICIAP 2022, 2022

Combining EfficientNet and Vision Transformers for Video Deepfake Detection.
Proceedings of the Image Analysis and Processing - ICIAP 2022, 2022

ALADIN: Distilling Fine-grained Alignment Scores for Efficient Image-Text Matching and Retrieval.
Proceedings of the CBMI 2022: International Conference on Content-based Multimedia Indexing, Graz, Austria, September 14, 2022

2021
Fine-Grained Visual Textual Alignment for Cross-Modal Retrieval Using Transformer Encoders.
ACM Trans. Multim. Comput. Commun. Appl., 2021

Solving the same-different task with convolutional neural networks.
Pattern Recognit. Lett., 2021

Generative Adversarial Networks for Astronomical Images Generation.
CoRR, 2021

AIMH at SemEval-2021 Task 6: Multimodal Classification Using an Ensemble of Transformer Models.
Proceedings of the 15th International Workshop on Semantic Evaluation, 2021

VISIONE at Video Browser Showdown 2021.
Proceedings of the MultiMedia Modeling - 27th International Conference, 2021

Towards Efficient Cross-Modal Visual Textual Retrieval using Transformer-Encoder Deep Features.
Proceedings of the 18th International Conference on Content-Based Multimedia Indexing, 2021

2020
Virtual to Real Adaptation of Pedestrian Detectors.
Sensors, 2020

Learning visual features for relational CBIR.
Int. J. Multim. Inf. Retr., 2020

Virtual to Real adaptation of Pedestrian Detectors for Smart Cities.
CoRR, 2020

Relational Visual-Textual Information Retrieval.
Proceedings of the Similarity Search and Applications - 13th International Conference, 2020

Re-implementing and Extending Relation Network for R-CBIR.
Proceedings of the Digital Libraries: The Era of Big Data and Data Science, 2020

Transformer Reasoning Network for Image- Text Matching and Retrieval.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

2019
Learning Pedestrian Detection from Virtual Worlds.
Proceedings of the Image Analysis and Processing - ICIAP 2019, 2019

Testing Deep Neural Networks on the Same-Different Task.
Proceedings of the 2019 International Conference on Content-Based Multimedia Indexing, 2019

2018
Learning Relationship-Aware Visual Features.
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018


  Loading...