Xirong Li

Orcid: 0000-0002-0220-8310

Affiliations:
  • Renmin University of China, Beijing, China
  • University of Amsterdam, The Netherlands (PhD 2012)


According to our database1, Xirong Li authored at least 158 papers between 2005 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
The performance of a deep learning system in assisting junior ophthalmologists in diagnosing 13 major fundus diseases: a prospective multi-center clinical trial.
npj Digit. Medicine, 2024

Beyond Coarse-Grained Matching in Video-Text Retrieval.
CoRR, 2024

Magnifier Prompt: Tackling Multimodal Hallucination via Extremely Simple Instructions.
CoRR, 2024

Video to Music Moment Retrieval.
CoRR, 2024

D&M: Enriching E-commerce Videos with Sound Effects by Key Moment Detection and SFX Matching.
CoRR, 2024

ASR-enhanced Multimodal Representation Learning for Cross-Domain Product Retrieval.
CoRR, 2024

PhD: A Prompted Visual Hallucination Evaluation Dataset.
CoRR, 2024

Cliprerank: An Extremely Simple Method For Improving Ad-Hoc Video Search.
Proceedings of the IEEE International Conference on Acoustics, 2024

Holistic Features are Almost Sufficient for Text-to-Video Retrieval.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Tackling Long Code Search with Splitting, Encoding, and Aggregating.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

2023
Bias oriented unbiased data augmentation for cross-bias representation learning.
Multim. Syst., April, 2023

MVSS-Net: Multi-View Multi-Scale Supervised Networks for Image Manipulation Detection.
IEEE Trans. Pattern Anal. Mach. Intell., March, 2023

Adaptive Fusion of Radiomics and Deep Features for Lung Adenocarcinoma Subtype Recognition.
CoRR, 2023

TeachCLIP: Multi-Grained Teaching for Efficient Text-to-Video Retrieval.
CoRR, 2023

Cross-domain Collaborative Learning for Recognizing Multiple Retinal Diseases from Wide-Field Fundus Images.
CoRR, 2023

Revisiting Code Search in a Two-Stage Paradigm.
Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, 2023

ChinaOpen: A Dataset for Open-world Multimodal Learning.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

SAFL-Net: Semantic-Agnostic Feature Learning Network with Auxiliary Plugins for Image Manipulation Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Geometrized Transformer for Self-Supervised Homography Estimation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Towards Making a Trojan-Horse Attack on Text-to-Image Retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2023

Supervised Domain Adaptation for Recognizing Retinal Diseases from Wide-Field Fundus Images.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2023

2022
Learning Two-Stream CNN for Multi-Modal Age-Related Macular Degeneration Categorization.
IEEE J. Biomed. Health Informatics, 2022

Reading-Strategy Inspired Visual Representation Learning for Text-to-Video Retrieval.
IEEE Trans. Circuits Syst. Video Technol., 2022

3D Object Detection for Autonomous Driving: A Survey.
Pattern Recognit., 2022

BADet: Boundary-Aware 3D Object Detection from Point Clouds.
Pattern Recognit., 2022

Dual Encoding for Video Retrieval by Text.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Renmin University of China at TRECVID 2022: Improving Video Search by Feature Fusion and Negation Understanding.
CoRR, 2022

Long Code for Code Search.
CoRR, 2022

Co-Teaching for Unsupervised Domain Adaptation and Expansion.
CoRR, 2022

Targeted Trojan-Horse Attacks on Language-based Image Retrieval.
CoRR, 2022

Learn to Understand Negation in Video Retrieval.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Partially Relevant Video Retrieval.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Lesion Localization in OCT by Semi-Supervised Object Detection.
Proceedings of the ICMR '22: International Conference on Multimedia Retrieval, Newark, NJ, USA, June 27, 2022

Fundus Photograph Defect Repair Algorithm Based on Portable Camera Empty Shot.
Proceedings of the Ophthalmic Medical Image Analysis - 9th International Workshop, 2022

Semi-supervised Learning for Nerve Segmentation in Corneal Confocal Microscope Photography.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2022, 2022

Template Mask Based Image Fusion Built-in Algorithm for Wide Field Fundus Cameras.
Proceedings of the Ophthalmic Medical Image Analysis - 9th International Workshop, 2022

Semi-supervised Keypoint Detector and Descriptor for Retinal Image Matching.
Proceedings of the Computer Vision - ECCV 2022, 2022

Lightweight Attentional Feature Fusion: A New Baseline for Text-to-Video Retrieval.
Proceedings of the Computer Vision - ECCV 2022, 2022

Deepfake Network Architecture Attribution.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

DRAG: Dynamic Region-Aware GCN for Privacy-Leaking Image Detection.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Unsupervised Domain Expansion for Visual Categorization.
ACM Trans. Multim. Comput. Commun. Appl., 2021

SEA: Sentence Encoder Assembly for Video Retrieval by Textual Queries.
IEEE Trans. Multim., 2021

Feature Re-Learning with Data Augmentation for Video Relevance Prediction.
IEEE Trans. Knowl. Data Eng., 2021

Detecting Adversarial Image Examples in Deep Neural Networks with Adaptive Noise Reduction.
IEEE Trans. Dependable Secur. Comput., 2021

Lightweight Attentional Feature Fusion for Video Retrieval by Text.
CoRR, 2021

Learning to Disentangle GAN Fingerprint for Fake Image Attribution.
CoRR, 2021

Boundary-Aware 3D Object Detection from Point Clouds.
CoRR, 2021

Mining Dual Emotion for Fake News Detection.
Proceedings of the WWW '21: The Web Conference 2021, 2021

Classifier Belief Optimization for Visual Categorization.
Proceedings of the MultiMedia Modeling - 27th International Conference, 2021

Multi-Modal Multi-Instance Learning for Retinal Disease Recognition.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Multi-Level Visual Representation with Semantic-Reinforced Learning for Video Captioning.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Improving Fake News Detection by Using an Entity-enhanced Framework to Fuse Diverse Multimodal Clues.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

What Matters for Ad-hoc Video Search? A Large-scale Evaluation on TRECVID.
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

Image Manipulation Detection by Multi-View Multi-Scale Supervision.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Article Reranking by Memory-Enhanced Key Sentence Matching for Detecting Previously Fact-Checked Claims.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Hybrid Space Learning for Language-based Video Retrieval.
CoRR, 2020

iCap: Interative Image Captioning with Predictive Text.
CoRR, 2020

Renmin University of China at TRECVID 2020: Sentence Encoder Assembly for Ad-hoc Video Search.
Proceedings of the 2020 TREC Video Retrieval Evaluation, 2020

AttenNet: Deep Attention Based Retinal Disease Classification in OCT Images.
Proceedings of the MultiMedia Modeling - 26th International Conference, 2020

Towards annotation-free evaluation of cross-lingual image captioning.
Proceedings of the MMAsia 2020: ACM Multimedia Asia, 2020

A W2VV++ Case Study with Automated and Interactive Text-to-Video Retrieval.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

iCap: Interactive Image Captioning with Predictive Text.
Proceedings of the 2020 on International Conference on Multimedia Retrieval, 2020

High-Order Attention Networks for Medical Image Segmentation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2020, 2020

A GAN-based Domain Adaptation Method for Glaucoma Diagnosis.
Proceedings of the 2020 International Joint Conference on Neural Networks, 2020

Learn to Segment Retinal Lesions and Beyond.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

Deep Multiple Instance Learning with Spatial Attention for ROP Case Classification, Instance Selection and Abnormality Localization.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

2019
COCO-CN for Cross-Lingual Image Tagging, Captioning, and Retrieval.
IEEE Trans. Multim., 2019

Hierarchical Attention Networks for Medical Image Segmentation.
CoRR, 2019

An automatic particle picking method based on Generative Adversarial Network.
Commun. Inf. Syst., 2019

A Coarse-to-fine Cascading Model for Cataract Nuclear Segmentation in Slit-lamp Photographs.
Proceedings of the 2019 IEEE Visual Communications and Image Processing, 2019

Renmin University of China and Zhejiang Gongshang University at TRECVID 2019: Learn to Search and Describe Videos.
Proceedings of the 2019 TREC Video Retrieval Evaluation, 2019

Four Models for Automatic Recognition of Left and Right Eye in Fundus Images.
Proceedings of the MultiMedia Modeling - 25th International Conference, 2019

Exploring Content-based Video Relevance for Video Click-Through Rate Prediction.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

W2VV++: Fully Deep Learning for Ad-hoc Video Search.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Joint Localization of Optic Disc and Fovea in Ultra-widefield Fundus Images.
Proceedings of the Machine Learning in Medical Imaging - 10th International Workshop, 2019

Fully Deep Learning for Slit-Lamp Photo Based Nuclear Cataract Grading.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Fovea Localization in Fundus Photographs by Faster R-CNN with Physiological Prior.
Proceedings of the Ophthalmic Medical Image Analysis - 6th International Workshop, 2019

Two-Stream CNN with Loose Pair Training for Multi-modal AMD Categorization.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Dual Encoding for Zero-Example Video Retrieval.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Oval Shape Constraint based Optic Disc and Cup Segmentation in Fundus Photographs.
Proceedings of the 30th British Machine Vision Conference 2019, 2019

2018
Cross-Media Similarity Evaluation for Web Image Retrieval in the Wild.
IEEE Trans. Multim., 2018

Predicting Visual Features From Text for Image and Video Caption Retrieval.
IEEE Trans. Multim., 2018

Dual Dense Encoding for Zero-Example Video Retrieval.
CoRR, 2018

Automatic Rumor Detection on Microblogs: A Survey.
CoRR, 2018

COCO-CN for Cross-Lingual Image Tagging, Captioning and Retrieval.
CoRR, 2018

Renmin University of China and Zhejiang Gongshang University at TRECVID 2018: Deep Cross-Modal Embeddings for Video-Text Retrieval.
Proceedings of the 2018 TREC Video Retrieval Evaluation, 2018

Imagination Based Sample Construction for Zero-Shot Learning.
Proceedings of the 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, 2018

Dissimilarity Representation Learning for Generalized Zero-Shot Recognition.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Feature Re-Learning with Data Augmentation for Content-based Video Recommendation.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Deep Text Classification Can be Fooled.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Cross-Class Sample Synthesis for Zero-shot Learning.
Proceedings of the British Machine Vision Conference 2018, 2018

Laser Scar Detection in Fundus Images Using Convolutional Neural Networks.
Proceedings of the Computer Vision - ACCV 2018, 2018

2017
Tag relevance fusion for social image retrieval.
Multim. Syst., 2017

Detecting Adversarial Examples in Deep Networks with Adaptive Noise Reduction.
CoRR, 2017

University of Amsterdam and Renmin University at TRECVID 2017: Searching Video, Detecting Events and Describing Video.
Proceedings of the 2017 TREC Video Retrieval Evaluation, 2017

Fluency-Guided Cross-Lingual Image Captioning.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Harvesting Deep Models for Cross-Lingual Image Annotation.
Proceedings of the 15th International Workshop on Content-Based Multimedia Indexing, 2017

2016
TagBook: A Semantic Video Representation Without Supervision for Event Detection.
IEEE Trans. Multim., 2016

Socializing the Semantic Gap: A Comparative Survey on Image Tag Assignment, Refinement, and Retrieval.
ACM Comput. Surv., 2016

Word2VisualVec: Cross-Media Retrieval by Visual Feature Prediction.
CoRR, 2016

University of Amsterdam and Renmin University at TRECVID 2016: Searching Video, Detecting Events and Describing Video.
Proceedings of the 2016 TREC Video Retrieval Evaluation, 2016

Improving Image Captioning by Concept-Based Sentence Reranking.
Proceedings of the Advances in Multimedia Information Processing - PCM 2016, 2016

Detecting Violence in Video using Subclasses.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Early Embedding and Late Reranking for Video Captioning.
Proceedings of the 2016 ACM Conference on Multimedia Conference, 2016

Adding Chinese Captions to Images.
Proceedings of the 2016 ACM on International Conference on Multimedia Retrieval, 2016

2015
Tag Features for Geo-Aware Image Classification.
IEEE Trans. Multim., 2015

Best practices for learning video concept detectors from social media examples.
Multim. Tools Appl., 2015

基于声学特征的语言情感识别 (Speech Emotion Recognition Based on Acoustic Features).
计算机科学, 2015

Zero-shot Image Tagging by Hierarchical Semantic Embedding.
Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2015

Image Tag Assignment, Refinement and Retrieval.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Image Retrieval by Cross-Media Relevance Fusion.
Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, MM '15, Brisbane, Australia, October 26, 2015

Music Positioning and Annotation For Television Videos.
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Semantic Concept Annotation For User Generated Videos Using Soundtracks.
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

RUCMM at MediaEval 2015 Affective Impact of Movies Task: Fusion of Audio and Visual Cues.
Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015

Towards structured semantic embedding of multimedia.
Proceedings of the 7th International Conference on Internet Multimedia Computing and Service, 2015

Detecting semantic concepts in consumer videos using audio.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

RUC-Tencent at ImageCLEF 2015: Concept Detection, Localization and Sentence Generation.
Proceedings of the Working Notes of CLEF 2015, 2015

2014
A guided Hopfield evolutionary algorithm with local search for maximum clique problem.
Proceedings of the 2014 IEEE International Conference on Systems, Man, and Cybernetics, 2014

Semantic Concept Annotation of Consumer Videos at Frame-Level Using Audio.
Proceedings of the Advances in Multimedia Information Processing - PCM 2014, 2014

Adaptive Tag Selection for Image Annotation.
Proceedings of the Advances in Multimedia Information Processing - PCM 2014, 2014

Source Separation Improves Music Emotion Recognition.
Proceedings of the International Conference on Multimedia Retrieval, 2014

Few-Example Video Event Retrieval using Tag Propagation.
Proceedings of the International Conference on Multimedia Retrieval, 2014

Speech emotion classification using acoustic features.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Building geo-aware tag features for image classification.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2014

Structure Perturbation Optimization for Hopfield-Type Neural Networks.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2014, 2014

Renmin University of China at ImageCLEF 2014 Scalable Concept Image Annotation.
Proceedings of the Working Notes for CLEF 2014 Conference, 2014

2013
Bootstrapping Visual Categorization With Relevant Negatives.
IEEE Trans. Multim., 2013

Cross-Codebook Image Classification.
Proceedings of the Advances in Multimedia Information Processing - PCM 2013, 2013

Classifying tag relevance with relevant positive and negative examples.
Proceedings of the ACM Multimedia Conference, 2013

A Novel Hybrid SCH-ABC Approach for the Frequency Assignment Problem.
Proceedings of the Neural Information Processing - 20th International Conference, 2013

SCH-EGA: An Efficient Hybrid Algorithm for the Frequency Assignment Problem.
Proceedings of the Engineering Applications of Neural Networks, 2013

Renmin University of China at ImageCLEF 2013 Scalable Concept Image Annotation.
Proceedings of the Working Notes for CLEF 2013 Conference , 2013

Evaluating sources and strategies for learning video concepts from social media.
Proceedings of the 11th International Workshop on Content-Based Multimedia Indexing, 2013

2012
Harvesting Social Images for Bi-Concept Search.
IEEE Trans. Multim., 2012

Fusing Heterogeneous Information for Social Image Retrieval.
Proceedings of the Web-Age Information Management, 2012

Fusing concept detection and geo context for visual search.
Proceedings of the International Conference on Multimedia Retrieval, 2012

2011
The MediaMill TRECVID 2011 Semantic Video Search Engine.
Proceedings of the 2011 TREC Video Retrieval Evaluation, 2011


Personalizing automated image annotation using cross-entropy.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Image search 2.0.
Proceedings of the 19th International Conference on Multimedia 2011, Scottsdale, AZ, USA, November 28, 2011

Social negative bootstrapping for visual categorization.
Proceedings of the 1st International Conference on Multimedia Retrieval, 2011

2010
Unsupervised multi-feature tag relevance learning for social image retrieval.
Proceedings of the 9th ACM International Conference on Image and Video Retrieval, 2010

2009
Learning Social Tag Relevance by Neighbor Voting.
IEEE Trans. Multim., 2009

Query representation by structured concept threads with application to interactive video retrieval.
J. Vis. Commun. Image Represent., 2009

Visual categorization with negative examples for free.
Proceedings of the 17th International Conference on Multimedia 2009, 2009

Annotating images by harnessing worldwide user-tagged photos.
Proceedings of the IEEE International Conference on Acoustics, 2009

Annotating Images by Mining Image Search.
Proceedings of the Semantic Mining Technologies for Multimedia Databases., 2009

2008
Annotating Images by Mining Image Search Results.
IEEE Trans. Pattern Anal. Mach. Intell., 2008


Learning tag relevance by neighbor voting for social image retrieval.
Proceedings of the 1st ACM SIGMM International Conference on Multimedia Information Retrieval, 2008

2007
Mapping Query to Semantic Concepts: Leveraging Semantic Indices for Automatic and Interactive Video Retrieval.
Proceedings of the First IEEE International Conference on Semantic Computing (ICSC 2007), 2007

The importance of query-concept-mapping for automatic video retrieval.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

SBIA: search-based image annotation by leveraging web-scale images.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

Video search in concept subspace: a text-like paradigm.
Proceedings of the 6th ACM International Conference on Image and Video Retrieval, 2007

Video retrieval with multi-modal features.
Proceedings of the 6th ACM International Conference on Image and Video Retrieval, 2007

2006
Intelligent Multimedia Group of Tsinghua University at TRECVID 2006.
Proceedings of the 2006 TREC Video Retrieval Evaluation, 2006

Image annotation by large-scale content-based image retrieval.
Proceedings of the 14th ACM International Conference on Multimedia, 2006

2005
Tsinghua University at TRECVID 2005.
Proceedings of the 2005 TREC Video Retrieval Evaluation, 2005


  Loading...