Zhihao Fan

Orcid: 0000-0002-9910-7937

According to our database1, Zhihao Fan authored at least 49 papers between 2018 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


On csauthors.net:


Unifying Structure Reasoning and Language Pre-Training for Complex Reasoning Tasks.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Qwen2 Technical Report.
CoRR, 2024

From LLMs to MLLMs: Exploring the Landscape of Multimodal Jailbreaking.
CoRR, 2024

MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results.
CoRR, 2024

Benchmark Self-Evolving: A Multi-Agent Framework for Dynamic LLM Evaluation.
CoRR, 2024

AI Hospital: Interactive Evaluation and Collaboration of LLMs as Intern Doctors for Clinical Diagnosis.
CoRR, 2024

DELAN: Dual-Level Alignment for Vision-and-Language Navigation by Cross-Modal Contrastive Learning.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

A flexible speller based on time-space frequency conversion SSVEP stimulation paradigm under dry electrode.
Frontiers Comput. Neurosci., February, 2023

ReForm-Eval: Evaluating Large Vision Language Models via Unified Re-Formulation of Task-Oriented Benchmarks.
CoRR, 2023

Unifying Structure Reasoning and Language Model Pre-training for Complex Reasoning.
CoRR, 2023

AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DANDELION: An ASV Deployed Micro-Profiler Array for Air-Sea Observation.
IROS, 2023

Topic-Aware Modeling for Unsupervised Extractive Summarization.
Proceedings of the International Joint Conference on Neural Networks, 2023

Text Generation with Diffusion Language Models: A Pre-training Approach with Continuous Paragraph Denoise.
Proceedings of the International Conference on Machine Learning, 2023

OTST: A Two-Phase Framework for Joint Denoising and Remosaicing in RGBW CFA.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Query Structure Modeling for Inductive Logical Reasoning Over Knowledge Graphs.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Unifying Cross-Lingual and Cross-Modal Modeling Towards Weakly Supervised Multilingual Vision-Language Pre-training.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

GJTD-LR: A Trainable Grouped Joint Tensor Dictionary With Low-Rank Prior for Single Hyperspectral Image Super-Resolution.
IEEE Trans. Geosci. Remote. Sens., 2022

GENIE: Large Scale Pre-training for Text Generation with Diffusion Model.
CoRR, 2022

A Unified Continuous Learning Framework for Multi-modal Knowledge Discovery and Pre-training.
CoRR, 2022

MVP: Multi-Stage Vision-Language Pre-Training via Multi-Level Semantic Alignment.
CoRR, 2022

Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences for Image-Text Retrieval.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022

MVPTR: Multi-Level Semantic Alignment for Vision-Language Pre-Training via Multi-Stage Learning.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Constructing Phrase-level Semantic Labels to Form Multi-Grained Supervision for Image-Text Retrieval.
Proceedings of the ICMR '22: International Conference on Multimedia Retrieval, Newark, NJ, USA, June 27, 2022

DRAGONFLY: a UAV Rapidly Deployed Micro-Profiler Array for Underwater Thermocline Observation.
Proceedings of the 2022 International Conference on Robotics and Automation, 2022

Residual Feature Distillation Channel Spatial Attention Network for ISP on Smartphone.
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

Learning to Joint Remosaic and Denoise in Quad Bayer CFA via Universal Multi-scale Channel Attention Network.
Proceedings of the Computer Vision - ECCV 2022 Workshops, 2022

Locate Then Ask: Interpretable Stepwise Reasoning for Multi-hop Question Answering.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

Logic-Driven Context Extension and Data Augmentation for Logical Reasoning of Text.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

Contextual Fine-to-Coarse Distillation for Coarse-grained Response Selection in Open-Domain Conversations.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

SGUNet: Style-guided UNet for adversely conditioned fundus image super-resolution.
Neurocomputing, 2021

Fusion of multi-source retinal fundus images via automatic registration for clinical diagnosis.
Neurocomputing, 2021

Negative Sample is Negative in Its Own Way: Tailoring Negative Sentences for Image-Text Retrieval.
CoRR, 2021

Mask Attention Networks: Rethinking and Strengthen Transformer.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

TCIC: Theme Concepts Learning Cross Language and Vision for Image Captioning.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

An Unsupervised Sampling Approach for Image-Sentence Matching Using Document-level Structural Information.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

PathQG: Neural Question Generation from Facts.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Single Fundus Image Super-Resolution Via Cascaded Channel-Wise Attention Network.
Proceedings of the 42nd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2020

An Enhanced Knowledge Injection Model for Commonsense Generation.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

Reconstruction of 3D Retina from Multi-viewed Stereo Fundus Images via Dynamic Registration.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2020

Bridging by Word: Image Grounded Vocabulary Construction for Visual Captioning.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

A Multi-Agent Communication Framework for Question-Worthy Phrase Extraction and Question Generation.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

ISCLAB at SemEval-2018 Task 1: UIR-Miner for Affect in Tweets.
Proceedings of The 12th International Workshop on Semantic Evaluation, 2018

A Question Type Driven Framework to Diversify Visual Question Generation.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

A Reinforcement Learning Framework for Natural Question Generation using Bi-discriminators.
Proceedings of the 27th International Conference on Computational Linguistics, 2018
