Conghui He

Orcid: 0000-0001-8697-695X

According to our database1, Conghui He authored at least 88 papers between 2014 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Exploring the user guidance for more accurate building segmentation from high-resolution remote sensing images.
Int. J. Appl. Earth Obs. Geoinformation, February, 2024

DropQueries: A Simple Way to Discover Comprehensive Segment Representations.
IEEE Trans. Multim., 2024

Weakly Supervised 3-D Building Reconstruction From Monocular Remote Sensing Images.
IEEE Trans. Geosci. Remote. Sens., 2024

Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction.
CoRR, 2024

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models.
CoRR, 2024

PyramidDrop: Accelerating Your Large Vision-Language Models via Pyramid Visual Redundancy Reduction.
CoRR, 2024

DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception.
CoRR, 2024

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models.
CoRR, 2024

Multi-Agent Collaborative Data Selection for Efficient LLM Pretraining.
CoRR, 2024

Utilize the Flow before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning.
CoRR, 2024

Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models.
CoRR, 2024

MinerU: An Open-Source Solution for Precise Document Content Extraction.
CoRR, 2024

Harnessing Diversity for Important Data Selection in Pretraining Large Language Models.
CoRR, 2024

CDM: A Reliable Metric for Fair and Accurate Formula Recognition Evaluation.
CoRR, 2024

UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios.
CoRR, 2024

CrossViewDiff: A Cross-View Diffusion Model for Satellite-to-Street View Synthesis.
CoRR, 2024

Fine-Grained Building Function Recognition from Street-View Images via Geometry-Aware Semi-Supervised Learning.
CoRR, 2024

SkyDiffusion: Street-to-Satellite Image Synthesis with Diffusion Models and BEV Paradigm.
CoRR, 2024

Synth-Empathy: Towards High-Quality Synthetic Empathy Data.
CoRR, 2024

SynthVLM: High-Efficiency and High-Quality Synthetic Data for Vision Language Models.
CoRR, 2024

OpenDataLab: Empowering General Artificial Intelligence with Open Datasets.
CoRR, 2024

Navigating the Data Trading Crossroads: An Interdisciplinary Survey.
CoRR, 2024

InternLM-XComposer-2.5: A Versatile Large Vision Language Model Supporting Long-Contextual Input and Output.
CoRR, 2024

KeyVideoLLM: Towards Large-scale Video Keyframe Selection.
CoRR, 2024

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models.
CoRR, 2024

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text.
CoRR, 2024

DSDL: Data Set Description Language for Bridging Modalities and Tasks in AI Data.
CoRR, 2024

A Survey of Multimodal Large Language Model from A Data-centric Perspective.
CoRR, 2024

FoundaBench: Evaluating Chinese Fundamental Knowledge Capabilities of Large Language Models.
CoRR, 2024

How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites.
CoRR, 2024

UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition.
CoRR, 2024

InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD.
CoRR, 2024

H2RSVLM: Towards Helpful and Honest Remote Sensing Large Vision Language Model.
CoRR, 2024

InternLM2 Technical Report.
CoRR, 2024

ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training.
CoRR, 2024

LOCR: Location-Guided Transformer for Optical Character Recognition.
CoRR, 2024

WanJuan-CC: A Safe and High-Quality Open-sourced English Webtext Dataset.
CoRR, 2024

SongComposer: A Large Language Model for Lyric and Melody Composition in Song Generation.
CoRR, 2024

SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models.
CoRR, 2024

InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model.
CoRR, 2024

SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

LOCR: Location-Guided Transformer for Optical Character Recognition.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

LongWanjuan: Towards Systematic Measurement for Long Text Quality.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-Training.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Cross-View Image Geo-Localization with Panorama-BEV Co-retrieval Network.
Proceedings of the Computer Vision - ECCV 2024, 2024

MMBench: Is Your Multi-modal Model an All-Around Player?
Proceedings of the Computer Vision - ECCV 2024, 2024

Parrot Captions Teach CLIP to Spot Text.
Proceedings of the Computer Vision - ECCV 2024, 2024

ShareGPT4V: Improving Large Multi-modal Models with Better Captions.
Proceedings of the Computer Vision - ECCV 2024, 2024

SG-BEV: Satellite-Guided BEV Fusion for Cross-View Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

VIGC: Visual Instruction Generation and Correction.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
Parrot Captions Teach CLIP to Spot Text.
CoRR, 2023

Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization.
CoRR, 2023

InternLM-XComposer: A Vision-Language Large Model for Advanced Text-image Comprehension and Composition.
CoRR, 2023

MiChao-HuaFen 1.0: A Specialized Pre-trained Corpus Dataset for Domain-specific Large Models.
CoRR, 2023

MLLM-DataEngine: An Iterative Refinement Approach for MLLM.
CoRR, 2023

WanJuan: A Comprehensive Multimodal Dataset for Advancing English and Chinese Large Models.
CoRR, 2023

LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model.
CoRR, 2023

V3Det: Vast Vocabulary Visual Detection Dataset.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

OmniCity: Omnipotent City Understanding with Multi-Level and Multi-View Images.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Think Twice before Driving: Towards Scalable Decoders for End-to-End Autonomous Driving.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

SEPT: Towards Scalable and Efficient Visual Pre-training.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Unified Interactive Image Matting.
CoRR, 2022

PersFormer: 3D Lane Detection via Perspective Transformer and the OpenLane Benchmark.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
INTERN: A New Learning Paradigm Towards General Vision.
CoRR, 2021

Influence Selection for Active Learning.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

3D Building Reconstruction from Monocular Remote Sensing Images.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Joint Semantic-geometric Learning for Polygonal Building Segmentation.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
FLAVA: Find, Localize, Adjust and Verify to Annotate LiDAR-based Point Clouds.
Proceedings of the UIST '20 Adjunct: The 33rd Annual ACM Symposium on User Interface Software and Technology, 2020

2019
Optimizing Finite Volume Method Solvers on Nvidia GPUs.
IEEE Trans. Parallel Distributed Syst., 2019

Semantic Segmentation-Based Building Footprint Extraction Using Very High-Resolution Satellite Images and Multi-Source GIS Data.
Remote. Sens., 2019

A Real-Time Tree Crown Detection Approach for Large-Scale Remote Sensing Images on FPGAs.
Remote. Sens., 2019

Finding Mutual X at WeChat-Scale Social Network in Ten Minitues.
Proceedings of the 2019 IEEE International Conference on Big Data (IEEE BigData), 2019

2018
Simulating the Wenchuan earthquake with accurate surface topography on Sunway TaihuLight.
Proceedings of the International Conference for High Performance Computing, 2018

Semantic Segmentation Based Building Extraction Method Using Multi-Source GIS Map Datasets and Satellite Imagery.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018

swCaffe: A Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight.
Proceedings of the IEEE International Conference on Cluster Computing, 2018

2017
A Fully-Pipelined Hardware Design for Gaussian Mixture Models.
IEEE Trans. Computers, 2017

18.9-Pflops nonlinear earthquake simulation on Sunway TaihuLight: enabling depiction of 18-Hz and 8-meter scenarios.
Proceedings of the International Conference for High Performance Computing, 2017

An FPGA-based tree crown detection approach for remote sensing images.
Proceedings of the International Conference on Field Programmable Technology, 2017

Exploring the potential of reconfigurable platforms for order book update.
Proceedings of the 27th International Conference on Field Programmable Logic and Applications, 2017

Accelerating Financial Market Server through Hybrid List Design (Abstract Only).
Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2017

A Nanosecond-Level Hybrid Table Design for Financial Market Data Generators.
Proceedings of the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2017

2016
A time-space domain stereo finite difference method for 3D scalar wave propagation.
Comput. Geosci., 2016

Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer.
Proceedings of the International Conference for High Performance Computing, 2016

2014
Global-Scale Associations of Vegetation Phenology with Rainfall and Temperature at a High Spatio-Temporal Resolution.
Remote. Sens., 2014


  Loading...