Yutong Lu

Orcid: 0000-0001-5315-3375

According to our database1, Yutong Lu authored at least 194 papers between 2005 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Sophisticated Orchestrating Concurrent DLRM Training on CPU/GPU Platform.
IEEE Trans. Parallel Distributed Syst., November, 2024

TensorMap: A Deep RL-Based Tensor Mapping Framework for Spatial Accelerators.
IEEE Trans. Computers, August, 2024

Deciphering cell types by integrating scATAC-seq data with genome sequences.
Nat. Comput. Sci., April, 2024

SAIH: A Scalable Evaluation Methodology for Understanding AI Performance Trend on HPC Systems.
J. Comput. Sci. Technol., March, 2024

Exploring low-resource medical image classification with weakly supervised prompt learning.
Pattern Recognit., 2024

Topo: Towards a fine-grained topological data processing framework on Tianhe-3 supercomputer.
J. Parallel Distributed Comput., 2024

UNR: Unified Notifiable RMA Library for HPC.
CoRR, 2024

Solving Partial Differential Equations in Different Domains by Operator Learning method Based on Boundary Integral Equations.
CoRR, 2024

Intensive Vision-guided Network for Radiology Report Generation.
CoRR, 2024

HTDcr: a job execution framework for high-throughput computing on supercomputers.
Sci. China Inf. Sci., 2024

Multi-omic analysis tools for microbial metabolites prediction.
Briefings Bioinform., 2024

Extreme-scale Direct Numerical Simulation of Incompressible Turbulence on the Heterogeneous Many-core System.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024

Liger: Interleaving Intra- and Inter-Operator Parallelism for Distributed Large Model Inference.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024

MixPert: Optimizing Mixed-Precision Floating-Point Emulation on GPU Integer Tensor Cores.
Proceedings of the 25th ACM SIGPLAN/SIGBED International Conference on Languages, 2024

Galaxy: A Resource-Efficient Collaborative Edge AI System for In-situ Transformer Inference.
Proceedings of the IEEE INFOCOM 2024, 2024

Equivariant Diffusion for Crystal Structure Prediction.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Efficient Coupling Streaming AI and Ensemble Simulations on HPC Clusters.
Proceedings of the Euro-Par 2024: Parallel Processing, 2024

Communication-Efficient Model Parallelism for Distributed In-Situ Transformer Inference.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

Solving the Catastrophic Forgetting Problem in Generalized Category Discovery.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Welcome Message from the IEEE Cluster 2024 Program Chairs.
Proceedings of the IEEE International Conference on Cluster Computing, 2024

2023
Improving Computation and Memory Efficiency for Real-world Transformer Inference on GPUs.
ACM Trans. Archit. Code Optim., December, 2023

Securing the Ethereum from Smart Ponzi Schemes: Identification Using Static Features.
ACM Trans. Softw. Eng. Methodol., September, 2023

Hierarchical Model Parallelism for Optimizing Inference on Many-core Processor via Decoupled 3D-CNN Structure.
ACM Trans. Archit. Code Optim., September, 2023

Optimizing massively parallel sparse matrix computing on ARM many-core processor.
Parallel Comput., September, 2023

PSLT: A Light-Weight Vision Transformer With Ladder Self-Attention and Progressive Shift.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2023

ShockSurv: A machine learning model to accurately predict 28-day mortality for septic shock patients in the intensive care unit.
Biomed. Signal Process. Control., September, 2023

Full-Stack Optimizing Transformer Inference on ARM Many-Core CPU.
IEEE Trans. Parallel Distributed Syst., July, 2023

A Transformer-Based Framework for Parameter Learning of a Land Surface Hydrological Process Model.
Remote. Sens., July, 2023

A parallel structured banded DC algorithm for symmetric eigenvalue problems.
CCF Trans. High Perform. Comput., June, 2023

Identifying B-cell epitopes using AlphaFold2 predicted structures and pretrained language model.
Bioinform., April, 2023

Identifying spatial domain by adapting transcriptomics with histology through contrastive learning.
Briefings Bioinform., March, 2023

VSTH: a user-friendly web server for structure-based virtual screening on Tianhe-2.
Bioinform., January, 2023

Hybrid MPI and CUDA paralleled finite volume unstructured CFD simulations on a multi-GPU system.
Future Gener. Comput. Syst., 2023

Sparkles: Unlocking Chats Across Multiple Images for Multimodal Instruction-Following Models.
CoRR, 2023

Form 10-K Itemization.
CoRR, 2023

Crystal Structure Prediction by Joint Equivariant Diffusion.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

LocLoc: Low-level Cues and Local-area Guides for Weakly Supervised Object Localization.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

GRAP: Group-level Resource Allocation Policy for Reconfigurable Dragonfly Network in HPC.
Proceedings of the 37th International Conference on Supercomputing, 2023

KeSCo: Compiler-based Kernel Scheduling for Multi-task GPU Applications.
Proceedings of the 41st IEEE International Conference on Computer Design, 2023

MixRec: Orchestrating Concurrent Recommendation Model Training on CPU-GPU platform.
Proceedings of the 41st IEEE International Conference on Computer Design, 2023

PEPC: Parallel and Extensible PC Implementation for Causal Structure Learning.
Proceedings of the 7th International Conference on High Performance Compilation, 2023

Accelerating Inference of 3D-CNN on ARMMany-core CPU via Hierarchical Model Partition.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

Accurately Identifying Muscle-Invasive Bladder Cancer from MRI via Weakly Supervised Learning.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2023

FinD: Fine-grained Dynamic Task Scheduling with Lightweight Threads on Many-core Processors.
Proceedings of the IEEE Intl Conf on Parallel & Distributed Processing with Applications, 2023

Enhancing Multi-physics Coupling on ARM Many-Core Cluster.
Proceedings of the Advanced Parallel Processing Technologies, 2023

2022
Design and Simulation of Content-Aware Hybrid DRAM-PCM Memory System.
IEEE Trans. Parallel Distributed Syst., 2022

Optimizing data query performance of Bi-cluster for large-scale scientific data in supercomputers.
J. Supercomput., 2022

To Improve Prediction of Binding Residues With DNA, RNA, Carbohydrate, and Peptide Via Multi-Task Deep Neural Networks.
IEEE ACM Trans. Comput. Biol. Bioinform., 2022

Optimizing small channel 3D convolution on GPU with tensor core.
Parallel Comput., 2022

Quantitative evaluation of explainable graph neural networks for molecular property prediction.
Patterns, 2022

iProX in 2021: connecting proteomics data sharing with big data.
Nucleic Acids Res., 2022

Interpretable Graph Transformer Network for Predicting Adsorption Isotherms of Metal-Organic Frameworks.
J. Chem. Inf. Model., 2022

Enhancing Distributed In-Situ CNN Inference in the Internet of Things.
IEEE Internet Things J., 2022

EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models.
CoRR, 2022

A tail-tolerant cloud storage scheduling based on precise periodicity detection.
CCF Trans. High Perform. Comput., 2022

A parameter-free deep embedded clustering method for single-cell RNA-seq data.
Briefings Bioinform., 2022

Spatial transcriptomics prediction from histology jointly through Transformer and graph neural networks.
Briefings Bioinform., 2022

A robust and scalable graph neural network for accurate single-cell classification.
Briefings Bioinform., 2022

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

RollBin: reducing code-size via loop rerolling at binary level.
Proceedings of the LCTES '22: 23rd ACM SIGPLAN/SIGBED International Conference on Languages, 2022

Optimized MPI collective algorithms for dragonfly topology.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

Handling heavy-tailed input of transformer inference on GPUs.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022

Characterizing and Optimizing Transformer Inference on ARM Many-core Processor.
Proceedings of the 51st International Conference on Parallel Processing, 2022

Exploiting data locality in memory for ORAM to reduce memory access overheads.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

moTuner: a compiler-based auto-tuning approach for mixed-precision operators.
Proceedings of the CF '22: 19th ACM International Conference on Computing Frontiers, Turin, Italy, May 17, 2022

RAISE: Efficient GPU Resource Management via Hybrid Scheduling.
Proceedings of the 22nd IEEE International Symposium on Cluster, 2022

2021
A Location-Based Factorization Machine Model for Web Service QoS Prediction.
IEEE Trans. Serv. Comput., 2021

A Parallel Structured Divide-and-Conquer Algorithm for Symmetric Tridiagonal Eigenvalue Problems.
IEEE Trans. Parallel Distributed Syst., 2021

Model Parallelism Optimization for Distributed Inference Via Decoupled CNN Structure.
IEEE Trans. Parallel Distributed Syst., 2021

A GPU-Accelerated In-Memory Metadata Management Scheme for Large-Scale Parallel File Systems.
J. Comput. Sci. Technol., 2021

Krill: a compiler and runtime system for concurrent graph processing.
Proceedings of the International Conference for High Performance Computing, 2021

DeepPE: Emulating Parameterization in Numerical Weather Forecast Model Through Bidirectional Network.
Proceedings of the Machine Learning and Knowledge Discovery in Databases. Applied Data Science Track, 2021

Unifying Multimodal Transformer for Bi-directional Image and Text Generation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

A Picture is Worth a Thousand Words: A Unified System for Diverse Captions and Rich Images Generation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Be Specific, Be Clear: Bridging Machine and Human Captions by Scene-Guided Transformer.
Proceedings of the MMPT@ICMR2021: Proceedings of the 2021 Workshop on Multi-Modal Pre-Training for Multimedia Understanding, 2021

A Fine-grained Optimization to Winograd Convolution Based on Micro-architectural Features of CPU.
Proceedings of the 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), New York City, NY, USA, September 30, 2021

Optimizing Massively Parallel Winograd Convolution on ARM Processor.
Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

Self-Motivated Communication Agent for Real-World Vision-Dialog Navigation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

DGAT-onco: A differential analysis method to detect oncogenes by integrating functional information of mutations.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2021

Multi-Layer Networks for Ensemble Precipitation Forecasts Postprocessing.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Memory-Efficient and Skew-Tolerant MapReduce Over MPI for Supercomputing Systems.
IEEE Trans. Parallel Distributed Syst., 2020

High-Scalable Collaborated Parallel Framework for Large-Scale Molecular Dynamic Simulation on Tianhe-2 Supercomputer.
IEEE ACM Trans. Comput. Biol. Bioinform., 2020

Validation of Fully Automatic Quantitative Software for Finger Joint Space Narrowing Progression for Rheumatoid Arthritis Patients.
J. Digit. Imaging, 2020

Design and Implementation of the Tianhe-2 Data Storage and Management System.
J. Comput. Sci. Technol., 2020

Accurately Predicting Mutation-Caused Stability Changes from Protein Sequences Using Extreme Gradient Boosting.
J. Chem. Inf. Model., 2020

To Improve Protein Sequence Profile Prediction through Image Captioning on Pairwise Residue Distance Map.
J. Chem. Inf. Model., 2020

SPOT-Fold: Fragment-Free Protein Structure Prediction Guided by Predicted Backbone Structure and Contact Map.
J. Comput. Chem., 2020

An Efficient Method for Training Deep Learning Networks Distributed.
IEICE Trans. Inf. Syst., 2020

A parallel generator of non-Hermitian matrices computed from given spectra.
Concurr. Comput. Pract. Exp., 2020

UniIndex: An index and query middleware for parallel file systems.
Concurr. Comput. Pract. Exp., 2020

HasFS: optimizing file system consistency mechanism on NVM-based hybrid storage architecture.
Clust. Comput., 2020

Improving the efficiency of HPC data movement on container-based virtual cluster.
CCF Trans. High Perform. Comput., 2020

Accurate prediction of genome-wide RNA secondary structure profile based on extreme gradient boosting.
Bioinform., 2020

Traveling the token world: A graph analysis of Ethereum ERC20 token ecosystem.
Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020

Re-evaluation of Atomic Operations and Graph Coloring for Unstructured Finite Volume GPU Simulations.
Proceedings of the 32nd IEEE International Symposium on Computer Architecture and High Performance Computing, 2020

Pacon: Improving Scalability and Efficiency of Metadata Service through Partial Consistency.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

Communicative Representation Learning on Attributed Molecular Graphs.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Phishing Scam Detection on Ethereum: Towards Financial Security for Blockchain Ecosystem.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Multimodal Brain MRI Translation Focused on Lesions.
Proceedings of the ICMLC 2020: 2020 12th International Conference on Machine Learning and Computing, 2020

Synthesis of Registered Multimodal Medical Images with Lesions.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2020, 2020

An Efficient Approach to Vectorize the Hybrid Breadth-First Search.
Proceedings of the 22nd IEEE International Conference on High Performance Computing and Communications; 18th IEEE International Conference on Smart City; 6th IEEE International Conference on Data Science and Systems, 2020

Accurately Clustering Single-cell RNA-seq data by Capturing Structural Relations between Cells through Graph Convolutional Network.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2020

An End-to-end Oxford Nanopore Basecaller Using Convolution-augmented Transformer.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2020

2019
A novel in situ compression method for CFD data based on generative adversarial network.
J. Vis., 2019

QBMG: quasi-biogenic molecule generator with deep recurrent neural network.
J. Cheminformatics, 2019

DLIGAND2: an improved knowledge-based energy function for protein-ligand interactions using the distance-scaled, finite, ideal-gas reference state.
J. Cheminformatics, 2019

An efficient real-time data collection framework on petascale systems.
Neurocomputing, 2019

Tiered data management system: Accelerating data processing on HPC systems.
Future Gener. Comput. Syst., 2019

Paving the way for China exascale computing.
CCF Trans. High Perform. Comput., 2019

Capability for Multi-Core and Many-Core Memory Systems: A Case-Study With Xeon Processors.
IEEE Access, 2019

Optimizing Data Placement on Hierarchical Storage Architecture via Machine Learning.
Proceedings of the Network and Parallel Computing, 2019

Decoupling Localization and Classification in Single Shot Temporal Action Detection.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2019

An Efficient and Flexible Metadata Management Layer for Local File Systems.
Proceedings of the 37th IEEE International Conference on Computer Design, 2019

Bi-Cluster: A High-Performance Data Query Framework for Large-Scale Scientific Data.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

An Efficient Data Collection Module on Petascale Systems.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

IndexIt: Enhancing Data Locating Services for Parallel File Systems.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

An Active and Deep Semantic Matching Framework for Query Rewrite in E-Commercial Search Engine.
Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019

2018
mSNP: A Massively Parallel Algorithm for Large-Scale SNP Detection.
IEEE Trans. Parallel Distributed Syst., 2018

Petascale scramjet combustion simulation on the Tianhe-2 heterogeneous supercomputer.
Parallel Comput., 2018

Erratum to: ONFS: a hierarchical hybrid file system based on memory, SSD, and HDD for high performance computers.
Frontiers Inf. Technol. Electron. Eng., 2018

Will supercomputers be super-data and super-AI machines?
Commun. ACM, 2018

Efficient computation of motif discovery on Intel Many Integrated Core (MIC) Architecture.
BMC Bioinform., 2018

Graph Analytics on Manycore Memory Systems.
IEEE Access, 2018

SingleCaffe: An Efficient Framework for Deep Learning on a Single Node.
IEEE Access, 2018

Experiences of Converging Big Data Analytics Frameworks with High Performance Computing Systems.
Proceedings of the Supercomputing Frontiers - 4th Asian Conference, 2018

Mimir+: An Optimized Framework of MapReduce on Heterogeneous High-Performance Computing System.
Proceedings of the Network and Parallel Computing, 2018

On the Power of Combiner Optimizations in MapReduce Over MPI Workflows.
Proceedings of the 24th IEEE International Conference on Parallel and Distributed Systems, 2018

A Low Communication Overhead Breadth-First Search Based on Global Bitmap.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2018

DistForest: A Parallel Random Forest Training Framework Based on Supercomputer.
Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

Accelerating Scientific Workflows with Tiered Data Management System.
Proceedings of the 20th IEEE International Conference on High Performance Computing and Communications; 16th IEEE International Conference on Smart City; 4th IEEE International Conference on Data Science and Systems, 2018

2017
ONFS: a hierarchical hybrid file system based on memory, SSD, and HDD for high performance computers.
Frontiers Inf. Technol. Electron. Eng., 2017

Multiple Sequence Alignment Based on a Suffix Tree and Center-Star Strategy: A Linear Method for Multiple Nucleotide Sequence Alignment on Spark Parallel Framework.
J. Comput. Biol., 2017

PTS: a pharmaceutical target seeker.
Database J. Biol. Databases Curation, 2017

Fast Truss Decomposition in Memory.
Proceedings of the Security, Privacy, and Anonymity in Computation, Communication, and Storage, 2017

Customized Filesystem with Dynamic Stripe Strategies on Lustre-Based Hadoop.
Proceedings of the Parallel Architecture, Algorithm and Programming, 2017

Mimir: Memory-Efficient and Scalable MapReduce for Large Supercomputing Systems.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

Bloomfish: A Highly Scalable Distributed K-mer Counting Framework.
Proceedings of the 23rd IEEE International Conference on Parallel and Distributed Systems, 2017

A Data Mining Method for Potential Fire Hazard Analysis of Urban Buildings based on Bayesian Network.
Proceedings of the 2nd International Conference on Intelligent Information Processing, 2017

UGSD: Scalable and Efficient Metadata Management for EB-Scale File Systems.
Proceedings of the International Conference on Compute and Data Analysis, 2017

Accelerating Redis with RDMA Over InfiniBand.
Proceedings of the Data Mining and Big Data - Second International Conference, 2017

Review on HDD-Based, SSD-Based and Hybrid Key-Value Stores.
Proceedings of the 15th IEEE Intl Conf on Dependable, 2017

mD3DOCKxb: An Ultra-Scalable CPU-MIC Coordinated Virtual Screening Framework.
Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

2016
Workload Partitioning for Accelerating Applications on Heterogeneous Platforms.
IEEE Trans. Parallel Distributed Syst., 2016

Me-CLOCK: A Memory-Efficient Framework to Implement Replacement Policies for Large Caches.
IEEE Trans. Computers, 2016

623 Tflop/s HPCG run on Tianhe-2: Leveraging millions of hybrid cores.
Int. J. High Perform. Comput. Appl., 2016

RFS: An LSM-Tree-Based File System for Enhanced Microdata Performance.
IEICE Trans. Inf. Syst., 2016

Distributed and Scalable Directory Service in a Parallel File System.
IEICE Trans. Inf. Syst., 2016

A Configuration Management Study to Fast Massive Writing for Distributed NoSQL System.
IEICE Trans. Inf. Syst., 2016

Using Dynamic Granularity Strategy to Accelerate Unbalanced Tree Search.
Proceedings of the 2016 IEEE Trustcom/BigDataSE/ISPA, 2016

The Gyrokinetic Particle Simulation of Fusion Plasmas on Tianhe-2 Supercomputer.
Proceedings of the 7th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2016

Persistence and Recovery for In-Memory NoSQL Services: A Measurement Study.
Proceedings of the IEEE International Conference on Web Services, 2016

Accelerating the Simulation of Thermal Convection in the Earth's Outer Core on Tianhe-2.
Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016

masFS: File System Based on Memory and SSD in Compute Nodes for High Performance Computers.
Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016

mAMBER: A CPU/MIC collaborated parallel framework for AMBER on Tianhe-2 supercomputer.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2016

2015
Ultra-Scalable CPU-MIC Acceleration of Mesoscale Atmospheric Modeling on Tianhe-2.
IEEE Trans. Computers, 2015

Neo-hetergeneous Programming and Parallelized Optimization of a Human Genome Re-sequencing Analysis Software Pipeline on TH-2 Supercomputer.
Supercomput. Front. Innov., 2015

High Performance Interconnect Network for Tianhe System.
J. Comput. Sci. Technol., 2015

CoGA: Extension of GA on Heterogeneous System.
Proceedings of the 2015 IEEE 12th Intl Conf on Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom), 2015

BWTCP: A Parallel Method for Constructing BWT in Large Collection of Genomic Reads.
Proceedings of the High Performance Computing - 30th International Conference, 2015

Large-Scale Neo-Heterogeneous Programming and Optimization of SNP Detection on Tianhe-2.
Proceedings of the High Performance Computing - 30th International Conference, 2015

A theoretical analysis of lifespan impact on flash memory imposed by erasure code.
Proceedings of the 10th IEEE International Conference on Networking, 2015

Thresholds modification strategy of wayside supercapacitor storage considering DC substation characteristics.
Proceedings of the IECON 2015, 2015

Performance Evaluation of HPGMG on Tianhe-2: Early Experience.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2015

The Challenge of Scaling Genome Big Data Analysis Software on TH-2 Supercomputer.
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

HAGP: A Hub-Centric Asynchronous Graph Processing Framework for Scale-Free Graph.
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

mD3DOCKxb: A Deep Parallel Optimized Software for Molecular Docking with Intel Xeon Phi Coprocessors.
Proceedings of the 15th IEEE/ACM International Symposium on Cluster, 2015

2014
A hybrid memory built by SSD and DRAM to support in-memory Big Data analytics.
Knowl. Inf. Syst., 2014

Using the Intel Many Integrated Core to accelerate graph traversal.
Int. J. High Perform. Comput. Appl., 2014

Hybrid hierarchy storage system in MilkyWay-2 supercomputer.
Frontiers Comput. Sci., 2014

MilkyWay-2 supercomputer: system and application.
Frontiers Comput. Sci., 2014

Efficient Shared-Memory Implementation of High-Performance Conjugate Gradient Benchmark and its Application to Unstructured Matrices.
Proceedings of the International Conference for High Performance Computing, 2014

Enabling and Scaling a Global Shallow-Water Atmospheric Model on Tianhe-2.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Scalability­-Centric HPC System Design.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Improving performance by matching imbalanced workloads with heterogeneous platforms.
Proceedings of the 2014 International Conference on Supercomputing, 2014

Physically based parallel ray tracer for the Metropolis light transport algorithm on the Tianhe-2 supercomputer.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

Accelerating HPCG on Tianhe-2: A hybrid CPU-MIC algorithm.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

Optimizing and Scaling HPCG on Tianhe-2: Early Experience.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2014

Ensemble based data stream mining with recalling and forgetting mechanisms.
Proceedings of the 11th International Conference on Fuzzy Systems and Knowledge Discovery, 2014

2013
A peta-scalable CPU-GPU algorithm for global atmospheric simulations.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2013

Using MIC to Accelerate a Typical Data-Intensive Application: The Breadth-first Search.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

NR-MPI: A Non-stop and Fault Resilient MPI.
Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

Somersault cloud: Toward a cloud-of-clouds service for personal backup.
Proceedings of the International Conference on Computing, Networking and Communications, 2013

2012
Tianhe-1A Interconnect and Message-Passing Services.
IEEE Micro, 2012

Wukong: A cloud-oriented file service for mobile Internet devices.
J. Parallel Distributed Comput., 2012

EaSync: A Transparent File Synchronization Service across Multiple Machines.
Proceedings of the Network and Parallel Computing, 9th IFIP International Conference, 2012

RAIC: Redundant Array of Inexpensive Cloud.
Proceedings of the Convergence and Hybrid Information Technology, 2012

A Power-aware Job Scheduling Algorithm.
Proceedings of the 2012 International Conference on Cloud and Service Computing, 2012

2011
Implementation and Evaluation of Network Interface and Message Passing Services for TianHe-1A Supercomputer.
Proceedings of the IEEE 19th Annual Symposium on High Performance Interconnects, 2011

2010
Wukong: Toward a Cloud-Oriented File Service for Mobile Devices.
Proceedings of the 2010 IEEE International Conference on Services Computing, 2010

2009
A Distributed file system framework for transparent accessing heterogeneous storage services.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Heterogeneity Issues and Supports in MPI Implementations: An Overview.
Proceedings of the Eighth International Conference on Grid and Cooperative Computing, 2009

2008
Scalable Resource Management System for High Productive Computing.
Proceedings of the Third ChinaGrid Annual Conference, ChinaGrid 2008, Dunhuang, Gansu, 2008

2006
A New Heartbeat Mechanism for Large-Scale Cluster.
Proceedings of the Advanced Web and Network Technologies, and Applications, 2006

2005
MCRM System: CIM-Based Multiple Clusters Manager.
Proceedings of the Parallel and Distributed Processing and Applications, 2005


  Loading...