Long Zheng

Orcid: 0000-0001-7903-2061

Affiliations:
  • Huazhong University of Science and Technology, School of Computer Science and Technology, Wuhan, China (PhD 2016)


According to our database1, Long Zheng authored at least 74 papers between 2014 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
ARCHER: a ReRAM-based accelerator for compressed recommendation systems.
Frontiers Comput. Sci., October, 2024

L-FNNG: Accelerating Large-Scale KNN Graph Construction on CPU-FPGA Heterogeneous Platform.
ACM Trans. Reconfigurable Technol. Syst., September, 2024

CPSAA: Accelerating Sparse Attention Using Crossbar-Based Processing-In-Memory Architecture.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., June, 2024

PhGraph: A High-Performance ReRAM-Based Accelerator for Hypergraph Applications.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., May, 2024

An Efficient GCNs Accelerator Using 3D-Stacked Processing-in-Memory Architectures.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., May, 2024

A heterogeneous 3-D stacked PIM accelerator for GCN-based recommender systems.
CCF Trans. High Perform. Comput., April, 2024

Minimal Context-Switching Data Race Detection with Dataflow Tracking.
J. Comput. Sci. Technol., March, 2024

Towards High-Performance Graph Processing: From a Hardware/Software Co-Design Perspective.
J. Comput. Sci. Technol., March, 2024

A Scalable, Efficient, and Robust Dynamic Memory Management Library for HLS-based FPGAs.
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

Enabling Efficient Large Recommendation Model Training with Near CXL Memory Processing.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024

High-Performance and Resource-Efficient Dynamic Memory Management in High-Level Synthesis.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

SpaHet: A Software/Hardware Co-design for Accelerating Heterogeneous-Sparsity based Sparse Matrix Multiplication.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

Towards Redundancy-Free Recommendation Model Training via Reusable-aware Near-Memory Processing.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

2023
Accelerating Loop-Oriented RTL Simulation With Code Instrumentation.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., December, 2023

Accelerating Graph Convolutional Networks Through a PIM-Accelerated Approach.
IEEE Trans. Computers, September, 2023

PDAS: Improving network pruning based on Progressive Differentiable Architecture Search for DNNs.
Future Gener. Comput. Syst., 2023

Cyclosa: Redundancy-Free Graph Pattern Mining via Set Dataflow.
Proceedings of the 2023 USENIX Annual Technical Conference, 2023

Accelerating Personalized Recommendation with Cross-level Near-Memory Processing.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

MetaNMP: Leveraging Cartesian-Like Product to Accelerate HGNNs with Near-Memory Processing.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

GraphMetaP: Efficient MetaPath Generation for Dynamic Heterogeneous Graph Models.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

AFaVS: Accurate Yet Fast Version Switching for Graph Processing Systems.
Proceedings of the 39th IEEE International Conference on Data Engineering, 2023

SMOG: Accelerating Subgraph Matching on GPUs.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2023

FNNG: A High-Performance FPGA-based Accelerator for K-Nearest Neighbor Graph Construction.
Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2023

MeG<sup>2</sup>: In-Memory Acceleration for Genome Graphs Analysis.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

2022
A Flexible Yet Efficient DNN Pruning Approach for Crossbar-Based Processing-in-Memory Architectures.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

ReaDy: A ReRAM-Based Processing-in-Memory Accelerator for Dynamic Graph Convolutional Networks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

An Effective 2-Dimension Graph Partitioning for Work Stealing Assisted Graph Processing on Multi-FPGAs.
IEEE Trans. Big Data, 2022

ReCSA: a dedicated sort accelerator using ReRAM-based content addressable memory.
Frontiers Comput. Sci., 2022

GraphFly: Efficient Asynchronous Streaming Graphs Processing via Dependency-Flow.
Proceedings of the SC22: International Conference for High Performance Computing, 2022

A Data-Centric Accelerator for High-Performance Hypergraph Processing.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

A General Offloading Approach for Near-DRAM Processing-In-Memory Architectures.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022

An Efficient Graph Accelerator with Distributed On-Chip Memory Hierarchy.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2022

Towards Fast GPU-based Sparse DNN Inference: A Hybrid Compute Model.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2022

Accelerating Sparse Deep Neural Network Inference Using GPU Tensor Cores.
Proceedings of the IEEE High Performance Extreme Computing Conference, 2022

ScalaGraph: A Scalable Accelerator for Massively Parallel Graph Processing.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

Hardware-Accelerated Hypergraph Processing with Chain-Driven Scheduling.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

Accelerating Graph Convolutional Networks Using Crossbar-based Processing-In-Memory Architectures.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

ReSMA: accelerating approximate string matching using ReRAM-based content addressable memory.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021
Efficient Graph Processing with Invalid Update Filtration.
IEEE Trans. Big Data, 2021

FDGLib: A Communication Library for Efficient Large-Scale Graph Processing in FPGA-Accelerated Data Centers.
J. Comput. Sci. Technol., 2021

Editorial for the special issue on high performance distributed computing.
CCF Trans. High Perform. Comput., 2021

Fast Sparse Deep Neural Network Inference with Flexible SpMM Optimization Space Exploration.
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

Productive High-Performance k-Truss Decomposition on GPU Using Linear Algebra.
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

GraSU: A Fast Graph Update Library for FPGA-based Dynamic Graph Processing.
Proceedings of the FPGA '21: The 2021 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, Virtual Event, USA, February 28, 2021

SumPA: Efficient Pattern-Centric Graph Mining with Pattern Abstraction.
Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques, 2021

2020
ReSQM: Accelerating Database Operations Using ReRAM-Based Content Addressable Memory.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

A Conflict-free Scheduler for High-performance Graph Processing on Multi-pipeline FPGAs.
ACM Trans. Archit. Code Optim., 2020

Efficient FPGA-based graph processing with hybrid pull-push computational model.
Frontiers Comput. Sci., 2020

Dynamic cluster strategy for hierarchical rollback-recovery protocols in MPI HPC applications.
Concurr. Comput. Pract. Exp., 2020

Effective runtime scheduling for high-performance graph processing on heterogeneous dataflow architecture.
CCF Trans. High Perform. Comput., 2020

ReGra: Accelerating Graph Traversal Applications Using ReRAM With Lower Communication Cost.
IEEE Access, 2020

Scaph: Scalable GPU-Accelerated Graph Processing with Value-Driven Differential Scheduling.
Proceedings of the 2020 USENIX Annual Technical Conference, 2020

A Locality-Aware Energy-Efficient Accelerator for Graph Mining Applications.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

A Heterogeneous PIM Hardware-Software Co-Design for Energy-Efficient Graph Processing.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

Spara: An Energy-Efficient ReRAM-Based Accelerator for Sparse Graph Analytics Applications.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

2019
Efficient Time-Evolving Stream Processing at Scale.
IEEE Trans. Parallel Distributed Syst., 2019

Supporting Superpages and Lightweight Page Migration in Hybrid Memory Systems.
ACM Trans. Archit. Code Optim., 2019

A Survey on Graph Processing Accelerators: Challenges and Opportunities.
J. Comput. Sci. Technol., 2019

Fast Triangle Counting on GPU.
Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019

RAGra: Leveraging Monolithic 3D ReRAM for Massively-Parallel Graph Processing.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

2018
Scalable Data Race Detection for Lock-Intensive Programs with Pending Period Representation.
IEEE Trans. Parallel Distributed Syst., 2018

DigHR: precise dynamic detection of hidden races with weak causal relation analysis.
J. Supercomput., 2018

Efficient and Scalable Graph Parallel Processing With Symbolic Execution.
ACM Trans. Archit. Code Optim., 2018

Scalable concurrency debugging with distributed graph processing.
Proceedings of the 2018 International Symposium on Code Generation and Optimization, 2018

An efficient graph accelerator with parallel data conflict management.
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018

Towards concurrency race debugging: an integrated approach for constraint solving and dynamic slicing.
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018

2017
Exploiting the Parallelism Between Conflicting Critical Sections with Partial Reversion.
IEEE Trans. Parallel Distributed Syst., 2017

Hardware/software cooperative caching for hybrid DRAM/NVM memory architectures.
Proceedings of the International Conference on Supercomputing, 2017

Towards Dataflow-Based Graph Accelerator.
Proceedings of the 37th IEEE International Conference on Distributed Computing Systems, 2017

2016
A Performance Debugging Framework for Unnecessary Lock Contentions with Record/Replay Techniques.
IEEE Trans. Parallel Distributed Syst., 2016

Automatic Security Bug Classification: A Compile-Time Approach.
Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016

2015
Understanding and identifying latent data races cross-thread interleaving.
Frontiers Comput. Sci., 2015

On performance debugging of unnecessary lock contentions on multicore processors: a replay-based approach.
Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2015

2014
esDMT: Efficient and scalable deterministic multithreading through memory isolation.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014


  Loading...