Lei Wang

Orcid: 0000-0001-6909-9561

Affiliations:
  • Chinese Academy of Sciences, Institute of Computing Technology, State Key Laboratory of Computer Architecture, Beijing, China


According to our database1, Lei Wang authored at least 138 papers between 2006 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
XGNN: Boosting Multi-GPU GNN Training via Global GNN Memory Store.
Proc. VLDB Endow., January, 2024

Could Bibliometrics Reveal Top Science and Technology Achievements and Researchers? The Case for Evaluatology-based Science and Technology Evaluation.
CoRR, 2024

Bridging the Gap Between Domain-specific Frameworks and Multiple Hardware Devices.
CoRR, 2024

Evaluatology: The Science and Engineering of Evaluation.
CoRR, 2024

GraphScope Flex: LEGO-like Graph Computing Stack.
Proceedings of the Companion of the 2024 International Conference on Management of Data, 2024

2023
Quantifying Resource Contention of Co-located Workloads with the System-level Entropy.
ACM Trans. Archit. Code Optim., March, 2023

Vineyard: Optimizing Data Sharing in Data-Intensive Analytics.
Proc. ACM Manag. Data, 2023

GraphScope Flex: LEGO-like Graph Computing Stack.
CoRR, 2023

IterLara: A Turing Complete Algebra for Big Data, AI, Scientific Computing, and Database.
CoRR, 2023

WPC: Whole-picture Workload Characterization.
CoRR, 2023

DCNetBench: Scaleable Data Center Network Benchmarking.
CoRR, 2023

NHtapDB: Native HTAP Databases.
CoRR, 2023

Legion: Automatically Pushing the Envelope of Multi-GPU System for Billion-Scale GNN Training.
Proceedings of the 2023 USENIX Annual Technical Conference, 2023

Bridging the Gap between Relational OLTP and Graph-based OLAP.
Proceedings of the 2023 USENIX Annual Technical Conference, 2023

CMLCompiler: A Unified Compiler for Classical Machine Learning.
Proceedings of the 37th International Conference on Supercomputing, 2023

Exploiting Contrastive Learning and Numerical Evidence for Confusing Legal Judgment Prediction.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Cross-Layer Profiling of IoTBench.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2023

A Linear Combination-Based Method to Construct Proxy Benchmarks for Big Data Workloads.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2023

Does AI for Science Need Another ImageNet or Totally Different Benchmarks? A Case Study of Machine Learning Force Fields.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2023

2022
Quality at the Tail.
CoRR, 2022

ToL: A Tensor of List-Based Unified Computation Model.
CoRR, 2022

High fusion computers: The IoTs, edges, data centers, and humans-in-the-loop as a computer.
CoRR, 2022

A systematic study on benchmarking AI inference accelerators.
CCF Trans. High Perform. Comput., 2022

OLxPBench: Real-time, Semantically Consistent, and Domain-specific are Essential in Benchmarking, Designing, and Implementing HTAP Systems.
Proceedings of the 38th IEEE International Conference on Data Engineering, 2022

GNNLab: a factored system for sample-based GNN training over GPUs.
Proceedings of the EuroSys '22: Seventeenth European Conference on Computer Systems, Rennes, France, April 5, 2022

EAIBench: An Energy Efficiency Benchmark for AI Training.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2022

2021
GraphScope: A One-Stop Large Graph Processing System.
Proc. VLDB Endow., 2021

GraphScope: A Unified Engine For Big Graph Processing.
Proc. VLDB Endow., 2021

Shift-and-Balance Attention.
CoRR, 2021

HPC AI500: Representative, Repeatable and Simple HPC AI Benchmarking.
CoRR, 2021

WPC: Whole-Picture Workload Characterization Across Intermediate Representation, ISA, and Microarchitecture.
IEEE Comput. Archit. Lett., 2021

AIBench Training: Balanced Industry-Standard AI Training Benchmarking.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2021

Finet: Using Fine-grained Batch Normalization to Train Light-weight Neural Networks.
Proceedings of the International Joint Conference on Neural Networks, 2021

FlexGraph: a flexible and efficient distributed framework for GNN training.
Proceedings of the EuroSys '21: Sixteenth European Conference on Computer Systems, 2021

HPC AI500 V2.0: The Methodology, Tools, and Metrics for Benchmarking HPC AI Systems.
Proceedings of the IEEE International Conference on Cluster Computing, 2021

AIBench Scenario: Scenario-Distilling AI Benchmarking.
Proceedings of the 30th International Conference on Parallel Architectures and Compilation Techniques, 2021

2020
HPC AI500: The Methodology, Tools, Roofline Performance Models, and Metrics for Benchmarking HPC AI Systems.
CoRR, 2020

Comparison and Benchmarking of AI Models and Frameworks on Mobile Devices.
CoRR, 2020

AIBench: Scenario-distilling AI Benchmarking.
CoRR, 2020

AIBench: An Industry Standard AI Benchmark Suite from Internet Services.
CoRR, 2020

Extended Batch Normalization.
CoRR, 2020

AIBench: An Agile Domain-specific Benchmarking Methodology and an AI Benchmark Suite.
CoRR, 2020

Referee: A Pattern-Guided Approach for Auto Design in Compiler-Based Analyzers.
Proceedings of the 27th IEEE International Conference on Software Analysis, 2020

OStoreBench: Benchmarking Distributed Object Storage Systems Using Real-World Application Scenarios.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2020

2019
Understanding Processors Design Decisions for Data Analytics in Homogeneous Data Centers.
IEEE Trans. Big Data, 2019

HybridTune: Spatio-Temporal Performance Data Correlation for Performance Diagnosis of Big Data Systems.
J. Comput. Sci. Technol., 2019

BenchCouncil's View on Benchmarking AI and Other Emerging Workloads.
CoRR, 2019

AIBench: An Industry Standard Internet Service AI Benchmark Suite.
CoRR, 2019

HPC AI500: A Benchmark Suite for HPC AI Systems.
CoRR, 2019

A Semantic-based Medical Image Fusion Approach.
CoRR, 2019

XOS: An Application-Defined Operating System for Data Center Servers.
CoRR, 2019

Landscape of Big Medical Data: A Pragmatic Survey on Prioritized Tasks.
IEEE Access, 2019

Performance-Boosting Sparsification of the IFDS Algorithm with Applications to Taint Analysis.
Proceedings of the 34th IEEE/ACM International Conference on Automated Software Engineering, 2019

BOPS, A New Computation-Centric Metric for Datacenter Computing.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2019

Anomaly Analysis and Diagnosis for Co-located Datacenter Workloads in the Alibaba Cluster.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2019

2018
NVM Streaker: a fast and reconfigurable performance simulator for non-volatile memory-based memory architecture.
J. Supercomput., 2018

Clustering Residential Electricity Load Curves via Community Detection in Network.
CoRR, 2018

Anomaly Analysis for Co-located Datacenter Workloads in the Alibaba Cluster.
CoRR, 2018

BigDataBench: A Dwarf-based Big Data and AI Benchmark Suite.
CoRR, 2018

A veracity preserving model for synthesizing scalable electricity load profiles.
CoRR, 2018

Big Data Dwarfs: Towards Fully Understanding Big Data Analytics Workloads.
CoRR, 2018

BOPS, Not FLOPS! A New Metric, Measuring Tool, and Roofline Performance Model For Datacenter Computing.
CoRR, 2018

Lazygraph: lazy data coherency for replicas in distributed graph-parallel computation.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018

Understanding and detecting evolution-induced compatibility issues in Android apps.
Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, 2018

Data Motif-based Proxy Benchmarks for Big Data and AI Workloads.
Proceedings of the 2018 IEEE International Symposium on Workload Characterization, 2018

Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2018, 2018

May-happen-in-parallel analysis with static vector clocks.
Proceedings of the 2018 International Symposium on Code Generation and Optimization, 2018

Online anomaly detection framework for spark systems via stage-task behavior modeling.
Proceedings of the 15th ACM International Conference on Computing Frontiers, 2018

XOS: An Application-Defined Operating System for Datacenter Computing.
Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2018), 2018

Deep Convolutional Neural Networks for Log Event Classification on Distributed Cluster Systems.
Proceedings of the IEEE International Conference on Big Data (IEEE BigData 2018), 2018

DCMIX: Generating Mixed Workloads for the Cloud Data Center.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2018

AIoT Bench: Towards Comprehensive Benchmarking Mobile and Embedded Device Intelligence.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2018

HPC AI500: A Benchmark Suite for HPC AI Systems.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2018

Edge AIBench: Towards Comprehensive End-to-End Edge Computing Benchmarking.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2018

AIBench: Towards Scalable and Comprehensive Datacenter AI Benchmarking.
Proceedings of the Benchmarking, Measuring, and Optimizing, 2018

Data motifs: a lens towards fully understanding big data and AI workloads.
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018

2017
Understanding Big Data Analytics Workloads on Modern Processors.
IEEE Trans. Parallel Distributed Syst., 2017

HybridTune: Spatio-temporal Data and Model Driven Performance Diagnosis for Big Data Systems.
CoRR, 2017

A Dwarf-based Scalable Big Data Benchmarking Methodology.
CoRR, 2017

Cosine Normalization: Using Cosine Similarity Instead of Dot Product in Neural Networks.
CoRR, 2017

Performance and energy efficiency of big data systems: characterization, implication and improvement.
Proceedings of the 6th International Conference on Software and Computer Applications, 2017

Towards memory and computation efficient graph processing on spark.
Proceedings of the 2017 IEEE International Conference on Big Data (IEEE BigData 2017), 2017

CloudMix: Generating Diverse and Reducible Workloads for Cloud Systems.
Proceedings of the 2017 IEEE 10th International Conference on Cloud Computing (CLOUD), 2017

2016
On Horizontal Decomposition of the Operating System.
CoRR, 2016

10-millisecond Computing.
CoRR, 2016

Articulation points guided redundancy elimination for betweenness centrality.
Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2016

Characterization and architectural implications of big data workloads.
Proceedings of the 2016 IEEE International Symposium on Performance Analysis of Systems and Software, 2016

Understanding Data Analytics Workloads on Intel(R) Xeon Phi(R).
Proceedings of the 18th IEEE International Conference on High Performance Computing and Communications; 14th IEEE International Conference on Smart City; 2nd IEEE International Conference on Data Science and Systems, 2016

BDTUne: Hierarchical correlation-based performance analysis and rule-based diagnosis for big data systems.
Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016

2015
WiseThrottling: a new asynchronous task scheduler for mitigating I/O bottleneck in large-scale datacenter servers.
J. Supercomput., 2015

Characterization and Architectural Implications of Big Data Workloads.
CoRR, 2015

Understanding Big Data Analytic Workloads on Modern Processors.
CoRR, 2015

BigDataBench-MT: A Benchmark Tool for Generating Realistic Mixed Data Center Workloads.
CoRR, 2015

Benchmarking Big Data Systems: State-of-the-Art and Future Directions.
CoRR, 2015

Identifying Dwarfs Workloads in Big Data Analytics.
CoRR, 2015

Characterizing Data Analytics Workloads on Intel Xeon Phi.
Proceedings of the 2015 IEEE International Symposium on Workload Characterization, 2015

BigDataBench-MT: A Benchmark Tool for Generating Realistic Mixed Data Center Workloads.
Proceedings of the Big Data Benchmarks, Performance Optimization, and Emerging Hardware, 2015

2014
Dynamic I/O-Aware Scheduling for Batch-Mode Applications on Chip Multiprocessor Systems of Cluster Platforms.
J. Comput. Sci. Technol., 2014

Characterizing and subsetting big data workloads.
Proceedings of the 2014 IEEE International Symposium on Workload Characterization, 2014

BigDataBench: A big data benchmark suite from internet services.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

BigOP: Generating Comprehensive Big Data Workloads as a Benchmarking Framework.
Proceedings of the Database Systems for Advanced Applications, 2014

A collaborative divide-and-conquer K-means clustering algorithm for processing large data.
Proceedings of the Computing Frontiers Conference, CF'14, 2014

2013
Cost-Aware Cooperative Resource Provisioning for Heterogeneous Workloads in Data Centers.
IEEE Trans. Computers, 2013

BigDataBench: a Big Data Benchmark Suite from Web Search Engines.
CoRR, 2013

BDGS: A Scalable Big Data Generator Suite in Big Data Benchmarking.
Proceedings of the Advancing Big Data Benchmarks, 2013

Characterizing data analysis workloads in data centers.
Proceedings of the IEEE International Symposium on Workload Characterization, 2013

CloudRank-V: A Desktop Cloud Benchmark with Complex Workloads.
Proceedings of the 10th IEEE International Conference on High Performance Computing and Communications & 2013 IEEE International Conference on Embedded and Ubiquitous Computing, 2013

2012
In Cloud, Can Scientific Communities Benefit from the Economies of Scale?
IEEE Trans. Parallel Distributed Syst., 2012

Precise, Scalable, and Online Request Tracing for Multitier Services of Black Boxes.
IEEE Trans. Parallel Distributed Syst., 2012

Extendable pattern-oriented optimization directives.
ACM Trans. Archit. Code Optim., 2012

CloudRank-D: benchmarking and ranking cloud computing systems for data processing applications.
Frontiers Comput. Sci., 2012

The Implications of Diverse Applications and Scalable Data Sets in Benchmarking Big Data Systems.
Proceedings of the Specifying Big Data Benchmarks, 2012

High Volume Throughput Computing: Identifying and Characterizing Throughput Oriented Workloads in Data Centers.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

A Highly Parallel Reuse Distance Analysis Algorithm on GPUs.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

2011
Automatic performance debugging of SPMD-style parallel programs.
J. Parallel Distributed Comput., 2011

Dacoop: Accelerating Data-Iterative Applications on Map/Reduce Cluster.
Proceedings of the 12th International Conference on Parallel and Distributed Computing, 2011

Automatic Library Generation for BLAS3 on GPUs.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Characterization of real workloads of web search engines.
Proceedings of the 2011 IEEE International Symposium on Workload Characterization, 2011

2010
Landing Stencil Code on Godson-T.
J. Comput. Sci. Technol., 2010

PowerTracer: Tracing requests in multi-tier services to save cluster power consumption
CoRR, 2010

Precise, Scalable and Online Request Tracing for Multi-tier Services of Black Boxes
CoRR, 2010

PhoenixCloud: Provisioning Resources for Heterogeneous Workloads in Cloud Computing
CoRR, 2010

Scalable Group Management in Large-Scale Virtualized Clusters
CoRR, 2010

PhoenixCloud: Provisioning Resources for Heterogeneous Cloud Workloads
CoRR, 2010

An adaptive task creation strategy for work-stealing scheduling.
Proceedings of the CGO 2010, 2010

2009
Phoenix Cloud : Consolidating Heterogeneous Workloads of Large Organizations on Cloud Computing Platforms
CoRR, 2009

In cloud, do MTC or HTC service providers benefit from the economies of scale?
Proceedings of the 2nd Workshop on Many-Task Computing on Grids and Supercomputers, 2009

Precise request tracing and performance debugging for multi-tier services of black boxes.
Proceedings of the 2009 IEEE/IFIP International Conference on Dependable Systems and Networks, 2009

Detecting and Eliminating Potential Violations of Sequential Consistency for Concurrent C/C++ Programs.
Proceedings of the CGO 2009, 2009

2008
Exploiting idle register classes for fast spill destination.
Proceedings of the 22nd Annual International Conference on Supercomputing, 2008

A Performance Model for Domino Mail Server.
Proceedings of the International Conference on Computer Science and Software Engineering, 2008

2007
The design methodology of Phoenix cluster system software stack.
Proceedings of the CHINA HPC 2007, 2007

Grid Unit: A Self-Managing Building Block for Grid System.
Proceedings of the Eighth International Conference on Parallel and Distributed Computing, 2007

A layered design methodology of cluster system stack.
Proceedings of the 2007 IEEE International Conference on Cluster Computing, 2007

2006
Design Patterns of Scalable Cluster System Software.
Proceedings of the Seventh International Conference on Parallel and Distributed Computing, 2006

Easy and reliable cluster management: the self-management experience of Fire Phoenix.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

PhoenixG: A Unified Management Framework for Industrial Information Grid.
Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006

A Failure-Aware Scheduling Strategy in Large-Scale Cluster System.
Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006


  Loading...