Zhibin Yu

Orcid: 0000-0001-8067-9612

Affiliations:
  • Chinese Academy of Science, Shenzhen Institute of Advanced Technology, Cloud Computing Center, China
  • Huazhong University of Science and Technology, China (PhD 2008)


According to our database1, Zhibin Yu authored at least 71 papers between 2008 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Satisfying Energy-Efficiency Constraints for Mobile Systems.
IEEE Trans. Mob. Comput., December, 2024

Falic: An FPGA-Based Multi-Scalar Multiplication Accelerator for Zero-Knowledge Proof.
IEEE Trans. Computers, December, 2024

TIE: Fast Experiment-Driven ML-Based Configuration Tuning for In-Memory Data Analytics.
IEEE Trans. Computers, May, 2024

Global-State Aware Automatic NUMA Balancing.
Proceedings of the 15th Asia-Pacific Symposium on Internetware, 2024

Guser: A GPGPU Power Stressmark Generator.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

WASP: Workload-Aware Self-Replicating Page-Tables for NUMA Servers.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
Resource scheduling techniques in cloud from a view of coordination: a holistic survey.
Frontiers Inf. Technol. Electron. Eng., January, 2023

PAC: Preference-Aware Co-location Scheduling on Heterogeneous NUMA Architectures To Improve Resource Utilization.
Proceedings of the 37th International Conference on Supercomputing, 2023

Accelerating path tracing rendering with Multi-GPU in Blender cycles.
Proceedings of the 25th International Conference on Advanced Communication Technology, 2023

2022
OSC: An Online Self-Configuring Big Data Framework for Optimization of QoS.
IEEE Trans. Computers, 2022

SOCA-DOM: A Mobile System-on-Chip Array System for Analyzing Big Data on the Move.
J. Comput. Sci. Technol., 2022

LOCAT: Low-Overhead Online Configuration Auto-Tuning of Spark SQL Applications [Extended Version].
CoRR, 2022

LOCAT: Low-Overhead Online Configuration Auto-Tuning of Spark SQL Applications.
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022

2021
GML: Efficiently Auto-Tuning Flink's Configurations Via Guided Machine Learning.
IEEE Trans. Parallel Distributed Syst., 2021

Democratic learning: hardware/software co-design for lightweight blockchain-secured on-device machine learning.
J. Syst. Archit., 2021

EnTiered-ReRAM: An Enhanced Low Latency and Energy Efficient TLC Crossbar ReRAM Architecture.
IEEE Access, 2021

Treator: a Fast Centralized Cluster Scheduling at Scale Based on B+ Tree and BSP.
Proceedings of the 2021 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), New York City, NY, USA, September 30, 2021

QIHE: Quantifying the Importance of Hardware Events with Respect to Performance of Mobile Processors.
Proceedings of the ICBDC 2021: 6th International Conference on Big Data and Computing, Shenzhen, China, May 22, 2021

CNN-DMA: A Predictable and Scalable Direct Memory Access Engine for Convolutional Neural Network with Sliding-window Filtering.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

OR-ML: Enhancing Reliability for Machine Learning Accelerator with Opportunistic Redundancy.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2021

Improving system latency of AI accelerator with on-chip pipelined activation preprocessing and multi-mode batch inference.
Proceedings of the 3rd IEEE International Conference on Artificial Intelligence Circuits and Systems, 2021

Para: Harvesting CPU time fragments in Big Data Analytics.
Proceedings of the 14th IEEE International Conference on Cloud Computing, 2021

2020
COPA: Highly Cost-Effective Power Back-Up for Green Datacenters.
IEEE Trans. Parallel Distributed Syst., 2020

Thread-Level Locking for SIMT Architectures.
IEEE Trans. Parallel Distributed Syst., 2020

vMobiDesk: Desktop Virtualization for Mobile Operating Systems.
IEEE Access, 2020

Accelerating Atrous Convolution with Fetch-and-Jump Architecture for Activation Positioning.
Proceedings of the 2020 IEEE International Conference on Integrated Circuits, 2020

BBS: Micro-Architecture Benchmarking Blockchain Systems through Machine Learning and Fuzzy Set.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

On the Auto-Tuning of Elastic-search based on Machine Learning.
Proceedings of the CCRIS 2020: International Conference on Control, 2020

2019
MiC: Multi-level Characterization and Optimization of GPGPU Kernels.
ACM J. Emerg. Technol. Comput. Syst., 2019

Green-Up: under-provisioning power backup infrastructure for green datacenters.
CCF Trans. High Perform. Comput., 2019

Accelerating Compact Convolutional Neural Networks with Multi-threaded Data Streaming.
Proceedings of the 2019 IEEE Computer Society Annual Symposium on VLSI, 2019

Adaptive memory-side last-level GPU caching.
Proceedings of the 46th International Symposium on Computer Architecture, 2019

TPShare: a time-space sharing scheduling abstraction for shared cloud via vertical labels.
Proceedings of the 46th International Symposium on Computer Architecture, 2019

SLFAG: Scan Line Fill Algorithm for PCB Image Rasterization Based on GPGPU.
Proceedings of the 4th International Conference on Big Data and Computing, 2019

SMHC: A Synthetic Metric for Heterogeneous Resources in Cloud Computing.
Proceedings of the 4th International Conference on Big Data and Computing, 2019

2018
MIA: Metric Importance Analysis for Big Data Workload Characterization.
IEEE Trans. Parallel Distributed Syst., 2018

QIG: Quantifying the Importance and Interaction of GPGPU Architecture Parameters.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

Configuring in-memory cluster computing using random forest.
Future Gener. Comput. Syst., 2018

CounterMiner: Mining Big Performance Data from Hardware Counters.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

The Elasticity and Plasticity in Semi-Containerized Co-locating Cloud Workload: a View from Alibaba Trace.
Proceedings of the ACM Symposium on Cloud Computing, 2018

Datasize-Aware High Dimensional Configurations Auto-Tuning of In-Memory Cluster Computing.
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

2017
ATH: Auto-Tuning HBase's Configuration via Ensemble Learning.
IEEE Access, 2017

MEST: A Model-Driven Efficient Searching Approach for MapReduce Self-Tuning.
IEEE Access, 2017

BACM: Barrier-Aware Cache Management for Irregular Memory-Intensive GPGPU Workloads.
Proceedings of the 2017 IEEE International Conference on Computer Design, 2017

An Experimental Comparison Between Genetic Algorithm and Particle Swarm Optimization in Spark Performance Tuning.
Proceedings of the first Workshop on Emerging Technologies for software-defined and reconfigurable hardware-accelerated Cloud Datacenters, 2017

POSTER: BACM: Barrier-Aware Cache Management for Irregular Memory-Intensive GPGPU Workloads.
Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016
RFHOC: A Random-Forest Approach to Auto-Tuning Hadoop's Configuration.
IEEE Trans. Parallel Distributed Syst., 2016

ShenZhen transportation system (SZTS): a novel big data benchmark suite.
J. Supercomput., 2016

Two-Level Hybrid Sampled Simulation of Multithreaded Applications.
ACM Trans. Archit. Code Optim., 2016

QIM: Quantifying Hyperparameter Importance for Deep Learning.
Proceedings of the Network and Parallel Computing, 2016

Barrier-Aware Warp Scheduling for Throughput Processors.
Proceedings of the 2016 International Conference on Supercomputing, 2016

Thread Similarity Matrix: Visualizing Branch Divergence in GPGPU Programs.
Proceedings of the 45th International Conference on Parallel Processing, 2016

Performance Modeling for Spark Using SVM.
Proceedings of the 7th International Conference on Cloud Computing and Big Data, 2016

2015
GPGPU-MiniBench: Accelerating GPGPU Micro-Architecture Simulation.
IEEE Trans. Computers, 2015

SZTS: A Novel Big Data Transportation System Benchmark Suite.
Proceedings of the 44th International Conference on Parallel Processing, 2015

Shorter On-Line Warmup for Sampled Simulation of Multi-threaded Applications.
Proceedings of the 44th International Conference on Parallel Processing, 2015

2014
A comparative study on resource allocation and energy efficient job scheduling strategies in large-scale parallel computing systems.
Clust. Comput., 2014

2013
PCantorSim: Accelerating parallel architecture simulation through fractal-based sampling.
ACM Trans. Archit. Code Optim., 2013

Accelerating GPGPU architecture simulation.
Proceedings of the ACM SIGMETRICS / International Conference on Measurement and Modeling of Computer Systems, 2013

Application-Aware Workload Consolidation to Minimize Both Energy Consumption and Network Load in Cloud Environments.
Proceedings of the 42nd International Conference on Parallel Processing, 2013

A characterization of big data benchmarks.
Proceedings of the 2013 IEEE International Conference on Big Data (IEEE BigData 2013), 2013

2012
FractalMRC: Online Cache Miss Rate Curve Prediction on Commodity Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

2011
MT-BTRIMER: A master-slave multi-threaded dynamic binary translator.
Comput. Syst. Sci. Eng., 2011

Hierarchically characterizing CUDA program behavior.
Proceedings of the 2011 IEEE International Symposium on Workload Characterization, 2011

2010
CantorSim: Simplifying Acceleration of Micro-architecture Simulations.
Proceedings of the MASCOTS 2010, 2010

MT-BTRIMER: A Master-Slave Multi-threaded Dynamic Binary Translator.
Proceedings of the Fifth International Conference on Frontier of Computer Science and Technology, 2010

System-level max power (SYMPO): a systematic approach for escalating system-level power consumption using synthetic benchmarks.
Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010

2009
Simple and fast micro-architecture simulation: a trisection cantor fractal approach.
SIGMETRICS Perform. Evaluation Rev., 2009

TSS: Applying two-stage sampling in micro-architecture simulations.
Proceedings of the 17th Annual Meeting of the IEEE/ACM International Symposium on Modelling, 2009

2008
Identifying Classes via Cognitive Approach in Object-Oriented System.
Proceedings of the PACIIA 2008, 2008

An Evaluation of Two-Stage Systematic Sampling in Micro-Architecture Simulation.
Proceedings of the Third ChinaGrid Annual Conference, ChinaGrid 2008, Dunhuang, Gansu, 2008


  Loading...