Guihai Yan

Orcid: 0000-0002-1254-3278

According to our database1, Guihai Yan authored at least 72 papers between 2008 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Satisfying Energy-Efficiency Constraints for Mobile Systems.
IEEE Trans. Mob. Comput., December, 2024

Monocular 3D Multi-Person Pose Estimation for On-Site Joint Flexion Assessment: A Case of Extreme Knee Flexion Detection.
Sensors, October, 2024

DPU-Direct: Unleashing Remote Accelerators via Enhanced RDMA for Disaggregated Datacenters.
IEEE Trans. Computers, August, 2024

AMST: Accelerating Large-Scale Graph Minimum Spanning Tree Computation on FPGA.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

Efficient RNIC Cache Side-Channel Attack Detection Through DPU-Driven Architecture.
Proceedings of the Euro-Par 2024: Parallel Processing, 2024

Athena: Add More Intelligence to RMT-Based Network Data Plane with Low-Bit Quantization.
Proceedings of the Euro-Par 2024: Parallel Processing, 2024

PHD: Parallel Huffman Decoder on FPGA for Extreme Performance and Energy Efficiency.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

Co-Via: A Video Frame Interpolation Accelerator Exploiting Codec Information Reuse.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

TianMen: a DPU-based storage network offloading structure for disaggregated datacenters.
Proceedings of the 2024 ACM Symposium on Cloud Computing, 2024

2023
DOE: database offloading engine for accelerating SQL processing.
Distributed Parallel Databases, September, 2023

FlatProxy: A DPU-centric Service Mesh Architecture for Hyperscale Cloud-native Application.
CoRR, 2023

BitColor: Accelerating Large-Scale Graph Coloring on FPGA with Parallel Bit-Wise Engines.
Proceedings of the 52nd International Conference on Parallel Processing, 2023

Optimize the TX Architecture of RDMA NIC for Performance Isolation in the Cloud Environment.
Proceedings of the Great Lakes Symposium on VLSI 2023, 2023

KPU-SQL: Kernel Processing Unit for High-Performance SQL Acceleration.
Proceedings of the Great Lakes Symposium on VLSI 2023, 2023

M2VT: A Multi-Output Encoder Accelerator for Multiple-Way Video Transcoding.
Proceedings of the Great Lakes Symposium on VLSI 2023, 2023

Co-ViSu: a Video Super-Resolution Accelerator Exploiting Codec Information Reuse.
Proceedings of the 33rd International Conference on Field-Programmable Logic and Applications, 2023

Built-in Fault-Tolerant Computing Paradigm for Resilient Large-Scale Chip Design - A Self-Test, Self-Diagnosis, and Self-Repair-Based Approach
Springer, ISBN: 978-981-19-8550-8, 2023

2022
Portrait: A holistic computation and bandwidth balanced performance evaluation model for heterogeneous systems.
Sustain. Comput. Informatics Syst., 2022

DOE: Database Offloading Engine for Accelerating SQL Processing.
Proceedings of the 38th IEEE International Conference on Data Engineering Workshops, 2022

Using Psychophysics to Guide Power Adaptation for Input Methods on Mobile Architectures.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

2021
ShuntFlowPlus: An Efficient and Scalable Dataflow Accelerator Architecture for Stream Applications.
ACM J. Emerg. Technol. Comput. Syst., 2021

2020
BZIP: A compact data memory system for UTXO-based blockchains.
J. Syst. Archit., 2020

A Quantitative Exploration of Collaborative Pruning and Approximation Computing Towards Energy Efficient Neural Networks.
IEEE Des. Test, 2020

2019
SynergyFlow: An Elastic Accelerator Architecture Supporting Batch Processing of Large-Scale Deep Neural Networks.
ACM Trans. Design Autom. Electr. Syst., 2019

ShuttleNoC: Power-Adaptable Communication Infrastructure for Many-Core Processors.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

Promoting the Harmony between Sparsity and Regularity: A Relaxed Synchronous Architecture for Convolutional Neural Networks.
IEEE Trans. Computers, 2019

SqueezeFlow: A Sparse CNN Accelerator Exploiting Concise Convolution Rules.
IEEE Trans. Computers, 2019

MLA: Machine Learning Adaptation for Realtime Streaming Financial Applications.
Proceedings of the Tenth International Green and Sustainable Computing Conference, 2019

ShuntFlow: An Efficient and Scalable Dataflow Accelerator Architecture for Streaming Applications.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

TNPU: an efficient accelerator architecture for training convolutional neural networks.
Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

2018
AdaFlow: Aggressive Convolutional Neural Networks Approximation by Leveraging the Input Variability.
J. Low Power Electron., 2018

Optimizing Memory Efficiency for Deep Convolutional Neural Network Accelerators.
J. Low Power Electron., 2018

CPicker: Leveraging Performance-Equivalent Configurations to Improve Data Center Energy Efficiency.
J. Comput. Sci. Technol., 2018

Joint Design of Training and Hardware Towards Efficient and Accuracy-Scalable Neural Network Inference.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2018

Fault tolerance on-chip: a reliable computing paradigm using self-test, self-diagnosis, and self-repair (3S) approach.
Sci. China Inf. Sci., 2018

AxTrain: Hardware-Oriented Neural Network Training for Approximate Inference.
Proceedings of the International Symposium on Low Power Electronics and Design, 2018

Tetris: re-architecting convolutional neural network computation for machine learning accelerators.
Proceedings of the International Conference on Computer-Aided Design, 2018

SmartShuttle: Optimizing off-chip memory accesses for deep learning accelerators.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

CCR: A concise convolution rule for sparse neural network accelerators.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

RiskCap: Minimizing Effort of Error Regulation for Approximate Computing.
Proceedings of the 27th IEEE Asian Test Symposium, 2018

2017
PowerTrader: Enforcing Autonomous Power Management for Future Large-Scale Many-Core Processors.
IEEE Trans. Multi Scale Comput. Syst., 2017

Exploiting the Potential of Computation Reuse Through Approximate Computing.
IEEE Trans. Multi Scale Comput. Syst., 2017

FlexFlow: A Flexible Dataflow Accelerator Architecture for Convolutional Neural Networks.
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017

ApproxEye: Enabling approximate computation reuse for microrobotic computer vision.
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

2016
EcoUp: Towards Economical Datacenter Upgrading.
IEEE Trans. Parallel Distributed Syst., 2016

CoreRank: Redeeming "Sick Silicon" by Dynamically Quantifying Core-Level Healthy Condition.
IEEE Trans. Computers, 2016

An Analytical Framework for Estimating Scale-Out and Scale-Up Power Efficiency of Heterogeneous Manycores.
IEEE Trans. Computers, 2016

Wide Operational Range Processor Power Delivery Design for Both Super-Threshold Voltage and Near-Threshold Voltage Computing.
J. Comput. Sci. Technol., 2016

PowerCap: Leverage Performance-Equivalent Resource Configurations for power capping.
Proceedings of the Seventh International Green and Sustainable Computing Conference, 2016

ACR: Enabling computation reuse for approximate computing.
Proceedings of the 21st Asia and South Pacific Design Automation Conference, 2016

2015
RISO: Enforce Noninterfered Performance With Relaxed Network-on-Chip Isolation in Many-Core Cloud Processors.
IEEE Trans. Very Large Scale Integr. Syst., 2015

ShuttleNoC: Boosting on-chip communication efficiency by enabling localized power adaptation.
Proceedings of the 20th Asia and South Pacific Design Automation Conference, 2015

2014
Orchestrator: Guarding Against Voltage Emergencies in Multithreaded Applications.
IEEE Trans. Very Large Scale Integr. Syst., 2014

SmartCap: Using Machine Learning for Power Adaptation of Smartphone's Application Processor.
ACM Trans. Design Autom. Electr. Syst., 2014

SuperRange: Wide operational range power delivery design for both STV and NTV computing.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

On-Chip Delay Sensor for Environments with Large Temperature Fluctuations.
Proceedings of the 23rd IEEE Asian Test Symposium, 2014

Amphisbaena: Modeling two orthogonal ways to hunt on heterogeneous many-cores.
Proceedings of the 19th Asia and South Pacific Design Automation Conference, 2014

2013
SmartCap: user experience-oriented power adaptation for smartphone's application processor.
Proceedings of the Design, Automation and Test in Europe, 2013

Orchestrator: a low-cost solution to reduce voltage emergencies for multi-threaded applications.
Proceedings of the Design, Automation and Test in Europe, 2013

RISO: relaxed network-on-chip isolation for cloud processors.
Proceedings of the 50th Annual Design Automation Conference 2013, 2013

2012
AgileRegulator: A hybrid voltage regulator scheme redeeming dark silicon for power efficiency in a multicore architecture.
Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture, 2012

2011
SVFD: A Versatile Online Fault Detection Scheme via Checking of Stability Violation.
IEEE Trans. Very Large Scale Integr. Syst., 2011

MicroFix: Using timing interpolation and delay sensors for power reduction.
ACM Trans. Design Autom. Electr. Syst., 2011

ReviveNet: A Self-Adaptive Architecture for Improving Lifetime Reliability via Localized Timing Adaptation.
IEEE Trans. Computers, 2011

Online timing variation tolerance for digital integrated circuits.
Proceedings of the 2011 IEEE International Test Conference, 2011

2010
Performance-asymmetry-aware scheduling for Chip Multiprocessors with static core coupling.
J. Syst. Archit., 2010

Leveraging the core-level complementary effects of PVT variations to reduce timing emergencies in multi-core processors.
Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010

2009
Variation-Aware Scheduling for Chip Multiprocessors with Thread Level Redundancy.
Proceedings of the 2009 15th IEEE Pacific Rim International Symposium on Dependable Computing, 2009

MicroFix: exploiting path-grained timing adaptability for improving power-performance efficiency.
Proceedings of the 2009 International Symposium on Low Power Electronics and Design, 2009

A unified online Fault Detection scheme via checking of Stability Violation.
Proceedings of the Design, Automation and Test in Europe, 2009

M-IVC: Using Multiple Input Vectors to Minimize Aging-Induced Delay.
Proceedings of the Eighteentgh Asian Test Symposium, 2009

2008
BAT: Performance-Driven Crosstalk Mitigation Based on Bus-Grouping Asynchronous Transmission.
IEICE Trans. Electron., 2008


  Loading...