Ying Wang

Orcid: 0000-0001-5172-4736

Affiliations:
  • Chinese Academy of Sciences, State Key Laboratory of Computer Architecture, Beijing, China


According to our database1, Ying Wang authored at least 219 papers between 2010 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
A Task-Adaptive In-Situ ReRAM Computing for Graph Convolutional Networks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., September, 2024

An Energy-Efficient In-Memory Accelerator for Graph Construction and Updating.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., June, 2024

An Automatic Neural Network Architecture-and-Quantization Joint Optimization Framework for Efficient Model Inference.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., May, 2024

Real-Time Robust Video Object Detection System Against Physical-World Adversarial Attacks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., January, 2024

A Deep Reinforcement Learning-Based Preemptive Approach for Cost-Aware Cloud Job Scheduling.
IEEE Trans. Sustain. Comput., 2024

Discovering Hierarchical Multi-Instance Business Processes From Event Logs.
IEEE Trans. Serv. Comput., 2024

COMET: Towards Partical W4A4KV4 LLMs Serving.
CoRR, 2024

SuperEncoder: Towards Universal Neural Approximate Quantum State Preparation.
CoRR, 2024

Natural language is not enough: Benchmarking multi-modal generative AI for Verilog generation.
CoRR, 2024

Approximate data mapping in refresh-free DRAM for energy-efficient computing in modern mobile systems.
Comput. Commun., 2024

HyQA: Hybrid Near-Data Processing Platform for Embedding Based Question Answering System.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

Bit-Trimmer: Ineffectual Bit-Operation Removal for CLM Architecture.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

GPACE: An Energy-Efficient PQ-Based GCN Accelerator with Redundancy Reduction.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

Drift: Leveraging Distribution-based Dynamic Precision Quantization for Efficient Deep Neural Network Acceleration.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

Chiplever: Towards Effortless Extension of Chiplet-based System for FHE.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

PrimePar: Efficient Spatial-temporal Tensor Partitioning for Large Transformer Model Training.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

CIM-MLC: A Multi-level Compilation Stack for Computing-In-Memory Accelerators.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

Chipletizer: Repartitioning SoCs for Cost-Effective Chiplet Integration.
Proceedings of the 29th Asia and South Pacific Design Automation Conference, 2024

APoX: Accelerate Graph-Based Deep Point Cloud Analysis via Adaptive Graph Construction.
Proceedings of the 29th Asia and South Pacific Design Automation Conference, 2024

Chaos: Function Granularity Runtime Address Layout Space Randomization for Kernel Module.
Proceedings of the 15th ACM SIGOPS Asia-Pacific Workshop on Systems, 2024

2023
Soft Error Reliability Analysis of Vision Transformers.
IEEE Trans. Very Large Scale Integr. Syst., December, 2023

Toward Network-Aware Query Execution Systems in Large Datacenters.
IEEE Trans. Netw. Serv. Manag., December, 2023

Exploring Winograd Convolution for Cost-Effective Neural Network Fault Tolerance.
IEEE Trans. Very Large Scale Integr. Syst., November, 2023

Statistical Modeling of Soft Error Influence on Neural Networks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., November, 2023

Accelerating Deformable Convolution Networks with Dynamic and Irregular Memory Accesses.
ACM Trans. Design Autom. Electr. Syst., July, 2023

A Coordinated Model Pruning and Mapping Framework for RRAM-Based DNN Accelerators.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., July, 2023

Network Pruning for Bit-Serial Accelerators.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., May, 2023

Variation Enhanced Attacks Against RRAM-Based Neuromorphic Computing System.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., May, 2023

A Survey of Non-Volatile Main Memory File Systems.
J. Comput. Sci. Technol., April, 2023

S$^{2}$ Loop: A Lightweight Spectral-Spatio Loop Closure Detector for Resource-Constrained Platforms.
IEEE Robotics Autom. Lett., March, 2023

On-Line Fault Protection for ReRAM-Based Neural Networks.
IEEE Trans. Computers, February, 2023

An Energy-Efficient Computing-in-Memory (CiM) Scheme Using Field-Free Spin-Orbit Torque (SOT) Magnetic RAMs.
IEEE Trans. Emerg. Top. Comput., 2023

A Framework for Neural Network Architecture and Compile Co-optimization.
ACM Trans. Embed. Comput. Syst., 2023

Optimus: An Operator Fusion Framework for Deep Neural Networks.
ACM Trans. Embed. Comput. Syst., 2023

IVP: An Intelligent Video Processing Architecture for Video Streaming.
IEEE Trans. Computers, 2023

Cross-Layer Optimization for Fault-Tolerant Deep Learning.
CoRR, 2023

MRFI: An Open Source Multi-Resolution Fault Injection Framework for Neural Network Processing.
CoRR, 2023

ChipGPT: How far are we from natural language hardware design.
CoRR, 2023

ApproxABFT: Approximate Algorithm-Based Fault Tolerance for Vision Transformers.
CoRR, 2023

Reliability Analysis of Vision Transformers.
CoRR, 2023

Communication-aware Quantization for Deep Learning Inference Parallelization on Chiplet-based Accelerators.
Proceedings of the 29th IEEE International Conference on Parallel and Distributed Systems, 2023

PANG: A Pattern-Aware GCN Accelerator for Universal Graphs.
Proceedings of the 41st IEEE International Conference on Computer Design, 2023

Full State Quantum Circuit Simulation Beyond Memory Limit.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

DeepBurning-MixQ: An Open Source Mixed-Precision Neural Network Accelerator Design Framework for FPGAs.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

Efficient Supernet Training Using Path Parallelism.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

CTA: Hardware-Software Co-design for Compressed Token Attention Mechanism.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

ENASA: Towards Edge Neural Architecture Search based on CIM acceleration.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

Layer-Puzzle: Allocating and Scheduling Multi-task on Multi-core NPUs by Using Layer Heterogeneity.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

APPEND: Rethinking ASIP Synthesis in the Era of AI.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

AmgR: Algebraic Multigrid Accelerated on ReRAM.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Adversarial Testing: A Novel On-Line Testing Method for Deep Learning Processors.
Proceedings of the 32nd IEEE Asian Test Symposium, 2023

Occamy: Elastically Sharing a SIMD Co-processor across Multiple CPU Cores.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

Deep Learning Compiler Optimization on Multi-Chiplet Architecture.
Proceedings of the 5th IEEE International Conference on Artificial Intelligence Circuits and Systems, 2023

2022
Taming Process Variations in CNFET for Efficient Last-Level Cache Design.
IEEE Trans. Very Large Scale Integr. Syst., 2022

MPC-CSAS: Multi-Party Computation for Real-Time Privacy-Preserving Speed Advisory Systems.
IEEE Trans. Intell. Transp. Syst., 2022

A Low-Cost FPGA Implementation of Spiking Extreme Learning Machine With On-Chip Reward-Modulated STDP Learning.
IEEE Trans. Circuits Syst. II Express Briefs, 2022

An Efficient Deep Learning Accelerator Architecture for Compressed Video Analysis.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

A Fast Precision Tuning Solution for Always-On DNN Accelerators.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

HyCA: A Hybrid Computing Architecture for Fault-Tolerant Deep Learning.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

An Automated Quantization Framework for High-Utilization RRAM-Based PIM.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Saving Energy of RRAM-Based Neural Accelerator Through State-Aware Computing.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Amphis: Managing Reconfigurable Processor Architectures With Generative Adversarial Learning.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

CAP: Communication-Aware Automated Parallelization for Deep Learning Inference on CMP Architectures.
IEEE Trans. Computers, 2022

Olympus: Reaching Memory-Optimality on DNN Processors.
IEEE Trans. Computers, 2022

TripleBrain: A Compact Neuromorphic Hardware Core With Fast On-Chip Self-Organizing and Reinforcement Spike-Timing Dependent Plasticity.
IEEE Trans. Biomed. Circuits Syst., 2022

Dadu-SV: Accelerate Stereo Vision Processing on NPU.
IEEE Embed. Syst. Lett., 2022

Real-Time Robust Video Object Detection System Against Physical-World Adversarial Attacks.
CoRR, 2022

Survey on chiplets: interface, interconnect and integration methodology.
CCF Trans. High Perform. Comput., 2022

Cognitive SSD+: a deep learning engine for energy-efficient unstructured data retrieval.
CCF Trans. High Perform. Comput., 2022

LINAC: A Spatially Linear Accelerator for Convolutional Neural Networks.
IEEE Comput. Archit. Lett., 2022

DeepBurning-SEG: Generating DNN Accelerators of Segment-Grained Pipeline Architecture.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

Canopy: A CNFET-based Process Variation Aware Systolic DNN Accelerator.
Proceedings of the ISLPED '22: ACM/IEEE International Symposium on Low Power Electronics and Design, Boston, MA, USA, August 1, 2022

Security Threat to the Robustness of RRAM-based Neuromorphic Computing System.
Proceedings of the IEEE International Symposium on Smart Electronic Systems, 2022

Reexamining CGRA Memory Sub-system for Higher Memory Utilization and Performance.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

GIA: A Reusable General Interposer Architecture for Agile Chiplet Integration.
Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022

MOCCA: A Process Variation Tolerant Systolic DNN Accelerator using CNFETs in Monolithic 3D.
Proceedings of the GLSVLSI '22: Great Lakes Symposium on VLSI 2022, Irvine CA USA, June 6, 2022

NoCeption: A Fast PPA Prediction Framework for Network-on-Chips Using Graph Neural Network.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

Winograd convolution: a perspective from fault tolerance.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

VStore: in-storage graph based vector search accelerator.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

Processing-in-SRAM acceleration for ultra-low power visual 3D perception.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

InfoX: an energy-efficient ReRAM accelerator design with information-lossless low-bit ADCs.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021
Reliability Evaluation and Analysis of FPGA-Based Neural Network Acceleration System.
IEEE Trans. Very Large Scale Integr. Syst., 2021

R2F: A Remote Retraining Framework for AIoT Processors With Computing Errors.
IEEE Trans. Very Large Scale Integr. Syst., 2021

Network-Aware Locality Scheduling for Distributed Data Operators in Data Centers.
IEEE Trans. Parallel Distributed Syst., 2021

A NAND-SPIN-Based Magnetic ADC.
IEEE Trans. Circuits Syst. II Express Briefs, 2021

Dadu-Eye: A 5.3 TOPS/W, 30 fps/1080p High Accuracy Stereo Vision Accelerator.
IEEE Trans. Circuits Syst. I Regul. Pap., 2021

An Edge 3D CNN Accelerator for Low-Power Activity Recognition.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2021

EnGN: A High-Throughput and Energy-Efficient Accelerator for Large Graph Neural Networks.
IEEE Trans. Computers, 2021

CompSNN: A lightweight spiking neural network based on spatiotemporally compressive spike features.
Neurocomputing, 2021

Enhancing the security of memory in cloud infrastructure through in-phase change memory data randomisation.
IET Comput. Digit. Tech., 2021

Taming Process Variations in CNFET for Efficient Last Level Cache Design.
CoRR, 2021

Energy-Efficient Accelerator Design for Deformable Convolution Networks.
CoRR, 2021

To cloud or not to cloud: an on-line scheduler for dynamic privacy-protection of deep learning workload on edge devices.
CCF Trans. High Perform. Comput., 2021

Special Session - Test for AI Chips: from DFT to On-line Testing.
Proceedings of the 39th IEEE VLSI Test Symposium, 2021

GLIST: Towards In-Storage Graph Learning.
Proceedings of the 2021 USENIX Annual Technical Conference, 2021

CHaNAS: coordinated search for network architecture and scheduling policy.
Proceedings of the LCTES '21: 22nd ACM SIGPLAN/SIGBED International Conference on Languages, 2021

Optimus: towards optimal layer-fusion on deep learning processors.
Proceedings of the LCTES '21: 22nd ACM SIGPLAN/SIGBED International Conference on Languages, 2021

NASA: Accelerating Neural Network Design with a NAS Processor.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

PicoVO: A Lightweight RGB-D Visual Odometry Targeting Resource-Constrained IoT Devices.
Proceedings of the IEEE International Conference on Robotics and Automation, 2021

MT-DLA: An Efficient Multi-Task Deep Learning Accelerator Design.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

Tenet: A Neural Network Model Extraction Attack in Multi-core Architecture.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

RECOIN: A Low-Power Processing-in-ReRAM Architecture for Deformable Convolution.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

Network-on-Interposer Design for Agile Neural-Network Processor Chip Customization.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

PixelSieve: Towards Efficient Activity Analysis From Compressed Video Streams.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

ASBP: Automatic Structured Bit-Pruning for RRAM-based NN Accelerator.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

TARe: Task-Adaptive in-situ ReRAM Computing for Graph Learning.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

GCiM: A Near-Data Processing Accelerator for Graph Construction.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

An Intelligent Video Processing Architecture for Edge-cloud Video Streaming.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

VADER: Leveraging the Natural Variation of Hardware to Enhance Adversarial Attack.
Proceedings of the ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, 2021

Exploiting Memristors for Neuromorphic Reinforcement Learning.
Proceedings of the 3rd IEEE International Conference on Artificial Intelligence Circuits and Systems, 2021

2020
Bulkyflip: A NAND-SPIN-Based Last-Level Cache With Bandwidth-Oriented Write Management Policy.
IEEE Trans. Circuits Syst. I Regul. Pap., 2020

Field-Free 3T2SOT MRAM for Non-Volatile Cache Memories.
IEEE Trans. Circuits Syst., 2020

A Novel High Performance and Energy Efficient NUCA Architecture for STT-MRAM LLCs With Thermal Consideration.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Accelerating Generative Neural Networks on Unmodified Deep Learning Processors - A Software Approach.
IEEE Trans. Computers, 2020

A High-Speed Low-Cost VLSI System Capable of On-Chip Online Learning for Dynamic Vision Sensor Data Classification.
Sensors, 2020

Write Back Energy Optimization for STT-MRAM-based Last-level Cache with Data Pattern Characterization.
ACM J. Emerg. Technol. Comput. Syst., 2020

Special Session - Emerging Memristor Based Memory and CIM Architecture: Test, Repair and Yield Analysis.
Proceedings of the 38th IEEE VLSI Test Symposium, 2020

Linear Symmetric Quantization of Neural Networks for Low-precision Integer Hardware.
Proceedings of the 8th International Conference on Learning Representations, 2020

A Hybrid Computing Architecture for Fault-tolerant Deep Learning Accelerators.
Proceedings of the 38th IEEE International Conference on Computer Design, 2020

A Many-Core Accelerator Design for On-Chip Deep Reinforcement Learning.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

DeepBurning-GL: an Automated Framework for Generating Graph Neural Network Accelerators.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

HitM: High-Throughput ReRAM-based PIM for Multi-Modal Neural Networks.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

Multi-task Scheduling for PIM-based Heterogeneous Computing System.
Proceedings of the GLSVLSI '20: Great Lakes Symposium on VLSI 2020, 2020

Towards Best-effort Approximation: Applying NAS to General-purpose Approximate Computing.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

You Only Search Once: A Fast Automation Framework for Single-Stage DNN/Accelerator Co-design.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

CNT-Cache: an Energy-Efficient Carbon Nanotube Cache with Adaptive Encoding.
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020

BitPruner: Network Pruning for Bit-serial Accelerators.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

An Efficient Deep Learning Accelerator for Compressed Video Analysis.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

RaQu: An automatic high-utilization CNN quantization and mapping framework for general-purpose RRAM Accelerator.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Towards State-Aware Computation in ReRAM Neural Networks.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Search-free Accelerator for Sparse Convolutional Neural Networks.
Proceedings of the 25th Asia and South Pacific Design Automation Conference, 2020

Persistent Fault Analysis of Neural Networks on FPGA-based Acceleration System.
Proceedings of the 31st IEEE International Conference on Application-specific Systems, 2020

2019
An Adaptive Thermal-Aware ECC Scheme for Reliable STT-MRAM LLC Design.
IEEE Trans. Very Large Scale Integr. Syst., 2019

Cluster Restoration-Based Trace Signal Selection for Post-Silicon Debug.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

A QoS-QoR Aware CNN Accelerator Design Approach.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

Load-balancing distributed outer joins through operator decomposition.
J. Parallel Distributed Comput., 2019

MV-Net: Toward Real-Time Deep Learning on Mobile GPGPU Systems.
ACM J. Emerg. Technol. Comput. Syst., 2019

Thread: Towards fine-grained precision reconfiguration in variable-precision neural network accelerator.
IEICE Electron. Express, 2019

Correlation of Gut Microbiome Between ASD Children and Mothers and Potential Biomarkers for Risk Assessment.
Genom. Proteom. Bioinform., 2019

PIMSim: A Flexible and Detailed Processing-in-Memory Simulator.
IEEE Comput. Archit. Lett., 2019

Modeling of Mixing Uniformity for Food With Special Medicinal Purposes Based on Chinese Herbal Medicine.
IEEE Access, 2019

Leveraging Memory PUFs and PIM-based encryption to secure edge deep learning systems.
Proceedings of the 37th IEEE VLSI Test Symposium, 2019

Cognitive SSD: A Deep Learning Engine for In-Storage Data Retrieval.
Proceedings of the 2019 USENIX Annual Technical Conference, 2019

Squeezing the Last MHz for CNN Acceleration on FPGAs.
Proceedings of the IEEE International Test Conference in Asia, 2019

OBFS: OpenCL Based BFS Optimizations on Software Programmable FPGAs.
Proceedings of the International Conference on Field-Programmable Technology, 2019

RRAMedy: Protecting ReRAM-Based Neural Network from Permanent and Soft Faults During Its Lifetime.
Proceedings of the 37th IEEE International Conference on Computer Design, 2019

ACG-Engine: An Inference Accelerator for Content Generative Neural Networks.
Proceedings of the International Conference on Computer-Aided Design, 2019

An Agile Precision-Tunable CNN Accelerator based on ReRAM.
Proceedings of the International Conference on Computer-Aided Design, 2019

InS-DLA: An In-SSD Deep Learning Accelerator for Near-Data Processing.
Proceedings of the 29th International Conference on Field Programmable Logic and Applications, 2019

Learn-to-Scale: Parallelizing Deep Learning Inference on Chip Multiprocessor Architecture.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

Systolic Cube: A Spatial 3D CNN Accelerator Architecture for Low Power Video Analysis.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

A None-Sparse Inference Accelerator that Distills and Reuses the Computation Redundancy in CNNs.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

Exploring emerging CNFET for efficient last level cache design.
Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

CuckooPIM: an efficient and less-blocking coherence mechanism for processing-in-memory systems.
Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

Addressing the issue of processing element under-utilization in general-purpose systolic deep learning accelerators.
Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

P<sup>3</sup>M: a PIM-based neural network model protection scheme for deep learning accelerator.
Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

Simulate-the-hardware: training accurate binarized neural networks for low-precision neural accelerators.
Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

Resilient Neural Network Training for Accelerators with Computing Errors.
Proceedings of the 30th IEEE International Conference on Application-specific Systems, 2019

2018
A Case of On-Chip Memory Subsystem Design for Low-Power CNN Accelerators.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

A Low Overhead In-Network Data Compressor for the Memory Hierarchy of Chip Multiprocessors.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

On Trace Buffer Reuse-Based Trigger Generation in Post-Silicon Debug.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

DimRouter: A Multi-Mode Router Architecture for Higher Energy-Proportionality of On-Chip Networks.
J. Comput. Sci. Technol., 2018

Fault tolerance on-chip: a reliable computing paradigm using self-test, self-diagnosis, and self-repair (3S) approach.
Sci. China Inf. Sci., 2018

Exploiting Lightweight Statistical Learning for Event-Based Vision Processing.
IEEE Access, 2018

Lightweight Timing Channel Protection for Shared DRAM Controller.
Proceedings of the IEEE International Test Conference, 2018

Leveraging DRAM Refresh to Protect the Memory Timing Channel of Cloud Chip Multi-processors.
Proceedings of the IEEE International Test Conference in Asia, 2018

MTTF-Aware Reliability Task Scheduling for PIM-Based Heterogeneous Computing System.
Proceedings of the IEEE International Test Conference in Asia, 2018

NEAR: A Novel Energy Aware Replacement Policy for STT-MRAM LLCs.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2018

FCN-engine: accelerating deconvolutional layers in classic CNN processors.
Proceedings of the International Conference on Computer-Aided Design, 2018

Caching or Not: Rethinking Virtual File System for Non-Volatile Main Memory.
Proceedings of the 10th USENIX Workshop on Hot Topics in Storage and File Systems, 2018

A retrospective evaluation of energy-efficient object detection solutions on embedded devices.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

Dadu-P: a scalable accelerator for robot motion planning in a dynamic environment.
Proceedings of the 55th Annual Design Automation Conference, 2018

XORiM: A case of in-memory bit-comparator implementation and its performance implications.
Proceedings of the 23rd Asia and South Pacific Design Automation Conference, 2018

PIMCH: Cooperative memory prefetching in processing-in-memory architecture.
Proceedings of the 23rd Asia and South Pacific Design Automation Conference, 2018

2017
Resilience-Aware Frequency Tuning for Neural-Network-Based Approximate Computing Chips.
IEEE Trans. Very Large Scale Integr. Syst., 2017

STT-RAM Buffer Design for Precision-Tunable General-Purpose Neural Network Accelerator.
IEEE Trans. Very Large Scale Integr. Syst., 2017

Retention-Aware DRAM Assembly and Repair for Future FGR Memories.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2017

Power-Utility-Driven Write Management for MLC PCM.
ACM J. Emerg. Technol. Comput. Syst., 2017

Flip-flop clustering based trace signal selection for post-silicon debug.
Proceedings of the 35th IEEE VLSI Test Symposium, 2017

A Coflow-Based Co-Optimization Framework for High-Performance Data Analytics.
Proceedings of the 46th International Conference on Parallel Processing, 2017

Thermosiphon: A thermal aware NUCA architecture for write energy reduction of the STT-MRAM based LLCs.
Proceedings of the 2017 IEEE/ACM International Conference on Computer-Aided Design, 2017

Cross-program design space exploration by ensemble transfer learning.
Proceedings of the 2017 IEEE/ACM International Conference on Computer-Aided Design, 2017

Real-Time Meets Approximate Computing: An Elastic CNN Inference Accelerator with Adaptive Trade-off between QoS and QoR.
Proceedings of the 54th Annual Design Automation Conference, 2017

Dadu: Accelerating Inverse Kinematics for High-DOF Robots.
Proceedings of the 54th Annual Design Automation Conference, 2017

Selective off-loading to Memory: Task Partitioning and Mapping for PIM-enabled Heterogeneous Systems.
Proceedings of the Computing Frontiers Conference, 2017

Test and Reliability of Emerging Non-volatile Memories.
Proceedings of the 26th IEEE Asian Test Symposium, 2017

CNN-based object detection solutions for embedded heterogeneous multicore SoCs.
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

ApproxPIM: Exploiting realistic 3D-stacked DRAM for energy-efficient processing in-memory.
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

BoDNoC: Providing bandwidth-on-demand interconnection for multi-granularity memory systems.
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

2016
PSI Conscious Write Scheduling: Architectural Support for Reliable Power Delivery in 3-D Die-Stacked PCM.
IEEE Trans. Very Large Scale Integr. Syst., 2016

VANUCA: Enabling Near-Threshold Voltage Operation in Large-Capacity Cache.
IEEE Trans. Very Large Scale Integr. Syst., 2016

Enhanced Wear-Rate Leveling for PRAM Lifetime Improvement Considering Process Variation.
IEEE Trans. Very Large Scale Integr. Syst., 2016

Re-architecting the on-chip memory sub-system of machine-learning accelerator for embedded devices.
Proceedings of the 35th International Conference on Computer-Aided Design, 2016

Frequency scheduling for resilient chip multi-processors operating at Near Threshold Voltage.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

DeepBurning: automatic generation of FPGA-based learning accelerators for the neural network family.
Proceedings of the 53rd Annual Design Automation Conference, 2016

DISCO: a low overhead in-network data compressor for energy-efficient chip multi-processors.
Proceedings of the 53rd Annual Design Automation Conference, 2016

C-brain: a deep learning accelerator that tames the diversity of CNNs through adaptive data-level parallelization.
Proceedings of the 53rd Annual Design Automation Conference, 2016

2015
Data Remapping for Static NUCA in Degradable Chip Multiprocessors.
IEEE Trans. Very Large Scale Integr. Syst., 2015

Economizing TSV Resources in 3-D Network-on-Chip Design.
IEEE Trans. Very Large Scale Integr. Syst., 2015

RISO: Enforce Noninterfered Performance With Relaxed Network-on-Chip Isolation in Many-Core Cloud Processors.
IEEE Trans. Very Large Scale Integr. Syst., 2015

A signal degradation reduction method for memristor ratioed logic (MRL) gates.
IEICE Electron. Express, 2015

An architecture-level cache simulation framework supporting advanced PMA STT-MRAM.
Proceedings of the 2015 IEEE/ACM International Symposium on Nanoscale Architectures, 2015

A Similarity Based Circuit Partitioning and Trimming Method to Defend against Hardware Trojans.
Proceedings of the 2015 IEEE Computer Society Annual Symposium on VLSI, 2015

A case of precision-tunable STT-RAM memory design for approximate neural network.
Proceedings of the 2015 IEEE International Symposium on Circuits and Systems, 2015

Retraining-based timing error mitigation for hardware neural networks.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

<i>RADAR</i>: a case for retention-aware DRAM assembly and repair in future FGR DRAM memory.
Proceedings of the 52nd Annual Design Automation Conference, 2015

ProPRAM: exploiting the transparent logic resources in non-volatile memory for near data computing.
Proceedings of the 52nd Annual Design Automation Conference, 2015

TURO: A lightweight turn-guided routing scheme for 3D NoCs.
Proceedings of the 2015 IEEE Symposium in Low-Power and High-Speed Chips, 2015

TWiN: A Turn-Guided Reliable Routing Scheme for Wireless 3D NoCs.
Proceedings of the 24th IEEE Asian Test Symposium, 2015

A Lightweight Timing Channel Protection for Shared Memory Controllers.
Proceedings of the 24th IEEE Asian Test Symposium, 2015

ShuttleNoC: Boosting on-chip communication efficiency by enabling localized power adaptation.
Proceedings of the 20th Asia and South Pacific Design Automation Conference, 2015

2014
Reinventing Memory System Design for Many-Accelerator Architecture.
J. Comput. Sci. Technol., 2014

Data-aware DRAM refresh to squeeze the margin of retention time in hybrid memory cube.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2014

A low power DRAM refresh control scheme for 3D memory cube.
Proceedings of the 2014 IEEE Symposium on Low-Power and High-Speed Chips, 2014

2013
Enabling Near-Threshold Voltage(NTV) operation in Multi-VDD cache for power reduction.
Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013

2011
Flex memory: Exploiting and managing abundant off-chip optical bandwidth.
Proceedings of the Design, Automation and Test in Europe, 2011

Wear rate leveling: lifetime enhancement of PRAM with endurance variation.
Proceedings of the 48th Design Automation Conference, 2011

2010
Address Remapping for Static NUCA in NoC-Based Degradable Chip-Multiprocessors.
Proceedings of the 16th IEEE Pacific Rim International Symposium on Dependable Computing, 2010


  Loading...