Yinhe Han

ACM Trans. Design Autom. Electr. Syst., January, 2024

DaDu-E: Rethinking the Role of Large Language Model in Robotic Computing Pipeline.

[BibT_eX]

[DOI]

CoRR, 2024

PIMCOMP: An End-to-End DNN Compiler for Processing-In-Memory Accelerators.

[BibT_eX]

[DOI]

CoRR, 2024

COMET: Towards Partical W4A4KV4 LLMs Serving.

[BibT_eX]

[DOI]

CoRR, 2024

BitQ: Tailoring Block Floating Point Precision for Improved DNN Efficiency on Resource-Constrained Devices.

[BibT_eX]

[DOI]

CoRR, 2024

KARMA: Augmenting Embodied AI Agents with Long-and-short Term Memory Systems.

[BibT_eX]

[DOI]

CoRR, 2024

SuperEncoder: Towards Universal Neural Approximate Quantum State Preparation.

[BibT_eX]

[DOI]

CoRR, 2024

Natural language is not enough: Benchmarking multi-modal generative AI for Verilog generation.

[BibT_eX]

[DOI]

CoRR, 2024

Corki: Enabling Real-time Embodied AI Robots via Algorithm-Architecture Co-Design.

[BibT_eX]

[DOI]

CoRR, 2024

JANM-IK: Jacobian Argumented Nelder-Mead Algorithm for Inverse Kinematics and its Hardware Acceleration.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2024

TMiner: A Vertex-Based Task Scheduling Architecture for Graph Pattern Mining.

[BibT_eX]

[DOI]

Zerun Li

Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

Accelerating Frequency-domain Convolutional Neural Networks Inference using FPGAs.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Circuits and Systems, 2024

HEX-SIM: Evaluating Multi-modal Large Language Models on Multi-chiplet NPUs.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on Workload Characterization, 2024

Depth-NeuS: Neural Implicit Surfaces Learning for Multi-view Reconstruction Based on Depth Information Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advanced Intelligent Computing Technology and Applications, 2024

AceMiner: Accelerating Graph Pattern Matching using PIM with Optimized Cache System.

[BibT_eX]

[DOI]

Proceedings of the 42nd IEEE International Conference on Computer Design, 2024

MemSort: In-Memory Sorting Architecture.

[BibT_eX]

[DOI]

Proceedings of the 42nd IEEE International Conference on Computer Design, 2024

PIMSIM-NN: An ISA-based Simulation Framework for Processing-in-Memory Accelerators.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

PIMSYN: Synthesizing Processing-in-Memory CNN Accelerators.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

GPACE: An Energy-Efficient PQ-Based GCN Accelerator with Redundancy Reduction.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

Drift: Leveraging Distribution-based Dynamic Precision Quantization for Efficient Deep Neural Network Acceleration.

[BibT_eX]

[DOI]

Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

Chiplever: Towards Effortless Extension of Chiplet-based System for FHE.

[BibT_eX]

[DOI]

Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

Data is all you need: Finetuning LLMs for Chip Design via an Automated design-data augmentation framework.

[BibT_eX]

[DOI]

Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

PrimePar: Efficient Spatial-temporal Tensor Partitioning for Large Transformer Model Training.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

ACES: Accelerating Sparse Matrix Multiplication with Adaptive Execution Flow and Concurrency-Aware Cache Optimizations.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

ORIANNA: An Accelerator Generation Framework for Optimization-based Robotic Applications.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

Chipletizer: Repartitioning SoCs for Cost-Effective Chiplet Integration.

[BibT_eX]

[DOI]

Proceedings of the 29th Asia and South Pacific Design Automation Conference, 2024

2023

Frequency-Domain Inference Acceleration for Convolutional Neural Networks Using ReRAMs.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., December, 2023

FeCrypto: Instruction Set Architecture for Cryptographic Algorithms Based on FeFET-Based In-Memory Computing.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., September, 2023

IVP: An Intelligent Video Processing Architecture for Video Streaming.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2023

RBDCore: Robot Rigid Body Dynamics Accelerator with Multifunctional Pipelines.

[BibT_eX]

[DOI]

CoRR, 2023

ChipGPT: How far are we from natural language hardware design.

[BibT_eX]

[DOI]

CoRR, 2023

Depth-NeuS: Neural Implicit Surfaces Learning for Multi-view Reconstruction Based on Depth Information Optimization.

[BibT_eX]

[DOI]

CoRR, 2023

Dadu-RBD: Robot Rigid Body Dynamics Accelerator with Multifunctional Pipelines.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

Hardware-Software Co-Design for Content-Based Sparse Attention.

[BibT_eX]

[DOI]

Proceedings of the 41st IEEE International Conference on Computer Design, 2023

PANG: A Pattern-Aware GCN Accelerator for Universal Graphs.

[BibT_eX]

[DOI]

Proceedings of the 41st IEEE International Conference on Computer Design, 2023

Full State Quantum Circuit Simulation Beyond Memory Limit.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

LIM-GEN: A Data-Guided Framework for Automated Generation of Heterogeneous Logic-in-Memory Architecture.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

Meltrix: A RRAM-Based Polymorphic Architecture Enhanced by Function Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

CTA: Hardware-Software Co-design for Compressed Token Attention Mechanism.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

ENASA: Towards Edge Neural Architecture Search based on CIM acceleration.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

Layer-Puzzle: Allocating and Scheduling Multi-task on Multi-core NPUs by Using Layer Heterogeneity.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023

FSPA: An FeFET-based Sparse Matrix-Dense Vector Multiplication Accelerator.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

PIMCOMP: A Universal Compilation Framework for Crossbar-based PIM DNN Accelerators.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

APPEND: Rethinking ASIP Synthesis in the Era of AI.

[BibT_eX]

[DOI]

Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Accelerating Convolutional Neural Networks in Frequency Domain via Kernel-Sharing Approach.

[BibT_eX]

[DOI]

Proceedings of the 28th Asia and South Pacific Design Automation Conference, 2023

2022

Load Balance Guaranteed Vehicle-to-Vehicle Computation Offloading for Min-Max Fairness in VANETs.

[BibT_eX]

[DOI]

IEEE Trans. Intell. Transp. Syst., 2022

Toward Efficient Computing for Robotics: From a Circuit and System View.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. II Express Briefs, 2022

Re-FeMAT: A Reconfigurable Multifunctional FeFET-Based Memory Architecture.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Search-Free Inference Acceleration for Sparse Convolutional Neural Networks.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Amphis: Managing Reconfigurable Processor Architectures With Generative Adversarial Learning.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Neural-PIM: Efficient Processing-In-Memory With Neural Approximation of Peripherals.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2022

Reconfiguration algorithms for synchronous communication on switch based degradable arrays.

[BibT_eX]

[DOI]

Thambipillai Srikanthan

Parallel Comput., 2022

STC-NAS: Fast neural architecture search with source-target consistency.

[BibT_eX]

[DOI]

Neurocomputing, 2022

Fast and High-Accuracy Approximate MAC Unit Design for CNN Computing.

[BibT_eX]

[DOI]

IEEE Embed. Syst. Lett., 2022

Dadu-SV: Accelerate Stereo Vision Processing on NPU.

[BibT_eX]

[DOI]

IEEE Embed. Syst. Lett., 2022

Survey on chiplets: interface, interconnect and integration methodology.

[BibT_eX]

[DOI]

CCF Trans. High Perform. Comput., 2022

LINAC: A Spatially Linear Accelerator for Convolutional Neural Networks.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2022

DeepBurning-SEG: Generating DNN Accelerators of Segment-Grained Pipeline Architecture.

[BibT_eX]

[DOI]

Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

Closing the Dynamics Gap via Adversarial and Reinforcement Learning for High-Speed Racing.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2022

Searching for BurgerFormer with Micro-Meso-Macro Space Design.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

AGNAS: Attention-Guided Micro and Macro-Architecture Search.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

GIA: A Reusable General Interposer Architecture for Agile Chiplet Integration.

[BibT_eX]

[DOI]

Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022

PME: Processing-in-memory Masking and Encoding for Secure NVM.

[BibT_eX]

[DOI]

Zhiwen Xie

Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

AIC-Bench: Workload Selection Methodology for Benchmarking AI Chips.

[BibT_eX]

[DOI]

Zhenyu Quan

P3S: A High Accuracy Probabilistic Prediction Processing System for CNN Acceleration.

[BibT_eX]

[DOI]

Proceedings of the GLSVLSI '22: Great Lakes Symposium on VLSI 2022, Irvine CA USA, June 6, 2022

Energy-Efficient In-SRAM Accumulation for CMOS-based CNN Accelerators.

[BibT_eX]

[DOI]

Wanqian Li

Proceedings of the GLSVLSI '22: Great Lakes Symposium on VLSI 2022, Irvine CA USA, June 6, 2022

GraphRing: an HMC-ring based graph processing framework with optimized data movement.

[BibT_eX]

[DOI]

Zerun Li

Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

FeMIC: Multi-Operands in-Memory Computing Based on FeFETs.

[BibT_eX]

[DOI]

Proceedings of the 27th Asia and South Pacific Design Automation Conference, 2022

Optimal Data Allocation for Graph Processing in Processing-in-Memory Systems.

[BibT_eX]

[DOI]

Zerun Li

Proceedings of the 27th Asia and South Pacific Design Automation Conference, 2022

Solving Least-Squares Fitting in $O(1)$ Using RRAM-based Computing-in-Memory Technique.

[BibT_eX]

[DOI]

Proceedings of the 27th Asia and South Pacific Design Automation Conference, 2022

2021

Design and Analysis of Energy-Efficient Dynamic Range Approximate Logarithmic Multipliers for Machine Learning.

[BibT_eX]

[DOI]

IEEE Trans. Sustain. Comput., 2021

Defect Analysis and Parallel Testing for 3D Hybrid CMOS-Memristor Memory.

[BibT_eX]

[DOI]

IEEE Trans. Emerg. Top. Comput., 2021

Integrating Two Logics Into One Crossbar Array for Logic Gate Design.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. II Express Briefs, 2021

Dadu-Eye: A 5.3 TOPS/W, 30 fps/1080p High Accuracy Stereo Vision Accelerator.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. I Regul. Pap., 2021

Fault Modeling and Efficient Testing of Memristor-Based Memory.

[BibT_eX]

[DOI]

Krishnendu Chakrabarty

IEEE Trans. Circuits Syst. I Regul. Pap., 2021

Breaking the von Neumann bottleneck: architecture-level processing-in-memory technology.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2021

RRAM-based Analog In-Memory Computing : Invited Paper.

[BibT_eX]

[DOI]

Tao Song

Proceedings of the IEEE/ACM International Symposium on Nanoscale Architectures, 2021

CoPIM: A Concurrency-aware PIM Workload Offloading Architecture for Graph Applications.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2021

KFS-LIO: Key-Feature Selection for Lightweight Lidar Inertial Odometry.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2021

Eliminating Iterations of Iterative Methods: Solving Large-Scale Sparse Linear System in <i>O</i>(1) with RRAM-based In-Memory Accelerator.

[BibT_eX]

[DOI]

Tao Song

Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

BRAHMS: Beyond Conventional RRAM-based Neural Network Accelerators Using Hybrid Analog Memory System.

[BibT_eX]

[DOI]

Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

F3D: Accelerating 3D Convolutional Neural Networks in Frequency Space Using ReRAM.

[BibT_eX]

[DOI]

Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

FePIM: Contention-Free In-Memory Computing Based on Ferroelectric Field-Effect Transistors.

[BibT_eX]

[DOI]

Yuping Wu

Proceedings of the ASPDAC '21: 26th Asia and South Pacific Design Automation Conference, 2021

DDSAS: Dynamic and Differentiable Space-Architecture Search.

[BibT_eX]

[DOI]

Proceedings of the Asian Conference on Machine Learning, 2021

2020

Architecting Effectual Computation for Machine Learning Accelerators.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Swallow: A Versatile Accelerator for Sparse Neural Networks.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Two-Stage Safe Reinforcement Learning for High-Speed Autonomous Racing.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics, 2020

DaDu Series - Fast and Efficient Robot Accelerators.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

Communication Lower Bound in Convolution Accelerators.

[BibT_eX]

[DOI]

Yu Wang

Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

Accelerating RRT Motion Planning Using TCAM.

[BibT_eX]

[DOI]

Proceedings of the GLSVLSI '20: Great Lakes Symposium on VLSI 2020, 2020

TUPIM: A Transparent and Universal Processing-in-Memory Architecture for Unmodified Binaries.

[BibT_eX]

[DOI]

Proceedings of the GLSVLSI '20: Great Lakes Symposium on VLSI 2020, 2020

Dadu-CD: Fast and Efficient Processing-in-Memory Accelerator for Collision Detection.

[BibT_eX]

[DOI]

Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

An Efficient Deep Learning Accelerator for Compressed Video Analysis.

[BibT_eX]

[DOI]

Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

PIM-Prune: Fine-Grain DCNN Pruning for Crossbar-Based Process-In-Memory Architecture.

[BibT_eX]

[DOI]

Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Search-free Accelerator for Sparse Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 25th Asia and South Pacific Design Automation Conference, 2020

GNN-PIM: A Processing-in-Memory Architecture for Graph Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advanced Computer Architecture - 13th Conference, 2020

2019

moDNN: Memory Optimal Deep Neural Network Training on Graphics Processing Units.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2019

Power and Area Efficient FPGA Building Blocks Based on Ferroelectric FETs.

[BibT_eX]

[DOI]

IEEE Trans. Circuits Syst. I Regul. Pap., 2019

Thread: Towards fine-grained precision reconfiguration in variable-precision neural network accelerator.

[BibT_eX]

[DOI]

IEICE Electron. Express, 2019

Accelerating DNN-based 3D point cloud processing for mobile computing.

[BibT_eX]

[DOI]

Sci. China Inf. Sci., 2019

PIMSim: A Flexible and Detailed Processing-in-Memory Simulator.

[BibT_eX]

[DOI]

IEEE Comput. Archit. Lett., 2019

China Test Conference (CTC) - Extending the Global Test Forum to China.

[BibT_eX]

[DOI]

Huawei Li

Proceedings of the IEEE International Test Conference, 2019

Implementation of Parametric Hardware Trojan in FPGA.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Test Conference in Asia, 2019

FeMAT: Exploring In-Memory Processing in Multifunctional FeFET-Based Memory Array.

[BibT_eX]

[DOI]

Xiaoyu Zhang

Proceedings of the 37th IEEE International Conference on Computer Design, 2019

ACG-Engine: An Inference Accelerator for Content Generative Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Computer-Aided Design, 2019

System-level hardware failure prediction using deep learning.

[BibT_eX]

[DOI]

Xiaoyi Sun

Krishnendu Chakrabarty

Proceedings of the 56th Annual Design Automation Conference 2019, 2019

Merging Everything (ME): A Unified FPGA Architecture Based on Logic-in-Memory Techniques.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual Design Automation Conference 2019, 2019

CuckooPIM: an efficient and less-blocking coherence mechanism for processing-in-memory systems.

[BibT_eX]

[DOI]

Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

Addressing the issue of processing element under-utilization in general-purpose systolic deep learning accelerators.

[BibT_eX]

[DOI]

Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

Simulate-the-hardware: training accurate binarized neural networks for low-precision neural accelerators.

[BibT_eX]

[DOI]

Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

2018

A Low Overhead In-Network Data Compressor for the Memory Hierarchy of Chip Multiprocessors.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

RT3D: Real-Time 3-D Vehicle Detection in LiDAR Point Cloud for Autonomous Driving.

[BibT_eX]

[DOI]

IEEE Robotics Autom. Lett., 2018

CPicker: Leveraging Performance-Equivalent Configurations to Improve Data Center Energy Efficiency.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2018

DimRouter: A Multi-Mode Router Architecture for Higher Energy-Proportionality of On-Chip Networks.

[BibT_eX]

[DOI]

Shiqi Lian

Ying Wang

J. Comput. Sci. Technol., 2018

See and Think: Disentangling Semantic Scene Completion.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

A retrospective evaluation of energy-efficient object detection solutions on embedded devices.

[BibT_eX]

[DOI]

Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

Dadu-P: a scalable accelerator for robot motion planning in a dynamic environment.

[BibT_eX]

[DOI]

Proceedings of the 55th Annual Design Automation Conference, 2018

PIMCH: Cooperative memory prefetching in processing-in-memory architecture.

[BibT_eX]

[DOI]

Proceedings of the 23rd Asia and South Pacific Design Automation Conference, 2018

2017

STT-RAM Buffer Design for Precision-Tunable General-Purpose Neural Network Accelerator.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2017

PowerTrader: Enforcing Autonomous Power Management for Future Large-Scale Many-Core Processors.

[BibT_eX]

[DOI]

IEEE Trans. Multi Scale Comput. Syst., 2017

Exploiting the Potential of Computation Reuse Through Approximate Computing.

[BibT_eX]

[DOI]

IEEE Trans. Multi Scale Comput. Syst., 2017

Retention-Aware DRAM Assembly and Repair for Future FGR Memories.

[BibT_eX]

[DOI]

IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2017

FlexFlow: A Flexible Dataflow Accelerator Architecture for Convolutional Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017

Dadu: Accelerating Inverse Kinematics for High-DOF Robots.

[BibT_eX]

[DOI]

Proceedings of the 54th Annual Design Automation Conference, 2017

CNN-based object detection solutions for embedded heterogeneous multicore SoCs.

[BibT_eX]

[DOI]

Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

BoDNoC: Providing bandwidth-on-demand interconnection for multi-granularity memory systems.

[BibT_eX]

[DOI]

Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

ApproxEye: Enabling approximate computation reuse for microrobotic computer vision.

[BibT_eX]

[DOI]

Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

2016

PSI Conscious Write Scheduling: Architectural Support for Reliable Power Delivery in 3-D Die-Stacked PCM.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2016

VANUCA: Enabling Near-Threshold Voltage Operation in Large-Capacity Cache.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2016

Enhanced Wear-Rate Leveling for PRAM Lifetime Improvement Considering Process Variation.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2016

EcoUp: Towards Economical Datacenter Upgrading.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2016

A Cost-Effective Energy Optimization Framework of Multicore SoCs Based on Dynamically Reconfigurable Voltage-Frequency Islands.

[BibT_eX]

[DOI]

ACM Trans. Design Autom. Electr. Syst., 2016

An Analytical Framework for Estimating Scale-Out and Scale-Up Power Efficiency of Heterogeneous Manycores.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2016

Statistical energy optimization on voltage-frequency island based MPSoCs in the presence of process variations.

[BibT_eX]

[DOI]

Song Jin

Songwei Pei

Microelectron. J., 2016

Wide Operational Range Processor Power Delivery Design for Both Super-Threshold Voltage and Near-Threshold Voltage Computing.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2016

PowerCap: Leverage Performance-Equivalent Resource Configurations for power capping.

[BibT_eX]

[DOI]

Proceedings of the Seventh International Green and Sustainable Computing Conference, 2016

DeepBurning: automatic generation of FPGA-based learning accelerators for the neural network family.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual Design Automation Conference, 2016

DISCO: a low overhead in-network data compressor for energy-efficient chip multi-processors.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual Design Automation Conference, 2016

C-brain: a deep learning accelerator that tames the diversity of CNNs through adaptive data-level parallelization.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual Design Automation Conference, 2016

ACR: Enabling computation reuse for approximate computing.

[BibT_eX]

[DOI]

Proceedings of the 21st Asia and South Pacific Design Automation Conference, 2016

2015

Data Remapping for Static NUCA in Degradable Chip Multiprocessors.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2015

Economizing TSV Resources in 3-D Network-on-Chip Design.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2015

RISO: Enforce Noninterfered Performance With Relaxed Network-on-Chip Isolation in Many-Core Cloud Processors.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2015

A signal degradation reduction method for memristor ratioed logic (MRL) gates.

[BibT_eX]

[DOI]

IEICE Electron. Express, 2015

On optimizing system energy of multi-core SoCs based on dynamically reconfigurable voltage-frequency island.

[BibT_eX]

[DOI]

Proceedings of the VLSI Design, Automation and Test, 2015

A case of precision-tunable STT-RAM memory design for approximate neural network.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Symposium on Circuits and Systems, 2015

<i>RADAR</i>: a case for retention-aware DRAM assembly and repair in future FGR DRAM memory.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual Design Automation Conference, 2015

ProPRAM: exploiting the transparent logic resources in non-volatile memory for near data computing.

[BibT_eX]

[DOI]

Proceedings of the 52nd Annual Design Automation Conference, 2015

ShuttleNoC: Boosting on-chip communication efficiency by enabling localized power adaptation.

[BibT_eX]

[DOI]

Proceedings of the 20th Asia and South Pacific Design Automation Conference, 2015

2014

Test-Quality Optimization for Variable $n$ -Detections of Transition Faults.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2014

ZoneDefense: A Fault-Tolerant Routing for 2-D Meshes Without Virtual Channels.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2014

SmartCap: Using Machine Learning for Power Adaptation of Smartphone's Application Processor.

[BibT_eX]

[DOI]

ACM Trans. Design Autom. Electr. Syst., 2014

Reinventing Memory System Design for Many-Accelerator Architecture.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2014

A General-Purpose Many-Accelerator Architecture Based on Dataflow Graph Clustering of Applications.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2014

Data-aware DRAM refresh to squeeze the margin of retention time in hybrid memory cube.

[BibT_eX]

[DOI]

Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2014

SuperRange: Wide operational range power delivery design for both STV and NTV computing.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2014

A low power DRAM refresh control scheme for 3D memory cube.

[BibT_eX]

[DOI]

Ying Wang

Huawei Li

Proceedings of the 2014 IEEE Symposium on Low-Power and High-Speed Chips, 2014

Amphisbaena: Modeling two orthogonal ways to hunt on heterogeneous many-cores.

[BibT_eX]

[DOI]

Proceedings of the 19th Asia and South Pacific Design Automation Conference, 2014

Variation-aware statistical energy optimization on voltage-frequency island based MPSoCs under performance yield constraints.

[BibT_eX]

[DOI]

Song Jin

Songwei Pei

Proceedings of the 19th Asia and South Pacific Design Automation Conference, 2014

2013

Unified Capture Scheme for Small Delay Defect Detection and Aging Prediction.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2013

Thermal-Constrained Task Allocation for Interconnect Energy Reduction in 3-D Homogeneous MPSoCs.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2013

RevivePath: Resilient Network-on-Chip Design Through Data Path Salvaging of Router.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2013

TSV Minimization for Circuit - Partitioned 3D SoC Test Wrapper Design.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2013

On predicting NBTI-induced circuit aging by isolating leakage change.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Quality Electronic Design, 2013

Enabling Near-Threshold Voltage(NTV) operation in Multi-VDD cache for power reduction.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013

SmartCap: user experience-oriented power adaptation for smartphone's application processor.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation and Test in Europe, 2013

RISO: relaxed network-on-chip isolation for cloud processors.

[BibT_eX]

[DOI]

Proceedings of the 50th Annual Design Automation Conference 2013, 2013

2012

M-IVC: Applying multiple input vectors to co-optimize aging and leakage.

[BibT_eX]

[DOI]

Song Jin

Microelectron. J., 2012

AgileRegulator: A hybrid voltage regulator scheme redeeming dark silicon for power efficiency in a multicore architecture.

[BibT_eX]

[DOI]

Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture, 2012

A clustering-based scheme for concurrent trace in debugging NoC-based multicore systems.

[BibT_eX]

[DOI]

Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012

2011

SVFD: A Versatile Online Fault Detection Scheme via Checking of Stability Violation.

[BibT_eX]

[DOI]

Guihai Yan

IEEE Trans. Very Large Scale Integr. Syst., 2011

MicroFix: Using timing interpolation and delay sensors for power reduction.

[BibT_eX]

[DOI]

ACM Trans. Design Autom. Electr. Syst., 2011

ReviveNet: A Self-Adaptive Architecture for Improving Lifetime Reliability via Localized Timing Adaptation.

[BibT_eX]

[DOI]

Guihai Yan

IEEE Trans. Computers, 2011

Statistical lifetime reliability optimization considering joint effect of process variation and aging.

[BibT_eX]

[DOI]

Integr., 2011

A New Multiple-Round Dimension-Order Routing for Networks-on-Chip.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2011

An abacus turn model for time/space-efficient reconfigurable routing.

[BibT_eX]

[DOI]

Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011

Flex memory: Exploiting and managing abundant off-chip optical bandwidth.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation and Test in Europe, 2011

Eliminating data invalidation in debugging multiple-clock chips.

[BibT_eX]

[DOI]

Jianliang Gao

Proceedings of the Design, Automation and Test in Europe, 2011

Wear rate leveling: lifetime enhancement of PRAM with endurance variation.

[BibT_eX]

[DOI]

Proceedings of the 48th Design Automation Conference, 2011

Wrapper Chain Design for Testing TSVs Minimization in Circuit-Partitioned 3D SoC.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE Asian Test Symposium, 2011

A resilient on-chip router design through data path salvaging.

[BibT_eX]

[DOI]

Proceedings of the 16th Asia South Pacific Design Automation Conference, 2011

Vertical interconnects squeezing in symmetric 3D mesh Network-on-Chip.

[BibT_eX]

[DOI]

Proceedings of the 16th Asia South Pacific Design Automation Conference, 2011

2010

Performance-asymmetry-aware scheduling for Chip Multiprocessors with static core coupling.

[BibT_eX]

[DOI]

J. Syst. Archit., 2010

Extended Selective Encoding of Scan Slices for Reducing Test Data and Test Power.

[BibT_eX]

[DOI]

Jun Liu

IEICE Trans. Inf. Syst., 2010

A Novel Post-Silicon Debug Mechanism Based on Suspect Window.

[BibT_eX]

[DOI]

Jianliang Gao

IEICE Trans. Inf. Syst., 2010

Address Remapping for Static NUCA in NoC-Based Degradable Chip-Multiprocessors.

[BibT_eX]

[DOI]

Proceedings of the 16th IEEE Pacific Rim International Symposium on Dependable Computing, 2010

nGFSIM : A GPU-based fault simulator for 1-to-n detection and its applications.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE International Test Conference, 2010

Leveraging the core-level complementary effects of PVT variations to reduce timing emergencies in multi-core processors.

[BibT_eX]

[DOI]

Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010

Performance-asymmetry-aware topology virtualization for defect-tolerant NoC-based many-core processors.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation and Test in Europe, 2010

Accelerating Lightpath setup via broadcasting in binary-tree waveguide in Optical NoCs.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation and Test in Europe, 2010

P^(2)CLRAF: An Pre- and Post-Silicon Cooperated Circuit Lifetime Reliability Analysis Framework.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE Asian Test Symposium, 2010

2009

On Topology Reconfiguration for Defect-Tolerant NoC-Based Homogeneous Manycore Systems.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2009

A New Post-Silicon Debug Approach Based on Suspect Window.

[BibT_eX]

[DOI]

Jianliang Gao

Proceedings of the 27th IEEE VLSI Test Symposium, 2009

A New Multiple-Round DOR Routing for 2D Network-on-Chip Meshes.

[BibT_eX]

[DOI]

Proceedings of the 2009 15th IEEE Pacific Rim International Symposium on Dependable Computing, 2009

Variation-Aware Scheduling for Chip Multiprocessors with Thread Level Redundancy.

[BibT_eX]

[DOI]

Proceedings of the 2009 15th IEEE Pacific Rim International Symposium on Dependable Computing, 2009

MicroFix: exploiting path-grained timing adaptability for improving power-performance efficiency.

[BibT_eX]

[DOI]

Proceedings of the 2009 International Symposium on Low Power Electronics and Design, 2009

A unified online Fault Detection scheme via checking of Stability Violation.

[BibT_eX]

[DOI]

Guihai Yan

Proceedings of the Design, Automation and Test in Europe, 2009

A Scalable Scan Architecture for Godson-3 Multicore Microprocessor.

[BibT_eX]

[DOI]

Proceedings of the Eighteentgh Asian Test Symposium, 2009

M-IVC: Using Multiple Input Vectors to Minimize Aging-Induced Delay.

[BibT_eX]

[DOI]

Proceedings of the Eighteentgh Asian Test Symposium, 2009

2008

BAT: Performance-Driven Crosstalk Mitigation Based on Bus-Grouping Asynchronous Transmission.

[BibT_eX]

[DOI]

IEICE Trans. Electron., 2008

Defect Tolerance in Homogeneous Manycore Processors Using Core-Level Redundancy with Unified Topology.

[BibT_eX]

[DOI]

Proceedings of the Design, Automation and Test in Europe, 2008

2007

Embedded Test Decompressor to Reduce the Required Channels and Vector Memory of Tester for Complex Processor Circuit.

[BibT_eX]

[DOI]

IEEE Trans. Very Large Scale Integr. Syst., 2007

Leakage Current Optimization Techniques During Test Based on Don't Care Bits Assignment.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2007

Frequency Analysis Method for Propagation of Transient Errors in Combinational Logic.

[BibT_eX]

[DOI]

Shaohua Lei

Proceedings of the 16th Asian Test Symposium, 2007

2006

Embedded test resource for SoC to reduce required tester channels based on advanced convolutional codes.

[BibT_eX]

[DOI]

IEEE Trans. Instrum. Meas., 2006

Compression/Scan Co-design for Reducing Test Data Volume, Scan-in Power Dissipation, and Test Application Time.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2006

Response compaction for system-on-a-chip based on advanced convolutional codes.

[BibT_eX]

[DOI]

Sci. China Ser. F Inf. Sci., 2006

An on-chip combinational decompressor for reducing test data volume.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006

Fast Packet Classification using Group Bit Vector.

[BibT_eX]

[DOI]

Proceedings of the Global Telecommunications Conference, 2006. GLOBECOM '06, San Francisco, CA, USA, 27 November, 2006

Test data compression based on clustered random access scan.

[BibT_eX]

[DOI]

Proceedings of the 15th Asian Test Symposium, 2006

2005

Test Resource Partitioning Based on Efficient Response Compaction for Test Time and Tester Channels Reduction.

[BibT_eX]

[DOI]

J. Comput. Sci. Technol., 2005

Wrapper Scan Chains Design for Rapid and Low Power Testing of Embedded Cores.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2005

SoC Leakage Power Reduction Algorithm by Input Vector Control.

[BibT_eX]

[DOI]

Proceedings of the 2005 International Symposium on System-on-Chip, 2005

Using MUXs Network to Hide Bunches of Scan Chains.

[BibT_eX]

[DOI]

Proceedings of the 6th International Symposium on Quality of Electronic Design (ISQED 2005), 2005

Validation analysis and test flow optimization of VLSI chip.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2005), 2005

Deterministic and low power BIST based on scan slice overlapping.

[BibT_eX]

[DOI]

Ji Li

Proceedings of the International Symposium on Circuits and Systems (ISCAS 2005), 2005

Scan Data Volume Reduction Using Periodically Alterable MUXs Decompressor.

[BibT_eX]

[DOI]

Shivakumar Swaminathan

Yu Hu

Anshuman Chandra

Proceedings of the 14th Asian Test Symposium (ATS 2005), 2005

Theoretic analysis and enhanced X-tolerance of test response compact based on convolutional code.

[BibT_eX]

[DOI]

Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005

2004

Response Compaction for Test Time and Test Pins Reduction Based on Advanced Convolutional Codes.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE International Symposium on Defect and Fault-Tolerance in VLSI Systems (DFT 2004), 2004

Simultaneous Reduction of Test Data Volume and Testing Power for Scan-Based Test.

[BibT_eX]