Yuan Xie

Orcid: 0000-0003-2093-1788

Affiliations:
  • Alibaba DAMO Academy
  • University of California at Santa Barbara, CA, USA (former)
  • Pennsylvania State University, Philadelphia, PA, USA (2003 - 2013)
  • Princeton University, Princeton, NJ, USA (PhD 2002)


According to our database1, Yuan Xie authored at least 559 papers between 2000 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Optimizing NVMe Storage for Large-Scale Deployment: Key Technologies and Strategies in Alibaba Cloud.
IEEE Micro, 2024

A Comprehensive Survey on GNN Characterization.
CoRR, 2024

EVT: Accelerating Deep Learning Training with Epilogue Visitor Tree.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
A Comprehensive Survey on Distributed Training of Graph Neural Networks.
Proc. IEEE, December, 2023

MNSIM 2.0: A Behavior-Level Modeling Tool for Processing-In-Memory Architectures.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., November, 2023

Efficient Super-Resolution System With Block-Wise Hybridization and Quantized Winograd on FPGA.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., November, 2023

Accelerating Distributed GNN Training by Codes.
IEEE Trans. Parallel Distributed Syst., September, 2023

Comprehensive SNN Compression Using ADMM Optimization and Activity Regularization.
IEEE Trans. Neural Networks Learn. Syst., June, 2023

Exploring Adversarial Attack in Spiking Neural Networks With Spike-Compatible Gradient.
IEEE Trans. Neural Networks Learn. Syst., May, 2023

IronMan-Pro: Multiobjective Design Space Exploration in HLS via Reinforcement Learning and Graph Neural Network-Based Modeling.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., March, 2023

SPCIM: Sparsity-Balanced Practical CIM Accelerator With Optimized Spatial-Temporal Multi-Macro Utilization.
IEEE Trans. Circuits Syst. I Regul. Pap., January, 2023

SDP: Co-Designing Algorithm, Dataflow, and Architecture for In-SRAM Sparse NN Acceleration.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2023

SPG: Structure-Private Graph Database via SqueezePIR.
Proc. VLDB Endow., 2023

E-Booster: A Field-Programmable Gate Array-Based Accelerator for Secure Tree Boosting Using Additively Homomorphic Encryption.
IEEE Micro, 2023

ReDCIM: Reconfigurable Digital Computing- In -Memory Processor With Unified FP/INT Pipeline for Cloud AI Acceleration.
IEEE J. Solid State Circuits, 2023

TranCIM: Full-Digital Bitline-Transpose CIM-based Sparse Transformer Accelerator With Pipeline/Parallel Reconfigurable Modes.
IEEE J. Solid State Circuits, 2023

A Survey of Machine Learning for Computer Architecture and Systems.
ACM Comput. Surv., 2023

NP-Hardness of Tensor Network Contraction Ordering.
CoRR, 2023

NPS: A Framework for Accurate Program Sampling Using Graph Neural Network.
CoRR, 2023

High-performance and Scalable Software-based NVMe Virtualization Mechanism with I/O Queues Passthrough.
CoRR, 2023

Dynamic N: M Fine-Grained Structured Sparse Attention Mechanism.
Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023

ALCOP: Automatic Load-Compute Pipelining in Deep Learning Compiler for AI-GPUs.
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

DF-GAS: a Distributed FPGA-as-a-Service Architecture towards Billion-Scale Graph-based Approximate Nearest Neighbor Search.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

TT-GNN: Efficient On-Chip Graph Neural Network Training via Embedding Reformation and Hardware Optimization.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

RM-STC: Row-Merge Dataflow Inspired GPU Sparse Tensor Core for Energy-Efficient Sparse Acceleration.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

ArchExplorer: Microarchitecture Exploration Via Bottleneck Analysis.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

ECSSD: Hardware/Data Layout Co-Designed In-Storage-Computing Architecture for Extreme Classification.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

Klotski: DNN Model Orchestration Framework for Dataflow Architecture Accelerators.
Proceedings of the IEEE/ACM International Conference on Computer Aided Design, 2023

Gamora: Graph Learning based Symbolic Reasoning for Large-Scale Boolean Networks.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

HBP: Hierarchically Balanced Pruning and Accelerator Co-Design for Efficient DNN Inference.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

CHAM: A Customized Homomorphic Encryption Accelerator for Fast Matrix-Vector Product.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Spada: Accelerating Sparse Matrix Multiplication with Adaptive Dataflow.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

2022
STPAcc: Structural TI-Based Pruning for Accelerating Distance-Related Algorithms on CPU-FPGA Platforms.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Hardware-Enabled Efficient Data Processing With Tensor-Train Decomposition.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

H2Learn: High-Efficiency Learning Accelerator for High-Accuracy Spiking Neural Networks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Rubik: A Hierarchical Architecture for Efficient Graph Neural Network Training.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Efficient Processing of Sparse Tensor Decomposition via Unified Abstraction and PE-Interactive Architecture.
IEEE Trans. Computers, 2022

Multi-Node Acceleration for Large-Scale GCNs.
IEEE Trans. Computers, 2022

Dynamic Sparse Attention for Scalable Transformer Acceleration.
IEEE Trans. Computers, 2022

A Systematic View of Model Leakage Risks in Deep Neural Network Systems.
IEEE Trans. Computers, 2022

HEDA: Multi-Attribute Unbounded Aggregation over Homomorphically Encrypted Database.
Proc. VLDB Endow., 2022

A Comprehensive and Modularized Statistical Framework for Gradient Norm Equality in Deep Neural Networks.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

SaARSP: An Architecture for Systolic-Array Acceleration of Recurrent Spiking Neural Networks.
ACM J. Emerg. Technol. Comput. Syst., 2022

EPQuant: A Graph Neural Network compression approach based on product quantization.
Neurocomputing, 2022

Enabling Data Movement and Computation Pipelining in Deep Learning Compiler.
CoRR, 2022

MILAN: Masked Image Pretraining on Language Assisted Representation.
CoRR, 2022

Cost-Aware Exploration for Chiplet-Based Architecture with Advanced Packaging Technologies.
CoRR, 2022

The Spike Gating Flow: A Hierarchical Structure Based Spiking Neural Network for Online Gesture Recognition.
CoRR, 2022

Heuristic Adaptability to Input Dynamics for SpMM on GPUs.
CoRR, 2022

Hybrid Graph Models for Logic Optimization via Spatio-Temporal Information.
CoRR, 2022

Characterizing and Understanding HGNNs on GPUs.
IEEE Comput. Archit. Lett., 2022

MPU-Sim: A Simulator for In-DRAM Near-Bank Processing Architectures.
IEEE Comput. Archit. Lett., 2022

Practical Near-Data-Processing Architecture for Large-Scale Distributed Graph Neural Network.
IEEE Access, 2022

Accelerating CPU-Based Sparse General Matrix Multiplication With Binary Row Merging.
IEEE Access, 2022

OpSparse: A Highly Optimized Framework for Sparse General Matrix Multiplication on GPUs.
IEEE Access, 2022

Faith: An Efficient Framework for Transformer Verification on GPUs.
Proceedings of the 2022 USENIX Annual Technical Conference, 2022

Toward Robust Spiking Neural Network Against Adversarial Perturbation.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Understanding GNN Computational Graph: A Coordinated Computation, IO, and Memory Perspective.
Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

AutoComm: A Framework for Enabling Efficient Communication in Distributed Quantum Programs.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

BEACON: Scalable Near-Data-Processing Accelerators for Genome Analysis near Memory Pool with the CXL Support.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022

COMB-MCM: Computing-on-Memory-Boundary NN Processor with Bipolar Bitwise Sparsity Optimization for Scalable Multi-Chiplet-Module Edge Machine Learning.
Proceedings of the IEEE International Solid-State Circuits Conference, 2022

A 28nm 15.59µJ/Token Full-Digital Bitline-Transpose CIM-Based Sparse Transformer Accelerator with Pipeline/Parallel Reconfigurable Modes.
Proceedings of the IEEE International Solid-State Circuits Conference, 2022

A 28nm 29.2TFLOPS/W BF16 and 36.5TOPS/W INT8 Reconfigurable Digital CIM Processor with Unified FP/INT Pipeline and Bitwise In-Memory Booth Multiplication for Cloud Deep Learning Acceleration.
Proceedings of the IEEE International Solid-State Circuits Conference, 2022

184QPS/W 64Mb/mm<sup>2</sup>3D Logic-to-DRAM Hybrid Bonding with Process-Near-Memory Engine for Recommendation System.
Proceedings of the IEEE International Solid-State Circuits Conference, 2022

A synthesis framework for stitching surface code with superconducting quantum devices.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

INSPIRE: in-storage private information retrieval via protocol and architecture co-design.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

Hyperscale FPGA-as-a-service architecture for large-scale distributed graph neural network.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

DIMMining: pruning-efficient and parallel graph mining on near-memory-computing.
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022

Survey on Graph Neural Network Acceleration: An Algorithmic Perspective.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting.
Proceedings of the 34th IEEE International Conference on Tools with Artificial Intelligence, 2022

Predicting the Output Structure of Sparse Matrix Multiplication with Sampled Compression Ratio.
Proceedings of the 28th IEEE International Conference on Parallel and Distributed Systems, 2022

Effective Model Sparsification by Scheduled Grow-and-Prune Methods.
Proceedings of the Tenth International Conference on Learning Representations, 2022

AI-assisted Synthesis in Next Generation EDA: Promises, Challenges, and Prospects.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

2022 ICCAD CAD Contest Problem C: Microarchitecture Design Space Exploration.
Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022

Accelerating Spatiotemporal Supervised Training of Large-Scale Spiking Neural Networks on GPU.
Proceedings of the 2022 Design, Automation & Test in Europe Conference & Exhibition, 2022

High-level synthesis performance prediction using GNNs: benchmarking, modeling, and advancing.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

Alleviating datapath conflicts and design centralization in graph analytics acceleration.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

Shfl-BW: accelerating deep neural network inference with tensor-core aware weight pruning.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

Heuristic adaptability to input dynamics for SpMM on CPUs.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

CHEX: CHannel EXploration for CNN Model Compression.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

A one-for-all and <i>o</i>(<i>v</i> log(<i>v</i> ))-cost solution for parallel merge style operations on sorted key-value arrays.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

DOTA: detect and omit weak attentions for scalable transformer acceleration.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

Paulihedral: a generalized block-wise compiler optimization framework for Quantum simulation kernels.
Proceedings of the ASPLOS '22: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 28 February 2022, 2022

LOSTIN: Logic Optimization via Spatio-Temporal Information with Hybrid Graph Models.
Proceedings of the 33rd IEEE International Conference on Application-specific Systems, 2022

2021
Core Placement Optimization for Multi-chip Many-core Neural Network Systems with Reinforcement Learning.
ACM Trans. Design Autom. Electr. Syst., 2021

Effective and Efficient Batch Normalization Using a Few Uncorrelated Data for Statistics Estimation.
IEEE Trans. Neural Networks Learn. Syst., 2021

A Novel, Efficient Implementation of a Local Binary Convolutional Neural Network.
IEEE Trans. Circuits Syst. II Express Briefs, 2021

Rescuing RRAM-Based Computing From Static and Dynamic Faults.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2021

Practical Attacks on Deep Neural Networks by Memory Trojaning.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2021

DLUX: A LUT-Based Near-Bank Accelerator for Data Center Deep Learning Training Workloads.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2021

Fast Search of the Optimal Contraction Sequence in Tensor Networks.
IEEE J. Sel. Top. Signal Process., 2021

Erratum to "Evolver: a Deep Learning Processor With On-Device Quantization-Voltage-Frequency Tuning".
IEEE J. Solid State Circuits, 2021

Evolver: A Deep Learning Processor With On-Device Quantization-Voltage-Frequency Tuning.
IEEE J. Solid State Circuits, 2021

Training and inference for integer-based semantic segmentation network.
Neurocomputing, 2021

Tensor train decomposition for solving large-scale linear equations.
Neurocomputing, 2021

Recap of the 39th Edition of the International Conference on Computer-Aided Design (ICCAD 2020).
IEEE Des. Test, 2021

Load-balanced Gather-scatter Patterns for Sparse Deep Neural Networks.
CoRR, 2021

Towards Efficient Ansatz Architecture for Variational Quantum Algorithms.
CoRR, 2021

Mapping Surface Code to Superconducting Quantum Processors.
CoRR, 2021

QECV: Quantum Error Correction Verification.
CoRR, 2021

Mitigating Noise-Induced Gradient Vanishing in Variational Quantum Algorithm Training.
CoRR, 2021

Transformer Acceleration with Dynamic Sparse Attention.
CoRR, 2021

Program-to-Circuit: Exploiting GNNs for Program Representation and Circuit Translation.
CoRR, 2021

Efficient Sparse Matrix Kernels based on Adaptive Workload-Balancing and Parallel-Reduction.
CoRR, 2021

MPU: Towards Bandwidth-abundant SIMT Processor via Near-bank Computing.
CoRR, 2021

A Case for 3D Integrated System Design for Neuromorphic Computing & AI Applications.
CoRR, 2021

Π-RT: A Runtime Framework to Enable Energy-Efficient Real-Time Robotic Vision Applications on Heterogeneous Architectures.
Computer, 2021

Hardware Acceleration for GCNs via Bidirectional Fusion.
IEEE Comput. Archit. Lett., 2021

Palleon: A Runtime System for Efficient Video Processing toward Dynamic Class Skew.
Proceedings of the 2021 USENIX Annual Technical Conference, 2021

Efficient tensor core-based GPU kernels for structured sparsity under reduced precision.
Proceedings of the International Conference for High Performance Computing, 2021

EGEMM-TC: accelerating scientific computing on tensor cores with extended precision.
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021

GNNAdvisor: An Adaptive and Efficient Runtime System for GNN Acceleration on GPUs.
Proceedings of the 15th USENIX Symposium on Operating Systems Design and Implementation, 2021

On the Co-Design of Quantum Software and Hardware.
Proceedings of the NANOCOM '21: The Eighth Annual ACM International Conference on Nanoscale Computing and Communication, Virtual Event, Italy, September 7, 2021

ENMC: Extreme Near-Memory Classification via Approximate Screening.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

Improving Streaming Graph Processing Performance using Input Knowledge.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021

Overcoming the Memory Hierarchy Inefficiencies in Graph Processing Applications.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

Brain-Inspired Computing: Adventure from Beyond CMOS Technologies to Beyond von Neumann Architectures ICCAD Special Session Paper.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

SpaceA: Sparse Matrix Vector Multiplication on Processing-in-Memory Accelerator.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

NeuroMeter: An Integrated Power, Area, and Timing Modeling Framework for Machine Learning Accelerators Industry Track Paper.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

IRONMAN: GNN-assisted Design Space Exploration in High-Level Synthesis via Reinforcement Learning.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021

SEALing Neural Network Models in Encrypted Deep Learning Accelerators.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

TiAcc: Triangle-inequality based Hardware Accelerator for K-means on FPGAs.
Proceedings of the 21st IEEE/ACM International Symposium on Cluster, 2021

2020
SemiMap: A Semi-Folded Convolution Mapping for Speed-Overhead Balance on Crossbars.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Crane: Mitigating Accelerator Under-utilization Caused by Sparsity Irregularities in CNNs.
IEEE Trans. Computers, 2020

NNBench-X: A Benchmarking Methodology for Neural Network Accelerator Designs.
ACM Trans. Archit. Code Optim., 2020

Scanning the Issue.
Proc. IEEE, 2020

Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey.
Proc. IEEE, 2020

Projection-based runtime assertions for testing and debugging Quantum programs.
Proc. ACM Program. Lang., 2020

Training high-performance and large-scale deep neural networks with full 8-bit integers.
Neural Networks, 2020

Comparing SNNs and RNNs on neuromorphic vision datasets: Similarities and differences.
Neural Networks, 2020

Rethinking the performance comparison between SNNS and ANNS.
Neural Networks, 2020

Tianjic: A Unified and Scalable Chip Bridging Spike-Based and Continuous Neural Computation.
IEEE J. Solid State Circuits, 2020

A Case for 3D Integrated System Design for Neuromorphic Computing and AI Applications.
Int. J. Semantic Comput., 2020

Rubik: A Hierarchical Architecture for Efficient Graph Learning.
CoRR, 2020

SEALing Neural Network Models in Secure Deep Learning Accelerators.
CoRR, 2020

GNNAdvisor: An Efficient Runtime System for GNN Acceleration on GPUs.
CoRR, 2020

Computation on Sparse Neural Networks: an Inspiration for Future Hardware.
CoRR, 2020

Memristor Hardware-Friendly Reinforcement Learning.
CoRR, 2020

Characterizing and Understanding GCNs on GPU.
IEEE Comput. Archit. Lett., 2020

NMTSim: Transaction-Command Based Simulator for New Memory Technology Devices.
IEEE Comput. Archit. Lett., 2020

DUET: Boosting Deep Neural Network Efficiency on Dual-Module Architecture.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

SAGA-Bench: Software and Hardware Characterization of Streaming Graph Analytics Workloads.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2020

SmartExchange: Trading Higher-cost Memory Storage/Access for Lower-cost Computation.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

Timely: Pushing Data Movements And Interfaces In Pim Accelerators Towards Local And In Time Domain.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

iPIM: Programmable In-Memory Image Processing Accelerator Using Near-Bank Architecture.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

Xuantie-910: A Commercial Multi-Core 12-Stage Pipeline Out-of-Order 64-bit High Performance RISC-V Processor with Vector Extension : Industrial Product.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

Boosting Deep Neural Network Efficiency with Dual-Module Inference.
Proceedings of the 37th International Conference on Machine Learning, 2020

NEST: DIMM based Near-Data-Processing Accelerator for K-mer Counting.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

fuseGNN: Accelerating Graph Convolutional Neural Network Training on GPGPU.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

HyGCN: A GCN Accelerator with Hybrid Architecture.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

Fulcrum: A Simplified Control and Access Mechanism Toward Flexible and Practical In-Situ Accelerators.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

Xuantie-910: Innovating Cloud and Edge Computing by RISC-V.
Proceedings of the IEEE Hot Chips 32 Symposium, 2020

MNSIM 2.0: A Behavior-Level Modeling Tool for Memristor-based Neuromorphic Computing Systems.
Proceedings of the GLSVLSI '20: Great Lakes Symposium on VLSI 2020, 2020

Taming Unstructured Sparsity on GPUs via Latency-Aware Optimization.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

INVITED: Computation on Sparse Neural Networks and its Implications for Future Hardware.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Eliminating Redundant Computation in Noisy Quantum Computing Simulation.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

Towards Efficient Superconducting Quantum Processor Architecture Design.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

DeepSniffer: A DNN Model Extraction Framework Based on Learning Architectural Hints.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

DARB: A Density-Adaptive Regular-Block Pruning for Deep Neural Networks.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
PXNOR-BNN: In/With Spin-Orbit Torque MRAM Preset-XNOR Operation-Based Binary Neural Networks.
IEEE Trans. Very Large Scale Integr. Syst., 2019

DASM: Data-Streaming-Based Computing in Nonvolatile Memory Architecture for Embedded System.
IEEE Trans. Very Large Scale Integr. Syst., 2019

Parana: A Parallel Neural Architecture Considering Thermal Problem of 3D Stacked Memory.
IEEE Trans. Parallel Distributed Syst., 2019

L1-Norm Batch Normalization for Efficient Training of Deep Neural Networks.
IEEE Trans. Neural Networks Learn. Syst., 2019

GraphH: A Processing-in-Memory Architecture for Large-Scale Graph Processing.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

TIME: A Training-in-Memory Architecture for RRAM-Based Deep Neural Networks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

Emerging Memory Technologies.
IEEE Micro, 2019

Network-on-Chip Design Guidelines for Monolithic 3-D Integration.
IEEE Micro, 2019

Poq: Projection-based Runtime Assertions for Debugging on a Quantum Computer.
CoRR, 2019

DARB: A Density-Aware Regular-Block Pruning for Deep Neural Networks.
CoRR, 2019

AccD: A Compiler-based Framework for Accelerating Distance-related Algorithms on CPU-FPGA Platforms.
CoRR, 2019

SANQ: A Simulation Framework for Architecting Noisy Intermediate-Scale Quantum Computing System.
CoRR, 2019

Neural Network Model Extraction Attacks in Edge Devices by Hearing Architectural Hints.
CoRR, 2019

QGAN: Quantized Generative Adversarial Networks.
CoRR, 2019

A Secure and Persistent Memory System for Non-volatile Memory.
CoRR, 2019

NNBench-X: Benchmarking and Understanding Neural Network Workloads for Accelerator Designs.
IEEE Comput. Archit. Lett., 2019

Power Profiling of Modern Die-Stacked Memory.
IEEE Comput. Archit. Lett., 2019

CRISP: Center for Research on Intelligent Storage and Processing-in-Memory.
Proceedings of the International Symposium on VLSI Design, Automation and Test, 2019

Investigation of Cost-Optimal Network-on-Chip for Passive and Active Interposer Systems.
Proceedings of the 21st ACM/IEEE International Workshop on System Level Interconnect Prediction, 2019

SuperMem: Enabling Application-transparent Secure Persistent Memory with Low Overheads.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

Sparse Tensor Core: Algorithm and Hardware Co-Design for Vector-wise Sparse Neural Networks on Modern GPUs.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

Alleviating Irregularity in Graph Analytics Acceleration: a Hardware/Software Co-Design Approach.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

MEDAL: Scalable DIMM based Near Data Processing Accelerator for DNA Seeding Algorithm.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

Balancing Memory Accesses for Energy-Efficient Graph Analytics Accelerators.
Proceedings of the 2019 IEEE/ACM International Symposium on Low Power Electronics and Design, 2019

Dynamic Sparse Graph for Efficient Deep Learning.
Proceedings of the 7th International Conference on Learning Representations, 2019

Analysis and Optimization of the Memory Hierarchy for Graph Processing Workloads.
Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

Ouroboros: An Inference Engine for Deep Learning Based TTS on Embedded Devices.
Proceedings of the 2019 IEEE Hot Chips 31 Symposium (HCS), 2019

CNNWire: Boosting Convolutional Neural Network with Winograd on ReRAM based Accelerators.
Proceedings of the 2019 on Great Lakes Symposium on VLSI, 2019

Memory Trojan Attack on Neural Network Accelerators.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

Near-Data Acceleration of Privacy-Preserving Biomarker Search with 3D-Stacked Memory.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

CORN: In-Buffer Computing for Binary Neural Network.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2019

Memory-Bound Proof-of-Work Acceleration for Blockchain Applications.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

Efficient System Architecture in the Era of Monolithic 3D: Dynamic Inter-tier Interconnect and Processing-in-Memory.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

Tackling the Qubit Mapping Problem for NISQ-Era Quantum Devices.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

FPSA: A Full System Stack Solution for Reconfigurable ReRAM-based NN Accelerator Architecture.
Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, 2019

Learning the sparsity for ReRAM: mapping and pruning sparse neural network for ReRAM based accelerator.
Proceedings of the 24th Asia and South Pacific Design Automation Conference, 2019

Direct Training for Spiking Neural Networks: Faster, Larger, Better.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Performance Evaluation and Optimization of HBM-Enabled GPU for Data-Intensive Applications.
IEEE Trans. Very Large Scale Integr. Syst., 2018

An Adaptive 3T-3MTJ Memory Cell Design for STT-MRAM-Based LLCs.
IEEE Trans. Very Large Scale Integr. Syst., 2018

Securing Emerging Nonvolatile Main Memory With Fast and Energy-Efficient AES In-Memory Implementation.
IEEE Trans. Very Large Scale Integr. Syst., 2018

Mitigating BTI-Induced Degradation in STT-MRAM Sensing Schemes.
IEEE Trans. Very Large Scale Integr. Syst., 2018

An Instruction Set Architecture for Machine Learning.
ACM Trans. Comput. Syst., 2018

MNSIM: Simulation Platform for Memristor-Based Neuromorphic Computing System.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

IAA: Incidental Approximate Architectures for Extremely Energy-Constrained Energy Harvesting Scenarios using IoT Nonvolatile Processors.
IEEE Micro, 2018

Die Stacking Is Happening.
IEEE Micro, 2018

Stuck-at Fault Tolerance in RRAM Computing Systems.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2018

Batch Normalization Sampling.
CoRR, 2018

In-memory multiplication engine with SOT-MRAM based stochastic computing.
CoRR, 2018

Structurally Sparsified Backward Propagation for Faster Long Short-Term Memory Training.
CoRR, 2018

L1-Norm Batch Normalization for Efficient Training of Deep Neural Networks.
CoRR, 2018

PIRT: A Runtime Framework to Enable Energy-Efficient Real-Time Robotic Applications on Heterogeneous Architectures.
CoRR, 2018

Bridging the Gap Between Neural Networks and Neuromorphic Hardware with A Neural Network Compiler.
CoRR, 2018

Exploring Core and Cache Hierarchy Bottlenecks in Graph Processing Workloads.
IEEE Comput. Archit. Lett., 2018

Crossbar-Aware Neural Network Pruning.
IEEE Access, 2018

HitNet: Hybrid Ternary Recurrent Neural Network.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

TETRIS: TilE-matching the TRemendous Irregular Sparsity.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

SCOPE: A Stochastic Computing Engine for DRAM-Based In-Situ Accelerator.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

Persistence Parallelism Optimization: A Holistic Approach from Memory Bus to RDMA Network.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

GraphIA: an in-situ accelerator for large-scale graph processing.
Proceedings of the International Symposium on Memory Systems, 2018

AIM: Fast and energy-efficient AES in-memory implementation for emerging non-volatile main memory.
Proceedings of the 2018 Design, Automation & Test in Europe Conference & Exhibition, 2018

SNrram: an efficient sparse neural network computation architecture based on resistive random-access memory.
Proceedings of the 55th Annual Design Automation Conference, 2018

RADAR: a 3D-reRAM based DNA alignment accelerator architecture.
Proceedings of the 55th Annual Design Automation Conference, 2018

Packet pump: overcoming network bottleneck in on-chip interconnects for GPGPUs.
Proceedings of the 55th Annual Design Automation Conference, 2018

NEOFog: Nonvolatility-Exploiting Optimizations for Fog Computing.
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

Bridge the Gap between Neural Networks and Neuromorphic Hardware with a Neural Network Compiler.
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018

Cost-efficient 3D Integration to Hinder Reverse Engineering During and After Manufacturing.
Proceedings of the Asian Hardware Oriented Security and Trust Symposium, 2018

2017
Thermomechanical Stress-Aware Management for 3-D IC Designs.
IEEE Trans. Very Large Scale Integr. Syst., 2017

Dynamic Power and Energy Management for Energy Harvesting Nonvolatile Processor Systems.
ACM Trans. Embed. Comput. Syst., 2017

DLAU: A Scalable Deep Learning Accelerator Unit on FPGA.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2017

Software-Hardware Codesign for Efficient Neural Network Acceleration.
IEEE Micro, 2017

Overview of 3-D Architecture Design Opportunities and Techniques.
IEEE Des. Test, 2017

Incidental computing on IoT nonvolatile processors.
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

DRISA: a DRAM-based reconfigurable in-situ accelerator.
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017

There and Back Again: Optimizing the Interconnect in Networks of Memory Cubes.
Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017

Cost-effective design of scalable high-performance systems using active and passive interposers.
Proceedings of the 2017 IEEE/ACM International Conference on Computer-Aided Design, 2017

PRESCOTT: Preset-based cross-point architecture for spin-orbit-torque magnetic random access memory.
Proceedings of the 2017 IEEE/ACM International Conference on Computer-Aided Design, 2017

Security Threats and Countermeasures in Three-Dimensional Integrated Circuits.
Proceedings of the on Great Lakes Symposium on VLSI 2017, 2017

TIME: A Training-in-memory Architecture for Memristor-based Deep Neural Networks.
Proceedings of the 54th Annual Design Automation Conference, 2017

Spendthrift: Machine learning based resource and frequency scaling for ambient energy harvesting nonvolatile processors.
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

Building energy-efficient multi-level cell STT-RAM caches with data compression.
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

Computation-oriented fault-tolerance schemes for RRAM computing systems.
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

POSTER: Bridge the Gap Between Neural Networks and Neuromorphic Hardware.
Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques, 2017

2016
Hybrid Drowsy SRAM and STT-RAM Buffer Designs for Dark-Silicon-Aware NoC.
IEEE Trans. Very Large Scale Integr. Syst., 2016

TSocket: Thermal Sustainable Power Budgeting.
ACM Trans. Design Autom. Electr. Syst., 2016

Adapting B<sup>+</sup> -Tree for Emerging Nonvolatile Memory-Based Main Memory.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2016

Nonvolatile Processor Architectures: Efficient, Reliable Progress with Unstable Power.
IEEE Micro, 2016

BACH: A Bandwidth-Aware Hybrid Cache Hierarchy Design with Nonvolatile Memories.
J. Comput. Sci. Technol., 2016

CNNLab: a Novel Parallel Framework for Neural Networks using GPU and FPGA-a Practical Study with Trade-off Analysis.
CoRR, 2016

Redesigning software and systems for non-volatile processors on self-powered devices.
Proceedings of the 2016 IFIP/IEEE International Conference on Very Large Scale Integration, 2016

Building a Low Latency, Highly Associative DRAM Cache with the Buffered Way Predictor.
Proceedings of the 28th International Symposium on Computer Architecture and High Performance Computing, 2016

OSCAR: Orchestrating STT-RAM cache traffic for heterogeneous CPU-GPU architectures.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

A unified memory network architecture for in-memory computing in commodity servers.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

NEUTRAMS: Neural network transformation and co-design under neuromorphic hardware constraints.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016

A Real-Time and Energy-Efficient Implementation of Difference-of-Gaussian with Flexible Thin-Film Transistors.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2016

Cost and Thermal Analysis of High-Performance 2.5D and 3D Integrated Circuit Design Space.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2016

Mellow Writes: Extending Lifetime in Resistive Memories through Selective Slow Write Backs.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Cambricon: An Instruction Set Architecture for Neural Networks.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

PRIME: A Novel Processing-in-Memory Architecture for Neural Network Computation in ReRAM-Based Main Memory.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

LAP: Loop-Block Aware Inclusion Properties for Energy-Efficient Asymmetric Last Level Caches.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Thermal-aware 3D design for side-channel information leakage.
Proceedings of the 34th IEEE International Conference on Computer Design, 2016

Scalable memory fabric for silicon interposer-based multi-core systems.
Proceedings of the 34th IEEE International Conference on Computer Design, 2016

ODESY: a novel 3T-3MTJ cell design with optimized area DEnsity, scalability and latencY.
Proceedings of the 35th International Conference on Computer-Aided Design, 2016

Cost analysis and cost-driven IP reuse methodology for SoC design based on 2.5D/3D integration.
Proceedings of the 35th International Conference on Computer-Aided Design, 2016

NVSim-CAM: a circuit-level simulator for emerging nonvolatile memory based content-addressable memory.
Proceedings of the 35th International Conference on Computer-Aided Design, 2016

Leveraging 3D Technologies for Hardware Security: Opportunities and Challenges.
Proceedings of the 26th edition on Great Lakes Symposium on VLSI, 2016

MNSIM: Simulation platform for memristor-based neuromorphic computing system.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016

Fine-granularity tile-level parallelism in non-volatile memory architecture with two-dimensional bank subdivision.
Proceedings of the 53rd Annual Design Automation Conference, 2016

Pinatubo: a processing-in-memory architecture for bulk bitwise operations in emerging non-volatile memories.
Proceedings of the 53rd Annual Design Automation Conference, 2016

NVSim-VX<sup>s</sup>: an improved NVSim for variation aware STT-RAM simulation.
Proceedings of the 53rd Annual Design Automation Conference, 2016

Neural network transformation under hardware constraints.
Proceedings of the 2016 International Conference on Compilers, 2016

Architecture design with STT-RAM: Opportunities and challenges.
Proceedings of the 21st Asia and South Pacific Design Automation Conference, 2016

2015
Die-stacking Architecture
Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01747-6, 2015

Whitespace-Aware TSV Arrangement in 3-D Clock Tree Synthesis.
IEEE Trans. Very Large Scale Integr. Syst., 2015

Impact of Cell Failure on Reliable Cross-Point Resistive Memory Design.
ACM Trans. Design Autom. Electr. Syst., 2015

Adaptive Burst-Writes (ABW): Memory Requests Scheduling to Reduce Write-Induced Interference.
ACM Trans. Design Autom. Electr. Syst., 2015

Introduction to the Special Issue on Reliable, Resilient, and Robust Design of Circuits and Systems.
ACM Trans. Design Autom. Electr. Syst., 2015

Impact of Write Pulse and Process Variation on 22 nm FinFET-Based STT-RAM Design: A Device-Architecture Co-Optimization Approach.
IEEE Trans. Multi Scale Comput. Syst., 2015

Buri: Scaling Big-Memory Computing with Hardware-Based Memory Expansion.
ACM Trans. Archit. Code Optim., 2015

EECache: A Comprehensive Study on the Architectural Design for Energy-Efficient Last-Level Caches in Chip Multiprocessors.
ACM Trans. Archit. Code Optim., 2015

Nonvolatile Processor Architecture Exploration for Energy-Harvesting Applications.
IEEE Micro, 2015

A Write-Aware STTRAM-Based Register File Architecture for GPGPU.
ACM J. Emerg. Technol. Comput. Syst., 2015

Memory and Storage System Design with Nonvolatile Memory Technologies.
IPSJ Trans. Syst. LSI Des. Methodol., 2015

NVMain 2.0: A User-Friendly Memory Simulator to Model (Non-)Volatile Memory Systems.
IEEE Comput. Archit. Lett., 2015

Leveraging nonvolatility for architecture design with emerging NVM.
Proceedings of the IEEE Non-Volatile Memory System and Applications Symposium, 2015

Using Multiple-Input NEMS for Parallel A/D Conversion and Image Processing.
Proceedings of the 2015 IEEE Computer Society Annual Symposium on VLSI, 2015

Exploring memory controller configurations for many-core systems with 3D stacked DRAMs.
Proceedings of the Sixteenth International Symposium on Quality Electronic Design, 2015

Leveraging emerging nonvolatile memory in high-level synthesis with loop transformations.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015

History-Assisted Adaptive-Granularity Caches (HAAG$) for High Performance 3D DRAM Architectures.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015

Dynamic Machine Learning Based Matching of Nonvolatile Processor Microarchitecture to Harvested Energy Profile.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2015

Overcoming the challenges of crossbar resistive memory architectures.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

Architecture exploration for ambient energy harvesting nonvolatile processors.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

Energy Efficient RRAM Spiking Neural Network for Real Time Classification.
Proceedings of the 25th edition on Great Lakes Symposium on VLSI, GLVLSI 2015, Pittsburgh, PA, USA, May 20, 2015

DESTINY: a tool for modeling emerging 3D NVM and eDRAM caches.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

DimNoC: a dim silicon approach towards power-efficient on-chip network.
Proceedings of the 52nd Annual Design Automation Conference, 2015

Ambient energy harvesting nonvolatile processors: from circuit to system.
Proceedings of the 52nd Annual Design Automation Conference, 2015

Core vs. uncore: the heart of darkness.
Proceedings of the 52nd Annual Design Automation Conference, 2015

Self-powered wearable sensor node: Challenges and opportunities.
Proceedings of the 2015 International Conference on Compilers, 2015

Heterogeneous architecture design with emerging 3D and non-volatile memory technologies.
Proceedings of the 20th Asia and South Pacific Design Automation Conference, 2015

Modeling framework for cross-point resistive memory design emphasizing reliability and variability issues.
Proceedings of the 20th Asia and South Pacific Design Automation Conference, 2015

Nonvolatile memory allocation and hierarchy optimization for high-level synthesis.
Proceedings of the 20th Asia and South Pacific Design Automation Conference, 2015

2014
Editorial: ACM Transactions on Design Automation of Electronics Systems and Beyond.
ACM Trans. Design Autom. Electr. Syst., 2014

Optimizing the NoC Slack Through Voltage and Frequency Scaling in Hard Real-Time Embedded Systems.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2014

PS3-RAM: A Fast Portable and Scalable Statistical STT-RAM Reliability/Energy Analysis Method.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2014

Endurance-aware cache line management for non-volatile caches.
ACM Trans. Archit. Code Optim., 2014

Building and Optimizing MRAM-Based Commodity Memories.
ACM Trans. Archit. Code Optim., 2014

Preventing STT-RAM Last-Level Caches from Port Obstruction.
ACM Trans. Archit. Code Optim., 2014

Testable cross-power domain interface (CPDI) circuit design in monolithic 3D technology.
ACM J. Emerg. Technol. Comput. Syst., 2014

On-Chip Hybrid Power Supply System for Wireless Sensor Nodes.
ACM J. Emerg. Technol. Comput. Syst., 2014

An Embedded Co-AdaBoost based construction of software document relation coupled resource spaces for cyber-physical society.
Future Gener. Comput. Syst., 2014

Exploration of Electrical and Novel Optical Chip-to-Chip Interconnects.
IEEE Des. Test, 2014

Compact models and model standard for 2.5D and 3D integration.
Proceedings of the ACM/IEEE International Workshop on System Level Interconnect Prediction, 2014

FIRM: Fair and High-Performance Memory Control for Persistent Memory Systems.
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014

Independently-Controlled-Gate FinFET 6T SRAM Cell Design for Leakage Current Reduction and Enhanced Read Access Speed.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2014

Efficient region-aware P/G TSV planning for 3D ICs.
Proceedings of the Fifteenth International Symposium on Quality Electronic Design, 2014

Building energy-efficient multi-level cell STT-MRAM based cache through dynamic data-resistance encoding.
Proceedings of the Fifteenth International Symposium on Quality Electronic Design, 2014

Enabling high-performance LPDDRx-compatible MRAM.
Proceedings of the International Symposium on Low Power Electronics and Design, 2014

Making B<sup>+</sup>-tree efficient in PCM-based main memory.
Proceedings of the International Symposium on Low Power Electronics and Design, 2014

EECache: exploiting design choices in energy-efficient last-level caches for chip multiprocessors.
Proceedings of the International Symposium on Low Power Electronics and Design, 2014

Half-DRAM: A high-bandwidth and low-power DRAM architecture from the rethinking of fine-grained activation.
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

ProactiveDRAM: A DRAM-initiated retention management scheme.
Proceedings of the 32nd IEEE International Conference on Computer Design, 2014

Architecting 3D vertical resistive memory for next-generation storage systems.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2014

Using multi-level cell STT-RAM for fast and energy-efficient local checkpointing.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2014

CREAM: A Concurrent-Refresh-Aware DRAM Memory architecture.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

Adaptive placement and migration policy for an STT-RAM-based hybrid cache.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

TSV power supply array electromigration lifetime analysis in 3D ICS.
Proceedings of the Great Lakes Symposium on VLSI 2014, GLSVLSI '14, Houston, TX, USA - May 21, 2014

3D-SWIFT: a high-performance 3D-stacked wide IO DRAM.
Proceedings of the Great Lakes Symposium on VLSI 2014, GLSVLSI '14, Houston, TX, USA - May 21, 2014

Reliability-aware cross-point resistive memory design.
Proceedings of the Great Lakes Symposium on VLSI 2014, GLSVLSI '14, Houston, TX, USA - May 21, 2014

NoC-Sprinting: Interconnect for Fine-Grained Sprinting in the Dark Silicon Era.
Proceedings of the 51st Annual Design Automation Conference 2014, 2014

Design Methodologies for 3D Mixed Signal Integrated Circuits: a Practical 12-bit SAR ADC Design Case.
Proceedings of the 51st Annual Design Automation Conference 2014, 2014

Thermal-Sustainable Power Budgeting for Dynamic Threading.
Proceedings of the 51st Annual Design Automation Conference 2014, 2014

Run-Time Technique for Simultaneous Aging and Power Optimization in GPGPUs.
Proceedings of the 51st Annual Design Automation Conference 2014, 2014

3DLAT: TSV-based 3D ICs crosstalk minimization utilizing Less Adjacent Transition code.
Proceedings of the 19th Asia and South Pacific Design Automation Conference, 2014

NoΔ: Leveraging delta compression for end-to-end memory access in NoC based multicores.
Proceedings of the 19th Asia and South Pacific Design Automation Conference, 2014

Modeling and design analysis of 3D vertical resistive memory - A low cost cross-point architecture.
Proceedings of the 19th Asia and South Pacific Design Automation Conference, 2014

SwimmingLane: A composite approach to mitigate voltage droop effects in 3D power delivery network.
Proceedings of the 19th Asia and South Pacific Design Automation Conference, 2014

Designing vertical bandwidth reconfigurable 3D NoCs for many core systems.
Proceedings of the 2014 International 3D Systems Integration Conference, 2014

A cost benefit analysis: The impact of defect clustering on the necessity of pre-bond tests.
Proceedings of the 2014 International 3D Systems Integration Conference, 2014

2013
Guest Editorial.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2013

Through Silicon Via Aware Design Planning for Thermally Efficient 3-D Integrated Circuits.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2013

Optimizing GPU energy efficiency with 3D die-stacking graphics memory and reconfigurable memory interface.
ACM Trans. Archit. Code Optim., 2013

WADE: Writeback-aware dynamic cache management for NVM-based main memory system.
ACM Trans. Archit. Code Optim., 2013

A circuit-architecture co-optimization framework for exploring nonvolatile memory hierarchies.
ACM Trans. Archit. Code Optim., 2013

Exploring the vulnerability of CMPs to soft errors with 3D stacked nonvolatile memory.
ACM J. Emerg. Technol. Comput. Syst., 2013

A Synthesis Algorithm for Reconfigurable Single-Electron Transistor Arrays.
ACM J. Emerg. Technol. Comput. Syst., 2013

Thermal-aware P/G TSV planning for IR drop reduction in 3D ICs.
Integr., 2013

Evaluation and mitigation of performance degradation under random telegraph noise for digital circuits.
IET Circuits Devices Syst., 2013

Assessment of Circuit Optimization Techniques Under NBTI.
IEEE Des. Test, 2013

Kiln: closing the performance gap between systems with and without persistence support.
Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture, 2013

Cost-driven 3D design optimization with metal layer reduction technique.
Proceedings of the International Symposium on Quality Electronic Design, 2013

CPDI: Cross-power-domain interface circuit design in monolithic 3D technology.
Proceedings of the International Symposium on Quality Electronic Design, 2013

TSV-aware topology generation for 3D Clock Tree Synthesis.
Proceedings of the International Symposium on Quality Electronic Design, 2013

A circuit-architecture co-optimization framework for evaluating emerging memory hierarchies.
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013

Designing scratchpad memory architecture with emerging STT-RAM memory technologies.
Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013

Lazy Precharge: An overhead-free method to reduce precharge overhead for memory parallelism improvement of DRAM system.
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013

Low power multi-level-cell resistive memory design with incomplete data mapping.
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013

Design of cross-point metal-oxide ReRAM emphasizing reliability and cost.
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2013

i<sup>2</sup>WAP: Improving non-volatile cache lifetime by reducing inter- and intra-set write variations.
Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, 2013

TS-Router: On maximizing the Quality-of-Allocation in the On-Chip Network.
Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, 2013

Thermomechanical stress-aware management for 3D IC designs.
Proceedings of the Design, Automation and Test in Europe, 2013

Future memory and interconnect technologies.
Proceedings of the Design, Automation and Test in Europe, 2013

OAP: an obstruction-aware cache management policy for STT-RAM last-level caches.
Proceedings of the Design, Automation and Test in Europe, 2013

Designing energy-efficient NoC for real-time embedded systems through slack optimization.
Proceedings of the 50th Annual Design Automation Conference 2013, 2013

Understanding the trade-offs in multi-level cell ReRAM memory design.
Proceedings of the 50th Annual Design Automation Conference 2013, 2013

2012
Low-Power Design of Emerging Memory Technologies.
Proceedings of the Handbook of Energy-Aware and Green Computing - Two Volume Set., 2012

Electrical Characterization for Intertier Connections and Timing Analysis for 3-D ICs.
IEEE Trans. Very Large Scale Integr. Syst., 2012

Performance/Thermal-Aware Design of 3D-Stacked L2 Caches for CMPs.
ACM Trans. Design Autom. Electr. Syst., 2012

Power Analysis Attack Resistance Engineering by Dynamic Voltage and Frequency Scaling.
ACM Trans. Embed. Comput. Syst., 2012

NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2012

Parametric Yield-Driven Resource Binding in High-Level Synthesis with Multi-Vth/Vdd Library and Device Sizing.
J. Electr. Comput. Eng., 2012

ESL Design Methodology.
J. Electr. Comput. Eng., 2012

An Embedded Co-AdaBoost and Its Application in Classification of Software Document Relation.
Proceedings of the Eighth International Conference on Semantics, Knowledge and Grids, 2012

MAGE: adaptive granularity and ECC for resilient and power efficient memory systems.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

NVMain: An Architectural-Level Main Memory Simulator for Emerging Non-volatile Memories.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2012

Temporal Performance Degradation under RTN: Evaluation and Mitigation for Nanoscale Circuits.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2012

Energy-efficient GPU design with reconfigurable in-package graphics memory.
Proceedings of the International Symposium on Low Power Electronics and Design, 2012

Design trade-offs for high density cross-point resistive memory.
Proceedings of the International Symposium on Low Power Electronics and Design, 2012

Optimizing bandwidth and power of graphics memory with hybrid memory technologies and adaptive data migration.
Proceedings of the 2012 IEEE/ACM International Conference on Computer-Aided Design, 2012

Mitigating electromigration of power supply networks using bidirectional current stress.
Proceedings of the Great Lakes Symposium on VLSI 2012, 2012

Modeling and design exploration of FBDRAM as on-chip memory.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012

3DHLS: Incorporating high-level synthesis in physical planning of three-dimensional (3D) ICs.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012

PS3-RAM: a fast portable and scalable statistical STT-RAM reliability analysis method.
Proceedings of the 49th Annual Design Automation Conference 2012, 2012

Point and discard: a hard-error-tolerant architecture for non-volatile last level caches.
Proceedings of the 49th Annual Design Automation Conference 2012, 2012

Cache revive: architecting volatile STT-RAM caches for enhanced performance in CMPs.
Proceedings of the 49th Annual Design Automation Conference 2012, 2012

Yield-aware time-efficient testing and self-fixing design for TSV-based 3D ICs.
Proceedings of the 17th Asia and South Pacific Design Automation Conference, 2012

Low power memristor-based ReRAM design with Error Correcting Code.
Proceedings of the 17th Asia and South Pacific Design Automation Conference, 2012

Thermal-aware power network design for IR drop reduction in 3D ICs.
Proceedings of the 17th Asia and South Pacific Design Automation Conference, 2012

2011
Influence of Stacked 3D Memory/Cache Architectures on GPUs.
Proceedings of the 3D Integration for NoC-based SoC Architectures, 2011

Leakage Power and Circuit Aging Cooptimization by Gate Replacement Techniques.
IEEE Trans. Very Large Scale Integr. Syst., 2011

Soft Error Rate Analysis for Combinational Logic Using an Accurate Electrical Masking Model.
IEEE Trans. Dependable Secur. Comput., 2011

Temperature-Aware NBTI Modeling and the Impact of Standby Leakage Reduction Techniques on Circuit Performance Degradation.
IEEE Trans. Dependable Secur. Comput., 2011

Variation-Aware Task and Communication Mapping for MPSoC Architecture.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2011

Hybrid checkpointing using emerging nonvolatile memories for future exascale systems.
ACM Trans. Archit. Code Optim., 2011

Editorial- three-dimensional integrated circuits design.
IET Comput. Digit. Tech., 2011

Stacking magnetic random access memory atop microprocessors: an architecture-level evaluation.
IET Comput. Digit. Tech., 2011

Three-dimensional Integrated Circuits: Design, EDA, and Architecture.
Found. Trends Electron. Des. Autom., 2011

Exploiting Heterogeneity for Energy Efficiency in Chip Multiprocessors.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2011

Modeling, Architecture, and Applications for Emerging Memory Technologies.
IEEE Des. Test Comput., 2011

Impact of Circuit Degradation on FPGA Design Security.
Proceedings of the IEEE Computer Society Annual Symposium on VLSI, 2011

Analysis and mitigation of lateral thermal blockage effect of through-silicon-via in 3D IC designs.
Proceedings of the 2011 International Symposium on Low Power Electronics and Design, 2011

Moguls: a model to explore the memory hierarchy for bandwidth improvements.
Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011

Architecting on-chip interconnects for stacked 3D STT-RAM caches in CMPs.
Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011

F<sup>2</sup>BFLY: an on-chip free-space optical network with wavelength-switching.
Proceedings of the 25th International Conference on Supercomputing, 2011, Tucson, AZ, USA, May 31, 2011

Energy-efficient multi-level cell phase-change memory system with data encoding.
Proceedings of the IEEE 29th International Conference on Computer Design, 2011

Exploring the vulnerability of CMPs to soft errors with 3D stacked non-volatile memory.
Proceedings of the IEEE 29th International Conference on Computer Design, 2011

Bandwidth-aware reconfigurable cache design with hybrid memory technologies.
Proceedings of the 2011 IEEE/ACM International Conference on Computer-Aided Design, 2011

Device-architecture co-optimization of STT-RAM based memory for low power embedded systems.
Proceedings of the 2011 IEEE/ACM International Conference on Computer-Aided Design, 2011

MorphCache: A Reconfigurable Adaptive Multi-level Cache hierarchy.
Proceedings of the 17th International Conference on High-Performance Computer Architecture (HPCA-17 2011), 2011

Enabling architectural innovations using non-volatile memory.
Proceedings of the 21st ACM Great Lakes Symposium on VLSI 2010, 2011

An energy-efficient 3D CMP design with fine-grained voltage scaling.
Proceedings of the Design, Automation and Test in Europe, 2011

Design implications of memristor-based RRAM cross-point structures.
Proceedings of the Design, Automation and Test in Europe, 2011

Automated mapping for reconfigurable single-electron transistor arrays.
Proceedings of the 48th Design Automation Conference, 2011

System-level design space exploration for three-dimensional (3D) SoCs.
Proceedings of the 9th International Conference on Hardware/Software Codesign and System Synthesis, 2011

A frequent-value based PRAM memory architecture.
Proceedings of the 16th Asia South Pacific Design Automation Conference, 2011

Enabling quality-of-service in nanophotonic network-on-chip.
Proceedings of the 16th Asia South Pacific Design Automation Conference, 2011

On-chip hybrid power supply system for wireless sensor nodes.
Proceedings of the 16th Asia South Pacific Design Automation Conference, 2011

AdaMS: Adaptive MLC/SLC phase-change memory design for file storage.
Proceedings of the 16th Asia South Pacific Design Automation Conference, 2011

2010
Total Power Optimization for Combinational Logic Using Genetic Algorithms.
J. Signal Process. Syst., 2010

Variable-Latency Adder (VL-Adder) Designs for Low Power and NBTI Tolerance.
IEEE Trans. Very Large Scale Integr. Syst., 2010

Fabrication Cost Analysis and Cost-Aware Design Space Exploration for 3-D ICs.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2010

Design exploration of hybrid caches with disparate memory technologies.
ACM Trans. Archit. Code Optim., 2010

Test-access mechanism optimization for core-based three-dimensional SOCs.
Microelectron. J., 2010

3D Stacked Microprocessor: Are We There Yet?
IEEE Micro, 2010

Processor Architecture Design Using 3D Integration Technology.
Proceedings of the VLSI Design 2010: 23rd International Conference on VLSI Design, 2010

Simple but Effective Heterogeneous Main Memory with On-Chip Memory Controller Support.
Proceedings of the Conference on High Performance Computing Networking, 2010

LOFT: A High Performance Network-on-Chip Providing Quality-of-Service Support.
Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010

Modeling TSV open defects in 3D-stacked DRAM.
Proceedings of the 2011 IEEE International Test Conference, 2010

Low-power dual-element memristor based memory design.
Proceedings of the 2010 International Symposium on Low Power Electronics and Design, 2010

3D-nonFAR: three-dimensional non-volatile FPGA architecture using phase change memory.
Proceedings of the 2010 International Symposium on Low Power Electronics and Design, 2010

Evaluation of using inductive/capacitive-coupling vertical interconnects in 3D network-on-chip.
Proceedings of the 2010 International Conference on Computer-Aided Design, 2010

Cost-effective integration of three-dimensional (3D) ICs emphasizing testing cost analysis.
Proceedings of the 2010 International Conference on Computer-Aided Design, 2010

A Hybrid solid-state storage architecture for the performance, energy consumption, and lifetime improvement.
Proceedings of the 16th International Conference on High-Performance Computer Architecture (HPCA-16 2010), 2010

Energy- and endurance-aware design of phase change memory caches.
Proceedings of the Design, Automation and Test in Europe, 2010

Cost-aware three-dimensional (3D) many-core multiprocessor design.
Proceedings of the 47th Design Automation Conference, 2010

Cost-driven 3D integration with interconnect layers.
Proceedings of the 47th Design Automation Conference, 2010

Impact of process variations on emerging memristor.
Proceedings of the 47th Design Automation Conference, 2010

A customized design of DRAM controller for on-chip 3D DRAM stacking.
Proceedings of the IEEE Custom Integrated Circuits Conference, 2010

Energy and performance driven circuit design for emerging phase-change memory.
Proceedings of the 15th Asia South Pacific Design Automation Conference, 2010

Three-dimensional integrated circuits (3D IC) floorplan and power/ground network co-synthesis.
Proceedings of the 15th Asia South Pacific Design Automation Conference, 2010

Parametric yield driven resource binding in behavioral synthesis with multi-<i>V</i><sub><i>th</i></sub><i>/V</i><sub><i>dd</i></sub> library.
Proceedings of the 15th Asia South Pacific Design Automation Conference, 2010

Minimizing leakage power in aging-bounded high-level synthesis with design time multi-<i>V</i><sub><i>th</i></sub> assignment.
Proceedings of the 15th Asia South Pacific Design Automation Conference, 2010

Architectural benefits and design challenges for three-dimensional integrated circuits.
Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems, 2010

A 3D SoC design for H.264 application with on-chip DRAM stacking.
Proceedings of the IEEE International Conference on 3D System Integration, 2010

3D memory stacking for fast checkpointing/restore applications.
Proceedings of the IEEE International Conference on 3D System Integration, 2010

2009
Modeling Soft Errors at the Device and Logic Levels for Combinational Circuits.
IEEE Trans. Dependable Secur. Comput., 2009

Process-Variation-Aware Adaptive Cache Architecture and Management.
IEEE Trans. Computers, 2009

Scan-chain design and optimization for three-dimensional integrated circuits.
ACM J. Emerg. Technol. Comput. Syst., 2009

<i>New-Age</i>: A Negative Bias Temperature Instability-Estimation Framework for Microarchitectural Components.
Int. J. Parallel Program., 2009

Temperature-Aware NBTI Modeling Techniques in Digital Circuits.
IEICE Trans. Electron., 2009

Statistical High-Level Synthesis under Process Variability.
IEEE Des. Test Comput., 2009

Guest Editors' Introduction: Opportunities and Challenges of 3D Integration.
IEEE Des. Test Comput., 2009

Leveraging 3D PCRAM technologies to reduce checkpoint overhead for future exascale systems.
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009

Networks-on-chip in emerging interconnect paradigms: Advantages and challenges.
Proceedings of the Third International Symposium on Networks-on-Chips, 2009

Power and area reduction using carbon nanotube bundle interconnect in global clock tree distribution network.
Proceedings of the 2009 IEEE/ACM International Symposium on Nanoscale Architectures, 2009

On the efficacy of input Vector Control to mitigate NBTI effects and leakage power.
Proceedings of the 10th International Symposium on Quality of Electronic Design (ISQED 2009), 2009

NBTI-aware statistical circuit delay assessment.
Proceedings of the 10th International Symposium on Quality of Electronic Design (ISQED 2009), 2009

Exploration of 3D stacked L2 cache design for high performance and efficient thermal control.
Proceedings of the 2009 International Symposium on Low Power Electronics and Design, 2009

Emerging technologies and their impact on system design.
Proceedings of the 2009 International Symposium on Low Power Electronics and Design, 2009

Hybrid cache architecture with disparate memory technologies.
Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009

Test-wrapper optimization for embedded cores in TSV-based three-dimensional SOCs.
Proceedings of the 27th International Conference on Computer Design, 2009

3D GPU architecture using cache stacking: Performance, cost, power and thermal analysis.
Proceedings of the 27th International Conference on Computer Design, 2009

Intrinsic NBTI-variability aware statistical pipeline performance assessment and tuning.
Proceedings of the 2009 International Conference on Computer-Aided Design, 2009

PCRAMsim: System-level performance, energy, and area modeling for Phase-Change RAM.
Proceedings of the 2009 International Conference on Computer-Aided Design, 2009

A novel architecture of the 3D stacked MRAM L2 cache for CMPs.
Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009

Power and performance of read-write aware Hybrid Caches with non-volatile memories.
Proceedings of the Design, Automation and Test in Europe, 2009

Gate replacement techniques for simultaneous leakage and aging optimization.
Proceedings of the Design, Automation and Test in Europe, 2009

CheckerCore: enhancing an FPGA soft core to capture worst-case execution times.
Proceedings of the 2009 International Conference on Compilers, 2009

Variation-aware resource sharing and binding in behavioral synthesis.
Proceedings of the 14th Asia South Pacific Design Automation Conference, 2009

A criticality-driven microarchitectural three dimensional (3D) floorplanner.
Proceedings of the 14th Asia South Pacific Design Automation Conference, 2009

System-level cost analysis and design exploration for three-dimensional integrated circuits (3D ICs).
Proceedings of the 14th Asia South Pacific Design Automation Conference, 2009

A framework for estimating NBTI degradation of microarchitectural components.
Proceedings of the 14th Asia South Pacific Design Automation Conference, 2009

Tolerating process variations in high-level synthesis using transparent latches.
Proceedings of the 14th Asia South Pacific Design Automation Conference, 2009

3D optical networks-on-chip (NoC) for multiprocessor systems-on-chip (MPSoC).
Proceedings of the IEEE International Conference on 3D System Integration, 2009

Arithmetic unit design using 180nm TSV-based 3D stacking technology.
Proceedings of the IEEE International Conference on 3D System Integration, 2009

Investigation and comparison of thermal distribution in synchronous and asynchronous 3D ICs.
Proceedings of the IEEE International Conference on 3D System Integration, 2009

2008
Case Study of Reliability-Aware and Low-Power Design.
IEEE Trans. Very Large Scale Integr. Syst., 2008

Design Space Exploration for 3-D Cache.
IEEE Trans. Very Large Scale Integr. Syst., 2008

Toward Increasing FPGA Lifetime.
IEEE Trans. Dependable Secur. Comput., 2008

Editorial: Special issue on 3D integrated circuits and microarchitectures.
ACM J. Emerg. Technol. Comput. Syst., 2008

Power optimization for FinFET-based circuits using genetic algorithms.
Proceedings of the 21st Annual IEEE International SoC Conference, SoCC 2008, 2008

ILP-based scheme for timing variation-aware scheduling and resource binding.
Proceedings of the 21st Annual IEEE International SoC Conference, SoCC 2008, 2008

Two-dimensional crosstalk avoidance codes.
Proceedings of the IEEE Workshop on Signal Processing Systems, 2008

Thermal-aware Design Considerations for Application-Specific Instruction Set Processor.
Proceedings of the IEEE Symposium on Application Specific Processors, 2008

Test-Access Solutions for Three-Dimensional SOCs.
Proceedings of the 2008 IEEE International Test Conference, 2008

Hierarchical Soft Error Estimation Tool (HSEET).
Proceedings of the 9th International Symposium on Quality of Electronic Design (ISQED 2008), 2008

MIRA: A Multi-layered On-Chip Interconnect Router Architecture.
Proceedings of the 35th International Symposium on Computer Architecture (ISCA 2008), 2008

Embedded Multi-Processor System-on-chip (MPSoC) design considering process variations.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Comparative analysis of NBTI effects on low power and high performance flip-flops.
Proceedings of the 26th International Conference on Computer Design, 2008

Thermal-aware reliability analysis for platform FPGAs.
Proceedings of the 2008 International Conference on Computer-Aided Design, 2008

A low-power phase change memory based hybrid cache architecture.
Proceedings of the 18th ACM Great Lakes Symposium on VLSI 2008, 2008

Technology, CAD tools, and designs for emerging 3D integration technology.
Proceedings of the 18th ACM Great Lakes Symposium on VLSI 2008, 2008

A Variation Aware High Level Synthesis Framework.
Proceedings of the Design, Automation and Test in Europe, 2008

Circuit and microarchitecture evaluation of 3D stacking magnetic RAM (MRAM) as a universal memory replacement.
Proceedings of the 45th Design Automation Conference, 2008

Variability-driven module selection with joint design time optimization and post-silicon tuning.
Proceedings of the 13th Asia South Pacific Design Automation Conference, 2008

2007
Reliability-aware Co-synthesis for Embedded Systems.
J. VLSI Signal Process., 2007

Code Decompression Unit Design for VLIW Embedded Processors.
IEEE Trans. Very Large Scale Integr. Syst., 2007

Code Compression for VLIW Embedded Systems Using a Self-Generating Table.
IEEE Trans. Very Large Scale Integr. Syst., 2007

Processor Design in 3D Die-Stacking Technologies.
IEEE Micro, 2007

On-chip bus thermal analysis and optimisation.
IET Comput. Digit. Tech., 2007

Soft Error Rate Analysis for Combinational Logic Using An Accurate Electrical Masking Model.
Proceedings of the 20th International Conference on VLSI Design (VLSI Design 2007), 2007

Architecting Microprocessor Components in 3D Design Space.
Proceedings of the 20th International Conference on VLSI Design (VLSI Design 2007), 2007

A Novel Gate-Level NBTI Delay Degradation Model with Stacking Effect.
Proceedings of the Integrated Circuit and System Design. Power and Timing Modeling, 2007

Collaborative VLSI-CAD Instruction in the Digital Sandbox.
Proceedings of the IEEE International Conference on Microelectronic Systems Education, 2007

Variation Impact on SER of Combinational Circuits.
Proceedings of the 8th International Symposium on Quality of Electronic Design (ISQED 2007), 2007

Variation Analysis of CAM Cells.
Proceedings of the 8th International Symposium on Quality of Electronic Design (ISQED 2007), 2007

Modeling of PMOS NBTI Effect Considering Temperature Variation.
Proceedings of the 8th International Symposium on Quality of Electronic Design (ISQED 2007), 2007

A novel dimensionally-decomposed router for on-chip communication in 3D architectures.
Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007), 2007

Scan chain design for three-dimensional integrated circuits (3D ICs).
Proceedings of the 25th International Conference on Computer Design, 2007

FPGA routing architecture analysis under variations.
Proceedings of the 25th International Conference on Computer Design, 2007

Variation-aware task allocation and scheduling for MPSoC.
Proceedings of the 2007 International Conference on Computer-Aided Design, 2007

Temperature-aware NBTI modeling and the impact of input vector control on performance degradation.
Proceedings of the 2007 Design, Automation and Test in Europe Conference and Exposition, 2007

A novel criticality computation method in statistical timing analysis.
Proceedings of the 2007 Design, Automation and Test in Europe Conference and Exposition, 2007

2006
Temperature-Aware Task Allocation and Scheduling for Embedded Multiprocessor Systems-on-Chip (MPSoC) Design.
J. VLSI Signal Process., 2006

Code Compression for Embedded VLIW Processors Using Variable-to-Fixed Coding.
IEEE Trans. Very Large Scale Integr. Syst., 2006

Design space exploration for 3D architectures.
ACM J. Emerg. Technol. Comput. Syst., 2006

Reliability Concerns in Embedded System Designs.
Computer, 2006

A Hybrid SoC Interconnect with Dynamic TDMA-Based Transaction-Less Buses and On-Chip Networks.
Proceedings of the 19th International Conference on VLSI Design (VLSI Design 2006), 2006

SEAT-LA: A Soft Error Analysis Tool for Combinational Logic.
Proceedings of the 19th International Conference on VLSI Design (VLSI Design 2006), 2006

Analysis of Subthreshold Finfet Circuits for Ultra-Low Power Design.
Proceedings of the 2006 IEEE International SOC Conference, Austin, Texas, USA, 2006

Crosstalk-Aware Energy Efficient Encoding for Instruction Bus through Code Compression.
Proceedings of the 2006 IEEE International SOC Conference, Austin, Texas, USA, 2006

Modeling the Impact of Process Variation on Critical Charge Distribution.
Proceedings of the 2006 IEEE International SOC Conference, Austin, Texas, USA, 2006

Reliability-Aware SOC Voltage Islands Partition and Floorplan.
Proceedings of the 2006 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2006), 2006

Dependability Analysis of Nano-scale FinFET circuits.
Proceedings of the 2006 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2006), 2006

Delay and Energy Efficient Data Transmission for On-Chip Buses.
Proceedings of the 2006 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2006), 2006

Interconnect and Thermal-aware Floorplanning for 3D Microprocessors.
Proceedings of the 7th International Symposium on Quality of Electronic Design (ISQED 2006), 2006

Design and Management of 3D Chip Multiprocessors Using Network-in-Memory.
Proceedings of the 33rd International Symposium on Computer Architecture (ISCA 2006), 2006

Guaranteeing performance yield in high-level synthesis.
Proceedings of the 2006 International Conference on Computer-Aided Design, 2006

On-chip bus thermal analysis and optimization.
Proceedings of the Conference on Design, Automation and Test in Europe, 2006

FLAW: FPGA lifetime awareness.
Proceedings of the 43rd Design Automation Conference, 2006

Optimal topology exploration for application-specific 3D architectures.
Proceedings of the 2006 Conference on Asia South Pacific Design Automation: ASP-DAC 2006, 2006

Leakage Optimized DECAP Design for FPGAs.
Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems 2006, 2006

2005
Accurate Stacking Effect Macro-Modeling of Leakage Power in Sub-100nm Circuits.
Proceedings of the 18th International Conference on VLSI Design (VLSI Design 2005), 2005

Influence of Leakage Reduction Techniques on Delay/Leakage Uncertainty.
Proceedings of the 18th International Conference on VLSI Design (VLSI Design 2005), 2005

Adaptive Power Management in Software Radios Using Resolution Adaptive Analog to Digital Converters.
Proceedings of the 2005 IEEE Computer Society Annual Symposium on VLSI (ISVLSI 2005), 2005

An ILP Formulation for Reliability-Oriented High-Level Synthesis.
Proceedings of the 6th International Symposium on Quality of Electronic Design (ISQED 2005), 2005

Reliability-Centric Hardware/Software Co-Design.
Proceedings of the 6th International Symposium on Quality of Electronic Design (ISQED 2005), 2005

Thermal-Aware Floorplanning Using Genetic Algorithms.
Proceedings of the 6th International Symposium on Quality of Electronic Design (ISQED 2005), 2005

Three-Dimensional Cache Design Exploration Using 3DCacti.
Proceedings of the 23rd International Conference on Computer Design (ICCD 2005), 2005

Temperature-Sensitive Loop Parallelization for Chip Multiprocessors.
Proceedings of the 23rd International Conference on Computer Design (ICCD 2005), 2005

Temperature-Aware Voltage Islands Architecting in System-on-Chip Design.
Proceedings of the 23rd International Conference on Computer Design (ICCD 2005), 2005

Power Attack Resistant Cryptosystem Design: A Dynamic Voltage and Frequency Switching Approach.
Proceedings of the 2005 Design, 2005

Leakage-Aware Interconnect for On-Chip Network.
Proceedings of the 2005 Design, 2005

Reliability-Centric High-Level Synthesis.
Proceedings of the 2005 Design, 2005

Thermal-Aware Task Allocation and Scheduling for Embedded Systems.
Proceedings of the 2005 Design, 2005

Low-leakage robust SRAM cell design for sub-100nm technologies.
Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005

Designing reliable circuit in the presence of soft errors.
Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005

FD-HGAC: a hybrid heuristic/genetic algorithm hardware/software co-synthesis framework with fault detection.
Proceedings of the 2005 Conference on Asia South Pacific Design Automation, 2005

2004
The Effect of Threshold Voltages on the Soft Error Rate.
Proceedings of the 5th International Symposium on Quality of Electronic Design (ISQED 2004), 2004

Thermal-Aware IP Virtualization and Placement for Networks-on-Chip Architecture.
Proceedings of the 22nd IEEE International Conference on Computer Design: VLSI in Computers & Processors (ICCD 2004), 2004

Improving soft-error tolerance of FPGA configuration bits.
Proceedings of the 2004 International Conference on Computer-Aided Design, 2004

Design of a nanosensor array architecture.
Proceedings of the 14th ACM Great Lakes Symposium on VLSI 2004, 2004

LZW-Based Code Compression for VLIW Embedded Systems.
Proceedings of the 2004 Design, 2004

2003
Augmenting Platform-Based Design with Synthesis Tools.
J. Circuits Syst. Comput., 2003

Effect of Power Optimizations on Soft Error Rate.
Proceedings of the VLSI-SOC: From Systems to Chips, 2003

Code Compression Using Variable-to-fixed Coding Based on Arithmetic Coding.
Proceedings of the 2003 Data Compression Conference (DCC 2003), 2003

Profile-Driven Selective Code Compression.
Proceedings of the 2003 Design, 2003

2002
Code Compression for VLIW Processors Using Variable-to-Fixed Coding.
Proceedings of the 15th International Symposium on System Synthesis (ISSS 2002), 2002

2001
A code decompression architecture for VLIW processors.
Proceedings of the 34th Annual International Symposium on Microarchitecture, 2001

Code Compression for VLIW Processors.
Proceedings of the Data Compression Conference, 2001

Allocation and scheduling of conditional task graph in hardware/software co-synthesis.
Proceedings of the Conference on Design, Automation and Test in Europe, 2001

2000
Co-synthesis with custom ASICs.
Proceedings of ASP-DAC 2000, 2000


  Loading...