Guangyu Sun

Orcid: 0000-0002-7315-6589

Affiliations:
  • Peking University, Center for Energy-efficient Computing and Applications, Beijing, China
  • Pennsylvania State University, Department of Computer Science and Engineering, University Park, PA, USA


According to our database1, Guangyu Sun authored at least 173 papers between 2008 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
CoMN: Algorithm-Hardware Co-Design Platform for Nonvolatile Memory-Based Convolutional Neural Network Accelerators.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., July, 2024

OriGen:Enhancing RTL Code Generation with Code-to-Code Augmentation and Self-Reflection.
CoRR, 2024

Theseus: Towards High-Efficiency Wafer-Scale Chip Design Space Exploration for Large Language Models.
CoRR, 2024

Toward CXL-Native Memory Tiering via Device-Side Profiling.
CoRR, 2024

The Dawn of AI-Native EDA: Promises and Challenges of Large Circuit Models.
CoRR, 2024

LLM Inference Unveiled: Survey and Roofline Model Insights.
CoRR, 2024

NeoMem: Hardware/Software Co-Design for CXL-Native Memory Tiering.
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024

34.3 A 22nm 64kb Lightning-Like Hybrid Computing-in-Memory Macro with a Compressed Adder Tree and Analog-Storage Quantizers for Transformer and CNNs.
Proceedings of the IEEE International Solid-State Circuits Conference, 2024

StreamPIM: Streaming Matrix Computation in Racetrack Memory.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024

Algorithm-Hardware Co-Design for Energy-Efficient A/D Conversion in ReRAM-Based Accelerators.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024

Oltron: Algorithm-Hardware Co-design for Outlier-Aware Quantization of LLMs with Inter-/Intra-Layer Adaptation.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

A Software-Hardware Co-design Solution for 3D Inner Structure Reconstruction.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

SpecPIM: Accelerating Speculative Inference on PIM-Enabled System via Architecture-Dataflow Co-Exploration.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

PIM-DL: Expanding the Applicability of Commodity DRAM-PIMs for Deep Learning via Algorithm-System Co-Optimization.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024

2023
FD-CNN: A Frequency-Domain FPGA Acceleration Scheme for CNN-Based Image-Processing Applications.
ACM Trans. Embed. Comput. Syst., November, 2023

COPPER: a combinatorial optimization problem solver with processing-in-memory architecture.
Frontiers Inf. Technol. Electron. Eng., May, 2023

Sky-Sorter: A Processing-in-Memory Architecture for Large-Scale Sorting.
IEEE Trans. Computers, February, 2023

Energon: Toward Efficient Acceleration of Transformers Using Dynamic Sparse Attention.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2023

ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models.
CoRR, 2023

RPTQ: Reorder-based Post-training Quantization for Large Language Models.
CoRR, 2023

Benchmarking the Reliability of Post-training Quantization: a Particular Focus on Worst-case Performance.
CoRR, 2023

TileFlow: A Framework for Modeling Fusion Dataflow via Tree-based Analysis.
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023

DIMM-Link: Enabling Efficient Inter-DIMM Communication for Near-Memory Processing.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

NMExplorer: An Efficient Exploration Framework for DIMM-based Near-Memory Tensor Reduction.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Polaris: Enhancing CXL-based Memory Expanders with Memory-side Prefetching.
Proceedings of the Advanced Parallel Processing Technologies, 2023

Caffeine: Towards Uniformed Representation and Acceleration for Deep Convolutional Neural Networks.
Proceedings of the ACM Turing Award Celebration Conference - China 2023, 2023

2022
Editorial for the special issue on memory architectures and systems for modern applications.
CCF Trans. High Perform. Comput., December, 2022

The Case for FPGA-Based Edge Computing.
IEEE Trans. Mob. Comput., 2022

PIMulator-NN: An Event-Driven, Cross-Level Simulation Framework for Processing-In-Memory-Based Neural Network Accelerators.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Flatfish: A Reinforcement Learning Approach for Application-Aware Address Mapping.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

A Survey of Trustworthy Graph Learning: Reliability, Explainability, and Privacy Protection.
CoRR, 2022

PetS: A Unified Framework for Parameter-Efficient Transformers Serving.
Proceedings of the 2022 USENIX Annual Technical Conference, 2022

GNNSampler: Bridging the Gap Between Sampling Algorithms of GNN and Hardware.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2022

Latency-aware Spatial-wise Dynamic Networks.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

An 82nW 0.53pJ/SOP Clock-Free Spiking Neural Network with 40µs Latency for AloT Wake-Up Functions Using Ultimate-Event-Driven Bionic Architecture and Computing-in-Memory Technique.
Proceedings of the IEEE International Solid-State Circuits Conference, 2022

Enabling High-Quality Uncertainty Quantification in a PIM Designed for Bayesian Neural Network.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

PTQ4ViT: Post-training Quantization for Vision Transformers with Twin Uniform Quantization.
Proceedings of the Computer Vision - ECCV 2022, 2022

Tailor: removing redundant operations in memristive analog neural network accelerators.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

A Mapping Model of SNNs to Neuromorphic Hardware.
Proceedings of the 4th IEEE International Conference on Artificial Intelligence Circuits and Systems, 2022

GNNear: Accelerating Full-Batch Training of Graph Neural Networks with near-Memory Processing.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022

2021
STAR: Synthesis of Stateful Logic in RRAM Targeting High Area Utilization.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2021

CIB-HIER: Centralized Input Buffer Design in Hierarchical High-radix Routers.
ACM Trans. Archit. Code Optim., 2021

Area Efficient Pattern Representation of Binary Neural Networks on RRAM.
J. Comput. Sci. Technol., 2021

PTQ4ViT: Post-Training Quantization Framework for Vision Transformers.
CoRR, 2021

GCNear: A Hybrid Architecture for Efficient GCN Training with Near-Memory Processing.
CoRR, 2021

Energon: Towards Efficient Acceleration of Transformers Using Dynamic Sparse Attention.
CoRR, 2021

PTQ-SL: Exploring the Sub-layerwise Post-training Quantization.
CoRR, 2021

METRO: A Software-Hardware Co-Design of Interconnections for Spatial DNN Accelerators.
CoRR, 2021

Agatha: Smart Contract for DNN Computation.
CoRR, 2021

NAS4RRAM: neural network architecture search for inference on RRAM-based accelerators.
Sci. China Inf. Sci., 2021

PipeZK: Accelerating Zero-Knowledge Proof with a Pipelined Architecture.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

SSR: A Skeleton-based Synthesis Flow for Hybrid Processing-in-RRAM Modes.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

Rapid Configuration of Asynchronous Recurrent Neural Networks for ASIC Implementations.
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021

BlockGNN: Towards Efficient GNN Acceleration Using Block-Circulant Weight Matrices.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

Reconfigurable ASIC Implementation of Asynchronous Recurrent Neural Networks.
Proceedings of the 27th IEEE International Symposium on Asynchronous Circuits and Systems, 2021

2020
Fork Path: Batching ORAM Requests to Remove Redundant Memory Accesses.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2020

Crane: Mitigating Accelerator Under-utilization Caused by Sparsity Irregularities in CNNs.
IEEE Trans. Computers, 2020

Bigflow: A General Optimization Layer for Distributed Computing Frameworks.
J. Comput. Sci. Technol., 2020

Preface.
J. Comput. Sci. Technol., 2020

Customizing Trusted AI Accelerators for Efficient Privacy-Preserving Machine Learning.
CoRR, 2020

ENAS4D: Efficient Multi-stage CNN Architecture Search for Dynamic Inference.
CoRR, 2020

Edge-Stream: a Stream Processing Approach for Distributed Applications on a Hierarchical Edge-computing System.
Proceedings of the 5th IEEE/ACM Symposium on Edge Computing, 2020

MobiLattice: A Depth-wise DCNN Accelerator with Hybrid Digital/Analog Nonvolatile Processing-In-Memory Block.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

SaFace: Towards Scenario-aware Face Recognition via Edge Computing System.
Proceedings of the 3rd USENIX Workshop on Hot Topics in Edge Computing, 2020

S2DNAS: Transforming Static CNN Model for Dynamic Inference via Neural Architecture Search.
Proceedings of the Computer Vision - ECCV 2020, 2020

Hardware-assisted Service Live Migration in Resource-limited Edge Computing Systems.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

GNN-PIM: A Processing-in-Memory Architecture for Graph Neural Networks.
Proceedings of the Advanced Computer Architecture - 13th Conference, 2020

Characterizing Membership Privacy in Stochastic Gradient Langevin Dynamics.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Caffeine: Toward Uniformed Representation and Acceleration for Deep Convolutional Neural Networks.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

GraphH: A Processing-in-Memory Architecture for Large-Scale Graph Processing.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2019

RC-NVM: Dual-Addressing Non-Volatile Memory Architecture Supporting Both Row and Column Memory Accesses.
IEEE Trans. Computers, 2019

EdgeFlow: Open-Source Multi-layer Data Flow Processing in Edge Computing for 5G and Beyond.
IEEE Netw., 2019

Joint Task Assignment, Transmission, and Computing Resource Allocation in Multilayer Mobile Edge Computing Systems.
IEEE Internet Things J., 2019

Generalization in Generative Adversarial Networks: A Novel Perspective from Privacy Protection.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

BAYHENN: Combining Bayesian Deep Learning and Homomorphic Encryption for Secure DNN Inference.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Accelerate service live migration in resource-limited edge computing systems.
Proceedings of the 4th ACM/IEEE Symposium on Edge Computing, 2019

ZUMA: Enabling Direct Insertion/Deletion Operations with Emerging Skyrmion Racetrack Memory.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

P3SGD: Patient Privacy Preserving SGD for Regularizing Deep CNNs in Pathological Image Classification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Parallel Stateful Logic in RRAM: Theoretical Analysis and Arithmetic Design.
Proceedings of the 30th IEEE International Conference on Application-specific Systems, 2019

G2C: A Generator-to-Classifier Framework Integrating Multi-Stained Visual Cues for Pathological Glomerulus Classification.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Optimizing Cache Bypassing and Warp Scheduling for GPUs.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2018

CRAT: Enabling Coordinated Register Allocation and Thread-Level Parallelism Optimization for GPUs.
IEEE Trans. Computers, 2018

V-PIM: An Analytical Overhead Model for Processing-in-Memory Architectures.
Proceedings of the IEEE 7th Non-Volatile Memory Systems and Applications Symposium, 2018

Shadow Block: Accelerating ORAM Accesses with Data Duplication.
Proceedings of the 51st Annual IEEE/ACM International Symposium on Microarchitecture, 2018

Path Prefetching: Accelerating Index Searches for In-Memory Databases.
Proceedings of the 36th IEEE International Conference on Computer Design, 2018

PM3: Power Modeling and Power Management for Processing-in-Memory.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

RC-NVM: Enabling Symmetric Row and Column Memory Accesses for In-memory Databases.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018

2017
Data Backup Optimization for Nonvolatile SRAM in Energy Harvesting Sensor Nodes.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2017

Pseudo-Differential Sensing Framework for STT-MRAM: A Cross-Layer Perspective.
IEEE Trans. Computers, 2017

SRPGAN: Perceptual Generative Adversarial Network for Single Image Super Resolution.
CoRR, 2017

Reducing Overfitting in Deep Convolutional Neural Networks Using Redundancy Regularizer.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2017, 2017

FP-DNN: An Automated Framework for Mapping Deep Neural Networks onto FPGAs with RTL-HLS Hybrid Templates.
Proceedings of the 25th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2017

Protect non-volatile memory from wear-out attack based on timing difference of row buffer hit/miss.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

SPMS: Strand based persistent memory system.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2017

Toss-up Wear Leveling: Protecting Phase-Change Memories from Inconsistent Write Patterns.
Proceedings of the 54th Annual Design Automation Conference, 2017

FPGA-based accelerator for long short-term memory recurrent neural networks.
Proceedings of the 22nd Asia and South Pacific Design Automation Conference, 2017

2016
Perspectives of Racetrack Memory for Large-Capacity On-Chip Memory: From Device to System.
IEEE Trans. Circuits Syst. I Regul. Pap., 2016

Statistical Cache Bypassing for Non-Volatile Memory.
IEEE Trans. Computers, 2016

Accelerate context switch by racetrack-SRAM hybrid cells.
Proceedings of the IEEE/ACM International Symposium on Nanoscale Architectures, 2016

np-ECC: Nonadjacent position error correction code for racetrack memory.
Proceedings of the IEEE/ACM International Symposium on Nanoscale Architectures, 2016

Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster.
Proceedings of the 2016 International Symposium on Low Power Electronics and Design, 2016

NXgraph: An efficient graph processing system on a single machine.
Proceedings of the 32nd IEEE International Conference on Data Engineering, 2016

The Applications of NVM Technology in Hardware Security.
Proceedings of the 26th edition on Great Lakes Symposium on VLSI, 2016

Exploring Main Memory Design Based on Racetrack Memory Technology.
Proceedings of the 26th edition on Great Lakes Symposium on VLSI, 2016

PDS: pseudo-differential sensing scheme for STT-MRAM.
Proceedings of the 53rd Annual Design Automation Conference, 2016

Pin Tumbler Lock: A shift based encryption mechanism for racetrack memory.
Proceedings of the 21st Asia and South Pacific Design Automation Conference, 2016

A novel PUF based on cell error rate distribution of STT-RAM.
Proceedings of the 21st Asia and South Pacific Design Automation Conference, 2016

Performance-centric register file design for GPUs using racetrack memory.
Proceedings of the 21st Asia and South Pacific Design Automation Conference, 2016

2015
An Efficient Compiler Framework for Cache Bypassing on GPUs.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2015

Exploring data placement in racetrack memory based scratchpad memory.
Proceedings of the IEEE Non-Volatile Memory System and Applications Symposium, 2015

An architecture-level cache simulation framework supporting advanced PMA STT-MRAM.
Proceedings of the 2015 IEEE/ACM International Symposium on Nanoscale Architectures, 2015

Atlas: Baidu's key-value storage system for cloud data.
Proceedings of the IEEE 31st Symposium on Mass Storage Systems and Technologies, 2015

Fork path: improving efficiency of ORAM by removing redundant memory accesses.
Proceedings of the 48th International Symposium on Microarchitecture, 2015

Enabling coordinated register allocation and thread-level parallelism optimization for GPUs.
Proceedings of the 48th International Symposium on Microarchitecture, 2015

Leveraging emerging nonvolatile memory in high-level synthesis with loop transformations.
Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design, 2015

Perspectives of racetrack memory based on current-induced domain wall motion: From device to system.
Proceedings of the 2015 IEEE International Symposium on Circuits and Systems, 2015

Hi-fi playback: tolerating position errors in shift operations of racetrack memory.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

Coordinated static and dynamic cache bypassing for GPUs.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015

Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks.
Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015

From device to system: cross-layer design exploration of racetrack memory.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

An energy efficient backup scheme with low inrush current for nonvolatile SRAM in energy harvesting sensor nodes.
Proceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition, 2015

A STT-RAM-based low-power hybrid register file for GPGPUs.
Proceedings of the 52nd Annual Design Automation Conference, 2015

Quantitative modeling of racetrack memory, a tradeoff among area, performance, and power.
Proceedings of the 20th Asia and South Pacific Design Automation Conference, 2015

InterFS: An Interplanted Distributed File System to Improve Storage Utilization.
Proceedings of the 6th Asia-Pacific Workshop on Systems, 2015

Improving Memory Access Performance of In-Memory Key-Value Store Using Data Prefetching Techniques.
Proceedings of the Advanced Parallel Processing Technologies, 2015

2014
GRT: A Reconfigurable SDR Platform with High Performance and Usability.
SIGARCH Comput. Archit. News, 2014

SBAC: a statistics based cache bypassing method for asymmetric-access caches.
Proceedings of the International Symposium on Low Power Electronics and Design, 2014

Rapid design space exploration of two-level unified caches.
Proceedings of the IEEE International Symposium on Circuits and Systemss, 2014

Half-DRAM: A high-bandwidth and low-power DRAM architecture from the rethinking of fine-grained activation.
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014

CREAM: A Concurrent-Refresh-Aware DRAM Memory architecture.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

Adaptive placement and migration policy for an STT-RAM-based hybrid cache.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

3D-SWIFT: a high-performance 3D-stacked wide IO DRAM.
Proceedings of the Great Lakes Symposium on VLSI 2014, GLSVLSI '14, Houston, TX, USA - May 21, 2014

An efficient design and implementation of LSM-tree based key-value store on open-channel SSD.
Proceedings of the Ninth Eurosys Conference 2014, 2014

NoC-Sprinting: Interconnect for Fine-Grained Sprinting in the Dark Silicon Era.
Proceedings of the 51st Annual Design Automation Conference 2014, 2014

Prefetching techniques for STT-RAM based last-level cache in CMP systems.
Proceedings of the 19th Asia and South Pacific Design Automation Conference, 2014

2013
Optimizing GPU energy efficiency with 3D die-stacking graphics memory and reconfigurable memory interface.
ACM Trans. Archit. Code Optim., 2013

Exploring the vulnerability of CMPs to soft errors with 3D stacked nonvolatile memory.
ACM J. Emerg. Technol. Comput. Syst., 2013

Active SSD design for energy-efficiency improvement of web-scale data analysis.
Proceedings of the International Symposium on Low Power Electronics and Design (ISLPED), 2013

Designing scratchpad memory architecture with emerging STT-RAM memory technologies.
Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013

Lazy Precharge: An overhead-free method to reduce precharge overhead for memory parallelism improvement of DRAM system.
Proceedings of the 2013 IEEE 31st International Conference on Computer Design, 2013

Asymmetric-access aware optimization for STT-RAM caches with process variations.
Proceedings of the Great Lakes Symposium on VLSI 2013 (part of ECRC), 2013

An efficient run-time encryption scheme for non-volatile main memory.
Proceedings of the International Conference on Compilers, 2013

A Probabilistic Data Replacement Strategy for Flash-Based Hybrid Storage System.
Proceedings of the Web Technologies and Applications - 15th Asia-Pacific Web Conference, 2013

2012
Performance/Thermal-Aware Design of 3D-Stacked L2 Caches for CMPs.
ACM Trans. Design Autom. Electr. Syst., 2012

Energy-efficient GPU design with reconfigurable in-package graphics memory.
Proceedings of the International Symposium on Low Power Electronics and Design, 2012

Improving energy efficiency of write-asymmetric memories by log style write.
Proceedings of the International Symposium on Low Power Electronics and Design, 2012

Multi-level cell STT-RAM: Is it realistic or just a dream?
Proceedings of the 2012 IEEE/ACM International Conference on Computer-Aided Design, 2012

Modeling and design exploration of FBDRAM as on-chip memory.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012

3DHLS: Incorporating high-level synthesis in physical planning of three-dimensional (3D) ICs.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012

2011
Influence of Stacked 3D Memory/Cache Architectures on GPUs.
Proceedings of the 3D Integration for NoC-based SoC Architectures, 2011

Three-dimensional Integrated Circuits: Design, EDA, and Architecture.
Found. Trends Electron. Des. Autom., 2011

Exploiting Heterogeneity for Energy Efficiency in Chip Multiprocessors.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2011

Moguls: a model to explore the memory hierarchy for bandwidth improvements.
Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011

Architecting on-chip interconnects for stacked 3D STT-RAM caches in CMPs.
Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011

Energy-efficient multi-level cell phase-change memory system with data encoding.
Proceedings of the IEEE 29th International Conference on Computer Design, 2011

Exploring the vulnerability of CMPs to soft errors with 3D stacked non-volatile memory.
Proceedings of the IEEE 29th International Conference on Computer Design, 2011

Enabling architectural innovations using non-volatile memory.
Proceedings of the 21st ACM Great Lakes Symposium on VLSI 2010, 2011

Emerging non-volatile memories: opportunities and challenges.
Proceedings of the 9th International Conference on Hardware/Software Codesign and System Synthesis, 2011

A frequent-value based PRAM memory architecture.
Proceedings of the 16th Asia South Pacific Design Automation Conference, 2011

Using NEM relay to improve 3DIC cost efficiency.
Proceedings of the 2011 IEEE International 3D Systems Integration Conference (3DIC), Osaka, Japan, January 31, 2011

Fabrication cost analysis for 2D, 2.5D, and 3D IC designs.
Proceedings of the 2011 IEEE International 3D Systems Integration Conference (3DIC), Osaka, Japan, January 31, 2011

2010
Variable-Latency Adder (VL-Adder) Designs for Low Power and NBTI Tolerance.
IEEE Trans. Very Large Scale Integr. Syst., 2010

A Hybrid solid-state storage architecture for the performance, energy consumption, and lifetime improvement.
Proceedings of the 16th International Conference on High-Performance Computer Architecture (HPCA-16 2010), 2010

Energy- and endurance-aware design of phase change memory caches.
Proceedings of the Design, Automation and Test in Europe, 2010

Cost-driven 3D integration with interconnect layers.
Proceedings of the 47th Design Automation Conference, 2010

2009
Exploration of 3D stacked L2 cache design for high performance and efficient thermal control.
Proceedings of the 2009 International Symposium on Low Power Electronics and Design, 2009

3D GPU architecture using cache stacking: Performance, cost, power and thermal analysis.
Proceedings of the 27th International Conference on Computer Design, 2009

A novel architecture of the 3D stacked MRAM L2 cache for CMPs.
Proceedings of the 15th International Conference on High-Performance Computer Architecture (HPCA-15 2009), 2009

A criticality-driven microarchitectural three dimensional (3D) floorplanner.
Proceedings of the 14th Asia South Pacific Design Automation Conference, 2009

Arithmetic unit design using 180nm TSV-based 3D stacking technology.
Proceedings of the IEEE International Conference on 3D System Integration, 2009

2008
Thermal-aware Design Considerations for Application-Specific Instruction Set Processor.
Proceedings of the IEEE Symposium on Application Specific Processors, 2008

A Variation Aware High Level Synthesis Framework.
Proceedings of the Design, Automation and Test in Europe, 2008

Circuit and microarchitecture evaluation of 3D stacking magnetic RAM (MRAM) as a universal memory replacement.
Proceedings of the 45th Design Automation Conference, 2008


  Loading...