Shih-Lien Lu

According to our database1, Shih-Lien Lu authored at least 90 papers between 1988 and 2021.

Collaborative distances:

Awards

IEEE Fellow

IEEE Fellow 2016, "For contributions to low-voltage microarchitecture and approximate computing".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2021
O2MD²: A New Post-Quantum Cryptosystem With One-to-Many Distributed Key Management Based on Prime Modulo Double Encapsulation.
IEEE Access, 2021

2020
CompAcc: Efficient Hardware Realization for Processing Compressed Neural Networks Using Accumulator Arrays.
Proceedings of the IEEE Asian Solid-State Circuits Conference, 2020

2019
Guest Editors' Introduction: Intelligent Resource-Constrained Sensor Nodes.
IEEE Des. Test, 2019

A Reliable, Low-Cost, Low-Energy Physically Unclonable Function Circuit Through Effective Filtering.
Proceedings of the International Symposium on VLSI Design, Automation and Test, 2019

A High Performance, Low Energy, Compact Masked 128-Bit AES in 22nm CMOS Technology.
Proceedings of the International Symposium on VLSI Design, Automation and Test, 2019

2018
A foundry's view of hardware security.
Proceedings of the 2018 International Symposium on VLSI Design, 2018

A FeFET Based Processing-In-Memory Architecture for Solving Distributed Least-Square Optimizations.
Proceedings of the 76th Device Research Conference, 2018

2017
23.9 An 8-channel 4.5Gb 180GB/s 18ns-row-latency RAM for the last level cache.
Proceedings of the 2017 IEEE International Solid-State Circuits Conference, 2017

2016
DRAM Refresh Mechanisms, Penalties, and Trade-Offs.
IEEE Trans. Computers, 2016

Runahead Cache Misses Using Bloom Filter.
Proceedings of the 17th International Conference on Parallel and Distributed Computing, 2016

Small cache lookaside table for fast DRAM cache access.
Proceedings of the 35th IEEE International Performance Computing and Communications Conference, 2016

2015
A computer designed half Gb 16-channel 819Gb/s high-bandwidth and 10ns low-latency DRAM for 3D stacked memory devices using TSVs.
Proceedings of the Symposium on VLSI Circuits, 2015

Evaluation methods of computer memory system.
Proceedings of the VLSI Design, Automation and Test, 2015

Improving DRAM latency with dynamic asymmetric subarray.
Proceedings of the 48th International Symposium on Microarchitecture, 2015

Bringing Modern Hierarchical Memory Systems Into Focus: A study of architecture and workload factors on system performance.
Proceedings of the 2015 International Symposium on Memory Systems, 2015

Flexible auto-refresh: enabling scalable and energy-efficient DRAM refresh reductions.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

2014
DArT: A Component-Based DRAM Area, Power, and Timing Modeling Tool.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2014

Recycled Error Bits: Energy-Efficient Architectural Support for Floating Point Accuracy.
Proceedings of the International Conference for High Performance Computing, 2014

Author retrospective for bloom filtering cache misses for accurate data speculation and prefetching.
Proceedings of the ACM International Conference on Supercomputing 25th Anniversary Volume, 2014

Remapping NUCA: Improving NUCA Cache's Power Efficiency.
Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014

Sandbox Prefetching: Safe run-time evaluation of aggressive prefetchers.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

2013
Reducing cache and TLB power by exploiting memory region and privilege level semantics.
J. Syst. Archit., 2013

Recycled Error Bits: Energy-Efficient Architectural Support for Higher Precision Floating Point.
CoRR, 2013

Guided Region-Based GPU Scheduling: Utilizing Multi-thread Parallelism to Hide Memory Latency.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013

Guided multiple hashing: Achieving near perfect balance for fast routing lookup.
Proceedings of the 2013 21st IEEE International Conference on Network Protocols, 2013

Technology comparison for large last-level caches (L<sup>3</sup>Cs): Low-leakage SRAM, low write-energy STT-RAM, and refresh-optimized eDRAM.
Proceedings of the 19th IEEE International Symposium on High Performance Computer Architecture, 2013

2012
Direct Compare of Information Coded With Error-Correcting Codes.
IEEE Trans. Very Large Scale Integr. Syst., 2012

Reducing L1 caches power by exploiting software semantics.
Proceedings of the International Symposium on Low Power Electronics and Design, 2012

Scaling the "Memory Wall": Designer track.
Proceedings of the 2012 IEEE/ACM International Conference on Computer-Aided Design, 2012

Design for test and reliability in ultimate CMOS.
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012

2011
All-Digital Circuit-Level Dynamic Variation Monitor for Silicon Debug and Adaptive Clock Control.
IEEE Trans. Circuits Syst. I Regul. Pap., 2011

Automatic Pipelining From Transactional Datapath Specifications.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2011

Adaptive Cache Design to Enable Reliable Low-Voltage Operation.
IEEE Trans. Computers, 2011

Tunable Replica Bits for Dynamic Variation Tolerance in 8T SRAM Arrays.
IEEE J. Solid State Circuits, 2011

A 45 nm Resilient Microprocessor Core for Dynamic Variation Tolerance.
IEEE J. Solid State Circuits, 2011

Error Detection and Correction in Microprocessor Core and Memory Due to Fast Dynamic Voltage Droops.
IEEE J. Emerg. Sel. Topics Circuits Syst., 2011

Integrating formal verification and high-level processor pipeline synthesis.
Proceedings of the IEEE 9th Symposium on Application Specific Processors, 2011

Energy-efficient cache design using variable-strength error-correcting codes.
Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011

Distributed hardware matcher framework for SoC survivability.
Proceedings of the Design, Automation and Test in Europe, 2011

2010
A 45nm resilient and adaptive microprocessor core for dynamic variation tolerance.
Proceedings of the IEEE International Solid-State Circuits Conference, 2010

PVT-and-aging adaptive wordline boosting for 8T SRAM power reduction.
Proceedings of the IEEE International Solid-State Circuits Conference, 2010

Resilient microprocessor design for high performance & energy efficiency.
Proceedings of the 2010 International Symposium on Low Power Electronics and Design, 2010

Reducing cache power with low-cost, multi-bit error-correcting codes.
Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010

Automatic multithreaded pipeline synthesis from transactional datapath specifications.
Proceedings of the 47th Design Automation Conference, 2010

Dynamic variation monitor for measuring the impact of voltage droops on microprocessor clock frequency.
Proceedings of the IEEE Custom Integrated Circuits Conference, 2010

Resilient design in scaled CMOS for energy efficiency.
Proceedings of the 15th Asia South Pacific Design Automation Conference, 2010

2009
Trading Off Cache Capacity for Low-Voltage Operation.
IEEE Micro, 2009

2 GHz 2 Mb 2T Gain Cell Memory Macro With 128 GBytes/sec Bandwidth in a 65 nm Logic Process Technology.
IEEE J. Solid State Circuits, 2009

Energy-Efficient and Metastability-Immune Resilient Circuits for Dynamic Variation Tolerance.
IEEE J. Solid State Circuits, 2009

Improving cache lifetime reliability at ultra-low voltages.
Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009

Low power adaptive pipeline based on instruction isolation.
Proceedings of the 10th International Symposium on Quality of Electronic Design (ISQED 2009), 2009

Resilient circuits - Enabling energy-efficient performance and reliability.
Proceedings of the 2009 International Conference on Computer-Aided Design, 2009

Circuit techniques for dynamic variation tolerance.
Proceedings of the 46th Design Automation Conference, 2009

Content Addressable Memory for Low-Power and High-Performance Applications.
Proceedings of the CSIE 2009, 2009 WRI World Congress on Computer Science and Information Engineering, March 31, 2009

2008
Carry Logic.
Proceedings of the Wiley Encyclopedia of Computer Science and Engineering, 2008

Active Cache Emulator.
IEEE Trans. Very Large Scale Integr. Syst., 2008

A Desktop Computer with a Reconfigurable Pentium®.
ACM Trans. Reconfigurable Technol. Syst., 2008

2GHz 2Mb 2T Gain-Cell Memory Macro with 128GB/s Bandwidth in a 65nm Logic Process.
Proceedings of the 2008 IEEE International Solid-State Circuits Conference, 2008

Energy-Efficient and Metastability-Immune Timing-Error Detection and Instruction-Replay-Based Recovery Circuits for Dynamic-Variation Tolerance.
Proceedings of the 2008 IEEE International Solid-State Circuits Conference, 2008

Trading off Cache Capacity for Reliability to Enable Low Voltage Operation.
Proceedings of the 35th International Symposium on Computer Architecture (ISCA 2008), 2008

Novel FPGA based Haar classifier face detection algorithm acceleration.
Proceedings of the FPL 2008, 2008

2007
RAMP: Research Accelerator for Multiple Processors.
IEEE Micro, 2007

Fine-Grained Redundancy in Adders.
Proceedings of the 8th International Symposium on Quality of Electronic Design (ISQED 2007), 2007

Improving the reliability of on-chip data caches under process variations.
Proceedings of the 25th International Conference on Computer Design, 2007

An FPGA Approach to Quantifying Coherence Traffic Efficiency on Multiprocessor Systems.
Proceedings of the FPL 2007, 2007

An FPGA-based Pentium in a complete desktop system.
Proceedings of the ACM/SIGDA 15th International Symposium on Field Programmable Gate Arrays, 2007

2006
Asymmetric Clustering using a Register Cache.
J. Instr. Level Parallelism, 2006

Research accelerator for multiple processors.
Proceedings of the 2006 IEEE Hot Chips 18 Symposium (HCS), 2006

Design, implementation, and verification of active cache emulator (ACE).
Proceedings of the ACM/SIGDA 14th International Symposium on Field Programmable Gate Arrays, 2006

2005
Characterization of L3 cache behavior of SPECjAppServer2002 and TPC-C.
Proceedings of the 19th Annual International Conference on Supercomputing, 2005

A 10Mbit, 15GBytes/sec bandwidth 1T DRAM chip with planar MOS storage capacitor in an unmodified 150nm logic process for high-density on-chip memory applications.
Proceedings of the 31st European Solid-State Circuits Conference, 2005

Improving branch prediction accuracy with parallel conservative correctors.
Proceedings of the Second Conference on Computing Frontiers, 2005

2004
Speeding Up Processing with Approximation Circuits.
Computer, 2004

Efficient Victim Mechanism on Sector Cache Organization.
Proceedings of the Advances in Computer Systems Architecture, 9th Asia-Pacific Conference, 2004

2003
A 4.5-GHz 130-nm 32-KB L0 cache with a leakage-tolerant self reverse-bias bitline scheme.
IEEE J. Solid State Circuits, 2003

Hardware-based Pointer Data Prefetcher.
Proceedings of the 21st International Conference on Computer Design (ICCD 2003), 2003

Implementation of HW$im - A Real-Time Configurable Cache Simulator.
Proceedings of the Field Programmable Logic and Application, 13th International Conference, 2003

2002
Dynamic addressing memory arrays with physical locality.
Proceedings of the 35th Annual International Symposium on Microarchitecture, 2002

Bloom filtering cache misses for accurate data speculation and prefetching.
Proceedings of the 16th international conference on Supercomputing, 2002

Ditto Processor.
Proceedings of the 2002 International Conference on Dependable Systems and Networks (DSN 2002), 2002

2001
Coming challenges in microarchitecture and architecture.
Proc. IEEE, 2001

2000
Performance improvement with circuit-level speculation.
Proceedings of the 33rd Annual IEEE/ACM International Symposium on Microarchitecture, 2000

1998
Non-Stalling CounterFlow Architecture.
Proceedings of the Fourth International Symposium on High-Performance Computer Architecture, Las Vegas, Nevada, USA, January 31, 1998

1997
Advances of the Counterflow Pipeline Microarchitecture.
Proceedings of the 3rd IEEE Symposium on High-Performance Computer Architecture (HPCA '97), 1997

1996
Efficient arithmetic using self-timing.
IEEE Trans. Very Large Scale Integr. Syst., 1996

1995
Implementation of micropipelines in enable/disable CMOS differential logic.
IEEE Trans. Very Large Scale Integr. Syst., 1995

Design of a static MIMD data flow processor using micropipelines.
IEEE Trans. Very Large Scale Integr. Syst., 1995

1988
Implementation of iterative networks with CMOS differential logic.
IEEE J. Solid State Circuits, August, 1988

A safe single-phase clocking scheme for CMOS circuits.
IEEE J. Solid State Circuits, February, 1988

Device and circuit simulation interface for an integrated VLSI design environment.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 1988


  Loading...