Lee-Sup Kim

Orcid: 0000-0001-9585-4591

According to our database1, Lee-Sup Kim authored at least 198 papers between 1989 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
ToEx: Accelerating Generation Stage of Transformer-Based Language Models via Token-Adaptive Early Exit.
IEEE Trans. Computers, September, 2024

ADC-Free ReRAM-Based In-Situ Accelerator for Energy-Efficient Binary Neural Networks.
IEEE Trans. Computers, February, 2024

Accelerating Deep Reinforcement Learning via Phase-Level Parallelism for Robotics Applications.
IEEE Comput. Archit. Lett., 2024

Token-Picker: Accelerating Attention in Text Generation with Minimized Memory Transfer via Probability Estimation.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024

2023
MGen: A Framework for Energy-Efficient In-ReRAM Acceleration of Multi-Task BERT.
IEEE Trans. Computers, November, 2023

Fault-Free: A Framework for Analysis and Mitigation of Stuck-at-Fault on Realistic ReRAM-Based DNN Accelerators.
IEEE Trans. Computers, July, 2023

Accelerating On-Device DNN Training Workloads via Runtime Convergence Monitor.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., May, 2023

Energy-Efficient CNN Personalized Training by Adaptive Data Reformation.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2023

OptimStore: In-Storage Optimization of Large Scale DNNs with On-Die Processing.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023

2022
Quantization-Error-Robust Deep Neural Network for Embedded Accelerators.
IEEE Trans. Circuits Syst. II Express Briefs, 2022

A Framework for Accelerating Transformer-Based Language Model on ReRAM-Based Architecture.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2022

Rare Computing: Removing Redundant Multiplications From Sparse and Repetitive Data in Deep Neural Networks.
IEEE Trans. Computers, 2022

S-FLASH: A NAND Flash-Based Deep Neural Network Accelerator Exploiting Bit-Level Sparsity.
IEEE Trans. Computers, 2022

EGCN: An Efficient GCN Accelerator for Minimizing Off-Chip Memory Access.
IEEE Trans. Computers, 2022

A Deep Neural Network Training Architecture With Inference-Aware Heterogeneous Data-Type.
IEEE Trans. Computers, 2022

Re<sup>2</sup>fresh: A Framework for Mitigating Read Disturbance in ReRAM-Based DNN Accelerators.
Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided Design, 2022

Algorithm/architecture co-design for energy-efficient acceleration of multi-task DNN.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021
Amnesiac DRAM: A Proactive Defense Mechanism Against Cold Boot Attacks.
IEEE Trans. Computers, 2021

Deferred Dropout: An Algorithm-Hardware Co-Design DNN Training Method Provisioning Consistent High Activation Sparsity.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

A Framework for Area-efficient Multi-task BERT Execution on ReRAM-based Accelerators.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

A Convergence Monitoring Method for DNN Training of On-Device Task Adaptation.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2021

Optimizing ADC Utilization through Value-Aware Bypass in ReRAM-based DNN Accelerator.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

Fault-free: A Fault-resilient Deep Neural Network Accelerator based on Realistic ReRAM Devices.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021

2020
An Energy-Efficient Deep Convolutional Neural Network Inference Processor With Enhanced Output Stationary Dataflow in 65-nm CMOS.
IEEE Trans. Very Large Scale Integr. Syst., 2020

A 10.8 Gb/s Quarter-Rate 1 FIR 1 IIR Direct DFE With Non-Time-Overlapping Data Generation for 4: 1 CMOS Clockless Multiplexer.
IEEE Trans. Circuits Syst. II Express Briefs, 2020

CREMON: Cryptography Embedded on the Convolutional Neural Network Accelerator.
IEEE Trans. Circuits Syst., 2020

An Energy-Efficient Deep Convolutional Neural Network Training Accelerator for In Situ Personalization on Smart Devices.
IEEE J. Solid State Circuits, 2020

A Thermal-aware Optimization Framework for ReRAM-based Deep Neural Network Acceleration.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

A Pragmatic Approach to On-device Incremental Learning System with Selective Weight Updates.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020

2019
A 0.9-V 12-Gb/s Two-FIR Tap Direct DFE With Feedback-Signal Common-Mode Control.
IEEE Trans. Very Large Scale Integr. Syst., 2019

A 12 Gb/s 1.59 mW/Gb/s Input-Data-Jitter-Tolerant Injection-Type CDR With Super-Harmonic Injection-Locking in 65-nm CMOS.
IEEE Trans. Circuits Syst. II Express Briefs, 2019

Sparse-Insertion Write Cache to Mitigate Write Disturbance Errors in Phase Change Memory.
IEEE Trans. Computers, 2019

DC-PCM: Mitigating PCM Write Disturbance with Low Performance Overhead by Using Detection Cells.
IEEE Trans. Computers, 2019

A 0.87 V 12.5 Gb/s Clock-Path Feedback Equalization Receiver with Unfixed Tap Weighting Property in 65 nm CMOS.
Proceedings of the 2019 Symposium on VLSI Circuits, Kyoto, Japan, June 9-14, 2019, 2019

Compressing Sparse Ternary Weight Convolutional Neural Networks for Efficient Hardware Acceleration.
Proceedings of the 2019 IEEE/ACM International Symposium on Low Power Electronics and Design, 2019

A PVT-robust Customized 4T Embedded DRAM Cell Array for Accelerating Binary Neural Networks.
Proceedings of the International Conference on Computer-Aided Design, 2019

An Energy-efficient Processing-in-memory Architecture for Long Short Term Memory in Spin Orbit Torque MRAM.
Proceedings of the International Conference on Computer-Aided Design, 2019

eSRCNN: A Framework for Optimizing Super-Resolution Tasks on Diverse Embedded CNN Accelerators.
Proceedings of the International Conference on Computer-Aided Design, 2019

NAND-Net: Minimizing Computational Complexity of In-Memory Processing for Binary Neural Networks.
Proceedings of the 25th IEEE International Symposium on High Performance Computer Architecture, 2019

An Optimized Design Technique of Low-bit Neural Network Training for Personalization on IoT Devices.
Proceedings of the 56th Annual Design Automation Conference 2019, 2019

A 47.4µJ/epoch Trainable Deep Convolutional Neural Network Accelerator for In-Situ Personalization on Smart Devices.
Proceedings of the IEEE Asian Solid-State Circuits Conference, 2019

2018
A 0.65-V, 11.2-Gb/s Power Noise Tolerant Source-Synchronous Injection-Locked Receiver With Direct DTLB DFE.
IEEE Trans. Circuits Syst. II Express Briefs, 2018

A 10-Gb/s Reference-Less Baud-Rate CDR for Low Power Consumption With the Direct Feedback Method.
IEEE Trans. Circuits Syst. II Express Briefs, 2018

Elaborate Refresh: A Fine Granularity Retention Management for Deep Submicron DRAMs.
IEEE Trans. Computers, 2018

TrainWare: A Memory Optimized Weight Update Architecture for On-Device Convolutional Neural Network Training.
Proceedings of the International Symposium on Low Power Electronics and Design, 2018

NID: processing binary convolutional neural network in commodity DRAM.
Proceedings of the International Conference on Computer-Aided Design, 2018

2017
In-DRAM Data Initialization.
IEEE Trans. Very Large Scale Integr. Syst., 2017

A 5-Gb/s Digital Clock and Data Recovery Circuit With Reduced DCO Supply Noise Sensitivity Utilizing Coupling Network.
IEEE Trans. Very Large Scale Integr. Syst., 2017

An Input Data and Power Noise Inducing Clock Jitter Tolerant Reference-Less Digital CDR for LCD Intra-Panel Interface.
IEEE Trans. Circuits Syst. I Regul. Pap., 2017

Energy-Efficient Design of Processing Element for Convolutional Neural Network.
IEEE Trans. Circuits Syst. II Express Briefs, 2017

Rank-Level Parallelism in DRAM.
IEEE Trans. Computers, 2017

Bank-Group Level Parallelism.
IEEE Trans. Computers, 2017

Refresh-Aware Write Recovery Memory Controller.
IEEE Trans. Computers, 2017

Hardware-Centric Vision Processing for Mobile IoT Environment Exploiting Approximate Graph Cut in Resistor Grid.
Proceedings of the 2017 IEEE Winter Conference on Applications of Computer Vision, 2017

SENIN: An energy-efficient sparse neuromorphic system with on-chip learning.
Proceedings of the 2017 IEEE/ACM International Symposium on Low Power Electronics and Design, 2017

A Kernel Decomposition Architecture for Binary-weight Convolutional Neural Networks.
Proceedings of the 54th Annual Design Automation Conference, 2017

2016
A 5-Gb/s 2.67-mW/Gb/s Digital Clock and Data Recovery With Hybrid Dithering Using a Time-Dithered Delta-Sigma Modulator.
IEEE Trans. Very Large Scale Integr. Syst., 2016

A 21-Gbit/s 1.63-pJ/bit Adaptive CTLE and One-Tap DFE With Single Loop Spectrum Balancing Method.
IEEE Trans. Very Large Scale Integr. Syst., 2016

A Vision Processor With a Unified Interest-Point Detection and Matching Hardware for Accelerating a Stereo-Matching Algorithm.
IEEE Trans. Circuits Syst. Video Technol., 2016

A 21%-Jitter-Improved Self-Aligned Dividerless Injection-Locked PLL With a VCO Control Voltage Ripple-Compensated Phase Detector.
IEEE Trans. Circuits Syst. II Express Briefs, 2016

A 10-Gb/s 0.71-pJ/bit Forwarded-Clock Receiver Tolerant to High-Frequency Jitter in 65-nm CMOS.
IEEE Trans. Circuits Syst. II Express Briefs, 2016

DRAM-Latency Optimization Inspired by Relationship between Row-Access Time and Refresh Timing.
IEEE Trans. Computers, 2016

Q-DRAM: Quick-Access DRAM with Decoupled Restoring from Row-Activation.
IEEE Trans. Computers, 2016

14.6 A 1.42TOPS/W deep convolutional neural network recognition processor for intelligent IoE systems.
Proceedings of the 2016 IEEE International Solid-State Circuits Conference, 2016

Crosstalk avoidance code for direct pass-through architecture.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2016

Energy Efficient Data Encoding in DRAM Channels Exploiting Data Value Similarity.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

2015
An 11.5 Gb/s 1/4th Baud-Rate CTLE and Two-Tap DFE With Boosted High Frequency Gain in 110-nm CMOS.
IEEE Trans. Very Large Scale Integr. Syst., 2015

A Forwarded Clock Receiver Based on Injection-Locked Oscillator With AC-Coupled Clock Multiplication Unit in 0.13~µm CMOS.
IEEE Trans. Very Large Scale Integr. Syst., 2015

A 9.6-Gb/s 1.22-mW/Gb/s Data-Jitter Mixing Forwarded-Clock Receiver in 65-nm CMOS.
IEEE Trans. Very Large Scale Integr. Syst., 2015

A 9.6 Gb/s 0.96 mW/Gb/s Forwarded Clock Receiver With High Jitter Tolerance Using Mixing Cell Integrated Injection-Locked Oscillator.
IEEE Trans. Circuits Syst. I Regul. Pap., 2015

Hybrid Temperature Sensor Network for Area-Efficient On-Chip Thermal Map Sensing.
IEEE J. Solid State Circuits, 2015

An integrated time register and arithmetic circuit with combined operation for time-domain signal processing.
Proceedings of the 2015 IEEE International Symposium on Circuits and Systems, 2015

Multiple clone row DRAM: a low latency and area optimized DRAM.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

An injection locked PLL for power supply variation robustness using negative phase shift phenomenon of injection locked frequency divider.
Proceedings of the 2015 IEEE Custom Integrated Circuits Conference, 2015

2014
A Quarter-Rate Forwarded Clock Receiver Based on ILO With Low Jitter Tracking Bandwidth Variation Using Phase Shifting Phenomenon in 65 nm CMOS.
IEEE Trans. Circuits Syst. I Regul. Pap., 2014

A 5 Gbps 1.6 mW/G bps/CH Adaptive Crosstalk Cancellation Scheme With Reference-less Digital Calibration and Switched Termination Resistors for Single-Ended Parallel Interface.
IEEE Trans. Circuits Syst. I Regul. Pap., 2014

A Forwarded-Clock Receiver With Constant and Wide-Range Jitter-Tracking Bandwidth.
IEEE Trans. Circuits Syst. II Express Briefs, 2014

An area-efficient on-chip temperature sensor with nonlinearity compensation using injection-locked oscillator (ILO).
Proceedings of the IEEE International Symposium on Circuits and Systemss, 2014

Timing error masking by exploiting operand value locality in SIMD architecture.
Proceedings of the 32nd IEEE International Conference on Computer Design, 2014

NUAT: A non-uniform access time memory controller.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014

2013
A Unified Graphics and Vision Processor With a 0.89 µW/fps Pose Estimation Engine for Augmented Reality.
IEEE Trans. Very Large Scale Integr. Syst., 2013

A 182 mW 94.3 f/s in Full HD Pattern-Matching Based Image Recognition Accelerator for an Embedded Vision System in 0.13-µm CMOS Technology.
IEEE Trans. Circuits Syst. Video Technol., 2013

A LOG-Induced SSN-Tolerant Transceiver for On-Chip Interconnects in COG-Packaged Source Driver IC for TFT-LCD.
IEEE Trans. Circuits Syst. II Express Briefs, 2013

A Reconfigurable SIMT Processor for Mobile Ray Tracing With Contention Reduction in Shared Memory.
IEEE Trans. Circuits Syst. I Regul. Pap., 2013

A 6.5-Gb/s 1-mW/Gb/s/CH Simple Capacitive Crosstalk Compensator in a 130-nm Process.
IEEE Trans. Circuits Syst. II Express Briefs, 2013

PowerField: A Probabilistic Approach for Temperature-to-Power Conversion Based on Markov Random Field Theory.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2013

A 1 mJ/Frame Unified Media Application Processor With Dynamic Analog-Digital Mode Reconfiguration for Embedded 3D-Media Contents Processing.
IEEE J. Solid State Circuits, 2013

An 8Gb/s 0.65mW/Gb/s forwarded-clock receiver using an ILO with dual feedback loop and quadrature injection scheme.
Proceedings of the 2013 IEEE International Solid-State Circuits Conference, 2013

All-digital hybrid temperature sensor network for dense thermal monitoring.
Proceedings of the 2013 IEEE International Solid-State Circuits Conference, 2013

A 7 mW 2.5 GHz spread spectrum clock generator using switch-controlled injection-locked oscillator.
Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013

2012
An Adaptive Equalizer With the Capacitance Multiplication for DisplayPort Main Link in 0.18-µm CMOS.
IEEE Trans. Very Large Scale Integr. Syst., 2012

A Mobile 3-D Display Processor With A Bandwidth-Saving Subdivider.
IEEE Trans. Very Large Scale Integr. Syst., 2012

Homogeneous Stream Processors With Embedded Special Function Units for High-Utilization Programmable Shaders.
IEEE Trans. Very Large Scale Integr. Syst., 2012

A Reconfigurable Heterogeneous Multimedia Processor for IC-Stacking on Si-Interposer.
IEEE Trans. Circuits Syst. Video Technol., 2012

A 5.4-Gb/s Clock and Data Recovery Circuit Using Seamless Loop Transition Scheme With Minimal Phase Noise Degradation.
IEEE Trans. Circuits Syst. I Regul. Pap., 2012

A 5.4/2.7/1.62-Gb/s Receiver for DisplayPort Version 1.2 With Multi-Rate Operation Scheme.
IEEE Trans. Circuits Syst. I Regul. Pap., 2012

MRTP: Mobile Ray Tracing Processor With Reconfigurable Stream Multi-Processors for High Datapath Utilization.
IEEE J. Solid State Circuits, 2012

1.22mW/Gb/s 9.6Gb/s data jitter mixing forwarded-clock receiver robust against power noise with 1.92ns latency mismatch between data and clock in 65nm CMOS.
Proceedings of the Symposium on VLSI Circuits, 2012

A 20 Gbps 1-tap decision feedback equalizer with unfixed tap coefficient.
Proceedings of the 2012 IEEE International Symposium on Circuits and Systems, 2012

PowerField: a transient temperature-to-power technique based on Markov random field theory.
Proceedings of the 49th Annual Design Automation Conference 2012, 2012

A 1mJ/frame unified media application processor with a 179.7pJ mixed-mode feature extraction engine for embedded 3D-media contents processing.
Proceedings of the IEEE 2012 Custom Integrated Circuits Conference, 2012

2011
A Dual-Shader 3-D Graphics Processor With Fast 4-D Vector Inner Product Units and Power-Aware Texture Cache.
IEEE Trans. Very Large Scale Integr. Syst., 2011

A Memory-Efficient Unified Early Z-Test.
IEEE Trans. Vis. Comput. Graph., 2011

A Spread Spectrum Clock Generator for DisplayPort Main Link.
IEEE Trans. Circuits Syst. II Express Briefs, 2011

A 100 MHz-to-1 GHz Fast-Lock Synchronous Clock Generator With DCC for Mobile Applications.
IEEE Trans. Circuits Syst. II Express Briefs, 2011

A 275mW heterogeneous multimedia processor for IC-stacking on Si-interposer.
Proceedings of the IEEE International Solid-State Circuits Conference, 2011

Area-efficient dynamic thermal management unit using MDLL with shared DLL scheme for many-core processors.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2011), 2011

A 5.4 Gb/s clock and data recovery circuit using the seamless loop transition scheme without phase noise degradation.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2011), 2011

A 7.4 Gb/s forwarded clock receiver based on first-harmonic injection-locked oscillator using AC coupled clock multiplication unit in 0.13µm CMOS.
Proceedings of the 2011 IEEE Custom Integrated Circuits Conference, 2011

2010
A Data-Pattern-Tolerant Adaptive Equalizer Using the Spectrum Balancing Method.
IEEE Trans. Circuits Syst. II Express Briefs, 2010

Correction on "A 5-Gb/s/pin Transceiver for DDR Memory Interface With a Crosstalk Suppression Scheme" [Aug 09 2222-2232].
IEEE J. Solid State Circuits, 2010

A 116 fps/74 mW Heterogeneous 3D-Media Processor for 3-D Display Applications.
IEEE J. Solid State Circuits, 2010

A graphics and vision unified processor with 0.89µW/fps pose estimation engine for augmented reality.
Proceedings of the IEEE International Solid-State Circuits Conference, 2010

An area efficient asynchronous gated ring oscillator TDC with minimum GRO stages.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2010), May 30, 2010

A high resolution metastability-independent two-step gated ring oscillator TDC with enhanced noise shaping.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2010), May 30, 2010

Reconfigurable mobile stream processor for ray tracing.
Proceedings of the IEEE Custom Integrated Circuits Conference, 2010

100MHz-to-1GHz open-loop ADDLL with fast lock-time for mobile applications.
Proceedings of the IEEE Custom Integrated Circuits Conference, 2010

2009
A 186-Mvertices/s 161-mW Floating-Point Vertex Processor With Optimized Datapath and Vertex Caches.
IEEE Trans. Very Large Scale Integr. Syst., 2009

A Floating-Point Unit for 4D Vector Inner Product with Reduced Latency.
IEEE Trans. Computers, 2009

A 5-Gb/s/pin Transceiver for DDR Memory Interface With a Crosstalk Suppression Scheme.
IEEE J. Solid State Circuits, 2009

A DLL With Jitter Reduction Techniques and Quadrature Phase Generation for DRAM Interfaces.
IEEE J. Solid State Circuits, 2009

A 0.13-µm CMOS 6 Gb/s/pin Memory Transceiver Using Pseudo-Differential Signaling for Removing Common-Mode Noise Due to SSN.
IEEE J. Solid State Circuits, 2009

Shader-based tessellation to save memory bandwidth in a mobile multimedia processor.
Comput. Graph., 2009

A 6Gb/s/pin pseudo-differential signaling using common-mode noise rejection techniques without reference signal for DRAM interfaces.
Proceedings of the IEEE International Solid-State Circuits Conference, 2009

A Spread Spectrum Clock Generator with Spread Ratio Error Reduction Scheme for DisplayPort Main Link.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2009), 2009

Bank-partition and Multi-fetch Scheme for Floating-point Special Function units in Multi-core Systems.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2009), 2009

2008
An Area Efficient Early Z -Test Method for 3-D Graphics Rendering Hardware.
IEEE Trans. Circuits Syst. I Regul. Pap., 2008

A 36 fps SXGA 3-D Display Processor Embedding a Programmable 3-D Graphics Rendering Engine.
IEEE J. Solid State Circuits, 2008

A 20 Gb/s 1: 4 DEMUX Without Inductors and Low-Power Divide-by-2 Circuit in 0.13 µm CMOS Technology.
IEEE J. Solid State Circuits, 2008

Noise Robust Motion Refinement for Motion Compensated Noise Reduction.
IEICE Trans. Inf. Syst., 2008

Area-efficient pixel rasterization and texture coordinate interpolation.
Comput. Graph., 2008

Clipping-ratio-independent 3D graphics clipping engine by dual-thread algorithm.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2008), 2008

High speed serial interface for mobile LCD driver IC.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2008), 2008

A 3D graphics processor with fast 4D vector inner product units and power aware texture cache.
Proceedings of the IEEE 2008 Custom Integrated Circuits Conference, 2008

A 5-Gb/s/pin transceiver for DDR memory interface with a crosstalk suppression scheme.
Proceedings of the IEEE 2008 Custom Integrated Circuits Conference, 2008

Tessellation-enabled shader for a bandwidth-limited 3D graphics engine.
Proceedings of the IEEE 2008 Custom Integrated Circuits Conference, 2008

2007
An Energy-Efficient Mobile Vertex Processor With Multithread Expanded VLIW Architecture and Vertex Caches.
IEEE J. Solid State Circuits, 2007

Binary Motion Estimation with Hybrid Distortion Measure.
IEICE Trans. Inf. Syst., 2007

A 36fps SXGA 3D Display Processor with a Programmable 3D Graphics Rendering Engine.
Proceedings of the 2007 IEEE International Solid-State Circuits Conference, 2007

A DLL with Jitter-Reduction Techniques for DRAM Interfaces.
Proceedings of the 2007 IEEE International Solid-State Circuits Conference, 2007

Triangle-Level Depth Filter Method for Bandwidth Reduction in 3D Graphics Hardware.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2007), 2007

A 186Mvertices/s 161mW Floating-Point Vertex Processor for Mobile Graphics Systems.
Proceedings of the IEEE 2007 Custom Integrated Circuits Conference, 2007

2006
A low-power ROM using single charge-sharing capacitor and hierarchical bit line.
IEEE Trans. Very Large Scale Integr. Syst., 2006

A cost-effective VLSI architecture for anisotropic texture filtering in limited memory bandwidth.
IEEE Trans. Very Large Scale Integr. Syst., 2006

An SoC with 1.3 gtexels/s 3-D graphics full pipeline for consumer applications.
IEEE J. Solid State Circuits, 2006

A 120Mvertices/s multi-threaded VLIW vertex processor for mobile multimedia applications.
Proceedings of the 2006 IEEE International Solid State Circuits Conference, 2006

A 20gb/s 1: 4 DEMUX without inductors in 0.13µm CMOS.
Proceedings of the 2006 IEEE International Solid State Circuits Conference, 2006

A low power SoC bus with low-leakage and low-swing technique.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006

A 0.18µm CMOS 10Gb/s 1: 4 DEMUX using replica-bias circuits for optical receiver.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006

Charge-pump reducing current mismatch in DLLs and PLLs.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006

Vertex cache of programmable geometry processor for mobile multimedia application.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006

An efficient texture cache for programmable vertex shaders.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2006), 2006

2005
A Method to Generate Soft Shadows Using a Layered Depth Image and Warping.
IEEE Trans. Vis. Comput. Graph., 2005

A low-power CAM using pulsed NAND-NOR match-line and charge-recycling search-line driver.
IEEE J. Solid State Circuits, 2005

A low-power SRAM using hierarchical bit line and local sense amplifiers.
IEEE J. Solid State Circuits, 2005

A 250-MHz-2-GHz wide-range delay-locked loop.
IEEE J. Solid State Circuits, 2005

An Efficient Memory Address Converter for Soc-based 3d Graphics System.
J. Circuits Syst. Comput., 2005

Geometry engine architecture with early backface culling hardware.
Comput. Graph., 2005

A 33.2M vertices/sec programmable geometry engine for multimedia embedded systems.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2005), 2005

A 3-way SIMD engine for programmable triangle setup in embedded 3D graphics hardware.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2005), 2005

An 11M-triangles/sec 3D graphics clipping engine for triangle primitives.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2005), 2005

A 500MHz DLL with second order duty cycle corrector for low jitter.
Proceedings of the IEEE 2005 Custom Integrated Circuits Conference, 2005

2004
An 800-MHz low-power direct digital frequency synthesizer with an on-chip D/a converter.
IEEE J. Solid State Circuits, 2004

A high-resolution synchronous mirror delay using successive approximation register.
IEEE J. Solid State Circuits, 2004

An Efficient Fragment Processing Technique in A-Buffer Implementation.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2004

An adaptive spatial filter for early depth test.
Proceedings of the 2004 International Symposium on Circuits and Systems, 2004

An error pattern ROM compression method for continuous data.
Proceedings of the 2004 International Symposium on Circuits and Systems, 2004

A high performance low power dynamic PLA with conditional evaluation scheme.
Proceedings of the 2004 International Symposium on Circuits and Systems, 2004

Division-free rasterizer for perspective-correct texture filtering.
Proceedings of the 2004 International Symposium on Circuits and Systems, 2004

Adaptive Selection of an Index in a Texture Cache.
Proceedings of the 22nd IEEE International Conference on Computer Design: VLSI in Computers & Processors (ICCD 2004), 2004

A 250MHz-2GHz wide range delay-locked loop.
Proceedings of the IEEE 2004 Custom Integrated Circuits Conference, 2004

2003
A low-power charge-recycling ROM architecture.
IEEE Trans. Very Large Scale Integr. Syst., 2003

Winscale: an image-scaling algorithm using an area pixel model.
IEEE Trans. Circuits Syst. Video Technol., 2003

A low-power ROM using charge recycling and charge sharing techniques.
IEEE J. Solid State Circuits, 2003

A clock delayed sleep mode domino logic for wide dynamic OR gate.
Proceedings of the 2003 International Symposium on Low Power Electronics and Design, 2003

A hierarchical depth buffer for minimizing memory bandwidth in 3D rendering engine: Depth Filter.
Proceedings of the 2003 International Symposium on Circuits and Systems, 2003

A low power charge sharing ROM using dummy bit lines.
Proceedings of the 2003 International Symposium on Circuits and Systems, 2003

A hardware-like high-level language based environment for 3D graphics architecture exploration.
Proceedings of the 2003 International Symposium on Circuits and Systems, 2003

A PN triangle generation unit for fast and simple tessellation hardware.
Proceedings of the 2003 International Symposium on Circuits and Systems, 2003

2002
A high speed direct digital frequency synthesizer using a low power pipelined parallel accumulator.
Proceedings of the 2002 International Symposium on Circuits and Systems, 2002

A 1.67 GHz 32-bit pipelined carry-select adder using the complementary scheme.
Proceedings of the 2002 International Symposium on Circuits and Systems, 2002

A ROM compression method for continuous data.
Proceedings of the IEEE 2002 Custom Integrated Circuits Conference, 2002

2001
A hardware cost minimized fast Phong shader.
IEEE Trans. Very Large Scale Integr. Syst., 2001

An advanced contrast enhancement using partially overlapped sub-block histogram equalization.
IEEE Trans. Circuits Syst. Video Technol., 2001

A high-speed pattern decoder in MPEG-4 padding block hardware accelerator.
Proceedings of the 2001 International Symposium on Circuits and Systems, 2001

A low power carry select adder with reduced area.
Proceedings of the 2001 International Symposium on Circuits and Systems, 2001

Design trade-off in merged DRAM logic for video signal processing.
Proceedings of the 2001 International Symposium on Circuits and Systems, 2001

SPAF: Sub-texel Precision Anisotropic Filtering.
Proceedings of the 2001 ACM SIGGRAPH/EUROGRAPHICS Workshop on Graphics Hardware, 2001

2000
A real-time wavelet vector quantization algorithm and its VLSI architecture.
IEEE Trans. Circuits Syst. Video Technol., 2000

A programmable 3.2-GOPS merged DRAM logic for video signal processing.
IEEE Trans. Circuits Syst. Video Technol., 2000

Comments on "New dynamic flip-flops for high-speed dual-modulus prescaler".
IEEE J. Solid State Circuits, 2000

SPARP: a single pass antialiased rasterization processor.
Comput. Graph., 2000

Single-Pass Full-Screen Hardware Accelerated Antialiasing.
Proceedings of the 2000 ACM SIGGRAPH/EUROGRAPHICS Workshop on Graphics Hardware, 2000

A Memory Architecture with 4-Address Configurations for Video Signal Processing.
Proceedings of the 2000 Design, 2000

1998
A minimized hardware architecture of fast Phong shader using Taylor series approximation in 3D graphics.
Proceedings of the International Conference on Computer Design: VLSI in Computers and Processors, 1998

1997
A new 4-2 adder and booth selector for low power MAC unit.
Proceedings of the 1997 International Symposium on Low Power Electronics and Design, 1997

1994
A 200 MHz 13 mm<sup>2</sup> 2-D DCT macrocell using sense-amplifying pipeline flip-flop scheme.
IEEE J. Solid State Circuits, December, 1994

1989
Modeling of the distributed gate RC effect in MOSFET's.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 1989


  Loading...