Jae W. Lee

Orcid: 0000-0002-4266-4919

Affiliations:
  • Seoul National University, College of Engineering, Korea
  • MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, USA (former)


According to our database1, Jae W. Lee authored at least 93 papers between 2002 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
An LSM Tree Augmented with B<sup>+</sup> Tree on Nonvolatile Memory.
ACM Trans. Storage, 2024

A Quantitative Analysis of State Space Model-Based Large Language Model: Study of Hungry Hungry Hippos.
IEEE Comput. Archit. Lett., 2024

Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023
A 4-bit 4.5-ns-Latency Pseudo-ReRAM Computing-In-Memory Macro With Self Error-Correcting DTC-Based WL Drivers and 6-bit CDAC-Less Column ADCs Having Ultra-Narrow Pitch.
IEEE Trans. Circuits Syst. II Express Briefs, September, 2023

MaPHeA: A Framework for Lightweight Memory Hierarchy-aware Profile-guided Heap Allocation.
ACM Trans. Embed. Comput. Syst., 2023

WALTZ: Leveraging Zone Append to Tighten the Tail Latency of LSM Tree on ZNS SSD.
Proc. VLDB Endow., 2023

Special Issue on Top Picks From the 2022 Computer Architecture Conferences.
IEEE Micro, 2023

DRAM Translation Layer: Software-Transparent DRAM Power Savings for Disaggregated Memory.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023

FlowKV: A Semantic-Aware Store for Large-Scale State Management of Stream Processing Engines.
Proceedings of the Eighteenth European Conference on Computer Systems, 2023

A Memory-Efficient Edge Inference Accelerator with XOR-based Model Compression.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023

Liquid: Mix-and-Match Multiple Image Formats to Balance DNN Training Pipeline.
Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems, 2023

Not All Neighbors Matter: Point Distribution-Aware Pruning for 3D Point Cloud.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
An Energy-Efficient DRAM Cache Architecture for Mobile Platforms With PCM-Based Main Memory.
ACM Trans. Embed. Comput. Syst., 2022

Architecting a Flash-Based Storage System for Low-Cost Inference of Extreme-Scale DNNs.
IEEE Trans. Computers, 2022

Ginex: SSD-enabled Billion-scale Graph Neural Network Training on a Single Machine via Provably Optimal In-memory Caching.
Proc. VLDB Endow., 2022

Layerweaver+: A QoS-Aware Layer-Wise DNN Scheduler for Multi-Tenant Neural Processing Units.
IEICE Trans. Inf. Syst., 2022

ULPPACK: Fast Sub-8-bit Matrix Multiply on Commodity SIMD Hardware.
Proceedings of the Fifth Conference on Machine Learning and Systems, 2022

ANNA: Specialized Architecture for Approximate Nearest Neighbor Search.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

Mithril: Cooperative Row Hammer Protection on Commodity DRAM Leveraging Managed Refresh.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022

A 40nm 5.6TOPS/W 239GOPS/mm<sup>2</sup> Self-Attention Processor with Sign Random Projection-based Approximation.
Proceedings of the 48th IEEE European Solid State Circuits Conference, 2022

A Hardware Automated Domain-Specific Flash Memory System for Emerging Applications.
Proceedings of the International Conference on Electronics, Information, and Communication, 2022

L3: Accelerator-Friendly Lossless Image Format for High-Resolution, High-Throughput DNN Training.
Proceedings of the Computer Vision - ECCV 2022, 2022

Effective zero compression on ReRAM-based sparse DNN accelerators.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022

2021
Accelerating Genomic Data Analytics With Composable Hardware Acceleration Framework.
IEEE Micro, 2021

An 8-bit Ring-Amplifier Based Mixed-Signal MAC Circuit With Full Digital Interface and Variable Accumulation Length.
IEEE Access, 2021

ASAP: Fast Mobile Application Switch via Adaptive Prepaging.
Proceedings of the 2021 USENIX Annual Technical Conference, 2021

MaPHeA: a lightweight memory hierarchy-aware profile-guided heap allocation framework.
Proceedings of the LCTES '21: 22nd ACM SIGPLAN/SIGBED International Conference on Languages, 2021

ELSA: Hardware-Software Co-design for Efficient, Lightweight Self-Attention Mechanism in Neural Networks.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

BOSS: Bandwidth-Optimized Search Accelerator for Storage-Class Memory.
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021

Layerweaver: Maximizing Resource Utilization of Neural Processing Units via Layer-Wise Scheduling.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021

Behemoth: A Flash-centric Training Accelerator for Extreme-scale DNNs.
Proceedings of the 19th USENIX Conference on File and Storage Technologies, 2021

FlashNeuron: SSD-Enabled Large-Batch Training of Very Deep Neural Networks.
Proceedings of the 19th USENIX Conference on File and Storage Technologies, 2021

Message from the General Chair.
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2021

MERCI: efficient embedding reduction on commodity hardware via sub-query memoization.
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021

2020
Graphene: Strong yet Lightweight Row Hammer Protection.
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020

A Case for Hardware-Based Demand Paging.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

A Specialized Architecture for Object Serialization with Applications to Big Data Analytics.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

Genesis: A Hardware Acceleration Framework for Genomic Data Analysis.
Proceedings of the 47th ACM/IEEE Annual International Symposium on Computer Architecture, 2020

Unlocking Wordline-level Parallelism for Fast Inference on RRAM-based DNN Accelerator.
Proceedings of the IEEE/ACM International Conference On Computer Aided Design, 2020

A<sup>3</sup>: Accelerating Attention Mechanisms in Neural Networks with Approximation.
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020

IIU: Specialized Architecture for Inverted Index Search.
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020

2019
SSDStreamer: Specializing I/O Stack for Large-Scale Machine Learning.
IEEE Micro, 2019

Eager Memory Management for In-Memory Data Analytics.
IEICE Trans. Inf. Syst., 2019

Asynchronous I/O Stack: A Low-latency Kernel I/O Stack for Ultra-Low Latency SSDs.
Proceedings of the 2019 USENIX Annual Technical Conference, 2019

Practical Erase Suspension for Modern Low-latency SSDs.
Proceedings of the 2019 USENIX Annual Technical Conference, 2019

Charon: Specialized Near-Memory Processing Architecture for Clearing Dead Objects in Memory.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019

Enforcing Last-Level Cache Partitioning through Memory Virtual Channels.
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019

2018
Erratum: Energy-efficient heterogeneous memory system for mobile platforms [IEICE Electronics Express Vol. 14 (2017) No. 24 pp. 20171002].
IEICE Electron. Express, 2018

Bandwidth-aware DRAM page migration for heterogeneous mobile memory systems.
Proceedings of the IEEE International Conference on Consumer Electronics, 2018

A portable, automatic data qantizer for deep neural networks.
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018

2017
On the Performance of Beam Division Nonorthogonal Multiple Access for FDD-Based Large-Scale Multi-User MIMO Systems.
IEEE Trans. Wirel. Commun., 2017

Energy-efficient heterogeneous memory system for mobile platforms.
IEICE Electron. Express, 2017

DRAM architecture for efficient data lifetime management.
IEICE Electron. Express, 2017

Evaluation of Performance Unfairness in NUMA System Architecture.
IEEE Comput. Archit. Lett., 2017

SALAD: Achieving Symmetric Access Latency with Asymmetric DRAM Architecture.
IEEE Comput. Archit. Lett., 2017

SOUP-N-SALAD: Allocation-Oblivious Access Latency Reduction with Asymmetric DRAM Microarchitectures.
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017

Context-Aware Memory Profiling for Speculative Parallelism.
Proceedings of the 24th IEEE International Conference on High Performance Computing, 2017

Constructive Multi-User Interference for Symbol-Level Link Adaptation: MMSE Approach.
Proceedings of the 2017 IEEE Globecom Workshops, Singapore, December 4-8, 2017, 2017

Jointly optimizing task granularity and concurrency for in-memory mapreduce frameworks.
Proceedings of the 2017 IEEE International Conference on Big Data (IEEE BigData 2017), 2017

History-Based Arbitration for Fairness in Processor-Interconnect of NUMA Servers.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

Typed Architectures: Architectural Support for Lightweight Scripting.
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017

2016
Workload-Aware Optimal Power Allocation on Single-Chip Heterogeneous Processors.
IEEE Trans. Parallel Distributed Syst., 2016

An eDRAM-Based Approximate Register File for GPUs.
IEEE Des. Test, 2016

Short-Circuit Dispatch: Accelerating Virtual Machine Interpreters on Embedded Processors.
Proceedings of the 43rd ACM/IEEE Annual International Symposium on Computer Architecture, 2016

Efficient footprint caching for Tagless DRAM Caches.
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016

Speculatively Exploiting Cross-Invocation Parallelism.
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016

2015
A neural network accelerator for mobile application processors.
IEEE Trans. Consumer Electron., 2015

JAWS: a JavaScript framework for adaptive CPU-GPU work sharing.
Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2015

A fully associative, tagless DRAM cache.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015

2014
Low-Overhead Network-on-Chip Support for Location-Oblivious Task Placement.
IEEE Trans. Computers, 2014

Efficient CPU-GPU work sharing for data-parallel JavaScript workloads.
Proceedings of the 23rd International World Wide Web Conference, 2014

Microbank: Architecting Through-Silicon Interposer-Based Main Memory Systems.
Proceedings of the International Conference for High Performance Computing, 2014

QPR.js: a runtime framework for QoS-aware power optimization for parallel JavaScript programs.
Proceedings of the International Symposium on Low Power Electronics and Design, 2014

eDRAM-based tiered-reliability memory with applications to low-power frame buffers.
Proceedings of the International Symposium on Low Power Electronics and Design, 2014

Security Vulnerability in Processor-Interconnect Router Design.
Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, 2014

2013
Practical speculative parallelization of variable-length decompression algorithms.
Proceedings of the SIGPLAN/SIGBED Conference on Languages, 2013

Reducing memory access latency with asymmetric DRAM bank organizations.
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013

Practical automatic loop specialization.
Proceedings of the Architectural Support for Programming Languages and Operating Systems, 2013

2012
Globally Synchronized Frames for guaranteed quality-of-service in on-chip networks.
J. Parallel Distributed Comput., 2012

DAFT: Decoupled Acyclic Fault Tolerance.
Int. J. Parallel Program., 2012

Parcae: a system for flexible parallel execution.
Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation, 2012

Runtime asynchronous fault tolerance via speculation.
Proceedings of the 10th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2012

Automatic speculative DOALL for clusters.
Proceedings of the 10th Annual IEEE/ACM International Symposium on Code Generation and Optimization, 2012

From sequential programming to flexible parallel execution.
Proceedings of the 15th International Conference on Compilers, 2012

2011
Parallelism orchestration using DoPE: the degree of parallelism executive.
Proceedings of the 32nd ACM SIGPLAN Conference on Programming Language Design and Implementation, 2011

2010
Probabilistic Distance-Based Arbitration: Providing Equality of Service for Many-Core CMPs.
Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010

Scalable Speculative Parallelization on Commodity Clusters.
Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010

Approximating age-based arbitration in on-chip networks.
Proceedings of the 19th International Conference on Parallel Architectures and Compilation Techniques, 2010

2007
Continual hashing for efficient fine-grain state inconsistency detection.
Proceedings of the 25th International Conference on Computer Design, 2007

2006
METERG: Measurement-Based End-to-End Performance Estimation Technique in QoS-Capable Multiprocessors.
Proceedings of the 12th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS 2006), 2006

2005
Extracting secret keys from integrated circuits.
IEEE Trans. Very Large Scale Integr. Syst., 2005

2004
Secure program execution via dynamic information flow tracking.
Proceedings of the 11th International Conference on Architectural Support for Programming Languages and Operating Systems, 2004

2002
The Raw Microprocessor: A Computational Fabric for Software Circuits and General-Purpose Programs.
IEEE Micro, 2002


  Loading...