Jung Ho Ahn
Orcid: 0000-0003-1733-1394Affiliations:
- Seoul National University, Korea
- Stanford University, USA (PhD, 2007)
According to our database1,
Jung Ho Ahn
authored at least 123 papers
between 2003 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
-
on dl.acm.org
On csauthors.net:
Bibliography
2024
IEEE Trans. Computers, October, 2024
Duplex: A Device for Large Language Models with Mixture of Experts, Grouped Query Attention, and Continuous Batching.
CoRR, 2024
HyPHEN: A Hybrid Packing Method and Its Optimizations for Homomorphic Encryption-Based Neural Networks.
IEEE Access, 2024
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024
DRAMScope: Uncovering DRAM Microarchitecture and Characteristics by Issuing Memory Commands.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024
Proceedings of the 38th ACM International Conference on Supercomputing, 2024
IDT: Intelligent Data Placement for Multi-tiered Main Memory with Reinforcement Learning.
Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing, 2024
An LPDDR-based CXL-PNM Platform for TCO-efficient Inference of Transformer-based Large Language Models.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
AttAcc! Unleashing the Power of PIM for Batched Transformer-based Generative Model Inference.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
2023
MaPHeA: A Framework for Lightweight Memory Hierarchy-aware Profile-guided Heap Allocation.
ACM Trans. Embed. Comput. Syst., 2023
High-precision RNS-CKKS on fixed but smaller word-size architectures: theory and application.
IACR Cryptol. ePrint Arch., 2023
NeuJeans: Private Neural Network Inference with Joint Optimization of Convolution and Bootstrapping.
CoRR, 2023
Toward Practical Privacy-Preserving Convolutional Neural Networks Exploiting Fully Homomorphic Encryption.
CoRR, 2023
CoRR, 2023
HyPHEN: A Hybrid Packing Method and Optimizations for Homomorphic Encryption-Based Neural Networks.
CoRR, 2023
X-ray: Discovering DRAM Internal Structure and Error Characteristics by Issuing Memory Commands.
IEEE Comput. Archit. Lett., 2023
IEEE Comput. Archit. Lett., 2023
A Hardware-Friendly Tiled Singular-Value Decomposition-Based Matrix Multiplication for Transformer-Based Models.
IEEE Comput. Archit. Lett., 2023
Unleashing the Potential of PIM: Accelerating Large Batched Inference of Transformer-Based Generative Models.
IEEE Comput. Archit. Lett., 2023
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
SHARP: A Short-Word Hierarchical Accelerator for Robust and Practical Fully Homomorphic Encryption.
Proceedings of the 50th Annual International Symposium on Computer Architecture, 2023
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023
2022
MVP: An Efficient CNN Accelerator with Matrix, Vector, and Processing-Near-Memory Units.
ACM Trans. Design Autom. Electr. Syst., 2022
Future Scaling of Memory Hierarchy for Tensor Cores and Eliminating Redundant Shared Memory Traffic Using Inter-Warp Multicasting.
IEEE Trans. Computers, 2022
AESPA: Accuracy Preserving Low-degree Polynomial Activation for Fast Private Inference.
CoRR, 2022
GraNDe: Near-Data Processing Architecture With Adaptive Matrix Mapping for Graph Convolutional Networks.
IEEE Comput. Archit. Lett., 2022
ARK: Fully Homomorphic Encryption Accelerator with Runtime Data Generation and Inter-Operation Key Reuse.
Proceedings of the 55th IEEE/ACM International Symposium on Microarchitecture, 2022
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022
Proceedings of the IEEE International Symposium on Workload Characterization, 2022
Proceedings of the IEEE International Symposium on Workload Characterization, 2022
Mithril: Cooperative Row Hammer Protection on Commodity DRAM Leveraging Managed Refresh.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2022
2021
Over 100x Faster Bootstrapping in Fully Homomorphic Encryption through Memory-centric Optimization with GPUs.
IACR Trans. Cryptogr. Hardw. Embed. Syst., 2021
IEEE Comput. Archit. Lett., 2021
Accelerating Fully Homomorphic Encryption Through Architecture-Centric Analysis and Optimization.
IEEE Access, 2021
TRiM: Enhancing Processor-Memory Interfaces with Scalable Tensor Reduction in Memory.
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
MaPHeA: a lightweight memory hierarchy-aware profile-guided heap allocation framework.
Proceedings of the LCTES '21: 22nd ACM SIGPLAN/SIGBED International Conference on Languages, 2021
Accelerating Fully Homomorphic Encryption Through Microarchitecture-Aware Analysis and Optimization.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2021
BCD deduplication: effective memory compression using partial cache-line deduplication.
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021
2020
MViD: Sparse Matrix-Vector Multiplication in Mobile DRAM for Accelerating Recurrent Neural Networks.
IEEE Trans. Computers, 2020
HEAAN Demystified: Accelerating Fully Homomorphic Encryption Through Architecture-centric Analysis and Optimization.
CoRR, 2020
CAT-TWO: Counter-Based Adaptive Tree, Time Window Optimized for DRAM Row-Hammer Prevention.
IEEE Access, 2020
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020
Accelerating Number Theoretic Transformations for Bootstrappable Homomorphic Encryption on GPUs.
Proceedings of the IEEE International Symposium on Workload Characterization, 2020
2019
Proceedings of the Second Conference on Machine Learning and Systems, SysML 2019, 2019
Proceedings of the 46th International Symposium on Computer Architecture, 2019
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019
2018
IEEE Comput. Archit. Lett., 2018
Partitioning Compute Units in CNN Acceleration for Statistical Memory Traffic Shaping.
IEEE Comput. Archit. Lett., 2018
IEEE Access, 2018
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2018
3D-Xpath: high-density managed DRAM architecture with cost-effective alternative paths for memory transactions.
Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, 2018
2017
IEEE Trans. Very Large Scale Integr. Syst., 2017
Selective DRAM cache bypassing for improving bandwidth on DRAM/NVM hybrid main memory systems.
IEICE Electron. Express, 2017
IEEE Comput. Archit. Lett., 2017
IEEE Comput. Archit. Lett., 2017
Understanding power-performance relationship of energy-efficient modern DRAM devices.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017
Work as a team or individual: Characterizing the system-level impacts of main memory partitioning.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017
SOUP-N-SALAD: Allocation-Oblivious Access Latency Reduction with Asymmetric DRAM Microarchitectures.
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017
2016
Full-Stack Architecting to Achieve a Billion-Requests-Per-Second Throughput on a Single Key-Value Store Server Platform.
ACM Trans. Comput. Syst., 2016
Near-DRAM Acceleration with Single-ISA Heterogeneous Processing in Standard Memory Modules.
IEEE Micro, 2016
IEEE Micro, 2016
IEICE Electron. Express, 2016
IEEE Comput. Archit. Lett., 2016
Chameleon: Versatile and practical near-DRAM acceleration architecture for large memory systems.
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016
Proceedings of the 34th IEEE International Conference on Computer Design, 2016
Buffered compares: Excavating the hidden parallelism inside DRAM architectures with lightweight logic.
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016
Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, 2016
2015
CIDR: A Cache Inspired Area-Efficient DRAM Resilience Architecture against Permanent Faults.
IEEE Comput. Archit. Lett., 2015
IEEE Comput. Archit. Lett., 2015
Architecting to achieve a billion requests per second throughput on a single key-value store server platform.
Proceedings of the 42nd Annual International Symposium on Computer Architecture, 2015
History-Assisted Adaptive-Granularity Caches (HAAG$) for High Performance 3D DRAM Architectures.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015
Alloy: Parallel-serial memory channel architecture for single-chip heterogeneous processor systems.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015
NDA: Near-DRAM acceleration architecture leveraging commodity DRAM devices and standard memory modules.
Proceedings of the 21st IEEE International Symposium on High Performance Computer Architecture, 2015
2014
Proceedings of the International Conference for High Performance Computing, 2014
Proceedings of the ACM/IEEE 41st International Symposium on Computer Architecture, 2014
2013
IEEE Trans. Very Large Scale Integr. Syst., 2013
MAEPER: Matching Access and Error Patterns With Error-Free Resource for Low Vcc L1 Cache.
IEEE Trans. Very Large Scale Integr. Syst., 2013
Mapping and Scheduling of Tasks and Communications on Many-Core SoC Under Local Memory Constraint.
IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., 2013
The McPAT Framework for Multicore and Manycore Architectures: Simultaneously Modeling Power, Area, and Timing.
ACM Trans. Archit. Code Optim., 2013
ACM Trans. Archit. Code Optim., 2013
McSimA+: A manycore simulator with application-level+ simulation and detailed microarchitecture modeling.
Proceedings of the 2012 IEEE International Symposium on Performance Analysis of Systems & Software, 2013
Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013
Proceedings of the IEEE/ACM International Conference on Computer-Aided Design, 2013
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013
2012
ACM Trans. Archit. Code Optim., 2012
Proceedings of the SC Conference on High Performance Computing Networking, 2012
Network within a network approach to create a scalable high-radix router microarchitecture.
Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture, 2012
Proceedings of the 2012 Design, Automation & Test in Europe Conference & Exhibition, 2012
2011
Proceedings of the International SoC Design Conference, 2011
Proceedings of the 38th International Symposium on Computer Architecture (ISCA 2011), 2011
CACTI-P: Architecture-level modeling for SRAM-based structures with advanced leakage reduction techniques.
Proceedings of the 2011 IEEE/ACM International Conference on Computer-Aided Design, 2011
A quantitative analysis of performance benefits of 3D die stacking on mobile and embedded SoC.
Proceedings of the Design, Automation and Test in Europe, 2011
Matching cache access behavior and bit error pattern for high performance low Vcc L1 cache.
Proceedings of the 48th Design Automation Conference, 2011
Proceedings of the Low Power Networks-on-Chip., 2011
2010
Proceedings of the 2010 International Symposium on Low Power Electronics and Design, 2010
2009
Multicore DIMM: an Energy Efficient Memory Module with Independently Controlled DRAMs.
IEEE Comput. Archit. Lett., 2009
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009
McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures.
Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009
2008
Proceedings of the 35th International Symposium on Computer Architecture (ISCA 2008), 2008
A Comprehensive Memory Modeling Tool and Its Application to the Design and Analysis of Future Memory Hierarchies.
Proceedings of the 35th International Symposium on Computer Architecture (ISCA 2008), 2008
Proceedings of the 16th Annual IEEE Symposium on High Performance Interconnects (HOTI 2008), 2008
2007
Proceedings of the 21th Annual International Conference on Supercomputing, 2007
Tradeoff between data-, instruction-, and thread-level parallelism in stream processors.
Proceedings of the 21th Annual International Conference on Supercomputing, 2007
2006
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006
2005
Proceedings of the 11th International Conference on High-Performance Computer Architecture (HPCA-11 2005), 2005
2004
Proceedings of the ACM/IEEE SC2004 Conference on High Performance Networking and Computing, 2004
Proceedings of the 31st International Symposium on Computer Architecture (ISCA 2004), 2004
Proceedings of the 10th International Conference on High-Performance Computer Architecture (HPCA-10 2004), 2004
2003
Proceedings of the ACM/IEEE SC2003 Conference on High Performance Networking and Computing, 2003