Hyesoon Kim
Orcid: 0000-0002-6061-7825
According to our database1,
Hyesoon Kim
authored at least 164 papers
between 2004 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
ACM Trans. Design Autom. Electr. Syst., 2024
Quantifying CO<sub>2</sub> Emission Reduction Through Spatial Partitioning in Deep Learning Recommendation System Workloads.
IEEE Micro, 2024
Unleashing CPU Potential for Executing GPU Programs Through Compiler/Runtime Optimizations.
Proceedings of the 57th IEEE/ACM International Symposium on Microarchitecture, 2024
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024
Comparative Analysis of Executing GPU Applications on FPGA: HLS vs. Soft GPU Approaches.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
Proceedings of the IEEE International Symposium on Workload Characterization, 2024
Proceedings of the 36th IEEE Hot Chips Symposium, 2024
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2024
Proceedings of the 33rd ACM SIGPLAN International Conference on Compiler Construction, 2024
2023
Proc. VLDB Endow., November, 2023
IEEE Comput. Archit. Lett., 2023
IEEE Comput. Archit. Lett., 2023
Proceedings of the Practice and Experience in Advanced Research Computing, 2023
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023
Proceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2023
Proceedings of the International Symposium on Memory Systems, 2023
Proceedings of the IEEE International Symposium on Multimedia, 2023
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023
VEGETA: Vertically-Integrated Extensions for Sparse/Dense GEMM Tile Acceleration on CPUs.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023
Spica: Exploring FPGA Optimizations to Enable an Efficient SpMV Implementation for Computations at Edge.
Proceedings of the IEEE International Conference on Edge Computing and Communications, 2023
Proceedings of the IEEE International Conference on Edge Computing and Communications, 2023
Reducing Inference Latency with Concurrent Architectures for Image Recognition at Edge.
Proceedings of the IEEE International Conference on Edge Computing and Communications, 2023
Proceedings of the IEEE International Conference on Edge Computing and Communications, 2023
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023
2022
ACM Trans. Archit. Code Optim., 2022
Proceedings of the SIGMOD '22: International Conference on Management of Data, Philadelphia, PA, USA, June 12, 2022
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022
Proceedings of the 2022 IEEE Hot Chips 34 Symposium, 2022
Proceedings of the 32nd International Conference on Field-Programmable Logic and Applications, 2022
2021
Efficiently Solving Partial Differential Equations in a Partially Reconfigurable Specialized Hardware.
IEEE Trans. Computers, 2021
Creating Robust Deep Neural Networks With Coded Distributed Computing for IoT Systems.
CoRR, 2021
THIA: Accelerating Video Analytics using Early Inference and Fine-Grained Query Planning.
CoRR, 2021
IEEE Comput. Archit. Lett., 2021
Proceedings of the MICRO '21: 54th Annual IEEE/ACM International Symposium on Microarchitecture, 2021
Copernicus: Characterizing the Performance Implications of Compression Formats Used in Sparse Workloads.
Proceedings of the IEEE International Symposium on Workload Characterization, 2021
FAFNIR: Accelerating Sparse Gathering by Using Efficient Near-Memory Intelligent Reduction.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2021
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021
Proceedings of the ASPLOS '21: 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2021
2020
Toward Collaborative Inferencing of Deep Neural Networks on Internet-of-Things Devices.
IEEE Internet Things J., 2020
Copernicus: Characterizing the Performance Implications of Compression Formats Used in Sparse Workloads.
CoRR, 2020
Secure Location-Aware Authentication and Communication for Intelligent Transportation Systems.
CoRR, 2020
CoRR, 2020
Edge-Tailored Perception: Fast Inferencing in-the-Edge with Efficient Model Distribution.
CoRR, 2020
Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture, 2020
Proceedings of the MEMSYS 2020: The International Symposium on Memory Systems, 2020
Proceedings of the MEMSYS 2020: The International Symposium on Memory Systems, 2020
Proceedings of the MEMSYS 2020: The International Symposium on Memory Systems, 2020
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2020
Proceedings of the 38th IEEE International Conference on Computer Design, 2020
Proceedings of the IEEE International Symposium on High Performance Computer Architecture, 2020
Proceedings of the 30th International Conference on Field-Programmable Logic and Applications, 2020
Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020
Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020
Proceedings of the 28th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2020
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020
Proceedings of the 2020 Design, Automation & Test in Europe Conference & Exhibition, 2020
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020
Proceedings of the ASPLOS '20: Architectural Support for Programming Languages and Operating Systems, 2020
2019
IEEE Micro, 2019
J. Parallel Distributed Comput., 2019
CoRR, 2019
CoRR, 2019
Characterizing the Execution of Deep Neural Networks on Collaborative Robots and Edge Devices.
Proceedings of the Practice and Experience in Advanced Research Computing on Rise of the Machines (learning), 2019
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2019
Proceedings of the IEEE International Symposium on Workload Characterization, 2019
Capella: Customizing Perception for Edge Devices by Efficiently Allocating FPGAs to DNNs.
Proceedings of the 29th International Conference on Field Programmable Logic and Applications, 2019
Proceedings of the 56th Annual Design Automation Conference 2019, 2019
Proceedings of the 56th Annual Design Automation Conference 2019, 2019
Proceedings of the 56th Annual Design Automation Conference 2019, 2019
Proceedings of the International Conference on Hardware/Software Codesign and System Synthesis Companion, 2019
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2019
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019
2018
StaleLearn: Learning Acceleration with Asynchronous Synchronization Between Model Replicas on PIM.
IEEE Trans. Computers, 2018
ACM Trans. Archit. Code Optim., 2018
CoRR, 2018
Performance Implications of NoCs on 3D-Stacked Memories: Insights from the Hybrid Memory Cube.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2018
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2018
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018
Proceedings of the 1st on Reproducible Quality-Efficient Systems Tournament on Co-designing Pareto-efficient Deep Learning, 2018
2017
CAIRO: A Compiler-Assisted Technique for Enabling Instruction-Level Offloading of Processing-In-Memory.
ACM Trans. Archit. Code Optim., 2017
J. Parallel Distributed Comput., 2017
CoRR, 2017
Proceedings of the 26th USENIX Security Symposium, 2017
Proceedings of the International Symposium on Memory Systems, 2017
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017
Demystifying the characteristics of 3D-stacked memories: A case study for Hybrid Memory Cube.
Proceedings of the 2017 IEEE International Symposium on Workload Characterization, 2017
Proceedings of the 2017 IEEE International Symposium on High Performance Computer Architecture, 2017
2016
Proceedings of the Second International Symposium on Memory Systems, 2016
2015
IEEE Trans. Computers, 2015
Block-Precise Processors: Low-Power Processors with Reduced Operand Store Accesses and Result Broadcasts.
IEEE Trans. Computers, 2015
IEEE Micro, 2015
IEEE Comput. Archit. Lett., 2015
Proceedings of the International Conference for High Performance Computing, 2015
Proceedings of the 2015 International Symposium on Memory Systems, 2015
Proceedings of the 2015 International Symposium on Memory Systems, 2015
Proceedings of the 2015 International Symposium on Memory Systems, 2015
BSSync: Processing Near Memory for Machine Learning Workloads with Bounded Staleness Consistency Models.
Proceedings of the 2015 International Conference on Parallel Architectures and Compilation, 2015
2014
ACM Trans. Design Autom. Electr. Syst., 2014
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014
Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 2014
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014
Proceedings of the 22nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2014
2013
Adaptive virtual channel partitioning for network-on-chip in heterogeneous architectures.
ACM Trans. Design Autom. Electr. Syst., 2013
IEEE Trans. Computers, 2013
Design space exploration of on-chip ring interconnection for a CPU-GPU heterogeneous architecture.
J. Parallel Distributed Comput., 2013
SESH Framework: A Space Exploration Framework for GPU Application and Hardware Codesign.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation, 2013
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013
2012
Performance Analysis and Tuning for General Purpose Graphics Processing Units (GPGPU)
Synthesis Lectures on Computer Architecture, Morgan & Claypool Publishers, ISBN: 978-3-031-01737-7, 2012
ACM Trans. Archit. Code Optim., 2012
IEEE Comput. Archit. Lett., 2012
A performance analysis framework for identifying potential benefits in GPGPU applications.
Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012
Proceedings of the 2012 ACM SIGPLAN workshop on Memory Systems Performance and Correctness: held in conjunction with PLDI '12, 2012
Proceedings of the 2012 ACM SIGPLAN workshop on Memory Systems Performance and Correctness: held in conjunction with PLDI '12, 2012
Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture, 2012
Proceedings of the 39th International Symposium on Computer Architecture (ISCA 2012), 2012
Predicting Potential Speedup of Serial Code via Lightweight Profiling and Emulations with Memory Performance Model.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012
Proceedings of the 18th IEEE International Symposium on High Performance Computer Architecture, 2012
2010
Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010
Proceedings of the 43rd Annual IEEE/ACM International Symposium on Microarchitecture, 2010
Proceedings of the 37th International Symposium on Computer Architecture (ISCA 2010), 2010
Proceedings of the 2010 International Conference on Compilers, 2010
2009
Virtual Program Counter (VPC) Prediction: Very Low Cost Indirect Branch Prediction Using Conditional Branch Prediction Hardware.
IEEE Trans. Computers, 2009
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009
Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping.
Proceedings of the 42st Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-42 2009), 2009
An analytical model for a GPU architecture with memory-level and thread-level parallelism awareness.
Proceedings of the 36th International Symposium on Computer Architecture (ISCA 2009), 2009
2008
Proceedings of the 26th International Conference on Computer Design, 2008
Proceedings of the 14th International Conference on High-Performance Computer Architecture (HPCA-14 2008), 2008
Improving the performance of object-oriented languages with dynamic predication of indirect jumps.
Proceedings of the 13th International Conference on Architectural Support for Programming Languages and Operating Systems, 2008
2007
IEEE Micro, 2007
VPC prediction: reducing the cost of indirect branches via hardware-based dynamic devirtualization.
Proceedings of the 34th International Symposium on Computer Architecture (ISCA 2007), 2007
Feedback Directed Prefetching: Improving the Performance and Bandwidth-Efficiency of Hardware Prefetchers.
Proceedings of the 13st International Conference on High-Performance Computer Architecture (HPCA-13 2007), 2007
Profile-assisted Compiler Support for Dynamic Predication in Diverge-Merge Processors.
Proceedings of the Fifth International Symposium on Code Generation and Optimization (CGO 2007), 2007
2006
Address-Value Delta (AVD) Prediction: A Hardware Technique for Efficiently Parallelizing Dependent Cache Misses.
IEEE Trans. Computers, 2006
IEEE Micro, 2006
IEEE Micro, 2006
Diverge-Merge Processor (DMP): Dynamic Predicated Execution of Complex Control-Flow Graphs Based on Frequently Executed Paths.
Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-39 2006), 2006
Proceedings of the Fourth IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2006), 2006
2005
An Analysis of the Performance Impact of Wrong-Path Memory References on Out-of-Order and Runahead Execution Processors.
IEEE Trans. Computers, 2005
Using the First-Level Caches as Filters to Reduce the Pollution Caused by Speculative Memory References.
Int. J. Parallel Program., 2005
On Reusing the Results of Pre-Executed Instructions in a Runahead Execution Processor.
IEEE Comput. Archit. Lett., 2005
Address-Value Delta (AVD) Prediction: Increasing the Effectiveness of Runahead Execution by Exploiting Regular Memory Allocation Patterns.
Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-38 2005), 2005
Wish Branches: Combining Conditional Branching and Predication for Adaptive Predicated Execution.
Proceedings of the 38th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-38 2005), 2005
Proceedings of the 32st International Symposium on Computer Architecture (ISCA 2005), 2005
2004
Proceedings of the 3rd Workshop on Memory Performance Issues, 2004
Cache Filtering Techniques to Reduce the Negative Impact of Useless Speculative Memory References on Processor Performance.
Proceedings of the 16th Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2004), 2004
Wrong Path Events: Exploiting Unusual and Illegal Program Behavior for Early Misprediction Detection and Recovery.
Proceedings of the 37th Annual International Symposium on Microarchitecture (MICRO-37 2004), 2004