Youngsok Kim
Orcid: 0000-0002-1015-9969
According to our database1,
Youngsok Kim
authored at least 45 papers
between 2014 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2025
IEEE Comput. Archit. Lett., 2025
2024
SPID-Join: A Skew-resistant Processing-in-DIMM Join Algorithm Exploiting the Bank- and Rank-level Parallelisms of DIMMs.
Proc. ACM Manag. Data, December, 2024
GCStack: A GPU Cycle Accounting Mechanism for Providing Accurate Insight Into GPU Performance.
IEEE Comput. Archit. Lett., 2024
IEEE Access, 2024
AGAThA: Fast and Efficient GPU Acceleration of Guided Sequence Alignment for Long Read Mapping.
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024
Proceedings of the 25th ACM SIGPLAN/SIGBED International Conference on Languages, 2024
PID-Comm: A Fast and Flexible Collective Communication Framework for Commodity Processing-in-DIMM Devices.
Proceedings of the 51st ACM/IEEE Annual International Symposium on Computer Architecture, 2024
CR2: Community-aware Compressed Regular Representation for Graph Processing on a GPU.
Proceedings of the 53rd International Conference on Parallel Processing, 2024
Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2024
MPC-Wrapper: Fully Harnessing the Potential of Samsung Aquabolt-XL HBM2-PIM on FPGAs.
Proceedings of the 32nd IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2024
GraNNDis: Fast Distributed Graph Neural Network Training Framework for Multi-Server Clusters.
Proceedings of the 2024 International Conference on Parallel Architectures and Compilation Techniques, 2024
2023
Enabling Fine-Grained Spatial Multitasking on Systolic-Array NPUs Using Dataflow Mirroring.
IEEE Trans. Computers, December, 2023
Design and Analysis of a Processing-in-DIMM Join Algorithm: A Case Study with UPMEM DIMMs.
Proc. ACM Manag. Data, 2023
GraNNDis: Efficient Unified Distributed Training Framework for Deep GNNs on Large Clusters.
CoRR, 2023
Proceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture, 2023
SGCN: Exploiting Compressed-Sparse Features in Deep Graph Convolutional Network Accelerators.
Proceedings of the IEEE International Symposium on High-Performance Computer Architecture, 2023
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2023
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023
Optimus-CC: Efficient Large NLP Model Training with 3D Parallelism Aware Communication Compression.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023
Virtual PIM: Resource-Aware Dynamic DPU Allocation and Workload Scheduling Framework for Multi-DPU PIM Architecture.
Proceedings of the 32nd International Conference on Parallel Architectures and Compilation Techniques, 2023
2022
GuardiaNN: Fast and Secure On-Device Inference in TrustZone Using Embedded SRAM and Cryptographic Hardware.
Proceedings of the Middleware '22: 23rd International Middleware Conference, Quebec, QC, Canada, November 7, 2022
Proceedings of the ISCA '22: The 49th Annual International Symposium on Computer Architecture, New York, New York, USA, June 18, 2022
SALoBa: Maximizing Data Locality and Workload Balance for Fast Sequence Alignment on GPUs.
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022
Enabling hard constraints in differentiable neural network and accelerator co-exploration.
Proceedings of the DAC '22: 59th ACM/IEEE Design Automation Conference, San Francisco, California, USA, July 10, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Slice-and-Forge: Making Better Use of Caches for Graph Convolutional Network Accelerators.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022
Decoupling Schedule, Topology Layout, and Algorithm to Easily Enlarge the Tuning Space of GPU Graph Processing.
Proceedings of the International Conference on Parallel Architectures and Compilation Techniques, 2022
2021
Making a Better Use of Caches for GCN Accelerators with Feature Slicing and Automatic Tile Morphing.
IEEE Comput. Archit. Lett., 2021
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Dataflow Mirroring: Architectural Support for Highly Efficient Fine-Grained Spatial Multitasking on Systolic-Array NPUs.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021
Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization, 2021
2020
Proceedings of the IEEE Real-Time and Embedded Technology and Applications Symposium, 2020
2019
FlexLearn: Fast and Highly Efficient Brain Simulations Using Flexible On-Chip Learning.
Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture, 2019
μLayer: Low Latency On-Device Inference Using Cooperative Single-Layer Acceleration and Processor-Friendly Quantization.
Proceedings of the Fourteenth EuroSys Conference 2019, Dresden, Germany, March 25-28, 2019, 2019
2018
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018
DCS-ctrl: A Fast and Flexible Device-Control Mechanism for Device-Centric Server Architecture.
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018
Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, 2018
2017
GPUpd: a fast and scalable multi-GPU architecture using cooperative projection and distribution.
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017
2016
Proceedings of the 2016 IEEE International Symposium on High Performance Computer Architecture, 2016
Proceedings of the IEEE/ACM 16th International Symposium on Cluster, 2016
2015
Proceedings of the 48th International Symposium on Microarchitecture, 2015
2014
IEEE Comput. Archit. Lett., 2014
Proceedings of the 2014 IEEE Symposium on Security and Privacy, 2014
GPUdmm: A high-performance and memory-oblivious GPU architecture using dynamic memory management.
Proceedings of the 20th IEEE International Symposium on High Performance Computer Architecture, 2014