Ivy Bo Peng

Orcid: 0000-0003-4158-3583

Affiliations:
  • KTH Royal Institute of Technology, Sweden


According to our database1, Ivy Bo Peng authored at least 70 papers between 2015 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Disaggregated Memory with SmartNIC Offloading: a Case Study on Graph Processing.
CoRR, 2024

Multi-level Memory-Centric Profiling on ARM Processors with ARM SPE.
CoRR, 2024

Understanding Data Movement in AMD Multi-GPU Systems with Infinity Fabric.
CoRR, 2024

Characterizing the Performance of the Implicit Massively Parallel Particle-in-Cell iPIC3D Code.
CoRR, 2024

Understanding Layered Portability from HPC to Cloud in Containerized Environments.
CoRR, 2024

On the Rise of AMD Matrix Cores: Performance, Power Efficiency, and Programmability.
Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software, 2024

Integration of Modern HPC Performance Tools in Vlasiator for Exascale Analysis and Optimization.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

Harnessing Integrated CPU-GPU System Memory for HPC: a first look into Grace Hopper.
Proceedings of the 53rd International Conference on Parallel Processing, 2024

Beyond the Buzz: Strategic Paths for Enabling Useful NISQ Applications.
Proceedings of the 21st ACM International Conference on Computing Frontiers, 2024

2023
Quantum Computer Simulations at Warp Speed: Assessing the Impact of GPU Acceleration.
CoRR, 2023

Leveraging HPC Profiling & Tracing Tools to Understand the Performance of Particle-in-Cell Monte Carlo Simulations.
CoRR, 2023

Perspectives on AI Architectures and Co-design for Earth System Predictability.
CoRR, 2023

HM-Keeper: Scalable Page Management for Multi-Tiered Large Memory Systems.
CoRR, 2023

A GPU-Accelerated Molecular Docking Workflow with Kubernetes and Apache Airflow.
Proceedings of the High Performance Computing, 2023

A Quantitative Approach for Adopting Disaggregated Memory in HPC Systems.
Proceedings of the International Conference for High Performance Computing, 2023

Accelerator integration in a tile-based SoC: lessons learned with a hardware floating point compression engine.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023

Kub: Enabling Elastic HPC Workloads on Containerized Environments.
Proceedings of the 35th IEEE International Symposium on Computer Architecture and High Performance Computing, 2023

LibCOS: Enabling Converged HPC and Cloud Data Stores with MPI.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2023

Leveraging HPC Profiling and Tracing Tools to Understand the Performance of Particle-in-Cell Monte Carlo Simulations.
Proceedings of the Euro-Par 2023: Parallel Processing Workshops - Euro-Par 2023 International Workshops, Limassol, Cyprus, August 28, 2023

Boosting the Performance of Object Tracking with a Half-Precision Particle Filter on GPU.
Proceedings of the Euro-Par 2023: Parallel Processing Workshops - Euro-Par 2023 International Workshops, Limassol, Cyprus, August 28, 2023

Accelerating Drug Discovery in AutoDock-GPU with Tensor Cores.
Proceedings of the Euro-Par 2023: Parallel Processing - 29th International Conference on Parallel and Distributed Computing, Limassol, Cyprus, August 28, 2023

OpenCUBE: Building an Open Source Cloud Blueprint with EPI Systems.
Proceedings of the Euro-Par 2023: Parallel Processing Workshops - Euro-Par 2023 International Workshops, Limassol, Cyprus, August 28, 2023

Quantum Computer Simulations at Warp Speed: Assessing the Impact of GPU Acceleration: A Case Study with IBM Qiskit Aer, Nvidia Thrust & cuQuantum.
Proceedings of the 19th IEEE International Conference on e-Science, 2023


2022
Enabling Scalable and Extensible Memory-Mapped Datastores in Userspace.
IEEE Trans. Parallel Distributed Syst., 2022

FPGA-accelerated simulation of variable latency memory systems.
Proceedings of the 2022 International Symposium on Memory Systems, 2022

Evaluating Emerging CXL-enabled Memory Pooling for HPC Systems.
Proceedings of the IEEE/ACM Workshop on Memory Centric High Performance Computing, 2022

2021
A Holistic View of Memory Utilization on HPC Systems: Current and Future Trends.
Proceedings of the MEMSYS 2021: The International Symposium on Memory Systems, Washington, USA, September 27, 2021

MD-HM: memoization-based molecular dynamics simulations on big memory system.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

Optimizing large-scale plasma simulations on persistent memory-based heterogeneous memory with effective data placement across memory hierarchy.
Proceedings of the ICS '21: 2021 International Conference on Supercomputing, 2021

StreamBrain: An HPC Framework for Brain-like Neural Networks on CPUs, GPUs and FPGAs.
Proceedings of the HEART '21: 11th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies, 2021

ArchTM: Architecture-Aware, High Performance Transaction for Persistent Memory.
Proceedings of the 19th USENIX Conference on File and Storage Technologies, 2021

2020
Automatic Particle Trajectory Classification in Plasma Simulations.
Proceedings of the 6th IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments, 2020

On the Memory Underutilization: Exploring Disaggregated Memory on HPC Systems.
Proceedings of the 32nd IEEE International Symposium on Computer Architecture and High Performance Computing, 2020

sputniPIC: An Implicit Particle-in-Cell Code for Multi-GPU Systems.
Proceedings of the 32nd IEEE International Symposium on Computer Architecture and High Performance Computing, 2020

Demystifying the Performance of HPC Scientific Applications on NVM-based Memory Systems.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

tf-Darshan: Understanding Fine-grained I/O Performance in Machine Learning Workloads.
Proceedings of the IEEE International Conference on Cluster Computing, 2020

ATMem: adaptive data placement in graph applications on heterogeneous memories.
Proceedings of the CGO '20: 18th ACM/IEEE International Symposium on Code Generation and Optimization, 2020

Ribbon: High Performance Cache Line Flushing for Persistent Memory.
Proceedings of the PACT '20: International Conference on Parallel Architectures and Compilation Techniques, 2020

2019
SAGE: Percipient Storage for Exascale Data Centric Computing.
Parallel Comput., 2019

UMap: Enabling Application-driven Optimizations for Page Management.
Proceedings of the 2019 IEEE/ACM Workshop on Memory Centric High Performance Computing, 2019

Performance Evaluation of Advanced Features in CUDA Unified Memory.
Proceedings of the 2019 IEEE/ACM Workshop on Memory Centric High Performance Computing, 2019

Posit NPB: Assessing the Precision Improvement in HPC Scientific Applications.
Proceedings of the Parallel Processing and Applied Mathematics, 2019

System evaluation of the Intel optane byte-addressable NVM.
Proceedings of the International Symposium on Memory Systems, 2019

Analyzing the suitability of contemporary 3D-stacked PIM architectures for HPC scientific applications.
Proceedings of the 16th ACM International Conference on Computing Frontiers, 2019

2018
MPI windows on storage for HPC applications.
Parallel Comput., 2018

Characterizing the performance benefit of hybrid memory system for HPC applications.
Parallel Comput., 2018

The SAGE Project: a Storage Centric Approach for Exascale Computing.
CoRR, 2018

Siena: exploring the design space of heterogeneous memory systems.
Proceedings of the International Conference for High Performance Computing, 2018

NVIDIA Tensor Core Programmability, Performance & Precision.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Tuyere: enabling scalable memory workloads for system exploration.
Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing, 2018

The SAGE project: a storage centric approach for exascale computing: invited paper.
Proceedings of the 15th ACM International Conference on Computing Frontiers, 2018

Understanding scale-Dependent soft-Error Behavior of Scientific Applications.
Proceedings of the 18th IEEE/ACM International Symposium on Cluster, 2018

2017
Data Movement on Emerging Large-Scale Parallel Systems.
PhD thesis, 2017

Efficient alarm behavior analytics for telecom networks.
Inf. Sci., 2017

MPI Streams for HPC Applications.
CoRR, 2017

Exploring the Performance Benefit of Hybrid Memory System on HPC Environments.
CoRR, 2017

RTHMS: a tool for data placement on hybrid memory system.
Proceedings of the 2017 ACM SIGPLAN International Symposium on Memory Management, 2017

Exploring the Performance Benefit of Hybrid Memory System on HPC Environments.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Preparing HPC Applications for the Exascale Era: A Decoupling Strategy.
Proceedings of the 46th International Conference on Parallel Processing, 2017

Extending Message Passing Interface Windows to Storage.
Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

2016
The EPiGRAM Project: Preparing Parallel Programming Models for Exascale.
Proceedings of the High Performance Computing, 2016

A Performance Characterization of Streaming Computing on Supercomputers.
Proceedings of the International Conference on Computational Science 2016, 2016

Idle Period Propagation in Message-Passing Applications.
Proceedings of the 18th IEEE International Conference on High Performance Computing and Communications; 14th IEEE International Conference on Smart City; 2nd IEEE International Conference on Data Science and Systems, 2016

Exploring Application Performance on Emerging Hybrid-Memory Supercomputers.
Proceedings of the 18th IEEE International Conference on High Performance Computing and Communications; 14th IEEE International Conference on Smart City; 2nd IEEE International Conference on Data Science and Systems, 2016

2015
A data streaming model in MPI.
Proceedings of the 3rd Workshop on Exascale MPI, 2015

Spectral Solver for Multi-scale Plasma Physics Simulations with Dynamically Adaptive Number of Moments.
Proceedings of the International Conference on Computational Science, 2015

The Formation of a Magnetosphere with Implicit Particle-in-Cell Simulations.
Proceedings of the International Conference on Computational Science, 2015

The Cost of Synchronizing Imbalanced Processes in Message Passing Systems.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

Evaluation of Parallel Communication Models in Nekbone, a Nek5000 Mini-Application.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015


  Loading...