I-Hsin Chung

Orcid: 0000-0003-4555-9257

Affiliations:
  • IBM Thomas J. Watson Research Center
  • University of Maryland, Department of Computer Science


According to our database1, I-Hsin Chung authored at least 77 papers between 2002 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Attention Tracker: Detecting Prompt Injection Attacks in LLMs.
CoRR, 2024

The infrastructure powering IBM's Gen AI model development.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, 2024

Data-Driven Lipschitz Continuity: A Cost-Effective Approach to Improve Adversarial Robustness.
CoRR, 2024

Steal Now and Attack Later: Evaluating Robustness of Object Detection against Black-box Adversarial Attacks.
CoRR, 2024

Overload: Latency Attacks on Object Detection for Edge Devices.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023
CODAG: Characterizing and Optimizing Decompression Algorithms for GPUs.
CoRR, 2023

GPU-Initiated On-Demand High-Throughput Storage Access in the BaM System Architecture.
Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2023

Enabling Scalability in the Cloud for Scientific Workflows: An Earth Science Use Case.
Proceedings of the 16th IEEE International Conference on Cloud Computing, 2023

2022
GPU-Initiated On-Demand High-Throughput Storage Access in the BaM System Architecture.
Dataset, October, 2022

BaM: A Case for Enabling Fine-grain High Throughput GPU-Orchestrated Access to Storage.
CoRR, 2022

NVMe Virtualization for Cloud Virtual Machines.
Proceedings of the ICPE '22: ACM/SPEC International Conference on Performance Engineering, Bejing, China, April 9, 2022

A Locality-aware Cooperative Distributed Memory Caching for Parallel Data Analytic Applications.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

2021
TEMPI: An Interposed MPI Library with a Canonical Representation of CUDA-aware Datatypes.
Proceedings of the HPDC '21: The 30th International Symposium on High-Performance Parallel and Distributed Computing, 2021

A Deep Reinforcement Learning Method for Solving Task Mapping Problems with Dynamic Traffic on Parallel Systems.
Proceedings of the HPC Asia 2021: The International Conference on High Performance Computing in Asia-Pacific Region, 2021

Toward an Enterprise-ready Composable Infrastructure as a Service.
Proceedings of the IEEE International Conference on Services Computing, 2021

2020
Transformation of application enablement tools on CORAL systems.
IBM J. Res. Dev., 2020

Fast CUDA-Aware MPI Datatypes without Platform Support.
CoRR, 2020

Tearing Down the Memory Wall.
CoRR, 2020

Node-Aware Stencil Communication for Heterogeneous Supercomputers.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

Workshop 14: iWAPT Automatic Performance Tuning.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

ECS2: A Fast Erasure Coding Library for GPU-Accelerated Storage Systems with Parallel & Direct IO.
Proceedings of the IEEE International Conference on Cluster Computing, 2020

2019
Evaluating Characteristics of CUDA Communication Primitives on High-Bandwidth Interconnects.
Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering, 2019

Temporal-based Load Adaptive SDN Controller Failover Mechanism.
Proceedings of the 16th IEEE Annual Consumer Communications & Networking Conference, 2019

Towards VR/AR Multimedia Content Multicast over Wireless LAN.
Proceedings of the 16th IEEE Annual Consumer Communications & Networking Conference, 2019

2018
Deadline Is Not Enough: Importance-Aware Transmission Control Protocol for Server-Centric Data Centers.
IEEE Syst. J., 2018

NUMA-Aware Data-Transfer Measurements for Power/NVLink Multi-GPU Systems.
Proceedings of the High Performance Computing, 2018

Towards a Single-Host Many-GPU System.
Proceedings of the 30th International Symposium on Computer Architecture and High Performance Computing, 2018

Towards a Composable Computer System.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2018

FlexProtect: A SDN-based DDoS Attack Protection Architecture for Multi-tenant Data Centers.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2018

2017
Parallel Deep Neural Network Training for Big Data on Blue Gene/Q.
IEEE Trans. Parallel Distributed Syst., 2017

Incremental Hybrid SDN Deployment for Enterprise Networks.
Proceedings of the 15th IEEE Intl Conf on Dependable, 2017

2016
A low-latency two-tier measurement and control platform for commodity SDN.
IEEE Commun. Mag., 2016

Particle Swarm Stepwise Algorithm (PaSS) on Multicore Hybrid CPU-GPU Clusters.
Proceedings of the 2016 IEEE International Conference on Computer and Information Technology, 2016

Application Characterization Assisted System Design.
Proceedings of the 2016 IEEE International Conference on Computer and Information Technology, 2016

2015
Hardware Thread-Level Speculation Performance Analysis.
Proceedings of the 17th IEEE International Conference on High Performance Computing and Communications, 2015

2014
Improving GPU Memory Performancewith Artificial Barrier Synchronization.
IEEE Trans. Parallel Distributed Syst., 2014

Performance Modeling for Hardware Thread-Level Speculation.
Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

Parallel deep neural network training for LVCSR tasks using blue gene/Q.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

2013
Optimizing Pairwise Box Intersection Checking on GPUs for Large-Scale Simulations.
ACM Trans. Model. Comput. Simul., 2013

TLA: Temporal look-ahead processor allocation method for heterogeneous multi-cluster systems.
J. Parallel Distributed Comput., 2013

Determination of performance characteristics of scientific applications on IBM Blue Gene/Q.
IBM J. Res. Dev., 2013

2012
A Systematic Approach toward Automated Performance Analysis and Tuning.
IEEE Trans. Parallel Distributed Syst., 2012

Application data prefetching on the IBM blue gene/Q supercomputer.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

A static analysis tool using a three-step approach for data races in HPC programs.
Proceedings of the 10th Workshop on Parallel and Distributed Systems: Testing, 2012

An Efficient Framework for Multi-dimensional Tuning of High Performance Computing Applications.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

Scalable Communication-aware Task Mapping Algorithms for Interconnected Multicore Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Comparison Studies of Large-scale Conventional Molecular Dynamics Simulation on Parallel Machines.
Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

GPU Performance Enhancement via Communication Cost Reduction: Case Studies of Radix Sort and WSN Relay Node Placement Problem.
Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012

2011
Hierarchical Mapping for HPC Applications.
Parallel Process. Lett., 2011

Scalable Communication-Aware Task Mapping Algorithms for Interconnected Multicore Systems.
Proceedings of the 13th IEEE International Conference on High Performance Computing & Communication, 2011

A Performance Goal Oriented Processor Allocation Technique for Centralized Heterogeneous Multi-cluster Environments.
Proceedings of the 11th IEEE/ACM International Symposium on Cluster, 2011

A Parallel Rectangle Intersection Algorithm on GPU+CPU.
Proceedings of the 11th IEEE/ACM International Symposium on Cluster, 2011

2010
Workload performance characterization of DARPA HPCS benchmarks.
Concurr. Comput. Pract. Exp., 2010

Masking I/O latency using application level I/O caching and prefetching on Blue Gene systems.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Parallelization of DQMC simulation for strongly correlated electron systems.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Application tuning through bottleneck-driven refactoring.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Automated mapping of regular communication graphs on mesh interconnects.
Proceedings of the 2010 International Conference on High Performance Computing, 2010

Guided Performance Analysis Combining Profile and Trace Tools.
Proceedings of the Euro-Par 2010 Parallel Processing Workshops, 2010

2009
Application level I/O caching on Blue Gene/P systems.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Towards a framework for automated performance tuning.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Tools for scalable performance analysis on Petascale systems.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

A Holistic Approach towards Automated Performance Analysis and Tuning.
Proceedings of the Euro-Par 2009 Parallel Processing, 2009

2008
Early experiences in application level I/O tracing on blue gene systems.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

A framework for automated performance bottleneck detection.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

2007
A productivity centered application performance tuning framework.
Proceedings of the 2nd International Conference on Performance Evaluation Methodolgies and Tools, 2007

A Productivity Centered Tools Framework for Application Performance Tuning.
Proceedings of the Fourth International Conference on the Quantitative Evaluaiton of Systems (QEST 2007), 2007

Performance Studies of a WebSphere Application, Trade, in Scale-out and Scale-up Environments.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

2006
Blue Gene system software - Topology mapping for Blue Gene/L supercomputer.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

MPI tools and performance studies - MPI performance analysis tools on Blue Gene/L.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

A study of MPI performance analysis tools on Blue Gene/L.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

A Case Study Using Automatic Performance Tuning for Large-Scale Scientific Programs.
Proceedings of the 15th IEEE International Symposium on High Performance Distributed Computing, 2006

2004
Towards Automatic Performance Tuning.
PhD thesis, 2004

Using Information from Prior Runs to Improve Automated Tuning Systems.
Proceedings of the ACM/IEEE SC2004 Conference on High Performance Networking and Computing, 2004

Automated Cluster-Based Web Service Performance Tuning.
Proceedings of the 13th International Symposium on High-Performance Distributed Computing (HPDC-13 2004), 2004

2003
Runtime Selection among Different API Implementations.
Parallel Process. Lett., 2003

2002
Design of Scalable Continuous Media Servers.
Multim. Tools Appl., 2002

Active harmony: towards automated performance tuning.
Proceedings of the 2002 ACM/IEEE conference on Supercomputing, 2002


  Loading...