2024
Runtime Performance Anomaly Diagnosis in Production HPC Systems Using Active Learning.
IEEE Trans. Parallel Distributed Syst., April, 2024
2023
Prodigy: Towards Unsupervised Anomaly Detection in Production HPC Systems.
Proceedings of the International Conference for High Performance Computing, 2023
2022
ALBADross: Active Learning Based Anomaly Diagnosis for Production HPC Systems.
Proceedings of the IEEE International Conference on Cluster Computing, 2022
2021
Proctor: A Semi-Supervised Performance Anomaly Diagnosis Framework for Production HPC Systems.
Proceedings of the High Performance Computing - 36th International Conference, 2021
Using Monitoring Data to Improve HPC Performance via Network-Data-Driven Allocation.
Proceedings of the 2021 IEEE High Performance Extreme Computing Conference, 2021
E2EWatch: An End-to-End Anomaly Diagnosis Framework for Production HPC Systems.
Proceedings of the Euro-Par 2021: Parallel Processing, 2021
2020
Counterfactual Explanations for Machine Learning on Multivariate Time Series Data.
CoRR, 2020
2019
Online Diagnosis of Performance Variation in HPC Systems Using Machine Learning.
IEEE Trans. Parallel Distributed Syst., 2019
HPAS: An HPC Performance Anomaly Suite for Reproducing Performance Variations.
Proceedings of the 48th International Conference on Parallel Processing, 2019
2018
Level-Spread: A New Job Allocation Policy for Dragonfly Networks.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018
Taxonomist: Application Detection Through Rich Monitoring Data.
Proceedings of the Euro-Par 2018: Parallel Processing, 2018
Statistical Models of Dengue Fever.
Proceedings of the Data Mining - 16th Australasian Conference, AusDM 2018, Bahrurst, NSW, 2018
2017
Trends in Data Locality Abstractions for HPC Systems.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
IEEE Trans. Parallel Distributed Syst., 2017
Diagnosing Performance Variations in HPC Applications Using Machine Learning.
Proceedings of the High Performance Computing - 32nd International Conference, 2017
Unveiling the Interplay Between Global Link Arrangements and Network Management Algorithms on Dragonfly Networks.
Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017
2016
Local search to improve coordinate-based task mapping.
Parallel Comput., 2016
2015
Simulation and optimization of HPC job allocation for jointly reducing communication and cooling costs.
Sustain. Comput. Informatics Syst., 2015
PaCMap: Topology Mapping of Unstructured Communication Patterns onto Non-contiguous Allocations.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015
Comparing Global Link Arrangements for Dragonfly Networks.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015
2014
Abstract machine models and proxy architectures for exascale computing.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 1st International Workshop on Hardware-Software Co-Design for High Performance Computing, 2014
Task mapping stencil computations for non-contiguous allocations.
Proceedings of the ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2014
Exploiting Geometric Partitioning in Task Mapping for Parallel Computers.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014
PReMAS: Simulator for Resource Management.
Proceedings of the 43rd International Conference on Parallel Processing Workshops, 2014
2013
Backfilling with guarantees made as jobs arrive.
Concurr. Comput. Pract. Exp., 2013
Efficient scheduling to minimize calibrations.
Proceedings of the 25th ACM Symposium on Parallelism in Algorithms and Architectures, 2013
Variations of Conservative Backfilling to Improve Fairness.
Proceedings of the Job Scheduling Strategies for Parallel Processing, 2013
2012
Brief announcement: subgraph isomorphism on a multithreaded shared memory architecture.
Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures, 2012
2011
Backfilling with Guarantees Granted upon Job Submission.
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011
2010
An Interoperable, Data-Structure-Neutral Component for Mesh Query and Manipulation.
ACM Trans. Math. Softw., 2010
Parallel Job Scheduling Policies to Improve Fairness: A Case Study.
Proceedings of the 39th International Conference on Parallel Processing, 2010
A Tie-Breaking Strategy for Processor Allocation in Meshes.
Proceedings of the 39th International Conference on Parallel Processing, 2010
2009
Scheduling Restartable Jobs with Short Test Runs.
Proceedings of the Job Scheduling Strategies for Parallel Processing, 2009
2008
Probabilistic analysis for scheduling with conflicts.
Theor. Comput. Sci., 2008
Communication-Aware Processor Allocation for Supercomputers: Finding Point Sets of Small Average Distance.
Algorithmica, 2008
2005
Communication-Aware Processor Allocation for Supercomputers.
Proceedings of the Algorithms and Data Structures, 9th International Workshop, 2005
2004
Communication Patterns and Allocation Strategies.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004
2003
Scheduling with Conflicts on Bipartite and Interval Graphs.
J. Sched., 2003
2002
Processor Allocation on Cplant: Achieving General Processor Locality Using One-Dimensional Allocation Strategies.
Proceedings of the 2002 IEEE International Conference on Cluster Computing (CLUSTER 2002), 2002
2001
Experimental Results on Statistical Approaches to Page Replacement Policies.
Proceedings of the Algorithm Engineering and Experimentation, Third International Workshop, 2001
2000
Strengthening integrality gaps for capacitated network design and covering problems.
Proceedings of the Eleventh Annual ACM-SIAM Symposium on Discrete Algorithms, 2000
1997
The Undecidability of the Unrestricted Modified Edit Distance.
Theor. Comput. Sci., 1997
1996
Scheduling with Conflicts, and Applications to Traffic Signal Control.
Proceedings of the Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, 1996
1992
Graph Tool - A Tool for Interactive Design and Manipulation of Graphs and Graph Algorithms.
Proceedings of the Computational Support for Discrete Mathematics, 1992