Zhiling Lan

Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31, 2009

Fault-aware, utility-based job scheduling on Blue, Gene/P systems.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31, 2009

Performance under Failures of DAG-based Parallel Computing.

[BibT_eX]

[DOI]

Proceedings of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, 2009

2008

Adaptive Fault Management of Parallel Applications for High-Performance Computing.

[BibT_eX]

[DOI]

IEEE Trans. Computers, 2008

Analytical study of migration-enhanced fault tolerance for long-running applications in IFR systems.

[BibT_eX]

[DOI]

Int. J. Parallel Emergent Distributed Syst., 2008

Enhancing application robustness through adaptive fault tolerance.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Dynamic Meta-Learning for Failure Prediction in Large-Scale Systems: A Case Study.

[BibT_eX]

[DOI]

Proceedings of the 2008 International Conference on Parallel Processing, 2008

A fast restart mechanism for checkpoint/recovery protocols in networked environments.

[BibT_eX]

[DOI]

Proceedings of the 38th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2008

2007

Fault-Driven Re-Scheduling For Improving System-level Fault Resilience.

[BibT_eX]

[DOI]

Proceedings of the 2007 International Conference on Parallel Processing (ICPP 2007), 2007

A Meta-Learning Failure Predictor for Blue Gene/L Systems.

[BibT_eX]

[DOI]

Proceedings of the 2007 International Conference on Parallel Processing (ICPP 2007), 2007

Anomaly localization in large-scale clusters.

[BibT_eX]

[DOI]

Ziming Zheng

Proceedings of the 2007 IEEE International Conference on Cluster Computing, 2007

2006

DistDLB: Improving cosmology SAMR simulations on distributed computing systems through hierarchical load balancing.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2006

Poster reception - Improving fault resilience of high performance applications.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Exploit Failure Prediction for Adaptive Fault-Tolerance in Cluster Computing.

[BibT_eX]

[DOI]

Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006

Evaluating Performance and Scalability of Advanced Accelerator Simulations.

[BibT_eX]

[DOI]

Jungmin Lee

James F. Amundson

Panagiotis Spentzouris

Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006

2005

A novel workload migration scheme for heterogeneous distributed computing.

[BibT_eX]

[DOI]

Proceedings of the 5th International Symposium on Cluster Computing and the Grid (CCGrid 2005), 2005

2004

Performance analysis of a large-scale cosmology application on three cluster systems.

[BibT_eX]

[DOI]

Prathibha Deshikachar

Int. J. High Perform. Comput. Netw., 2004

A Survey of Load Balancing in Grid Computing.

[BibT_eX]

[DOI]

Proceedings of the Computational and Information Science, First International Symposium, 2004

2003

Exploring cosmology applications on distributed environments.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2003

2002

Dynamic load balancing of SAMR applications on distributed systems.

[BibT_eX]

[DOI]

Sci. Program., 2002

A novel dynamic load balancing scheme for parallel systems.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2002

2001

Design and Development of the Prophesy Performance Database for Distributed Scientific Applications.

[BibT_eX]

Proceedings of the Tenth SIAM Conference on Parallel Processing for Scientific Computing, 2001

Prophesy: Automating the Modeling Process.

[BibT_eX]

[DOI]

Proceedings of the 3rd Annual International Workshop on Active Middleware Services (AMS 2001), 2001

Dynamic Load Balancing for Structured Adaptive Mesh Refinement Applications.

[BibT_eX]

[DOI]