R. Clint Whaley

  • Indiana University Bloomington, USA
  • Louisiana State University, Baton Rouge, LA, USA (former)

According to our database1, R. Clint Whaley authored at least 25 papers between 1993 and 2014.

Collaborative distances:



In proceedings 
PhD thesis 


Online presence:

On csauthors.net:


Effectively Exploiting Parallel Scale for All Problem Sizes in LU Factorization.
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014

Scaling LAPACK panel operations using parallel cache assignment.
ACM Trans. Math. Softw., 2013

Vectorization past dependent branches through speculation.
Proceedings of the 22nd International Conference on Parallel Architectures and Compilation Techniques, 2013

ATLAS (Automatically Tuned Linear Algebra Software).
Proceedings of the Encyclopedia of Parallel Computing, 2011

Achieving Scalable Parallelization for the Hessenberg Factorization.
Proceedings of the 2011 IEEE International Conference on Cluster Computing (CLUSTER), 2011

Scaling LAPACK panel operations using parallel cache assignment.
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2010

ATLAS Version 3.9: Overview and Status.
Proceedings of the Software Automatic Tuning, From Concepts to State-of-the-Art Results, 2010

Minimizing startup costs for performance-critical threading.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Achieving accurate and context-sensitive timing for code optimization.
Softw. Pract. Exp., 2008

Reducing Floating Point Error in Dot Product Using the Superblock Family of Algorithms.
SIAM J. Sci. Comput., 2008

Empirically tuning LAPACK's blocking factor for increased performance.
Proceedings of the International Multiconference on Computer Science and Information Technology, 2008

Minimizing development and maintenance costs in supporting persistently optimized BLAS.
Softw. Pract. Exp., 2005

Self-Adapting Linear Algebra Algorithms and Software.
Proc. IEEE, 2005

Tuning High Performance Kernels through Empirical Compilation.
Proceedings of the 34th International Conference on Parallel Processing (ICPP 2005), 2005

Automated empirical optimizations of software and the ATLAS project.
Parallel Comput., 2001

Parallel and Distributed Scientific Computing.
Proceedings of the Handbook on Parallel and Distributed Processing, 2000

Automatically Tuned Linear Algebra Software.
Proceedings of the ACM/IEEE Conference on Supercomputing, 1998

Practical Experience in the Numerical Dangers of Heterogeneous Computing.
ACM Trans. Math. Softw., 1997

ScaLAPACK: A Linear Algebra Library for Message-Passing Computers.
Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, 1997

Design and Implementation of the ScaLAPACK LU, QR, and Cholesky Factorization Routines.
Sci. Program., 1996

ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance.
Proceedings of the 1996 ACM/IEEE Conference on Supercomputing, 1996

Practical Experience in the Dangers of Heterogeneous Computing.
Proceedings of the Applied Parallel Computing, 1996

A Proposal for a Set of Parallel Basic Linear Algebra Subprograms.
Proceedings of the Applied Parallel Computing, 1995

ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance.
Proceedings of the Applied Parallel Computing, 1995

Two Dimensional Basic Linear Algebra Communication Subprograms.
Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, 1993
