James Demmel
Orcid: 0000-0002-0550-5476Affiliations:
- University of California, Berkeley, USA
According to our database1,
James Demmel
authored at least 246 papers
between 1985 and 2024.
Collaborative distances:
Collaborative distances:
Awards
IEEE Fellow
IEEE Fellow 2002, "For contributions to the field of computational mathematics and the development of mathematical software.".
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on zbmath.org
-
on viaf.org
-
on orcid.org
-
on id.loc.gov
-
on d-nb.info
-
on isni.org
-
on dl.acm.org
On csauthors.net:
Bibliography
2024
IEEE Trans. Knowl. Data Eng., November, 2024
IEEE Trans. Big Data, June, 2024
On Multilinear Inequalities of Holder-Brascamp-Lieb Type for Torsion-Free Discrete Abelian Groups.
J. Log. Anal., 2024
Int. J. High Perform. Comput. Appl., 2024
WallFacer: Guiding Transformer Model Training Out of the Long-Context Dark Forest with N-body Problem.
CoRR, 2024
LPSim: Large Scale Multi-GPU Parallel Computing based Regional Scale Traffic Simulation Framework.
CoRR, 2024
Proceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures, 2024
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
2023
An Improved Analysis and Unified Perspective on Deterministic and Randomized Low-Rank Matrix Approximation.
SIAM J. Matrix Anal. Appl., June, 2023
Surrogate-based Autotuning for Randomized Sketching Algorithms in Regression Problems.
CoRR, 2023
CoRR, 2023
Generalized Pseudospectral Shattering and Inverse-Free Matrix Pencil Diagonalization.
CoRR, 2023
Randomized Numerical Linear Algebra : A Perspective on the Field With an Eye to Software.
CoRR, 2023
Fast Exact Leverage Score Sampling from Khatri-Rao Products with Applications to Tensor Decomposition.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023
2022
Proceedings of the PASC '22: Platform for Advanced Scientific Computing Conference, Basel, Switzerland, June 27, 2022
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022
Proceedings of the Sixth IEEE/ACM International Workshop on Software Correctness for HPC Applications, 2022
2021
SIAM J. Sci. Comput., 2021
SIAM J. Sci. Comput., 2021
Comput. Methods Appl. Math., 2021
CCF Trans. High Perform. Comput., 2021
Proceedings of the High Performance Computing - 36th International Conference, 2021
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021
Proceedings of the 14th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2021
Proceedings of the 48th ACM/IEEE Annual International Symposium on Computer Architecture, 2021
Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021
2020
ACM Trans. Math. Softw., 2020
ACM Trans. Math. Softw., 2020
Knowl. Inf. Syst., 2020
Proceedings of the SPAA '20: 32nd ACM Symposium on Parallelism in Algorithms and Architectures, 2020
Proceedings of the 8th International Conference on Learning Representations, 2020
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2020
Proceedings of the 27th IEEE International Conference on High Performance Computing, 2020
2019
IEEE Trans. Parallel Distributed Syst., 2019
SIAM J. Sci. Comput., 2019
An improved analysis and unified perspective on deterministic and randomized low rank matrix approximations.
CoRR, 2019
Proceedings of the International Conference for High Performance Computing, 2019
Proceedings of the 2019 IEEE International Conference on Data Mining, 2019
2018
Low Rank Approximation of a Sparse Matrix Based on LU Factorization with Column and Row Tournament Pivoting.
SIAM J. Sci. Comput., 2018
Proceedings of the 30th on Symposium on Parallelism in Algorithms and Architectures, 2018
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018
Accurate, Fast and Scalable Kernel Ridge Regression on Parallel and Distributed Systems.
Proceedings of the 32nd International Conference on Supercomputing, 2018
Proceedings of the 47th International Conference on Parallel Processing, 2018
Proceedings of the 47th International Conference on Parallel Processing, 2018
Proceedings of the 25th IEEE Symposium on Computer Arithmetic, 2018
2017
Design and Implementation of a Communication-Optimal Classifier for Distributed Kernel Support Vector Machines.
IEEE Trans. Parallel Distributed Syst., 2017
CoRR, 2017
Proceedings of the 29th ACM Symposium on Parallelism in Algorithms and Architectures, 2017
Proceedings of the International Conference for High Performance Computing, 2017
Proceedings of the 46th International Conference on Parallel Processing, 2017
2016
SIAM J. Sci. Comput., 2016
Comput. Sci. Eng., 2016
Matrix Factorization at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies.
CoRR, 2016
Proceedings of the First International Workshop on Communication Optimizations in HPC, 2016
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium, 2016
Proceedings of the 38th International Conference on Software Engineering, 2016
Matrix factorizations at scale: A comparison of scientific data analytics in spark and C+MPI using three case studies.
Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016
2015
ACM Trans. Parallel Comput., 2015
SIAM J. Matrix Anal. Appl., 2015
Accuracy of the s-Step Lanczos Method for the Symmetric Eigenproblem in Finite Precision.
SIAM J. Matrix Anal. Appl., 2015
J. Parallel Distributed Comput., 2015
Proceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure, St. Louis, MO, USA, July 26, 2015
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015
Proceedings of the 22nd IEEE Symposium on Computer Arithmetic, 2015
2014
A Residual Replacement Strategy for Improving the Maximum Attainable Accuracy of s-Step Krylov Subspace Methods.
SIAM J. Matrix Anal. Appl., 2014
SIAM J. Matrix Anal. Appl., 2014
J. Parallel Distributed Comput., 2014
Acta Numer., 2014
Proceedings of the Annual Conference of the Extreme Science and Engineering Discovery Environment, 2014
Tradeoffs between synchronization, communication, and computation in parallel linear algebra computations.
Proceedings of the 26th ACM Symposium on Parallelism in Algorithms and Architectures, 2014
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014
Author retrospective for optimizing matrix multiply using PHiPAC: a portable high-performance ANSI C coding methodology.
Proceedings of the ACM International Conference on Supercomputing 25th Anniversary Volume, 2014
2013
SIAM J. Sci. Comput., 2013
LU Factorization with Panel Rank Revealing Pivoting and Its Communication Avoiding Version.
SIAM J. Matrix Anal. Appl., 2013
Communication lower bounds and optimal algorithms for programs that reference arrays - Part 1.
CoRR, 2013
Proceedings of the Extreme Science and Engineering Discovery Environment: Gateway to Discovery, 2013
Communication efficient gaussian elimination with partial pivoting using a shape morphing data layout.
Proceedings of the 25th ACM Symposium on Parallelism in Algorithms and Architectures, 2013
Proceedings of the 25th ACM Symposium on Parallelism in Algorithms and Architectures, 2013
Proceedings of the International Conference for High Performance Computing, 2013
Proceedings of the Parallel Processing and Applied Mathematics, 2013
Cyclops Tensor Framework: Reducing Communication and Eliminating Load Imbalance in Massively Parallel Contractions.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013
Implementing a Blocked Aasen's Algorithm with a Dynamic Scheduler on Multicore Architectures.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013
Proceedings of the 2013 IEEE International Conference on Big Data (IEEE BigData 2013), 2013
Proceedings of the 21st IEEE Symposium on Computer Arithmetic, 2013
Proceedings of the 21st IEEE Symposium on Computer Arithmetic, 2013
2012
Fast ℓ<sub>1</sub>-SPIRiT Compressed Sensing Parallel Imaging MRI: Scalable Parallel Implementation and Clinically Feasible Runtime.
IEEE Trans. Medical Imaging, 2012
SIAM J. Sci. Comput., 2012
Strong Scaling of Matrix Multiplication Algorithms and Memory-Independent Communication Lower Bounds
CoRR, 2012
Proceedings of the High Performance Computing for Computational Science, 2012
Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures, 2012
Brief announcement: strong scaling of matrix multiplication algorithms and memory-independent communication lower bounds.
Proceedings of the 24th ACM Symposium on Parallelism in Algorithms and Architectures, 2012
Proceedings of the SC Conference on High Performance Computing Networking, 2012
Poster: Beating MKL and ScaLAPACK at Rectangular Matrix Multiplication Using the BFS/DFS Approach.
Proceedings of the 2012 SC Companion: High Performance Computing, 2012
Proceedings of the 2012 SC Companion: High Performance Computing, 2012
Proceedings of the 17th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2012
Graph Expansion Analysis for Communication Costs of Fast Rectangular Matrix Multiplication.
Proceedings of the Design and Analysis of Algorithms, 2012
2011
SIAM J. Matrix Anal. Appl., 2011
SIAM J. Matrix Anal. Appl., 2011
Graph expansion and communication costs of fast matrix multiplication: regular submission.
Proceedings of the SPAA 2011: Proceedings of the 23rd Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2011
Proceedings of the SPAA 2011: Proceedings of the 23rd Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2011
Accurate and efficient expression evaluation and linear algebra, or why it can be easier to compute accurate eigenvalues of a Vandermonde matrix than the accurate sum of 3 numbers.
Proceedings of the SNC 2011, 2011
Improving communication performance in dense linear algebra via topology aware collectives.
Proceedings of the Conference on High Performance Computing Networking, 2011
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011
On improving trust-region variable projection algorithms for separable nonlinear least squares learning.
Proceedings of the 2011 International Joint Conference on Neural Networks, 2011
Proceedings of the 2011 IEEE Hot Chips 23 Symposium (HCS), 2011
Communication-Optimal Parallel 2.5D Matrix Multiplication and LU Factorization Algorithms.
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011
Proceedings of the Thirteenth Workshop on Algorithm Engineering and Experiments, 2011
2010
SIAM J. Sci. Comput., 2010
CoRR, 2010
Brief announcement: Lower bounds on communication for sparse Cholesky factorization of a model problem.
Proceedings of the SPAA 2010: Proceedings of the 22nd Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2010
2009
ACM Trans. Math. Softw., 2009
Nonnegative Diagonals and High Performance on Low-Profile Matrices from Householder QR.
SIAM J. Sci. Comput., 2009
Parallel Comput., 2009
Communication-optimal parallel and sequential Cholesky decomposition: extended abstract.
Proceedings of the SPAA 2009: Proceedings of the 21st Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2009
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2009
2008
ACM Trans. Math. Softw., 2008
ACM Trans. Math. Softw., 2008
SIAM J. Sci. Comput., 2008
SIAM J. Sci. Comput., 2008
Sparse SOS Relaxations for Minimizing Functions that are Summations of Small Polynomials.
SIAM J. Optim., 2008
J. Glob. Optim., 2008
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008
Proceedings of the ACM/IEEE Conference on High Performance Computing, 2008
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
2007
Proceedings of the Handbook of Parallel Computing - Models, Algorithms and Applications., 2007
SIAM J. Sci. Comput., 2007
Appl. Algebra Eng. Commun. Comput., 2007
Proceedings of the 6th International Conference on Information Processing in Sensor Networks, 2007
2006
Math. Program., 2006
Proceedings of the 4th International Conference on Embedded Networked Sensor Systems, 2006
Automatic Performance Tuning for the Multi-section with Multiple Eigenvalues Method for Symmetric Tridiagonal Eigenproblems.
Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006
Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006
2005
The Accurate and Efficient Solution of a Totally Positive Generalized Vandermonde Linear System.
SIAM J. Matrix Anal. Appl., 2005
J. Glob. Optim., 2005
Proceedings of the Computational Science, 2005
Proceedings of the Algebraic and Numerical Algorithms and Computer-assisted Proofs, 2005
2004
Fast and Accurate Floating Point Summation with Application to Computational Geometry.
Numer. Algorithms, 2004
Int. J. High Perform. Comput. Appl., 2004
Proceedings of the Applied Parallel Computing, 2004
Proceedings of the Applied Parallel Computing, 2004
Performance Models for Evaluation and Automatic Tuning of Symmetric Sparse Matrix-Vector Multiply.
Proceedings of the 33rd International Conference on Parallel Processing (ICPP 2004), 2004
2003
SuperLU_DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems.
ACM Trans. Math. Softw., 2003
On structure-exploiting trust-region regularized nonlinear least squares algorithms for neural-network learning.
Neural Networks, 2003
Iterative Scaled Trust-Region Learning in Krylov Subspaces via Pearlmutter's Implicit Sparse Hessian-Vector Multiply.
Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003
Proceedings of the Computational Science - ICCS 2003, 2003
2002
ACM Trans. Math. Softw., 2002
ACM Trans. Math. Softw., 2002
Proceedings of the 2002 ACM/IEEE conference on Supercomputing, 2002
2001
Proceedings of the Computational Science - ICCS 2001, 2001
Proceedings of the Computational Science - ICCS 2001, 2001
2000
Computing Connecting Orbits via an Improved Algorithm for Continuing Invariant Subspaces.
SIAM J. Sci. Comput., 2000
SIAM J. Matrix Anal. Appl., 2000
Proceedings of the Semantics, 2000
On Iterative Krylov-Dogleg Trust-Region Steps for Solving Neural Networks Nonlinear Least Squares Problems.
Proceedings of the Advances in Neural Information Processing Systems 13, 2000
Proceedings of the Templates for the Solution of Algebraic Eigenvalue Problems, 2000
Proceedings of the Templates for the Solution of Algebraic Eigenvalue Problems, 2000
Proceedings of the Templates for the Solution of Algebraic Eigenvalue Problems, 2000
Proceedings of the Templates for the Solution of Algebraic Eigenvalue Problems, 2000
1999
SIAM J. Matrix Anal. Appl., 1999
Proceedings of the Eleventh Annual ACM Symposium on Parallel Algorithms and Architectures, 1999
Proceedings of the ACM/IEEE Conference on Supercomputing, 1999
A Scalable Sparse Direct Solver Using Static Pivoting.
Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999
Software, Environments and Tools, SIAM, ISBN: 978-0-89871-960-4, 1999
1998
SIAM J. Matrix Anal. Appl., January, 1998
Proceedings of the ACM/IEEE Conference on Supercomputing, 1998
1997
ACM Trans. Math. Softw., 1997
The Spectral Decomposition of Nonsymmetric Matrices on Distributed Memory Parallel Computers.
SIAM J. Sci. Comput., 1997
J. Parallel Distributed Comput., 1997
ScaLAPACK: A Linear Algebra Library for Message-Passing Computers.
Proceedings of the Eighth SIAM Conference on Parallel Processing for Scientific Computing, 1997
Optimizing Matrix Multiply Using PHiPAC: A Portable, High-Performance, ANSI C Coding Methodology.
Proceedings of the 11th international conference on Supercomputing, 1997
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997
1996
ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance.
Proceedings of the 1996 ACM/IEEE Conference on Supercomputing, 1996
Proceedings of the Applied Parallel Computing, 1996
1995
Algorithms for Intersecting Parametric and Algebraic Curves II: Multiple Intersections.
CVGIP Graph. Model. Image Process., 1995
Proceedings of the 7th Annual ACM Symposium on Parallel Algorithms and Architectures, 1995
Proceedings of the Proceedings Supercomputing '95, San Diego, CA, USA, December 4-8, 1995, 1995
The Performance of Finding Eigenvalues and Eigenvaectors of Dense Symmetric Matrices on Distributed Memory Computers.
Proceedings of the Seventh SIAM Conference on Parallel Processing for Scientific Computing, 1995
ScaLAPACK: A Portable Linear Algebra Library for Distributed Memory Computers - Design Issues and Performance.
Proceedings of the Applied Parallel Computing, 1995
Proceedings of the Computer Science Today: Recent Trends and Developments, 1995
1994
ACM Trans. Graph., 1994
Other Titles in Applied Mathematics, SIAM, ISBN: 978-1-61197-153-8, 1994
1993
SIAM J. Matrix Anal. Appl., January, 1993
The generalized Schur decomposition of an arbitrary pencil A-λB - robust software with error bounds and applications. Part II: software and applications.
ACM Trans. Math. Softw., 1993
The generalized Schur decomposition of an arbitrary pencil A-λB - robust software with error bounds and applications. Part I: theory and algorithms.
ACM Trans. Math. Softw., 1993
ACM Trans. Math. Softw., 1993
LAPACK for Distributed Memory Architectures: The Next Generation.
Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, 1993
Design of a Parallel Nonsymmetric Eigenroutine Toolbox, Part I.
Proceedings of the Sixth SIAM Conference on Parallel Processing for Scientific Computing, 1993
1992
SIAM J. Matrix Anal. Appl., 1992
1991
Concurr. Pract. Exp., 1991
1990
SIAM Rev., 1990
Proceedings of the Proceedings Supercomputing '90, New York, NY, USA, November 12-16, 1990, 1990
1989
Int. J. High Speed Comput., 1989
Proceedings of the 1989 IEEE International Conference on Robotics and Automation, 1989
1988
Proceedings of the 1988 IEEE International Conference on Robotics and Automation, 1988
1987
Proceedings of the 8th IEEE Symposium on Computer Arithmetic, 1987
1985
An interval algorithm for solving systems of linear equations to prespecified accuracy.
Computing, 1985