Xiaoye S. Li

Orcid: 0000-0002-0747-698X

According to our database1, Xiaoye S. Li authored at least 115 papers between 1991 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Non-smooth Bayesian optimization in tuning scientific applications.
Int. J. High Perform. Comput. Appl., 2024

Batched sparse direct solver design and evaluation in SuperLU_DIST.
Int. J. High Perform. Comput. Appl., 2024

Then and Now: Improving Software Portability, Productivity, and 100× Performance.
Comput. Sci. Eng., 2024

Elucidating the Design Choice of Probability Paths in Flow Matching for Forecasting.
CoRR, 2024

2023
Sparse Approximate Multifrontal Factorization with Composite Compression Methods.
ACM Trans. Math. Softw., September, 2023

Newly Released Capabilities in the Distributed-Memory SuperLU Sparse Direct Solver.
ACM Trans. Math. Softw., March, 2023

Recent Trends in Graph Decomposition (Dagstuhl Seminar 23331).
Dagstuhl Reports, 2023

Open Problems in (Hyper)Graph Decomposition.
CoRR, 2023

Construction of Hierarchically Semi-Separable matrix Representation using Adaptive Johnson-Lindenstrauss Sketching.
CoRR, 2023

Brief Announcement: Communication Optimal Sparse LU Factorization for Planar Matrices.
Proceedings of the 35th ACM Symposium on Parallelism in Algorithms and Architectures, 2023

Unified Communication Optimization Strategies for Sparse Triangular Solver on CPU and GPU Clusters.
Proceedings of the International Conference for High Performance Computing, 2023

Harnessing the Crowd for Autotuning High-Performance Computing Applications.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

2022
gSoFa: Scalable Sparse Symbolic LU Factorization on GPUs.
IEEE Trans. Parallel Distributed Syst., 2022

Resiliency in numerical algorithm design for extreme scale simulations.
Int. J. High Perform. Comput. Appl., 2022

Hybrid Models for Mixed Variables in Bayesian Optimization.
CoRR, 2022

Addressing Irregular Patterns of Matrix Computations on GPUs and Their Impact on Applications Powered by Sparse Direct Solvers.
Proceedings of the SC22: International Conference for High Performance Computing, 2022

GPTuneBand: Multi-task and Multi-fidelity Autotuning for Large-scale High Performance Computing Applications.
Proceedings of the 2022 SIAM Conference on Parallel Processing for Scientific Computing, 2022

Proposed Consistent Exception Handling for the BLAS and LAPACK.
Proceedings of the Sixth IEEE/ACM International Workshop on Software Correctness for HPC Applications, 2022

2021
Trust: Triangle Counting Reloaded on GPUs.
IEEE Trans. Parallel Distributed Syst., 2021

Butterfly Factorization Via Randomized Matrix-Vector Multiplications.
SIAM J. Sci. Comput., 2021

Sparse Approximate Multifrontal Factorization with Butterfly Compression for High-Frequency Wave Equations.
SIAM J. Sci. Comput., 2021

A survey of numerical linear algebra methods utilizing mixed-precision arithmetic.
Int. J. High Perform. Comput. Appl., 2021

Non-smooth Bayesian Optimization in Tuning Problems.
CoRR, 2021

Dr. Top-k: delegate-centric Top-k on GPUs.
Proceedings of the International Conference for High Performance Computing, 2021

GPTune: multitask learning for autotuning exascale applications.
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021

Enhancing Autotuning Capability with a History Database.
Proceedings of the 14th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2021

A Message-Driven, Multi-GPU Parallel Sparse Triangular Solver.
Proceedings of the 2021 SIAM Conference on Applied and Computational Discrete Algorithms, 2021

2020
A Distributed-Memory Algorithm for Computing a Heavy-Weight Perfect Matching on Bipartite Graphs.
SIAM J. Sci. Comput., 2020

A parallel hierarchical blocked adaptive cross approximation algorithm.
Int. J. High Perform. Comput. Appl., 2020

A Survey of Numerical Methods Utilizing Mixed Precision Arithmetic.
CoRR, 2020

GSoFa: Scalable Sparse LU Symbolic Factorization on GPUs.
CoRR, 2020

C-SAW: a framework for graph sampling and random walk on GPUs.
Proceedings of the International Conference for High Performance Computing, 2020

Leveraging One-Sided Communication for Sparse Triangular Solvers.
Proceedings of the 2020 SIAM Conference on Parallel Processing for Scientific Computing, 2020

Distributed macroscopic traffic simulation with Open Traffic Models.
Proceedings of the 23rd IEEE International Conference on Intelligent Transportation Systems, 2020

Scalable and Memory-Efficient Kernel Ridge Regression.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2020

2019
Robust and Accurate Stopping Criteria for Adaptive Randomized Sampling in Matrix-Free Hierarchically Semiseparable Construction.
SIAM J. Sci. Comput., 2019

A communication-avoiding 3D algorithm for sparse LU factorization on heterogeneous systems.
J. Parallel Distributed Comput., 2019

Multitask and Transfer Learning for Autotuning Exascale Applications.
CoRR, 2019

A communication-avoiding 3D sparse triangular solver.
Proceedings of the ACM International Conference on Supercomputing, 2019

H-INDEX: Hash-Indexing for Parallel Triangle Counting on GPUs.
Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019

2018
Consensus Ensemble System for Traffic Flow Prediction.
IEEE Trans. Intell. Transp. Syst., 2018

Efficient Online Hyperparameter Optimization for Kernel Ridge Regression with Applications to Traffic Time Series Prediction.
CoRR, 2018

Matrix-free construction of HSS representation using adaptive randomized sampling.
CoRR, 2018

A unified software framework for solving traffic assignment problems.
CoRR, 2018

A distributed-memory approximation algorithm for maximum weight perfect bipartite matching.
CoRR, 2018

Highly scalable distributed-memory sparse triangular solution algorithms.
Proceedings of the Eighth SIAM Workshop on Combinatorial Scientific Computing, 2018

Efficient Online Hyperparameter Learning for Traffic Flow Prediction.
Proceedings of the 21st International Conference on Intelligent Transportation Systems, 2018

A Unified Software Framework to Enable Solution of Traffic Assignment Problems at Extreme Scale.
Proceedings of the 21st International Conference on Intelligent Transportation Systems, 2018

A Communication-Avoiding 3D LU Factorization Algorithm for Sparse Matrices.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

A Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge Regression.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

2017
xSDK Foundations: Toward an Extreme-scale Scientific Software Development Kit.
Supercomput. Front. Innov., 2017

A Robust Parallel Preconditioner for Indefinite Systems Using Hierarchical Matrices and Randomized Sampling.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

2016
A Parallel Geometric Multifrontal Solver Using Hierarchically Semiseparable Structure.
ACM Trans. Math. Softw., 2016

A Distributed-Memory Package for Dense Hierarchically Semi-Separable Matrix Computations Using Randomization.
ACM Trans. Math. Softw., 2016

An Efficient Multicore Implementation of a Novel HSS-Structured Multifrontal Solver Using Randomized Sampling.
SIAM J. Sci. Comput., 2016

An algebraic multifrontal preconditioner that exploits the low-rank property.
Numer. Linear Algebra Appl., 2016

Tuning the Coarse Space Construction in a Spectral AMG Solver.
Proceedings of the International Conference on Computational Science 2016, 2016

Achieving High Parallel Efficiency on Modern Processors for X-Ray Scattering Data Analysis.
Proceedings of the Euro-Par 2016: Parallel Processing Workshops, 2016

2015
Extending Summation Precision for Network Reduction Operations.
Int. J. Parallel Program., 2015

An efficient multi-core implementation of a novel HSS-structured multifrontal solver using randomized sampling.
CoRR, 2015

Comparative Performance Analysis of Coarse Solvers for Algebraic Multigrid on Multicore and Manycore Architectures.
Proceedings of the Parallel Processing and Applied Mathematics, 2015

A Sparse Direct Solver for Distributed Memory Xeon Phi-Accelerated Systems.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium, 2015

Resilient Matrix Multiplication of Hierarchical Semi-Separable Matrices.
Proceedings of the 5th Workshop on Fault Tolerance for HPC at eXtreme Scale, 2015

2014
Using Random Butterfly Transformations to Avoid Pivoting in Sparse Direct Methods.
Proceedings of the High Performance Computing for Computational Science - VECPAR 2014 - 11th International Conference, Eugene, OR, USA, June 30, 2014

High-Performance Inverse Modeling with Reverse Monte Carlo Simulations.
Proceedings of the 43rd International Conference on Parallel Processing, 2014

A Distributed CPU-GPU Sparse Direct Solver.
Proceedings of the Euro-Par 2014 Parallel Processing, 2014

2013
Efficient Scalable Algorithms for Solving Dense Linear Systems with Hierarchically Semiseparable Structures.
SIAM J. Sci. Comput., 2013

Tuning HipGISAXS on Multi and Many Core Supercomputers.
Proceedings of the High Performance Computing Systems. Performance Modeling, Benchmarking and Simulation, 2013

On Partitioning and Reordering Problems in a Hierarchically Parallel Hybrid Linear Solver.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

2012
Massively parallel X-ray scattering simulations.
Proceedings of the SC Conference on High Performance Computing Networking, 2012

New Scheduling Strategies and Hybrid Programming for a Parallel Right-looking Sparse LU Factorization Algorithm on Multicore Cluster Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012

2011
SuperLU.
Proceedings of the Encyclopedia of Parallel Computing, 2011

A Supernodal Approach to Incomplete LU Factorization with Partial Pivoting.
ACM Trans. Math. Softw., 2011

2010
Direction-Preserving and Schur-Monotonic Semiseparable Approximations of Symmetric Positive Definite Matrices.
SIAM J. Matrix Anal. Appl., 2010

Fast algorithms for hierarchically semiseparable matrices.
Numer. Linear Algebra Appl., 2010

Particle-field decomposition and domain decomposition in parallel particle-in-cell beam dynamics simulation.
Comput. Phys. Commun., 2010

On Techniques to Improve Robustness and Scalability of a Parallel Hybrid Linear Solver.
Proceedings of the High Performance Computing for Computational Science - VECPAR 2010, 2010

2009
Extra-Precise Iterative Refinement for Overdetermined Least Squares Problems.
ACM Trans. Math. Softw., 2009

Superfast Multifrontal Method for Large Structured Linear Systems of Equations.
SIAM J. Matrix Anal. Appl., 2009

Performance Modeling Tools for Parallel Sparse Linear Algebra Computations.
Proceedings of the Parallel Computing: From Multicores and GPU's to Petascale, 2009

2008
An Implementation and Evaluation of the AMLS Method for Sparse Eigenvalue Problems.
ACM Trans. Math. Softw., 2008

Evaluation of Sparse LU Factorization and Triangular Solution on Multicore Platforms.
Proceedings of the High Performance Computing for Computational Science, 2008

2007
Prospectus for a Dense Linear Algebra Software Library.
Proceedings of the Handbook of Parallel Computing - Models, Algorithms and Applications., 2007

Parallel Symbolic Factorization for Sparse LU with Static Pivoting.
SIAM J. Sci. Comput., 2007

Towards an accurate performance modeling of parallel sparse factorization.
Appl. Algebra Eng. Commun. Comput., 2007

2006
Error bounds from extra-precise iterative refinement.
ACM Trans. Math. Softw., 2006

Unsymmetric Ordering Using A Constrained Markowitz Scheme.
SIAM J. Matrix Anal. Appl., 2006

Diagonal Markowitz Scheme with Local Symmetrization.
SIAM J. Matrix Anal. Appl., 2006

Prospectus for the Next LAPACK and ScaLAPACK Libraries.
Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

2005
An overview of SuperLU: Algorithms, implementation, and user interface.
ACM Trans. Math. Softw., 2005

An Algebraic Substructuring Method for Large-Scale Eigenvalue Calculation.
SIAM J. Sci. Comput., 2005

A Comparison of Three High-Precision Quadrature Schemes.
Exp. Math., 2005

2004
Algebraic Sub-structuring for Electromagnetic Applications.
Proceedings of the Applied Parallel Computing, 2004

Performance Analysis of Parallel Right-Looking Sparse LU Factorization on Two Dimensional Grids of Processors.
Proceedings of the Applied Parallel Computing, 2004

2003
SuperLU_DIST: A scalable distributed-memory sparse direct solver for unsymmetric linear systems.
ACM Trans. Math. Softw., 2003

Impact of the implementation of MPI point-to-point communications on the performance of two general sparse solvers.
Parallel Comput., 2003

2002
Design, implementation and testing of extended and mixed precision BLAS.
ACM Trans. Math. Softw., 2002

Effects of Ordering Strategies and Programming Paradigms on Sparse Matrix Computations.
SIAM Rev., 2002

A new scheduling algorithm for parallel sparse LU factorization with static pivoting.
Proceedings of the 2002 ACM/IEEE conference on Supercomputing, 2002

High performance computing meets experimental mathematics.
Proceedings of the 2002 ACM/IEEE conference on Supercomputing, 2002

Memory-Intensive Benchmarks: IRAM vs. Cache-Based Machines.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

2001
Analysis and comparison of two general sparse solvers for distributed memory computers.
ACM Trans. Math. Softw., 2001

Solution of a three-body problem in quantum mechanics using sparse linear algebra on parallel computers.
Proceedings of the 2001 ACM/IEEE conference on Supercomputing, 2001

Performance and tuning of two distributed memory sparse solvers.
Proceedings of the Tenth SIAM Conference on Parallel Processing for Scientific Computing, 2001

Algorithms for Quad-Double Precision Floating Point Arithmetic.
Proceedings of the 15th IEEE Symposium on Computer Arithmetic (Arith-15 2001), 2001

2000
Collection offers overview of research on structured matrices [Book Review].
IEEE Concurr., 2000

Ordering Unstructured Meshes for Sparse Matrix Computations on Leading Parallel Systems.
Proceedings of the Parallel and Distributed Processing, 2000

Common Issues.
Proceedings of the Templates for the Solution of Algebraic Eigenvalue Problems, 2000

1999
An Asynchronous Parallel Supernodal Algorithm for Sparse Gaussian Elimination.
SIAM J. Matrix Anal. Appl., 1999

A Supernodal Approach to Sparse Partial Pivoting.
SIAM J. Matrix Anal. Appl., 1999

A Scalable Sparse Direct Solver Using Static Pivoting.
Proceedings of the Ninth SIAM Conference on Parallel Processing for Scientific Computing, 1999

1998
Making Sparse Gaussian Elimination Scalable by Static Pivoting.
Proceedings of the ACM/IEEE Conference on Supercomputing, 1998

1994
Faster Numerical Algorithms via Exception Handling.
IEEE Trans. Computers, 1994

1993
Decentralized optimal power pricing: the development of a parallel program.
IEEE Parallel Distributed Technol. Syst. Appl., 1993

1991
On a Massively Parallel e-Relaxization Algorithm for Linear Transformation Problems.
Proceedings of the International Conference on Parallel Processing, 1991


  Loading...