Takeshi Fukaya
Orcid: 0000-0003-1217-6444
According to our database1,
Takeshi Fukaya
authored at least 33 papers
between 2007 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
A Cholesky QR type algorithm for computing tall-skinny QR factorization with column pivoting.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
2023
Convergence acceleration of preconditioned conjugate gradient solver based on error vector sampling for a sequence of linear systems.
Numer. Linear Algebra Appl., December, 2023
Numerical Behavior of Mixed Precision Iterative Refinement Using the BiCGSTAB Method.
J. Inf. Process., 2023
Subspace Correction Preconditioning for Solving a Sequence of Asymmetric Linear Systems Using the Bi-CGSTAB Method.
J. Inf. Process., 2023
A novel ILU preconditioning method with a block structure suitable for SIMD vectorization.
J. Comput. Appl. Math., 2023
2022
JSIAM Lett., 2022
Numerical Investigation into the Mixed Precision GMRES(<i>m</i>) Method Using FP64 and FP32.
J. Inf. Process., 2022
J. Inf. Process., 2022
Convergence Acceleration of Preconditioned CG Solver Based on Error Vector Sampling for a Sequence of Linear Systems.
CoRR, 2022
Distributed Parallel Tall-Skinny QR Factorization: Performance Evaluation of Various Algorithms on Various Systems.
Proceedings of the Parallel and Distributed Computing, Applications and Technologies, 2022
2021
Accelerating the SpMV kernel on standard CPUs by exploiting the partially diagonal structures.
CoRR, 2021
2020
SIAM J. Sci. Comput., 2020
White Paper from Workshop on Large-scale Parallel Numerical Computing Technology (LSPANC 2020): HPC and Computer Arithmetic toward Minimal-Precision Computing.
CoRR, 2020
Hierarchical block multi-color ordering: a new parallel ordering method for vectorization and parallelization of the sparse triangular solver in the ICCG method.
CCF Trans. High Perform. Comput., 2020
An Integer Arithmetic-Based Sparse Linear Solver Using a GMRES Method and Iterative Refinement.
Proceedings of the 11th IEEE/ACM Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2020
Effect of Mixed Precision Computing on H-Matrix Vector Multiplication in BEM Analysis.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2020
2019
Enhancement of Algebraic Block Multi-Color Ordering for ILU Preconditioning and Its Performance Evaluation in Preconditioned GMRES Solver.
J. Inf. Process., 2019
An investigation into the impact of the structured QR kernel on the overall performance of the TSQR algorithm.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2019
2018
A Case Study on Modeling the Performance of Dense Matrix Computation: Tridiagonalization in the EigenExa Eigensolver on the K Computer.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2018
2016
JSIAM Lett., 2016
On Constructing Cost Models for Online Automatic Tuning Using ATMathCoreLib: Case Studies through the SVD Computation on a Multicore Processor.
Proceedings of the 10th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2016
2015
Performance Analysis of the Chebyshev Basis Conjugate Gradient Method on the K Computer.
Proceedings of the Parallel Processing and Applied Mathematics, 2015
Proceedings of the Parallel Computing: On the Road to Exascale, 2015
Performance Evaluation of the Eigen Exa Eigensolver on Oakleaf-FX: Tridiagonalization Versus Pentadiagonalization.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015
2014
Performance Analysis of the Householder-Type Parallel Tall-Skinny QR Factorizations Toward Automatic Algorithm Selection.
Proceedings of the High Performance Computing for Computational Science - VECPAR 2014 - 11th International Conference, Eugene, OR, USA, June 30, 2014
CholeskyQR2: a simple and communication-avoiding algorithm for computing a tall-skinny QR factorization on a large-scale parallel system.
Proceedings of the 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2014
2011
Acceleration of Hessenberg Reduction for Nonsymmetric Eigenvalue Problems in a Hybrid CPU-GPU Computing Environment.
Int. J. Netw. Comput., 2011
2010
Differential qd algorithm for totally nonnegative Hessenberg matrices: introduction of origin shifts and relationship with the discrete hungry Lotka-Volterra system.
JSIAM Lett., 2010
Dynamic Programming Approaches to Optimizing the Blocking Strategy for Basic Matrix Decompositions.
Proceedings of the Software Automatic Tuning, From Concepts to State-of-the-Art Results, 2010
2009
Differential qd algorithm for totally nonnegative band matrices: convergence properties and error analysis.
JSIAM Lett., 2009
2008
A dynamic programming approach to optimizing the blocking strategy for the Householder QR decomposition.
Proceedings of the 2008 IEEE International Conference on Cluster Computing, 29 September, 2008
2007
Accelerating the Singular Value Decomposition of Rectangular Matrices with the CSX600 and the Integrable SVD.
Proceedings of the Parallel Computing Technologies, 2007