Ichitaro Yamazaki
Orcid: 0000-0002-6196-2508
According to our database1,
Ichitaro Yamazaki
authored at least 80 papers
between 2006 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
2023
CoRR, 2023
An Experimental Study of Two-level Schwarz Domain-Decomposition Preconditioners on GPUs.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023
2022
Parallel Comput., 2022
Numer. Linear Algebra Appl., 2022
High-Performance GMRES Multi-Precision Benchmark: Design, Performance, and Challenges.
Proceedings of the IEEE/ACM International Workshop on Performance Modeling, 2022
Proceedings of the Parallel and Distributed Computing, Applications and Technologies, 2022
Proceedings of the 2022 IEEE International Parallel and Distributed Processing Symposium, 2022
2021
Two-Stage Gauss-Seidel Preconditioners and Smoothers for Krylov Solvers on a GPU cluster.
CoRR, 2021
CoRR, 2021
Exploiting Block Structures of KKT Matrices for Efficient Solution of Convex Optimization Problems.
IEEE Access, 2021
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021
2020
Concurr. Comput. Pract. Exp., 2020
Low-synchronization orthogonalization schemes for <i>s</i>-step and pipelined Krylov solvers in Trilinos.
Proceedings of the 2020 SIAM Conference on Parallel Processing for Scientific Computing, 2020
Performance Portable Supernode-based Sparse Triangular Solver for Manycore Architectures.
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020
2019
ACM Trans. Math. Softw., 2019
Parallel Comput., 2019
J. Inf. Process., 2019
Int. J. High Perform. Comput. Appl., 2019
Evaluation of Programming Models to Address Load Imbalance on Distributed Multi-Core CPUs: A Case Study with Block Low-Rank Factorization.
Proceedings of the 2019 IEEE/ACM Parallel Applications Workshop, Alternatives To MPI, 2019
Optimization of Numerous Small Dense-Matrix-Vector Multiplications in H-Matrix Arithmetic on GPU.
Proceedings of the 13th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2019
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019
Increasing Accuracy of Iterative Refinement in Limited Floating-Point Arithmetic on Half-Precision Accelerators.
Proceedings of the 2019 IEEE High Performance Extreme Computing Conference, 2019
Proceedings of the Euro-Par 2019: Parallel Processing, 2019
2018
IEEE Trans. Parallel Distributed Syst., 2018
Supercomput. Front. Innov., 2018
The Singular Value Decomposition: Anatomy of Optimizing an Algorithm for Extreme Scale.
SIAM Rev., 2018
Proceedings of the Supercomputing Frontiers - 4th Asian Conference, 2018
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018
2017
Design and Implementation of the PULSAR Programming System for Large Scale Computing.
Supercomput. Front. Innov., 2017
IEEE Embed. Syst. Lett., 2017
Concurr. Comput. Pract. Exp., 2017
Concurr. Comput. Pract. Exp., 2017
Improving Performance of GMRES by Reducing Communication and Pipelining Global Collectives.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017
Proceedings of the 2017 IEEE High Performance Extreme Computing Conference, 2017
Proceedings of the 2017 IEEE International Conference on Big Data (IEEE BigData 2017), 2017
Scaling point set registration in 3D across thread counts on multicore and hardware accelerator platforms through autotuning for large scale analysis of scientific point clouds.
Proceedings of the 2017 IEEE International Conference on Big Data (IEEE BigData 2017), 2017
Proceedings of the Handbook of Big Data Technologies, 2017
2016
Stability and Performance of Various Singular Value QR Implementations on Multicore CPU with a GPU.
ACM Trans. Math. Softw., 2016
Acta Numer., 2016
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016
2015
Supercomput. Front. Innov., 2015
Computing Low-Rank Approximation of a Dense Matrix on Multicore CPUs with a GPU and Its Application to Solving a Hierarchically Semiseparable Linear System of Equations.
Sci. Program., 2015
Mixed-Precision Cholesky QR Factorization and Its Case Studies on Multicore CPU with Multiple GPUs.
SIAM J. Sci. Comput., 2015
Concurr. Comput. Pract. Exp., 2015
Proceedings of the 6th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2015
Randomized algorithms to update partial singular value decomposition on a hybrid CPU/GPU cluster.
Proceedings of the International Conference for High Performance Computing, 2015
Performance of random sampling for computing low-rank approximations of a dense matrix on GPUs.
Proceedings of the International Conference for High Performance Computing, 2015
Proceedings of the Parallel Processing and Applied Mathematics, 2015
2014
SIAM J. Matrix Anal. Appl., 2014
Design and Implementation of a Large Scale Tree-Based QR Decomposition Using a 3D Virtual Systolic Array and a Lightweight Runtime.
Parallel Process. Lett., 2014
Tridiagonalization of a dense symmetric matrix on multiple GPUs and its application to symmetric eigenvalue problems.
Concurr. Comput. Pract. Exp., 2014
Mixed-Precision Orthogonalization Scheme and Adaptive Step Size for Improving the Stability and Performance of CA-GMRES on GPUs.
Proceedings of the High Performance Computing for Computational Science - VECPAR 2014 - 11th International Conference, Eugene, OR, USA, June 30, 2014
Proceedings of the 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2014
Domain Decomposition Preconditioners for Communication-Avoiding Krylov Methods on a Hybrid CPU/GPU Cluster.
Proceedings of the International Conference for High Performance Computing, 2014
Performance and portability with OpenCL for throughput-oriented HPC workloads across accelerators, coprocessors, and multicore processors.
Proceedings of the 5th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems, 2014
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014
Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014
Proceedings of the 2014 IEEE International Conference on Big Data (IEEE BigData 2014), 2014
Proceedings of the Numerical Computations with GPUs, 2014
2013
Performance comparison of parallel eigensolvers based on a contour integral method and a Lanczos method.
Parallel Comput., 2013
On Partitioning and Reordering Problems in a Hierarchically Parallel Hybrid Linear Solver.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013
Implementing a Blocked Aasen's Algorithm with a Dynamic Scheduler on Multicore Architectures.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013
2012
Proceedings of the International Conference on Computational Science, 2012
Proceedings of the 2012 SC Companion: High Performance Computing, 2012
Proceedings of the 2012 SC Companion: High Performance Computing, 2012
New Scheduling Strategies and Hybrid Programming for a Parallel Right-looking Sparse LU Factorization Algorithm on Multicore Cluster Systems.
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium, 2012
2011
A Communication-Avoiding Thick-Restart Lanczos Method on a Distributed-Memory System.
Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011
2010
ACM Trans. Math. Softw., 2010
On Techniques to Improve Robustness and Scalability of a Parallel Hybrid Linear Solver.
Proceedings of the High Performance Computing for Computational Science - VECPAR 2010, 2010
2008
CompostBin: A DNA Composition-Based Algorithm for Binning Environmental Shotgun Reads.
Proceedings of the Research in Computational Molecular Biology, 2008
2006
Proceedings of the 2006 International Conference on Shape Modeling and Applications (SMI 2006), 2006