Taisuke Boku
Orcid: 0000-0001-8730-2228
According to our database1,
Taisuke Boku
authored at least 163 papers
between 1985 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on dl.acm.org
On csauthors.net:
Bibliography
2024
Correction: Design and performance evaluation of UCX for the Tofu Interconnect D on Fugaku towards efficient multithreaded communication.
J. Supercomput., November, 2024
Design and performance evaluation of UCX for the Tofu Interconnect D on Fugaku towards efficient multithreaded communication.
J. Supercomput., September, 2024
Improving Performance on Replica-Exchange Molecular Dynamics Simulations by Optimizing GPU Core Utilization.
Proceedings of the 53rd International Conference on Parallel Processing, 2024
Using Intel oneAPI for Multi-hybrid Acceleration Programming with GPU and FPGA Coupling.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region Workshops, 2024
CHARM-SYCL & IRIS: A Tool Chain for Performance Portability on Extremely Heterogeneous Systems.
Proceedings of the 20th IEEE International Conference on e-Science, 2024
2023
ACM Trans. Reconfigurable Technol. Syst., March, 2023
OpenACC Unified Programming Environment for Multi-hybrid Acceleration with GPU and FPGA.
Proceedings of the High Performance Computing, 2023
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2023
Implementation and Performance Evaluation of Collective Communications Using CIRCUS on Multiple FPGAs.
Proceedings of the HPC Asia 2023 Workshops, 2023
Performance improvement by enhancing spatial parallelism on FPGA for HPC applications.
Proceedings of the IEEE International Conference on Cluster Computing, 2023
2022
Large-scale ab initio simulation of light-matter interaction at the atomic scale in Fugaku.
Int. J. High Perform. Comput. Appl., 2022
Proceedings of the 2022 International Symposium on VLSI Design, Automation and Test, 2022
Proceedings of the Parallel and Distributed Computing, Applications and Technologies, 2022
Design and Performance Evaluation of UCX for Tofu-D Interconnect with OpenSHMEM-UCX on Fugaku.
Proceedings of the IEEE/ACM Parallel Applications Workshop: Alternatives To MPI+X, 2022
Proceedings of the Workshop Proceedings of the 51st International Conference on Parallel Processing, 2022
Multi-hetero Acceleration by GPU and FPGA for Astrophysics Simulation on oneAPI Environment.
Proceedings of the HPC Asia 2022: International Conference on High Performance Computing in Asia-Pacific Region, Virtual Event, Japan, January 12, 2022
Performance Evaluation on GPU-FPGA Accelerated Computing Considering Interconnections between Accelerators.
Proceedings of the HEART 2022: International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, Tsukuba, Japan, June 9, 2022
Implementation and Performance Evaluation of Memory System Using Addressable Cache for HPC Applications on HBM2 Equipped FPGAs.
Proceedings of the Euro-Par 2022: Parallel Processing Workshops, 2022
Proceedings of the IEEE International Conference on Big Data, 2022
2021
IEEE Access, 2021
High Resolution of City-Level Climate Simulation by GPU with Multi-physical Phenomena.
Proceedings of the Network and Parallel Computing, 2021
Performance Evaluation of OpenCL-Enabled Inter-FPGA Optical Link Communication Framework CIRCUS and SMI.
Proceedings of the HPC Asia 2021: The International Conference on High Performance Computing in Asia-Pacific Region, 2021
Proceedings of the HEART '21: 11th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies, 2021
An efficient RTL buffering scheme for an FPGA-accelerated simulation of diffuse radiative transfer.
Proceedings of the International Conference on Field-Programmable Technology, 2021
Proceedings of the IEEE International Conference on Cluster Computing, 2021
Proceedings of the IEEE International Conference on Cluster Computing, 2021
Proceedings of the XcalableMP PGAS Programming Language, 2021
Proceedings of the XcalableMP PGAS Programming Language, 2021
Proceedings of the XcalableMP PGAS Programming Language, 2021
2020
Proceedings of the Software for Exascale Computing - SPPEXA 2016-2019, 2020
Multi-Hybrid Accelerated Simulation by GPU and FPGA on Radiative Transfer Simulation in Astrophysics.
J. Inf. Process., 2020
White Paper from Workshop on Large-scale Parallel Numerical Computing Technology (LSPANC 2020): HPC and Computer Arithmetic toward Minimal-Precision Computing.
CoRR, 2020
OpenCL-enabled Parallel Raytracing for Astrophysical Application on Multiple FPGAs with Optical Links.
Proceedings of the 2020 IEEE/ACM International Workshop on Heterogeneous High-performance Reconfigurable Computing, 2020
Proceedings of the 19th International Symposium on Parallel and Distributed Computing, 2020
Performance Evaluation of Pipelined Communication Combined with Computation in OpenCL Programming on FPGA.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020
Proceedings of the 31st IEEE International Conference on Application-specific Systems, 2020
Condensing an overload of parallel computing ingredients into a single architecture recipe.
Proceedings of the 31st IEEE International Conference on Application-specific Systems, 2020
2019
Evaluation of XcalableACC with tightly coupled accelerators/InfiniBand hybrid communication on accelerated cluster.
Int. J. High Perform. Comput. Appl., 2019
Implementation and evaluation of the HPC challenge benchmark in the XcalableMP PGAS language.
Int. J. High Perform. Comput. Appl., 2019
Comput. Phys. Commun., 2019
Proceedings of the High Performance Computing, 2019
Proceedings of the 13th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2019
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2019
Parallel Processing on FPGA Combining Computation and Communication in OpenCL Programming.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2019
Scalable communication performance prediction using auto-generated pseudo MPI event trace.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2019
Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, 2019
Proceedings of the 30th IEEE International Conference on Application-specific Systems, 2019
2018
Performance Optimization and Evaluation of Scalable Optoelectronics Application on Large Scale KNL Cluster.
Proceedings of the High Performance Computing - 33rd International Conference, 2018
Proceedings of the Supercomputing Frontiers - 4th Asian Conference, 2018
Proceedings of the Evolving OpenMP for Evolving Architectures, 2018
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018
Performance evaluation for a hydrodynamics application in XcalableACC PGAS language for accelerated clusters.
Proceedings of the Proceedings of Workshops of HPC Asia 2018, 2018
Performance evaluation for omni XcalableMP compiler on many-core cluster system based on knights landing.
Proceedings of the Proceedings of Workshops of HPC Asia 2018, 2018
Linkage of XcalableMP and Python languages for high productivity on HPC cluster system: application to graph order/degree problem.
Proceedings of the Proceedings of Workshops of HPC Asia 2018, 2018
Proceedings of the Proceedings of Workshops of HPC Asia 2018, 2018
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2018
Scaling collectives on large clusters using Intel(R) architecture processors and fabric.
Proceedings of the Proceedings of Workshops of HPC Asia 2018, 2018
Performance Evaluation of Large Scale Electron Dynamics Simulation under Many-core Cluster based on Knights Landing.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2018
Proceedings of the 9th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, 2018
2017
Proceedings of the First International Workshop on Software Correctness for HPC Applications, 2017
Proceedings of the International Conference on ReConFigurable Computing and FPGAs, 2017
Mixed Precision Solver Scalable to 16000 MPI Processes for Lattice Quantum Chromodynamics Simulations on the Oakforest-PACS System.
Proceedings of the Fifth International Symposium on Computing and Networking, 2017
Implementing Lattice QCD Application with XcalableACC Language on Accelerated Cluster.
Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017
Implementation and Evaluation of One-sided PGAS Communication in XcalableACC for Accelerated Clusters.
Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017
2016
Hybrid-view programming of nuclear fusion simulation code in the PGAS parallel programming language XcalableMP.
Parallel Comput., 2016
Implementation and Evaluation of NAS Parallel CG Benchmark on GPU Cluster with Proprietary Interconnect TCA.
Proceedings of the High Performance Computing for Computational Science - VECPAR 2016, 2016
Design and Preliminary Evaluation of Omni OpenACC Compiler for Massive MIMD Processor PEZY-SC.
Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016
Electron Dynamics Simulation with Time-Dependent Density Functional Theory on Large Scale Symmetric Mode Xeon Phi Cluster.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016
Performance evaluation of Stratix V DE5-Net FPGA board for high performance computing.
Proceedings of the 2016 International Conference on Computer, 2016
2015
Implementation of CG Method on GPU Cluster with Proprietary Interconnect TCA for GPU Direct Communication.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015
Hybrid Communication with TCA and InfiniBand on a Parallel Programming Language XcalableACC for GPU Clusters.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015
Improving Strong-Scaling on GPU Cluster Based on Tightly Coupled Accelerators Architecture.
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015
Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015
Towards Unification of Accelerated Computing and Interconnection For Extreme-Scale Computing.
Proceedings of the Applied Reconfigurable Computing - 11th International Symposium, 2015
2014
SIGARCH Comput. Archit. News, 2014
Massively-parallel electron dynamics calculations in real-time and real-space: Toward applications to nanostructures of more than ten-nanometers in size.
J. Comput. Phys., 2014
Performance evaluation of ultra-large-scale first-principles electronic structure calculation code on the K computer.
Int. J. High Perform. Comput. Appl., 2014
XcalableACC: extension of XcalableMP PGAS language using OpenACC for accelerator clusters.
Proceedings of the First Workshop on Accelerator Programming using Directives, 2014
Nuclear Fusion Simulation Code Optimization and Performance Evaluation on GPU Cluster.
Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014
Hybrid-view programming of nuclear fusion simulation code in the PGAS parallel programming language XcalableMP.
Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014
A Preliminarily Evaluation of PEACH3: A Switching Hub for Tightly Coupled Accelerators.
Proceedings of the Second International Symposium on Computing and Networking, 2014
QCD Library for GPU Cluster with Proprietary Interconnect for GPU Direct Communication.
Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014
2013
Tightly Coupled Accelerators Architecture for Minimizing Communication Latency among Accelerators.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013
Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013
Proceedings of the Algorithms and Architectures for Parallel Processing, 2013
Proceedings of the IEEE 21st Annual Symposium on High-Performance Interconnects, 2013
Task level pipelining with PEACH2: An FPGA switching fabric for high performance computing.
Proceedings of the 2013 International Conference on Field-Programmable Technology, 2013
2012
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012
GPU/CPU Work Sharing with Parallel Language XcalableMP-dev for Parallelized Accelerated Computing.
Proceedings of the 41st International Conference on Parallel Processing Workshops, 2012
Productivity and Performance of Global-View Programming with XcalableMP PGAS Language.
Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012
2011
Int. J. High Perform. Comput. Appl., 2011
First-principles calculations of electron states of a silicon nanowire with 100, 000 atoms on the K computer.
Proceedings of the Conference on High Performance Computing Networking, 2011
Proceedings of the IEEE International Solid-State Circuits Conference, 2011
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011
Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011
Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011
Proceedings of the IEEE/IFIP 9th International Conference on Embedded and Ubiquitous Computing, 2011
An 80 Gbps dependable multicore communication SoC with PCI express I/F and intelligent interrupt controller.
Proceedings of the 2011 IEEE Symposium on Low-Power and High-Speed Chips, 2011
2010
A massively-parallel electronic-structure calculations based on real-space density functional theory.
J. Comput. Phys., 2010
Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, 2010
PEARL: Power-Aware, Dependable, and High-Performance Communication Link Using PCI Express.
Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications, 2010
2009
Evaluation of Multicore Processors for Embedded Systems by Parallel Benchmark Program Using OpenMP.
Proceedings of the Evolving OpenMP in an Age of Extreme Parallelism, 2009
Proceedings of the 2009 IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, 2009
RI2N/DRV: Multi-link ethernet for high-bandwidth and fault-tolerant network on PC clusters.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009
Flexible Multi-link Ethernet Binding System for PC Clusters with Asymmetric Topology.
Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009
Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31, 2009
2008
Integrating Computing Resources on Multiple Grid-Enabled Job Scheduling Systems Through a Grid RPC System.
J. Grid Comput., 2008
A dynamic routing control system for high-performance PC cluster with multi-path Ethernet connection.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
OpenMPD: A Directive-Based Data Parallel Language Extension for Distributed Memory Systems.
Proceedings of the 37th International Conference on Parallel Processing, 2008
RI2N: High-bandwidth and fault-tolerant network with multi-link Ethernet for PC clusters.
Proceedings of the 2008 IEEE International Conference on Cluster Computing, 29 September, 2008
2007
Design and Implementation of OpenMPD: An OpenMP-Like Programming Language for Distributed Memory Systems.
Proceedings of the A Practical Programming Model for the Multi-Core Era, 2007
RI2N/UDP: High bandwidth and fault-tolerant network for a PC-cluster based on multi-link Ethernet.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007
2006
Storage challenge - High performance data analysis for particle physics using the Gfarm file system.
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006
Profile-based optimization of power performance by using dynamic voltage scaling on a PC cluster.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006
A scalable communication layer for multi-dimensional hyper crossbar network using multiple gigabit ethernet.
Proceedings of the 20th Annual International Conference on Supercomputing, 2006
Proceedings of the Advances in Grid and Pervasive Computing, 2006
Emprical study on Reducing Energy of Parallel Programs using Slack Reclamation by DVFS in a Power-scalable High Performance Cluster.
Proceedings of the 2006 IEEE International Conference on Cluster Computing, 2006
Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006
2005
Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005
Computation of High-Precision Mathematical Constants in a Combined Cluster and Grid Environment.
Proceedings of the Large-Scale Scientific Computing, 5th International Conference, 2005
Design of a Software Distributed Shared Memory System using an MPI communication layer.
Proceedings of the 8th International Symposium on Parallel Architectures, 2005
Low-cost High-bandwidth Tree Network for PC Clusters based on Tagged-VLAN Technology.
Proceedings of the 8th International Symposium on Parallel Architectures, 2005
Proceedings of the High-Performance Computing - 6th International Symposium, 2005
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005
Grid Environment for Computational Astrophysics Driven by GRAPE-6 with HMCS-G and OmniRPC.
Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005
2004
Proceedings of the 3rd Workshop on Memory Performance Issues, 2004
Proceedings of the 2004 Symposium on Applications and the Internet Workshops (SAINT 2004 Workshops), 2004
Proceedings of the 2004 Symposium on Applications and the Internet Workshops (SAINT 2004 Workshops), 2004
Proceedings of the 2004 Symposium on Applications and the Internet Workshops (SAINT 2004 Workshops), 2004
An Implementation of Parallel 3-D FFT Using Short Vector SIMD Instructions on Clusters of PCs.
Proceedings of the Applied Parallel Computing, 2004
Parallel Implementation of Strassen's Matrix Multiplication Algorithm for Heterogeneous Clusters.
Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004
Implementation and performance evaluation of CONFLEX-G: grid-enabled molecular conformational space search program with OmniRPC.
Proceedings of the 18th Annual International Conference on Supercomputing, 2004
Formation of Dwarf Galaxies in Reionized Universe with Heterogeneous Multi-computer System.
Proceedings of the Computational Science, 2004
2003
Int. J. Parallel Program., 2003
Proceedings of the OpenMP Shared Memory Parallel Programming, 2003
RI2N - Interconnection Network System for Clusters with Wide-Bandwidth and Fault-Tolerancy Based on Multiple Links.
Proceedings of the High Performance Computing, 5th International Symposium, 2003
Proceedings of the 3rd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2003), 2003
Proceedings of the 3rd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2003), 2003
2002
Proceedings of the 2002 International Conference on Parallel Computing in Electrical Engineering (PARELEC 2002), 2002
Proceedings of the High Performance Computing, 4th International Symposium, 2002
Heterogeneous multi-computer system: a new platform for multi-paradigm scientific simulation.
Proceedings of the 16th international conference on Supercomputing, 2002
Proceedings of the Euro-Par 2002, 2002
2001
Proceedings of the High-Performance Computing and Networking, 9th International Conference, 2001
2000
Proceedings of the Intelligent Memory Systems, Second International Workshop, 2000
SCIMA: Software Controlled Integrated Memory Architecture for High Performance Computing.
Proceedings of the IEEE International Conference On Computer Design: VLSI In Computers & Processors, 2000
1999
Parallel Comput., 1999
Commodity Network Based Parallel I/O System for Massively Parallel Processors.
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 1999
1998
Practical Simulation of Large-Scale Parallel Programs and Its Performance Analysis of the NAS Parallel Benchmarks.
Proceedings of the Euro-Par '98 Parallel Processing, 1998
1997
Proceedings of the 11th international conference on Supercomputing, 1997
Proceedings of the ASP-DAC '97 Asia and South Pacific Design Automation Conference, 1997
1996
Syst. Comput. Jpn., 1996
1994
Evaluation of Pseudo Vector Processor Based on Slide-Windowed Registers.
Proceedings of the 27th Annual Hawaii International Conference on System Sciences (HICSS-27), 1994
1993
A Scalar Architecture for Pseudo Vector Processing Based on Slide-Windowed Registers.
Proceedings of the 7th international conference on Supercomputing, 1993
1991
NCC: A concurrent description language for scientific calculation on multiprocessors.
Syst. Comput. Jpn., 1991
1990
IEEE Trans. Computers, 1990
1988
IMPULSE: A High Performance Processing Unit for Multiprocessors for Scientific Calculation.
Proceedings of the 15th Annual International Symposium on Computer Architecture, 1988
1985
Proceedings of the 12th Annual Symposium on Computer Architecture, 1985