Taisuke Boku

Proceedings of the HEART 2022: International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, Tsukuba, Japan, June 9, 2022

Implementation and Performance Evaluation of Memory System Using Addressable Cache for HPC Applications on HBM2 Equipped FPGAs.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2022: Parallel Processing Workshops, 2022

An FPGA-based Accelerator for Regular Path Queries over Edge-labeled Graphs.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Big Data, 2022

2021

A Highly-Efficient and Tightly-Connected Many-Core Overlay Architecture.

[BibT_eX]

[DOI]

IEEE Access, 2021

High Resolution of City-Level Climate Simulation by GPU with Multi-physical Phenomena.

[BibT_eX]

[DOI]

Proceedings of the Network and Parallel Computing, 2021

Performance Evaluation of OpenCL-Enabled Inter-FPGA Optical Link Communication Framework CIRCUS and SMI.

[BibT_eX]

[DOI]

Proceedings of the HPC Asia 2021: The International Conference on High Performance Computing in Asia-Pacific Region, 2021

A Sorting Library for FPGA Implementation in OpenCL Programming.

[BibT_eX]

[DOI]

Proceedings of the HEART '21: 11th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies, 2021

An efficient RTL buffering scheme for an FPGA-accelerated simulation of diffuse radiative transfer.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Field-Programmable Technology, 2021

An FPGA-based storage control with load balancing.

[BibT_eX]

[DOI]

Naoya Umezu

Proceedings of the IEEE International Conference on Cluster Computing, 2021

HBM2 Memory System for HPC Applications on an FPGA.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Cluster Computing, 2021

Multi-SPMD Programming Model with YML and XcalableMP.

[BibT_eX]

[DOI]

Proceedings of the XcalableMP PGAS Programming Language, 2021

Hybrid-View Programming of Nuclear Fusion Simulation Code in XcalableMP.

[BibT_eX]

[DOI]

Proceedings of the XcalableMP PGAS Programming Language, 2021

XcalableACC: An Integration of XcalableMP and OpenACC.

[BibT_eX]

[DOI]

Proceedings of the XcalableMP PGAS Programming Language, 2021

2020

MYX: Runtime Correctness Analysis for Multi-Level Parallel Programming Paradigms.

[BibT_eX]

[DOI]

Proceedings of the Software for Exascale Computing - SPPEXA 2016-2019, 2020

Multi-Hybrid Accelerated Simulation by GPU and FPGA on Radiative Transfer Simulation in Astrophysics.

[BibT_eX]

[DOI]

J. Inf. Process., 2020

White Paper from Workshop on Large-scale Parallel Numerical Computing Technology (LSPANC 2020): HPC and Computer Arithmetic toward Minimal-Precision Computing.

[BibT_eX]

[DOI]

CoRR, 2020

OpenCL-enabled Parallel Raytracing for Astrophysical Application on Multiple FPGAs with Optical Links.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE/ACM International Workshop on Heterogeneous High-performance Reconfigurable Computing, 2020

Parallelized GPU Code of City-Level Large Eddy Simulation.

[BibT_eX]

[DOI]

Proceedings of the 19th International Symposium on Parallel and Distributed Computing, 2020

Performance Evaluation of Pipelined Communication Combined with Computation in OpenCL Programming on FPGA.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

AsHES 2020 Keynote Speaker (5: 30 pm CDT).

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

Accelerating Radiative Transfer Simulation with GPU-FPGA Cooperative Computation.

[BibT_eX]

[DOI]

Proceedings of the 31st IEEE International Conference on Application-specific Systems, 2020

Condensing an overload of parallel computing ingredients into a single architecture recipe.

[BibT_eX]

[DOI]

Proceedings of the 31st IEEE International Conference on Application-specific Systems, 2020

2019

Evaluation of XcalableACC with tightly coupled accelerators/InfiniBand hybrid communication on accelerated cluster.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2019

Implementation and evaluation of the HPC challenge benchmark in the XcalableMP PGAS language.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2019

SALMON: Scalable Ab-initio Light-Matter simulator for Optics and Nanoscience.

[BibT_eX]

[DOI]

Comput. Phys. Commun., 2019

Using FPGAs to Accelerate HPC and Data Analytics on Intel-Based Systems.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, 2019

MITRACA: A Next-Gen Heterogeneous Architecture.

[BibT_eX]

[DOI]

Proceedings of the 13th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2019

GPU-FPGA Heterogeneous Computing with OpenCL-Enabled Direct Memory Access.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2019

Parallel Processing on FPGA Combining Computation and Communication in OpenCL Programming.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2019

Scalable communication performance prediction using auto-generated pseudo MPI event trace.

[BibT_eX]

[DOI]

Miwako Tsuji

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2019

FPGA-based Implementation of Memory-Intensive Application using OpenCL.

[BibT_eX]

[DOI]

Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, 2019

MITRACA: Manycore Interlinked Torus Reconfigurable Accelerator Architecture.

[BibT_eX]

[DOI]

Proceedings of the 30th IEEE International Conference on Application-specific Systems, 2019

2018

Performance Optimization and Evaluation of Scalable Optoelectronics Application on Large Scale KNL Cluster.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing - 33rd International Conference, 2018

MACC: An OpenACC Transpiler for Automatic Multi-GPU Use.

[BibT_eX]

[DOI]

Proceedings of the Supercomputing Frontiers - 4th Asian Conference, 2018

Trade-Off of Offloading to FPGA in OpenMP Task-Based Programming.

[BibT_eX]

[DOI]

Proceedings of the Evolving OpenMP for Evolving Architectures, 2018

Performance and Scalability of Lightweight Multi-kernel Based Operating Systems.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium, 2018

Performance evaluation for a hydrodynamics application in XcalableACC PGAS language for accelerated clusters.

[BibT_eX]

[DOI]

Proceedings of the Proceedings of Workshops of HPC Asia 2018, 2018

Performance evaluation for omni XcalableMP compiler on many-core cluster system based on knights landing.

[BibT_eX]

[DOI]

Proceedings of the Proceedings of Workshops of HPC Asia 2018, 2018

Linkage of XcalableMP and Python languages for high productivity on HPC cluster system: application to graph order/degree problem.

[BibT_eX]

[DOI]

Proceedings of the Proceedings of Workshops of HPC Asia 2018, 2018

Multiple endpoints for improved MPI performance on a lattice QCD code.

[BibT_eX]

[DOI]

Proceedings of the Proceedings of Workshops of HPC Asia 2018, 2018

OpenCL-ready High Speed FPGA Network for Reconfigurable High Performance Computing.

[BibT_eX]

[DOI]

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2018

Scaling collectives on large clusters using Intel(R) architecture processors and fabric.

[BibT_eX]

[DOI]

Proceedings of the Proceedings of Workshops of HPC Asia 2018, 2018

Performance Evaluation of Large Scale Electron Dynamics Simulation under Many-core Cluster based on Knights Landing.

[BibT_eX]

[DOI]

Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2018

Accelerating Space Radiative Transfer on FPGA using OpenCL.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, 2018

2017

Runtime Correctness Checking for Emerging Programming Paradigms.

[BibT_eX]

[DOI]

Proceedings of the First International Workshop on Software Correctness for HPC Applications, 2017

Thorough analysis of PCIe Gen3 communication.

[BibT_eX]

[DOI]

Proceedings of the International Conference on ReConFigurable Computing and FPGAs, 2017

Mixed Precision Solver Scalable to 16000 MPI Processes for Lattice Quantum Chromodynamics Simulations on the Oakforest-PACS System.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Symposium on Computing and Networking, 2017

Implementing Lattice QCD Application with XcalableACC Language on Accelerated Cluster.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Cluster Computing, 2017

Implementation and Evaluation of One-sided PGAS Communication in XcalableACC for Accelerated Clusters.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE/ACM International Symposium on Cluster, 2017

2016

Hybrid-view programming of nuclear fusion simulation code in the PGAS parallel programming language XcalableMP.

[BibT_eX]

[DOI]

Parallel Comput., 2016

Implementation and Evaluation of NAS Parallel CG Benchmark on GPU Cluster with Proprietary Interconnect TCA.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing for Computational Science - VECPAR 2016, 2016

Design and Preliminary Evaluation of Omni OpenACC Compiler for Massive MIMD Processor PEZY-SC.

[BibT_eX]

[DOI]

Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016

Electron Dynamics Simulation with Time-Dependent Density Functional Theory on Large Scale Symmetric Mode Xeon Phi Cluster.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Performance evaluation of Stratix V DE5-Net FPGA board for high performance computing.

[BibT_eX]

[DOI]

Iman Firmansyah

Proceedings of the 2016 International Conference on Computer, 2016

2015

Implementation of CG Method on GPU Cluster with Proprietary Interconnect TCA for GPU Direct Communication.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Hybrid Communication with TCA and InfiniBand on a Parallel Programming Language XcalableACC for GPU Clusters.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

Improving Strong-Scaling on GPU Cluster Based on Tightly Coupled Accelerators Architecture.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

Evaluation of FFT for GPU Cluster Using Tightly Coupled Accelerators Architecture.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Cluster Computing, 2015

Towards Unification of Accelerated Computing and Interconnection For Extreme-Scale Computing.

[BibT_eX]

[DOI]

Proceedings of the Applied Reconfigurable Computing - 11th International Symposium, 2015

2014

PEACH2: An FPGA-based PCIe network device for Tightly Coupled Accelerators.

[BibT_eX]

[DOI]

SIGARCH Comput. Archit. News, 2014

Massively-parallel electron dynamics calculations in real-time and real-space: Toward applications to nanostructures of more than ten-nanometers in size.

[BibT_eX]

[DOI]

J. Comput. Phys., 2014

Performance evaluation of ultra-large-scale first-principles electronic structure calculation code on the K computer.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2014

XcalableACC: extension of XcalableMP PGAS language using OpenACC for accelerator clusters.

[BibT_eX]

[DOI]

Proceedings of the First Workshop on Accelerator Programming using Directives, 2014

Nuclear Fusion Simulation Code Optimization and Performance Evaluation on GPU Cluster.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

Hybrid-view programming of nuclear fusion simulation code in the PGAS parallel programming language XcalableMP.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems, 2014

A Preliminarily Evaluation of PEACH3: A Switching Hub for Tightly Coupled Accelerators.

[BibT_eX]

[DOI]

Proceedings of the Second International Symposium on Computing and Networking, 2014

QCD Library for GPU Cluster with Proprietary Interconnect for GPU Direct Communication.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2014: Parallel Processing Workshops, 2014

2013

Tightly Coupled Accelerators Architecture for Minimizing Communication Latency among Accelerators.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

Nuclear Fusion Simulation Code Optimization on GPU Clusters.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE International Conference on Parallel and Distributed Systems, 2013

Adaptive Task Size Control on High Level Programming for GPU/CPU Work Sharing.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2013

Interconnection Network for Tightly Coupled Accelerators Architecture.

[BibT_eX]

[DOI]

Proceedings of the IEEE 21st Annual Symposium on High-Performance Interconnects, 2013

Task level pipelining with PEACH2: An FPGA switching fabric for high performance computing.

[BibT_eX]

[DOI]

Proceedings of the 2013 International Conference on Field-Programmable Technology, 2013

2012

Implementation of XcalableMP Device Acceleration Extention with OpenCL.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

GPU/CPU Work Sharing with Parallel Language XcalableMP-dev for Parallelized Accelerated Computing.

[BibT_eX]

[DOI]

Proceedings of the 41st International Conference on Parallel Processing Workshops, 2012

Productivity and Performance of Global-View Programming with XcalableMP PGAS Language.

[BibT_eX]

[DOI]

Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012

2011

Peach: A Multicore Communication System on Chip with PCI Express.

[BibT_eX]

[DOI]

IEEE Micro, 2011

The International Exascale Software Project roadmap.

[BibT_eX]

[DOI]

Bertrand Braunschweig

Int. J. High Perform. Comput. Appl., 2011

First-principles calculations of electron states of a silicon nanowire with 100, 000 atoms on the K computer.

[BibT_eX]

[DOI]

Proceedings of the Conference on High Performance Computing Networking, 2011

An 80Gb/s dependable communication SoC with PCI express I/F and 8 CPUs.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Solid-State Circuits Conference, 2011

PEARL and PEACH: A Novel PCI Express Direct Link and Its Implementation.

[BibT_eX]

[DOI]

Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

An Extension of XcalableMP PGAS Lanaguage for Multi-node GPU Clusters.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2011: Parallel Processing Workshops - CCPI, CGWS, HeteroPar, HiBB, HPCVirt, HPPC, HPSS, MDGS, ProPer, Resilience, UCHPC, VHPC, Bordeaux, France, August 29, 2011

Introduction.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2011 Parallel Processing - 17th International Conference, 2011

XMCAPI: Inter-core Communication Interface on Multi-chip Embedded Systems.

[BibT_eX]

[DOI]

Proceedings of the IEEE/IFIP 9th International Conference on Embedded and Ubiquitous Computing, 2011

An 80 Gbps dependable multicore communication SoC with PCI express I/F and intelligent interrupt controller.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE Symposium on Low-Power and High-Speed Chips, 2011

2010

A massively-parallel electronic-structure calculations based on real-space density functional theory.

[BibT_eX]

[DOI]

J. Comput. Phys., 2010

XcalableMP implementation and performance of NAS Parallel Benchmarks.

[BibT_eX]

[DOI]

Proceedings of the Fourth Conference on Partitioned Global Address Space Programming Model, 2010

PEARL: Power-Aware, Dependable, and High-Performance Communication Link Using PCI Express.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications, 2010

2009

Evaluation of Multicore Processors for Embedded Systems by Parallel Benchmark Program Using OpenMP.

[BibT_eX]

[DOI]

Proceedings of the Evolving OpenMP in an Age of Extreme Parallelism, 2009

Towards an Open Dependable Operating System.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing, 2009

RI2N/DRV: Multi-link ethernet for high-bandwidth and fault-tolerant network on PC clusters.

[BibT_eX]

[DOI]

Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

Flexible Multi-link Ethernet Binding System for PC Clusters with Asymmetric Topology.

[BibT_eX]

[DOI]

Proceedings of the 15th IEEE International Conference on Parallel and Distributed Systems, 2009

Using a cluster as a memory resource: A fast and large virtual memory on MPI.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE International Conference on Cluster Computing, August 31, 2009

2008

Integrating Computing Resources on Multiple Grid-Enabled Job Scheduling Systems Through a Grid RPC System.

[BibT_eX]

[DOI]

J. Grid Comput., 2008

A dynamic routing control system for high-performance PC cluster with multi-path Ethernet connection.

[BibT_eX]

[DOI]

Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

OpenMPD: A Directive-Based Data Parallel Language Extension for Distributed Memory Systems.

[BibT_eX]

[DOI]

Jinpil Lee

Proceedings of the 37th International Conference on Parallel Processing, 2008

RI2N: High-bandwidth and fault-tolerant network with multi-link Ethernet for PC clusters.

[BibT_eX]

[DOI]

Proceedings of the 2008 IEEE International Conference on Cluster Computing, 29 September, 2008

2007

Design and Implementation of OpenMPD: An OpenMP-Like Programming Language for Distributed Memory Systems.

[BibT_eX]

[DOI]

Jinpil Lee

Proceedings of the A Practical Programming Model for the Multi-Core Era, 2007

RI2N/UDP: High bandwidth and fault-tolerant network for a PC-cluster based on multi-link Ethernet.

[BibT_eX]

[DOI]

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

2006

Storage challenge - High performance data analysis for particle physics using the Gfarm file system.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

Profile-based optimization of power performance by using dynamic voltage scaling on a PC cluster.

[BibT_eX]

[DOI]

Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

MegaProto/E: power-aware high-performance cluster with commodity technology.

[BibT_eX]

[DOI]

Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

A scalable communication layer for multi-dimensional hyper crossbar network using multiple gigabit ethernet.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual International Conference on Supercomputing, 2006

Performance Improvement by Data Management Layer in a Grid RPC System.

[BibT_eX]

[DOI]

Proceedings of the Advances in Grid and Pervasive Computing, 2006

Emprical study on Reducing Energy of Parallel Programs using Slack Reclamation by DVFS in a Power-scalable High Performance Cluster.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Cluster Computing, 2006

PACS-CS: A Large-Scale Bandwidth-Aware PC Cluster for Scientific Computations.

[BibT_eX]

[DOI]

Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006

2005

MegaProto: 1 TFlops/10kW Rack Is Feasible Even with Only Commodity Technology.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005

Computation of High-Precision Mathematical Constants in a Combined Cluster and Grid Environment.

[BibT_eX]

[DOI]

Proceedings of the Large-Scale Scientific Computing, 5th International Conference, 2005

Design of a Software Distributed Shared Memory System using an MPI communication layer.

[BibT_eX]

[DOI]

Proceedings of the 8th International Symposium on Parallel Architectures, 2005

Low-cost High-bandwidth Tree Network for PC Clusters based on Tagged-VLAN Technology.

[BibT_eX]

[DOI]

Proceedings of the 8th International Symposium on Parallel Architectures, 2005

Empirical Study for Optimization of Power-Performance with On-Chip Memory.

[BibT_eX]

[DOI]

Proceedings of the High-Performance Computing - 6th International Symposium, 2005

MegaProto: A Low-Power and Compact Cluster for High-Performance Computing.

[BibT_eX]

[DOI]

Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Grid Environment for Computational Astrophysics Driven by GRAPE-6 with HMCS-G and OmniRPC.

[BibT_eX]

[DOI]

Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

2004

SCIMA-SMP: on-chip memory processor architecture for SMP.

[BibT_eX]

[DOI]

Proceedings of the 3rd Workshop on Memory Performance Issues, 2004

The Second Trans-Pacific Grid Datafarm Testbed and Experiments for SC2003.

[BibT_eX]

[DOI]

Proceedings of the 2004 Symposium on Applications and the Internet Workshops (SAINT 2004 Workshops), 2004

Heterogeneous Remote Computing System for Computational Astrophysics with OmniRPC.

[BibT_eX]

[DOI]

Proceedings of the 2004 Symposium on Applications and the Internet Workshops (SAINT 2004 Workshops), 2004

Performance Evaluation of OmniRPC in a Grid Environment.

[BibT_eX]

[DOI]

Proceedings of the 2004 Symposium on Applications and the Internet Workshops (SAINT 2004 Workshops), 2004

An Implementation of Parallel 3-D FFT Using Short Vector SIMD Instructions on Clusters of PCs.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel Computing, 2004

Parallel Implementation of Strassen's Matrix Multiplication Algorithm for Heterogeneous Clusters.

[BibT_eX]

[DOI]

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

Implementation and performance evaluation of CONFLEX-G: grid-enabled molecular conformational space search program with OmniRPC.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual International Conference on Supercomputing, 2004

Formation of Dwarf Galaxies in Reionized Universe with Heterogeneous Multi-computer System.

[BibT_eX]

[DOI]

Proceedings of the Computational Science, 2004

2003

Performance Evaluation of the Hitachi SR8000 Using SPEC OMP2001 Benchmarks.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2003

An OpenMP Implementation of Parallel FFT and Its Performance on IA-64 Processors.

[BibT_eX]

[DOI]

Proceedings of the OpenMP Shared Memory Parallel Programming, 2003

RI2N - Interconnection Network System for Clusters with Wide-Bandwidth and Fault-Tolerancy Based on Multiple Links.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, 5th International Symposium, 2003

OmniRPC: a Grid RPC ystem for Parallel Programming in Cluster and Grid Environment.

[BibT_eX]

[DOI]

Proceedings of the 3rd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2003), 2003

HMCS-G: Grid-enabled Hybrid Computing System for Computational Astrophysics.

[BibT_eX]

[DOI]

Proceedings of the 3rd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2003), 2003

2002

Heterogeneous Multi-Computer System: A New Paradigm of Parallel Processing.

[BibT_eX]

[DOI]

Proceedings of the 2002 International Conference on Parallel Computing in Electrical Engineering (PARELEC 2002), 2002

Performance Evaluation of the Hitachi SR8000 Using OpenMP Benchmarks.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, 4th International Symposium, 2002

Heterogeneous multi-computer system: a new platform for multi-paradigm scientific simulation.

[BibT_eX]

[DOI]

Proceedings of the 16th international conference on Supercomputing, 2002

A Blocking Algorithm for Parallel 1-D FFT on Clusters of PCs.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2002, 2002

2001

PIO: Parallel I/O System for Massively Parallel Processors.

[BibT_eX]

[DOI]

Masazumi Matsubara

Ken'ichi Itakura

Proceedings of the High-Performance Computing and Networking, 9th International Conference, 2001

2000

Software Controlled Reconfigurable On-Chip Memory for High Performance Computing.

[BibT_eX]

[DOI]

Hiroshi Nakamura

Masaaki Kondo

Proceedings of the Intelligent Memory Systems, Second International Workshop, 2000

SCIMA: Software Controlled Integrated Memory Architecture for High Performance Computing.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference On Computer Design: VLSI In Computers & Processors, 2000

1999

CP-PACS: A massively parallel processor at the University of Tsukuba.

[BibT_eX]

[DOI]

Parallel Comput., 1999

Performance of lattice QCD programs on CP-PACS.

[BibT_eX]

[DOI]

Parallel Comput., 1999

Commodity Network Based Parallel I/O System for Massively Parallel Processors.

[BibT_eX]

Masazumi Matsubara

Hisataka Numa

Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, 1999

1998

Practical Simulation of Large-Scale Parallel Programs and Its Performance Analysis of the NAS Parallel Benchmarks.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par '98 Parallel Processing, 1998

1997

CP-PACS: A Massively Parallel Processor for Large Scale Scientific Calculations.

[BibT_eX]

[DOI]

Proceedings of the 11th international conference on Supercomputing, 1997

Advanced processor design using hardware description language AIDL.

[BibT_eX]

[DOI]

Proceedings of the ASP-DAC '97 Asia and South Pacific Design Automation Conference, 1997

1996

Adaptive routing technique on hypercrossbar network and its evaluation.

[BibT_eX]

[DOI]

Syst. Comput. Jpn., 1996

1994

Evaluation of Pseudo Vector Processor Based on Slide-Windowed Registers.

[BibT_eX]

Proceedings of the 27th Annual Hawaii International Conference on System Sciences (HICSS-27), 1994

1993

A Scalar Architecture for Pseudo Vector Processing Based on Slide-Windowed Registers.

[BibT_eX]

[DOI]

Proceedings of the 7th international conference on Supercomputing, 1993

1991

NCC: A concurrent description language for scientific calculation on multiprocessors.

[BibT_eX]

[DOI]

Syst. Comput. Jpn., 1991

1990

(SM)²-II: A Large-Scale Multiprocessor for Sparse Matrix Calculations.

[BibT_eX]

[DOI]

Hideharu Amano

Tomohiro Kudoh

IEEE Trans. Computers, 1990

1988

IMPULSE: A High Performance Processing Unit for Multiprocessors for Scientific Calculation.

[BibT_eX]

[DOI]