Kentaro Sano

Proceedings of the International Conference on Field Programmable Technology, 2023

Performance Modeling and Scalability Analysis of Stream Computing in ESSPER FPGA Clusters.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Field Programmable Technology, 2023

Senju: A Framework for the Design of Highly Parallel FPGA-based Iterative Stencil Loop Accelerators.

[BibT_eX]

[DOI]

Emanuele Del Sozzo

Davide Conficconi

Marco D. Santambrogio

Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2023

2022

The First International Workshop on Coarse-Grained Reconfigurable Architectures for High-Performance Computing (CGRA4HPC).

[BibT_eX]

[DOI]

Artur Podobas

Jason Anderson

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

An Architecture- Independent CGRA Compiler enabling OpenMP Applications.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

Exploration Framework for Synthesizable CGRAs Targeting HPC: Initial Design and Evaluation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

FPGA-Dedicated Network vs. Server Network for Pipelined Computing with Multiple FPGAs.

[BibT_eX]

[DOI]

Takaaki Miyajima

Proceedings of the HEART 2022: International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, Tsukuba, Japan, June 9, 2022

Stream Computation of 3D Approximate Convex Hulls with an FPGA.

[BibT_eX]

[DOI]

Proceedings of the HEART 2022: International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, Tsukuba, Japan, June 9, 2022

A SYCL-based high-level programming framework for HPC programmers to use remote FPGA clusters.

[BibT_eX]

[DOI]

Satoshi Kaneko

Hiroyuki Takizawa

Proceedings of the HEART 2022: International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, Tsukuba, Japan, June 9, 2022

ESSPER: Elastic and Scalable System for High-Performance Reconfigurable Computing with Software-bridged APIs.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Field-Programmable Technology, 2022

Elastic Sample Filter: An FPGA-based Accelerator for Bayesian Network Structure Learning.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Field-Programmable Technology, 2022

Exploring Inter-tile Connectivity for HPC-oriented CGRA with Lower Resource Usage.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Field-Programmable Technology, 2022

The Cost of Flexibility: Embedded versus Discrete Routers in CGRAs for HPC.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Cluster Computing, 2022

2021

Efficient Queue-Balancing Switch for FPGAs.

[BibT_eX]

[DOI]

Philippos Papaphilippou

Boma A. Adhi

Wayne Luk

Proceedings of the International Conference on Field-Programmable Technology, 2021

A memory bandwidth improvement with memory space partitioning for single-precision floating-point FFT on Stratix 10 FPGA.

[BibT_eX]

[DOI]

Takaaki Miyajima

Proceedings of the IEEE International Conference on Cluster Computing, 2021

Virtual Circuit-Switching Network with Flexible Topology for High-Performance FPGA Cluster.

[BibT_eX]

[DOI]

Atsushi Koshiba

Proceedings of the 32nd IEEE International Conference on Application-specific Systems, 2021

2020

White Paper from Workshop on Large-scale Parallel Numerical Computing Technology (LSPANC 2020): HPC and Computer Arithmetic toward Minimal-Precision Computing.

[BibT_eX]

[DOI]

CoRR, 2020

A Survey on Coarse-Grained Reconfigurable Architectures From a Performance Perspective.

[BibT_eX]

[DOI]

Artur Podobas

Satoshi Matsuoka

IEEE Access, 2020

OpenMP Device Offloading to FPGAs Using the Nymble Infrastructure.

[BibT_eX]

[DOI]

Proceedings of the OpenMP: Portable Multi-Level Parallelism on Modern Systems, 2020

Performance Evaluation of Pipelined Communication Combined with Computation in OpenCL Programming on FPGA.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

Performance Evaluation and Power Analysis of Teraflop-scale Fluid Simulation with Stratix 10 FPGA.

[BibT_eX]

[DOI]

Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020

Extending High-Level Synthesis with High-Performance Computing Performance Visualization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Cluster Computing, 2020

A Template-based Framework for Exploring Coarse-Grained Reconfigurable Architectures.

[BibT_eX]

[DOI]

Artur Podobas

Satoshi Matsuoka

Proceedings of the 31st IEEE International Conference on Application-specific Systems, 2020

Comparison of Direct and Indirect Networks for High-Performance FPGA Clusters.

[BibT_eX]

[DOI]

Proceedings of the Applied Reconfigurable Computing. Architectures, Tools, and Applications, 2020

2019

Scalability Analysis of Deeply Pipelined Tsunami Simulation with Multiple FPGAs.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2019

A High Level Synthesis Approach for Application Specific DMA Controllers.

[BibT_eX]

[DOI]

Proceedings of the 2019 International Conference on ReConFigurable Computing and FPGAs, 2019

Crossbar Implementation with Partial Reconfiguration for Stream Switching Applications on an FPGA.

[BibT_eX]

[DOI]

Proceedings of the Parallel Computing: Technology Trends, 2019

Hybrid Network Utilization for Efficient Communication in a Tightly Coupled FPGA Cluster.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Field-Programmable Technology, 2019

A software bridged data transfer on a FPGA cluster by using pipelining and InfiniBand verbs.

[BibT_eX]

[DOI]

Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, 2019

Scaling Performance for N-Body Stream Computation with a Ring of FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, 2019

FPGA implementation of a robot control algorithm.

[BibT_eX]

[DOI]

Proceedings of the 24th IEEE International Conference on Emerging Technologies and Factory Automation, 2019

2018

A Guide of Fingerprint Based Radio Emitter Localization Using Multiple Sensors.

[BibT_eX]

[DOI]

IEICE Trans. Commun., 2018

High-productivity Programming and Optimization Framework for Stream Processing on FPGA.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, 2018

Enhancing Memory Bandwidth in a Single Stream Computation with Multiple FPGAs.

[BibT_eX]

[DOI]

Antoniette Mondigo

Hiroyuki Takizawa

Proceedings of the International Conference on Field-Programmable Technology, 2018

Performance Analysis of Hardware-Based Numerical Data Compression on Various Data Formats.

[BibT_eX]

[DOI]

Takashi Furusawa

Proceedings of the 2018 Data Compression Conference, 2018

Performance Estimation of Deeply Pipelined Fluid Simulation on Multiple FPGAs with High-speed Communication Subsystem.

[BibT_eX]

[DOI]

Antoniette Mondigo

Hiroyuki Takizawa

Proceedings of the 29th IEEE International Conference on Application-specific Systems, 2018

Hardware Algorithms.

[BibT_eX]

[DOI]

Hiroki Nakahara

Proceedings of the Principles and Structures of FPGAs., 2018

2017

Bandwidth Compression of Floating-Point Numerical Data Streams for FPGA-Based High-Performance Computing.

[BibT_eX]

[DOI]

ACM Trans. Reconfigurable Technol. Syst., 2017

FPGA-Based Scalable and Power-Efficient Fluid Simulation using Floating-Point DSP Blocks.

[BibT_eX]

[DOI]

IEEE Trans. Parallel Distributed Syst., 2017

FPGA-based tsunami simulation: Performance comparison with GPUs, and roofline model for scalability analysis.

[BibT_eX]

[DOI]

J. Parallel Distributed Comput., 2017

Design and scalability analysis of bandwidth-compressed stream computing with multiple FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 12th International Symposium on Reconfigurable Communication-centric Systems-on-Chip, 2017

FPGA-based Stream Computing for High-Performance N-Body Simulation using Floating-Point DSP Blocks.

[BibT_eX]

[DOI]

Shin Abiko

Proceedings of the 8th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies, 2017

2016

Parallelism for High-Performance Tsunami Simulation with FPGA: Spatial or Temporal?

[BibT_eX]

[DOI]

Stanislav G. Sedukhin

Proceedings of the 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2016

2015

Stream Computation of Shallow Water Equation Solver for FPGA-based 1D Tsunami Simulation.

[BibT_eX]

[DOI]

SIGARCH Comput. Archit. News, 2015

DSL-based Design Space Exploration for Temporal and Spatial Parallelism of Custom Stream Computing.

[BibT_eX]

[DOI]

CoRR, 2015

2014

Multi-FPGA Accelerator for Scalable Stencil Computation with Constant Memory Bandwidth.

[BibT_eX]

[DOI]

Yoshiaki Hatsuda

IEEE Trans. Parallel Distributed Syst., 2014

FPGA-based Custom Computing Architecture for Large-Scale Fluid Simulation with Building Cube Method.

[BibT_eX]

[DOI]

SIGARCH Comput. Archit. News, 2014

Stream Processor Generator for HPC to Embedded Applications on FPGA-based System Platform.

[BibT_eX]

[DOI]

CoRR, 2014

Bandwidth compression of multiple numerical data streams for high performance custom computing.

[BibT_eX]

[DOI]

Proceedings of the IEEE 25th International Conference on Application-Specific Systems, 2014

2013

Efficient custom computing of fully-streamed lattice boltzmann method on tightly-coupled FPGA cluster.

[BibT_eX]

[DOI]

SIGARCH Comput. Archit. News, 2013

Parallel and scalable custom computing for real-time fluid simulation on a cluster node with four tightly-coupled FPGAs.

[BibT_eX]

[DOI]

Proceedings of the 23rd International Conference on Field programmable Logic and Applications, 2013

Parameterized Design and Evaluation of Bandwidth Compressor for Floating-Point Data Streams in FPGA-Based Custom Computing.

[BibT_eX]

[DOI]

Proceedings of the Reconfigurable Computing: Architectures, Tools and Applications, 2013

2012

FPGA-based Connect6 solver with hardware-accelerated move refinement.

[BibT_eX]

[DOI]

Yoshiaki Kono

SIGARCH Comput. Archit. News, 2012

The NII Shonan Configurable Computing Workshop (NII Shonan Meeting 2012-11).

[BibT_eX]

[DOI]

Peter M. Athanas

Brad L. Hutchings

NII Shonan Meet. Rep., 2012

High-Performance Reconfigurable Computing.

[BibT_eX]

[DOI]

Int. J. Reconfigurable Comput., 2012

Multi-sensor location estimation for illegal cell-phone use in real-life indoor environment.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Communication Systems, 2012

Cooling efficiency aware workload placement using historical sensor data on IT-facility collaborative control.

[BibT_eX]

[DOI]

Proceedings of the 2012 International Green Computing Conference, 2012

Scalability analysis of tightly-coupled FPGA-cluster for lattice Boltzmann computation.

[BibT_eX]

[DOI]

Yoshiaki Kono

Proceedings of the 22nd International Conference on Field Programmable Logic and Applications (FPL), 2012

Domain-Specific Language and Compiler for Stencil Computation on FPGA-Based Systolic Computational-Memory Array.

[BibT_eX]

[DOI]

Luzhou Wang

Proceedings of the Reconfigurable Computing: Architectures, Tools and Applications, 2012

2011

Domain-specific programmable design of scalable streaming-array for power-efficient stencil computation.

[BibT_eX]

[DOI]

Yoshiaki Hatsuda

SIGARCH Comput. Archit. News, 2011

SW and HW co-design of Connect6 accelerator with scalable streaming cores.

[BibT_eX]

[DOI]

Proceedings of the 2011 International Conference on Field-Programmable Technology, 2011

Scalable Streaming-Array of Simple Soft-Processors for Stencil Computations with Constant Memory-Bandwidth.

[BibT_eX]

[DOI]

Yoshiaki Hatsuda

Proceedings of the IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines, 2011

2010

FPGA-Array with Bandwidth-Reduction Mechanism for Scalable and Power-Efficient Numerical Simulations Based on Finite Difference Methods.

[BibT_eX]

[DOI]

ACM Trans. Reconfigurable Technol. Syst., 2010

Prototype implementation of array-processor extensible over multiple FPGAs for scalable stencil computation.

[BibT_eX]

[DOI]

Luzhou Wang

SIGARCH Comput. Archit. News, 2010

Local-and-global stall mechanism for systolic computational-memory array on extensible multi-FPGA system.

[BibT_eX]

[DOI]

Luzhou Wang

Proceedings of the International Conference on Field-Programmable Technology, 2010

Segment-Parallel Predictor for FPGA-Based Hardware Compressor and Decompressor of Floating-Point Data Streams to Enhance Memory I/O Bandwidth.

[BibT_eX]

[DOI]

Kazuya Katahira

Proceedings of the 2010 Data Compression Conference (DCC 2010), 2010

FPGA-based lossless compressors of floating-point data streams to enhance memory bandwidth.

[BibT_eX]

[DOI]

Kazuya Katahira

Proceedings of the 21st IEEE International Conference on Application-specific Systems Architectures and Processors, 2010

2008

Scalable FPGA-array for high-performance and power-efficient computation based on difference schemes.

[BibT_eX]

[DOI]

Proceedings of the 2008 Second International Workshop on High-Performance Reconfigurable Computing Technology and Applications, 2008

Evaluating power and energy consumption of FPGA-based custom computing machines for scientific floating-point computation.

[BibT_eX]

[DOI]

Proceedings of the 2008 International Conference on Field-Programmable Technology, 2008

2007

FPGA-based Streaming Computation for Lattice Boltzmann Method.

[BibT_eX]

[DOI]

Proceedings of the 2007 International Conference on Field-Programmable Technology, 2007

Systolic Architecture for Computational Fluid Dynamics on FPGAs.

[BibT_eX]

[DOI]

Takanori Iizuka

Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, 2007

2005

A Competitive Learning Algorithm with Controlling Maximum Distortion.

[BibT_eX]

[DOI]

J. Adv. Comput. Intell. Intell. Informatics, 2005

2004

Efficient parallel processing of competitive learning algorithms.

[BibT_eX]

[DOI]

Parallel Comput., 2004

Differential coding scheme for efficient parallel image composition on a PC cluster system.

[BibT_eX]

[DOI]

Yusuke Kobayashi

Tadao Nakamura

Parallel Comput., 2004

A Systolic Memory Architecture for Fast Codebook Design based on MMPDCL Algorithm.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'04), 2004

Parallel competitive learning algorithm for fast codebook design on partitioned space.

[BibT_eX]

[DOI]

Proceedings of the 2004 IEEE International Conference on Cluster Computing (CLUSTER 2004), 2004

2003

A Comparison Study of Vector Quantization Codebook Design Algorithms based on the Equidistortion Principle.

[BibT_eX]

Proceedings of the 21st IASTED International Multi-Conference on Applied Informatics (AI 2003), 2003

2002

Parallel Algorithm for the Law-of-the-Jungle Learning to the Fast Design of Optimal Codebooks.

[BibT_eX]

Proceedings of the International Conference on Parallel and Distributed Computing Systems, 2002

Hardware Support for Concurrent Execution of Loops Containing Loop-carried Data Dependences.

[BibT_eX]

C. D. Lima