Kentaro Sano
Orcid: 0000-0002-6681-4192
According to our database1,
Kentaro Sano
authored at least 101 papers
between 1997 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
Multifacets of lossy compression for scientific data in the Joint-Laboratory of Extreme Scale Computing.
Future Gener. Comput. Syst., 2025
2024
Across Time and Space: Senju's Approach for Scaling Iterative Stencil Loop Accelerators on Single and Multiple FPGAs.
ACM Trans. Reconfigurable Technol. Syst., June, 2024
ACM Trans. Reconfigurable Technol. Syst., June, 2024
Future Gener. Comput. Syst., 2024
Exploration of Trade-offs Between General-Purpose and Specialized Processing Elements in HPC-Oriented CGRA.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
RAW 2024 Invited Talk-6: Reconfigurable Architectures for High-Performance Computing.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
Proceedings of the IEEE International Conference on Consumer Electronics, 2024
HLS Implementation of a Building Cube Stencil Computation Framework for an FPGA Accelerator.
Proceedings of the IEEE International Conference on Consumer Electronics, 2024
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2024
Proceedings of the IEEE International Conference on Cluster Computing, 2024
Proceedings of the IEEE International Conference on Cluster Computing, 2024
2023
VCSN: Virtual Circuit-Switching Network for Flexible and Simple-to-Operate Communication in HPC FPGA Cluster.
ACM Trans. Reconfigurable Technol. Syst., June, 2023
IEEE Trans. Parallel Distributed Syst., May, 2023
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023
Hardware Specialization: Estimating Monte Carlo Cross-Section Lookup Kernel Performance and Area.
Proceedings of the SC '23 Workshops of The International Conference on High Performance Computing, 2023
Achieving Scalable Quantum Error Correction with Union-Find on Systolic Arrays by Using Multi-Context Processing Elements.
Proceedings of the IEEE International Conference on Quantum Computing and Engineering, 2023
Novel Union-Find-based Decoders for Scalable Quantum Error Correction on Systolic Arrays.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023
Less for More: Reducing Intra-CGRA Connectivity for Higher Performance and Efficiency in HPC.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023
ESSPER: Elastic and Scalable FPGA-Cluster System for High-Performance Reconfigurable Computing with Supercomputer Fugaku.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2023
Proceedings of the 13th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies, 2023
Journal Track Paper ICFPT 2023 : Across Time and Space: Senju's Approach for Scaling Iterative Stencil Loop Accelerators on Single and Multiple FPGAs.
Proceedings of the International Conference on Field Programmable Technology, 2023
Performance Modeling and Scalability Analysis of Stream Computing in ESSPER FPGA Clusters.
Proceedings of the International Conference on Field Programmable Technology, 2023
Senju: A Framework for the Design of Highly Parallel FPGA-based Iterative Stencil Loop Accelerators.
Proceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2023
2022
The First International Workshop on Coarse-Grained Reconfigurable Architectures for High-Performance Computing (CGRA4HPC).
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022
Exploration Framework for Synthesizable CGRAs Targeting HPC: Initial Design and Evaluation.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022
FPGA-Dedicated Network vs. Server Network for Pipelined Computing with Multiple FPGAs.
Proceedings of the HEART 2022: International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, Tsukuba, Japan, June 9, 2022
Proceedings of the HEART 2022: International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, Tsukuba, Japan, June 9, 2022
A SYCL-based high-level programming framework for HPC programmers to use remote FPGA clusters.
Proceedings of the HEART 2022: International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, Tsukuba, Japan, June 9, 2022
ESSPER: Elastic and Scalable System for High-Performance Reconfigurable Computing with Software-bridged APIs.
Proceedings of the International Conference on Field-Programmable Technology, 2022
Elastic Sample Filter: An FPGA-based Accelerator for Bayesian Network Structure Learning.
Proceedings of the International Conference on Field-Programmable Technology, 2022
Proceedings of the International Conference on Field-Programmable Technology, 2022
Proceedings of the IEEE International Conference on Cluster Computing, 2022
2021
Proceedings of the International Conference on Field-Programmable Technology, 2021
A memory bandwidth improvement with memory space partitioning for single-precision floating-point FFT on Stratix 10 FPGA.
Proceedings of the IEEE International Conference on Cluster Computing, 2021
Virtual Circuit-Switching Network with Flexible Topology for High-Performance FPGA Cluster.
Proceedings of the 32nd IEEE International Conference on Application-specific Systems, 2021
2020
White Paper from Workshop on Large-scale Parallel Numerical Computing Technology (LSPANC 2020): HPC and Computer Arithmetic toward Minimal-Precision Computing.
CoRR, 2020
A Survey on Coarse-Grained Reconfigurable Architectures From a Performance Perspective.
IEEE Access, 2020
Proceedings of the OpenMP: Portable Multi-Level Parallelism on Modern Systems, 2020
Performance Evaluation of Pipelined Communication Combined with Computation in OpenCL Programming on FPGA.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020
Performance Evaluation and Power Analysis of Teraflop-scale Fluid Simulation with Stratix 10 FPGA.
Proceedings of the FPGA '20: The 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2020
Extending High-Level Synthesis with High-Performance Computing Performance Visualization.
Proceedings of the IEEE International Conference on Cluster Computing, 2020
A Template-based Framework for Exploring Coarse-Grained Reconfigurable Architectures.
Proceedings of the 31st IEEE International Conference on Application-specific Systems, 2020
Proceedings of the Applied Reconfigurable Computing. Architectures, Tools, and Applications, 2020
2019
IEICE Trans. Inf. Syst., 2019
Proceedings of the 2019 International Conference on ReConFigurable Computing and FPGAs, 2019
Crossbar Implementation with Partial Reconfiguration for Stream Switching Applications on an FPGA.
Proceedings of the Parallel Computing: Technology Trends, 2019
Hybrid Network Utilization for Efficient Communication in a Tightly Coupled FPGA Cluster.
Proceedings of the International Conference on Field-Programmable Technology, 2019
A software bridged data transfer on a FPGA cluster by using pipelining and InfiniBand verbs.
Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, 2019
Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, 2019
Proceedings of the 24th IEEE International Conference on Emerging Technologies and Factory Automation, 2019
2018
IEICE Trans. Commun., 2018
High-productivity Programming and Optimization Framework for Stream Processing on FPGA.
Proceedings of the 9th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies, 2018
Proceedings of the International Conference on Field-Programmable Technology, 2018
Performance Analysis of Hardware-Based Numerical Data Compression on Various Data Formats.
Proceedings of the 2018 Data Compression Conference, 2018
Performance Estimation of Deeply Pipelined Fluid Simulation on Multiple FPGAs with High-speed Communication Subsystem.
Proceedings of the 29th IEEE International Conference on Application-specific Systems, 2018
2017
Bandwidth Compression of Floating-Point Numerical Data Streams for FPGA-Based High-Performance Computing.
ACM Trans. Reconfigurable Technol. Syst., 2017
FPGA-Based Scalable and Power-Efficient Fluid Simulation using Floating-Point DSP Blocks.
IEEE Trans. Parallel Distributed Syst., 2017
FPGA-based tsunami simulation: Performance comparison with GPUs, and roofline model for scalability analysis.
J. Parallel Distributed Comput., 2017
Design and scalability analysis of bandwidth-compressed stream computing with multiple FPGAs.
Proceedings of the 12th International Symposium on Reconfigurable Communication-centric Systems-on-Chip, 2017
FPGA-based Stream Computing for High-Performance N-Body Simulation using Floating-Point DSP Blocks.
Proceedings of the 8th International Symposium on Highly Efficient Accelerators and Reconfigurable Technologies, 2017
2016
Proceedings of the 24th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2016
2015
Stream Computation of Shallow Water Equation Solver for FPGA-based 1D Tsunami Simulation.
SIGARCH Comput. Archit. News, 2015
DSL-based Design Space Exploration for Temporal and Spatial Parallelism of Custom Stream Computing.
CoRR, 2015
2014
Multi-FPGA Accelerator for Scalable Stencil Computation with Constant Memory Bandwidth.
IEEE Trans. Parallel Distributed Syst., 2014
FPGA-based Custom Computing Architecture for Large-Scale Fluid Simulation with Building Cube Method.
SIGARCH Comput. Archit. News, 2014
Stream Processor Generator for HPC to Embedded Applications on FPGA-based System Platform.
CoRR, 2014
Bandwidth compression of multiple numerical data streams for high performance custom computing.
Proceedings of the IEEE 25th International Conference on Application-Specific Systems, 2014
2013
Efficient custom computing of fully-streamed lattice boltzmann method on tightly-coupled FPGA cluster.
SIGARCH Comput. Archit. News, 2013
Parallel and scalable custom computing for real-time fluid simulation on a cluster node with four tightly-coupled FPGAs.
Proceedings of the 23rd International Conference on Field programmable Logic and Applications, 2013
Parameterized Design and Evaluation of Bandwidth Compressor for Floating-Point Data Streams in FPGA-Based Custom Computing.
Proceedings of the Reconfigurable Computing: Architectures, Tools and Applications, 2013
2012
SIGARCH Comput. Archit. News, 2012
NII Shonan Meet. Rep., 2012
Multi-sensor location estimation for illegal cell-phone use in real-life indoor environment.
Proceedings of the IEEE International Conference on Communication Systems, 2012
Cooling efficiency aware workload placement using historical sensor data on IT-facility collaborative control.
Proceedings of the 2012 International Green Computing Conference, 2012
Scalability analysis of tightly-coupled FPGA-cluster for lattice Boltzmann computation.
Proceedings of the 22nd International Conference on Field Programmable Logic and Applications (FPL), 2012
Domain-Specific Language and Compiler for Stencil Computation on FPGA-Based Systolic Computational-Memory Array.
Proceedings of the Reconfigurable Computing: Architectures, Tools and Applications, 2012
2011
Domain-specific programmable design of scalable streaming-array for power-efficient stencil computation.
SIGARCH Comput. Archit. News, 2011
Proceedings of the 2011 International Conference on Field-Programmable Technology, 2011
Scalable Streaming-Array of Simple Soft-Processors for Stencil Computations with Constant Memory-Bandwidth.
Proceedings of the IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines, 2011
2010
FPGA-Array with Bandwidth-Reduction Mechanism for Scalable and Power-Efficient Numerical Simulations Based on Finite Difference Methods.
ACM Trans. Reconfigurable Technol. Syst., 2010
Prototype implementation of array-processor extensible over multiple FPGAs for scalable stencil computation.
SIGARCH Comput. Archit. News, 2010
Local-and-global stall mechanism for systolic computational-memory array on extensible multi-FPGA system.
Proceedings of the International Conference on Field-Programmable Technology, 2010
Segment-Parallel Predictor for FPGA-Based Hardware Compressor and Decompressor of Floating-Point Data Streams to Enhance Memory I/O Bandwidth.
Proceedings of the 2010 Data Compression Conference (DCC 2010), 2010
FPGA-based lossless compressors of floating-point data streams to enhance memory bandwidth.
Proceedings of the 21st IEEE International Conference on Application-specific Systems Architectures and Processors, 2010
2008
Scalable FPGA-array for high-performance and power-efficient computation based on difference schemes.
Proceedings of the 2008 Second International Workshop on High-Performance Reconfigurable Computing Technology and Applications, 2008
Evaluating power and energy consumption of FPGA-based custom computing machines for scientific floating-point computation.
Proceedings of the 2008 International Conference on Field-Programmable Technology, 2008
2007
Proceedings of the 2007 International Conference on Field-Programmable Technology, 2007
Proceedings of the IEEE Symposium on Field-Programmable Custom Computing Machines, 2007
2005
J. Adv. Comput. Intell. Intell. Informatics, 2005
2004
Parallel Comput., 2004
Differential coding scheme for efficient parallel image composition on a PC cluster system.
Parallel Comput., 2004
Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'04), 2004
Parallel competitive learning algorithm for fast codebook design on partitioned space.
Proceedings of the 2004 IEEE International Conference on Cluster Computing (CLUSTER 2004), 2004
2003
A Comparison Study of Vector Quantization Codebook Design Algorithms based on the Equidistortion Principle.
Proceedings of the 21st IASTED International Multi-Conference on Applied Informatics (AI 2003), 2003
2002
Parallel Algorithm for the Law-of-the-Jungle Learning to the Fast Design of Optimal Codebooks.
Proceedings of the International Conference on Parallel and Distributed Computing Systems, 2002
Hardware Support for Concurrent Execution of Loops Containing Loop-carried Data Dependences.
Proceedings of the International Conference on Parallel and Distributed Computing Systems, 2002
2001
Proceedings of the 19th International Conference on Computer Design (ICCD 2001), 2001
1997
Parallel processing of the shear-warp factorization with the binary-swap method on a distributed-memory multiprocessor system.
Proceedings of the IEEE Symposium on Parallel Rendering, 1997