Fumihiko Ino
Orcid: 0000-0002-5757-7631
According to our database1,
Fumihiko Ino
authored at least 92 papers
between 2001 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
Knowl. Based Syst., 2024
Lazy Qubit Reordering for Accelerating Parallel State-Vector-based Quantum Circuit Simulation.
CoRR, 2024
2023
A compression-based memory-efficient optimization for out-of-core GPU stencil computation.
J. Supercomput., July, 2023
A Synergy between On- and Off-Chip Data Reuse for GPU-based Out-of-Core Stencil Computation.
CoRR, 2023
IEEE Access, 2023
PRF: A Fast Parallel Relaxed Flooding Algorithm for Voronoi Diagram Generation on GPU.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023
2022
Proceedings of the Parallel and Distributed Computing, Applications and Technologies, 2022
Proceedings of the International Joint Conference on Neural Networks, 2022
2021
Accelerating In-Transit Co-Processing for Scientific Simulations Using Region-Based Data-Driven Analysis.
Algorithms, 2021
Proceedings of the Parallel and Distributed Computing, Applications and Technologies, 2021
Proceedings of the 12th International Symposium on Parallel Architectures, 2021
2020
A Data-Centric Directive-Based Framework to Accelerate Out-of-Core Stencil Computation on a GPU.
IEICE Trans. Inf. Syst., 2020
Concurr. Comput. Pract. Exp., 2020
Accelerating Human Genome Phenotypic Analysis with Bitwise Search and Batched Computation.
Proceedings of the 28th Euromicro International Conference on Parallel, 2020
Proceedings of the 9th International Conference on Software and Computer Applications, 2020
2019
PACC: a directive-based programming framework for out-of-core stencil computation on accelerators.
Int. J. High Perform. Comput. Netw., 2019
Memory Efficient Load Balancing for Distributed Large-Scale Volume Rendering Using a Two-Layered Group Structure.
IEICE Trans. Inf. Syst., 2019
IEICE Trans. Inf. Syst., 2019
GPU-based branch-and-bound method to solve large 0-1 knapsack problems with data-centric strategies.
Concurr. Comput. Pract. Exp., 2019
Transparent In-memory Cache Management in Apache Spark based on Post-Mortem Analysis.
Proceedings of the 2019 IEEE International Conference on Big Data (IEEE BigData), 2019
2018
Proceedings of the 11th Workshop on General Purpose Processing using GPUs, 2018
Proceedings of the VII International Conference on Network, Communication and Computing, 2018
An Automated Method for Generating Training Sets for Deep Learning based Image Registration.
Proceedings of the 11th International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2018), 2018
2017
IEEE Trans. Parallel Distributed Syst., 2017
IEICE Trans. Inf. Syst., 2017
CoRR, 2017
An Out-of-Core Branch and Bound Method for Solving the 0-1 Knapsack Problem on a GPU.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2017
Proceedings of the 2017 IEEE International Conference on Bioinformatics and Biomedicine, 2017
2016
Reducing memory usage by the lifting-based discrete wavelet transform with a unified buffer on a GPU.
J. Parallel Distributed Comput., 2016
Cache-Aware GPU Optimization for Out-of-Core Cone Beam CT Reconstruction of High-Resolution Volumes.
IEICE Trans. Inf. Syst., 2016
An Extension of OpenACC Directives for Out-of-Core Stencil Computation with Temporal Blocking.
Proceedings of the Third Workshop on Accelerator Programming Using Directives, 2016
Proceedings of the 24th Euromicro International Conference on Parallel, 2016
Towards Automating Multi-dimensional Data Decomposition for Executing a Single-GPU Code on a Multi-GPU System.
Proceedings of the Fourth International Symposium on Computing and Networking, 2016
2015
J. Parallel Distributed Comput., 2015
Enumerating Joint Weight of a Binary Linear Code Using Parallel Architectures: multi-core CPUs and GPUs.
Int. J. Netw. Comput., 2015
Accelerating the Smith-Waterman algorithm with interpair pruning and band optimization for the all-pairs comparison of base sequences.
BMC Bioinform., 2015
2014
Efficient Acceleration of Mutual Information Computation for Nonrigid Registration Using CUDA.
IEEE J. Biomed. Health Informatics, 2014
Int. J. Netw. Comput., 2014
Concurr. Comput. Pract. Exp., 2014
A Parallel Algorithm for Enumerating Joint Weight of a Binary Linear Code in Network Coding.
Proceedings of the Second International Symposium on Computing and Networking, 2014
2013
GPU-Chariot: A Programming Framework for Stream Applications Running on Multi-GPU Systems.
IEICE Trans. Inf. Syst., 2013
Proceedings of the First International Symposium on Computing and Networking, 2013
2012
IEEE Trans. Parallel Distributed Syst., 2012
Int. J. High Perform. Comput. Netw., 2012
Concurr. Comput. Pract. Exp., 2012
Proceedings of the 2012 International Conference on High Performance Computing & Simulation, 2012
Proceedings of the ARCS 2012 Workshops, 28. Februar - 2. März 2012, München, Germany, 2012
2011
Proceedings of the 19th International Euromicro Conference on Parallel, 2011
2010
Parallel Comput., 2010
Accelerating Smith-Waterman Algorithm for Biological Database Search on CUDA-Compatible GPUs.
IEICE Trans. Inf. Syst., 2010
Proceedings of the 2010 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2010
2009
Parallel Process. Lett., 2009
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009
2008
A decompression pipeline for accelerating out-of-core volume rendering of time-varying data.
Comput. Graph., 2008
A Task Parallel Algorithm for Computing the Costs of All-Pairs Shortest Paths on the CUDA-Compatible GPU.
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2008
Proceedings of the High Performance Computing, 2008
Design and implementation of the Smith-Waterman algorithm on the CUDA-compatible GPU.
Proceedings of the 8th IEEE International Conference on Bioinformatics and Bioengineering, 2008
2007
Parallel Adaptive Estimation of Hip Range of Motion for Total Hip Replacement Surgery.
IEICE Trans. Inf. Syst., 2007
Real-time rendering of time-varying volume data using a single cots computer.
Proceedings of the GRAPP 2007, 2007
2006
Trace reduction for performance improvement assessment of message passing parallel programs.
Syst. Comput. Jpn., 2006
A parallel implementation of 2-D/3-D image registration for computer-assisted surgery.
Int. J. Bioinform. Res. Appl., 2006
IEICE Trans. Inf. Syst., 2006
Proceedings of the Frontiers of High Performance Computing and Networking, 2006
Proceedings of the Parallel and Distributed Processing and Applications, 2006
Proceedings of the Biological and Medical Data Analysis, 7th International Symposium, 2006
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006
Proceedings of the 4th International Conference on Computer Graphics and Interactive Techniques in Australasia and Southeast Asia 2006, Kuala Lumpur, Malaysia, November 29, 2006
2005
Parallel Comput., 2005
Performance Study of Nonrigid Registration Algorithm for Investigating Lung Disease on Clusters.
Proceedings of the Sixth International Conference on Parallel and Distributed Computing, 2005
Proceedings of the High Performance Computing, 2005
2004
High-performance computing service over the Internet for intraoperative image processing.
IEEE Trans. Inf. Technol. Biomed., 2004
IEICE Trans. Inf. Syst., 2004
Proceedings of the High Performance Computing for Computational Science, 2004
Proceedings of the Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2004, 2004
Parallel Volume Rendering with Early Ray Termination for Visualizing Large-Scale Datasets.
Proceedings of the Parallel and Distributed Processing and Applications, 2004
A Performance Analysis Tool for Performance Debugging of Message Passing Parallel Programs.
Proceedings of the 33rd International Conference on Parallel Processing Workshops (ICPP 2004 Workshops), 2004
2003
An improved binary-swap compositing for sort-last parallel rendering on distributed memory multiprocessors.
Parallel Comput., 2003
CoRR, 2003
Proceedings of the 2003 ACM Symposium on Applied Computing (SAC), 2003
Design and Implementation of Parallel Nonrigid Image Registration Using Off-the-Shelf Supercomputers.
Proceedings of the Medical Image Computing and Computer-Assisted Intervention, 2003
A Divided-Screenwise Hierarchical Compositing for Sort-Last Parallel Volume Rendering.
Proceedings of the 17th International Parallel and Distributed Processing Symposium (IPDPS 2003), 2003
A High Performance Computing System for Medical Imaging in the Remote Operating Room.
Proceedings of the High Performance Computing - HiPC 2003, 10th International Conference, 2003
Proceedings of the Euro-Par 2003. Parallel Processing, 2003
A high-performance computing service over the Internet for nonrigid image registration.
Proceedings of the CARS 2003. Computer Assisted Radiology and Surgery. Proceedings of the 17th International Congress and Exhibition, 2003
2001
Proceedings of the 2001 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP'01), 2001