Naoya Maruyama
According to our database1,
Naoya Maruyama
authored at least 65 papers
between 2006 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
Proceedings of the International Conference on Microelectronics, 2024
2021
The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs With Hybrid Parallelism.
IEEE Trans. Parallel Distributed Syst., 2021
Int. J. High Perform. Comput. Appl., 2021
2020
Proceedings of the Software for Exascale Computing - SPPEXA 2016-2019, 2020
2019
Preparation and optimization of a diverse workload for a large-scale heterogeneous system.
Proceedings of the International Conference for High Performance Computing, 2019
Proceedings of the International Conference for High Performance Computing, 2019
Proceedings of the 2019 IEEE International Parallel and Distributed Processing Symposium, 2019
2018
Proceedings of the 2018 International Joint Conference on Neural Networks, 2018
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2018
A Portability Layer of an All-pairs Operation for Hierarchical N-Body Algorithm Framework Tapas.
Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region, 2018
2017
IEEE Trans. Parallel Distributed Syst., 2017
Efficient Breadth-First Search on Massively Parallel and Distributed-Memory Machines.
Data Sci. Eng., 2017
Optimizations of Two Compute-Bound Scientific Kernels on the SW26010 Many-Core Processor.
Proceedings of the 46th International Conference on Parallel Processing, 2017
Proceedings of the 27th International Conference on Field Programmable Logic and Applications, 2017
2016
Int. J. High Perform. Comput. Appl., 2016
Proceedings of the International Conference for High Performance Computing, 2016
Proceedings of the International Conference for High Performance Computing, 2016
Proceedings of the OpenMP: Memory, Devices, and Tasks, 2016
Tapas: An Implicitly Parallel Programming Framework for Hierarchical N-Body Algorithms.
Proceedings of the 22nd IEEE International Conference on Parallel and Distributed Systems, 2016
A Directive-Based Data Layout Abstraction for Performance Portability of OpenACC Applications.
Proceedings of the 18th IEEE International Conference on High Performance Computing and Communications; 14th IEEE International Conference on Smart City; 2nd IEEE International Conference on Data Science and Systems, 2016
From FLOPS to BYTES: disruptive change in high-performance computing towards the post-moore era.
Proceedings of the ACM International Conference on Computing Frontiers, CF'16, 2016
Proceedings of the 2016 IEEE International Conference on Big Data (IEEE BigData 2016), 2016
2015
Proceedings of the 5th Workshop on Irregular Applications - Architectures and Algorithms, 2015
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015
Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing, 2015
2014
Proceedings of the International Conference for High Performance Computing, 2014
Proceedings of the First Workshop on Accelerator Programming using Directives, 2014
Proceedings of the 21st European MPI Users' Group Meeting, 2014
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014
Proceedings of the 14th IEEE/ACM International Symposium on Cluster, 2014
2013
Fork-Join and Data-Driven Execution Models on Multi-core Architectures: Case Study of the FMM.
Proceedings of the Supercomputing - 28th International Supercomputing Conference, 2013
Improving the Computing Efficiency of HPC Systems Using a Combination of Proactive and Preventive Checkpointing.
Proceedings of the 27th IEEE International Symposium on Parallel and Distributed Processing, 2013
Proceedings of the 42nd International Conference on Parallel Processing, 2013
Proceedings of the Euro-Par 2013 Parallel Processing, 2013
Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013
K MapReduce: A scalable tool for data-processing and search/ensemble applications on large-scale supercomputers.
Proceedings of the 2013 IEEE International Conference on Cluster Computing, 2013
CUDA vs OpenACC: Performance Case Studies with Kernel Benchmarks and a Memory-Bound CFD Application.
Proceedings of the 13th IEEE/ACM International Symposium on Cluster, 2013
2012
Proceedings of the High Performance Computing for Computational Science, 2012
Proceedings of the 2012 SC Companion: High Performance Computing, 2012
Proceedings of the SC Conference on High Performance Computing Networking, 2012
Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012
Proceedings of the Euro-Par 2012: Parallel Processing Workshops, 2012
Scalable Reed-Solomon-Based Reliable Local Storage for HPC Applications on IaaS Clouds.
Proceedings of the Euro-Par 2012 Parallel Processing - 18th International Conference, 2012
Proceedings of the 2012 IEEE International Conference on Cluster Computing, 2012
Design and Implementation of Portable and Efficient Non-blocking Collective Communication.
Proceedings of the 12th IEEE/ACM International Symposium on Cluster, 2012
2011
An exact algorithm for energy-efficient acceleration of task trees on CPU/GPU architectures.
Proceedings of of SYSTOR 2011: The 4th Annual Haifa Experimental Systems Conference, Haifa, Israel, May 30, 2011
Peta-scale phase-field simulation for dendritic solidification on the TSUBAME 2.0 supercomputer.
Proceedings of the Conference on High Performance Computing Networking, 2011
Physis: an implicitly parallel programming model for stencil computations on large-scale GPU-accelerated supercomputers.
Proceedings of the Conference on High Performance Computing Networking, 2011
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2011
Proceedings of the Conference on High Performance Computing Networking, 2011
2010
Model-based Fault Localization: Finding Behavioral Outliers in Large-scale Computing Systems.
New Gener. Comput., 2010
An 80-Fold Speedup, 15.0 TFlops Full GPU Acceleration of Non-Hydrostatic Weather Model ASUCA Production Code.
Proceedings of the Conference on High Performance Computing Networking, 2010
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010
Proceedings of the 2010 International Conference on High Performance Computing, 2010
Proceedings of the International Green Computing Conference 2010, 2010
Proceedings of the 10th IEEE/ACM International Conference on Cluster, 2010
2009
Proceedings of the 9th IEEE/ACM International Symposium on Cluster Computing and the Grid, 2009
2008
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008
Proceedings of the 9th IEEE/ACM International Conference on Grid Computing (Grid 2008), Tsukuba, Japan, September 29, 2008
2007
Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing, 2007
Proceedings of the Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007), 2007
2006
Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006
Proceedings of the Frontiers of High Performance Computing and Networking, 2006