Daisuke Takahashi

Proceedings of the Computational Science and Its Applications - ICCSA 2017, 2017

2016

Automatic Thread-Block Size Adjustment for Memory-Bound BLAS Kernels on GPUs.

[BibT_eX]

[DOI]

Toshiyuki Imamura

Proceedings of the 10th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2016

Implementation of Multiple-Precision Floating-Point Arithmetic on Intel Xeon Phi Coprocessors.

[BibT_eX]

[DOI]

Proceedings of the Computational Science and Its Applications - ICCSA 2016, 2016

Parallel Sparse Matrix-Vector Multiplication Using Accelerators.

[BibT_eX]

[DOI]

Hiroshi Maeda

Proceedings of the Computational Science and Its Applications - ICCSA 2016, 2016

Automatic Tuning of Computation-Communication Overlap for Parallel 1-D FFT.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Intl Conference on Computational Science and Engineering, 2016

2015

Fast Implementation of General Matrix-Vector Multiplication (GEMV) on Kepler GPUs.

[BibT_eX]

[DOI]

Toshiyuki Imamura

Proceedings of the 23rd Euromicro International Conference on Parallel, 2015

Performance Evaluation of Sparse Matrix-Vector Multiplication Using GPU/MIC Cluster.

[BibT_eX]

[DOI]

Hiroshi Maeda

Proceedings of the Third International Symposium on Computing and Networking, 2015

2014

Virtual flow-net for accountability and forensics of computer and network systems.

[BibT_eX]

[DOI]

Ke Meng

Secur. Commun. Networks, 2014

Massively parallel implementation of 3D-RISM calculation with volumetric 3D-FFT.

[BibT_eX]

[DOI]

J. Comput. Chem., 2014

Performance evaluation of ultra-large-scale first-principles electronic structure calculation code on the K computer.

[BibT_eX]

[DOI]

Int. J. High Perform. Comput. Appl., 2014

A study on application-aware power-saving control method for sensor stations in home gateway.

[BibT_eX]

[DOI]

Shigeru Uchida

Eiichi Horiuchi

Proceedings of the IEEE 3rd Global Conference on Consumer Electronics, 2014

A study on application-aware QoS control in OSGi based home gateway.

[BibT_eX]

[DOI]

Shigeru Uchida

Eiichi Horiuchi

Proceedings of the IEEE 3rd Global Conference on Consumer Electronics, 2014

2013

Highly scalable implementation of an <i>N</i>N-body code on a GPU cluster.

[BibT_eX]

[DOI]

Yohei Miki

Masao Mori

Comput. Phys. Commun., 2013

Using Quadruple Precision Arithmetic to Accelerate Krylov Subspace Methods on GPUs.

[BibT_eX]

[DOI]

Proceedings of the Parallel Processing and Applied Mathematics, 2013

Optimization of Sparse Matrix-Vector Multiplication for CRS Format on NVIDIA Kepler Architecture GPUs.

[BibT_eX]

[DOI]

Proceedings of the Computational Science and Its Applications - ICCSA 2013, 2013

Efficient Hybrid Breadth-First Search on GPUs.

[BibT_eX]

[DOI]

Takaaki Hiragushi

Proceedings of the Algorithms and Architectures for Parallel Processing, 2013

A study on OSGi based home gateway employing application-aware QoS control.

[BibT_eX]

[DOI]

Eiichi Horiuchi

Proceedings of the IEEE 2nd Global Conference on Consumer Electronics, 2013

Implementation of Parallel 1-D FFT on GPU Clusters.

[BibT_eX]

[DOI]

Proceedings of the 16th IEEE International Conference on Computational Science and Engineering, 2013

Optimizing Objective Function Parameters for Strength in Computer Game-Playing.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013

2012

Accountability using flow-net: design, implementation, and performance evaluation.

[BibT_eX]

[DOI]

Ke Meng

Secur. Commun. Networks, 2012

A Fast Implementation and Performance Analysis of Collisionless N-body Code Based on GPGPU.

[BibT_eX]

[DOI]

Yohei Miki

Masao Mori

Proceedings of the International Conference on Computational Science, 2012

Implementation of XcalableMP Device Acceleration Extention with OpenCL.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

Implementation and Evaluation of Triple Precision BLAS Subroutines on GPUs.

[BibT_eX]

[DOI]

Proceedings of the 26th IEEE International Parallel and Distributed Processing Symposium Workshops & PhD Forum, 2012

An Implementation of Parallel 2-D FFT Using Intel AVX Instructions on Multi-core Processors.

[BibT_eX]

[DOI]

Proceedings of the Algorithms and Architectures for Parallel Processing, 2012

An Implementation of Parallel 1-D FFT on the K Computer.

[BibT_eX]

[DOI]

Atsuya Uno

Mitsuo Yokokawa

Proceedings of the 14th IEEE International Conference on High Performance Computing and Communication & 9th IEEE International Conference on Embedded Software and Systems, 2012

Automatic Tuning of Sparse Matrix-Vector Multiplication for CRS Format on GPUs.

[BibT_eX]

[DOI]

Hiroki Yoshizawa

Proceedings of the 15th IEEE International Conference on Computational Science and Engineering, 2012

2011

Wireless telemedicine and m-health: technologies, applications and research issues.

[BibT_eX]

[DOI]

Int. J. Sens. Networks, 2011

First-principles calculations of electron states of a silicon nanowire with 100, 000 atoms on the K computer.

[BibT_eX]

[DOI]

Proceedings of the Conference on High Performance Computing Networking, 2011

Optimization of Sparse Matrix-Vector Multiplication by Auto Selecting Storage Schemes on GPU.

[BibT_eX]

[DOI]

Yuji Kubota

Proceedings of the Computational Science and Its Applications - ICCSA 2011, 2011

2010

Parallel implementation of multiple-precision arithmetic and 2, 576, 980, 370, 000 decimal digits of pi calculation.

[BibT_eX]

[DOI]

Parallel Comput., 2010

A massively-parallel electronic-structure calculations based on real-space density functional theory.

[BibT_eX]

[DOI]

J. Comput. Phys., 2010

A Shogi Program Based on Monte-Carlo Tree Search.

[BibT_eX]

[DOI]

Yoshikuni Sato

Reijer Grimbergen

J. Int. Comput. Games Assoc., 2010

IEEE 802.11 user fingerprinting and its applications for intrusion detection.

[BibT_eX]

[DOI]

Yan Zhang

Periklis Chatzimisios

Hsiao-Hwa Chen

Comput. Math. Appl., 2010

Implementation and Evaluation of Quadruple Precision BLAS Functions on GPUs.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel and Scientific Computing, 2010

Automatic Tuning for Parallel FFTs.

[BibT_eX]

[DOI]

Proceedings of the Software Automatic Tuning, From Concepts to State-of-the-Art Results, 2010

2009

On a discrete optimal velocity model and its continuous and ultradiscrete relatives.

[BibT_eX]

[DOI]

Junta Matsukidaira

JSIAM Lett., 2009

An Implementation of Parallel 3-D FFT with 2-D Decomposition on a Massively Parallel Cluster of Multi-core Processors.

[BibT_eX]

[DOI]

Proceedings of the Parallel Processing and Applied Mathematics, 2009

2008

Retrieving knowledge from auditing log-files for computer and network forensics and accountability.

[BibT_eX]

[DOI]

Secur. Commun. Networks, 2008

A parallel method for large sparse generalized eigenvalue problems using a GridRPC system.

[BibT_eX]

[DOI]

Future Gener. Comput. Syst., 2008

Temperature-Aware Routing for Telemedicine Applications in Embedded Biomedical Sensor Networks.

[BibT_eX]

[DOI]

EURASIP J. Wirel. Commun. Netw., 2008

On-Demand Anonymous Routing with Distance Vector Protecting Traffic Privacy in Wireless Multi-hop Networks.

[BibT_eX]

[DOI]

Xiaoyan Hong

Proceedings of the MSN 2008, 2008

Complexity Analysis of Retrieving Knowledge from Auditing Log Files for Computer and Network Forensics and Accountability.

[BibT_eX]

[DOI]

Proceedings of IEEE International Conference on Communications, 2008

2007

Telemedicine Usage and Potentials.

[BibT_eX]

[DOI]

Fei Hu

Proceedings of the IEEE Wireless Communications and Networking Conference, 2007

A Parallel Algorithm for Multiple-Precision Division by a Single-Precision Integer.

[BibT_eX]

[DOI]

Proceedings of the Large-Scale Scientific Computing, 6th International Conference, 2007

RI2N/UDP: High bandwidth and fault-tolerant network for a PC-cluster based on multi-link Ethernet.

[BibT_eX]

[DOI]

Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

High Performance FFT on SGI Altix 3700.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing and Communications, 2007

LTRT: Least Total-Route Temperature Routing for Embedded Biomedical Sensor Networks.

[BibT_eX]

[DOI]

Fei Hu

Proceedings of the Global Communications Conference, 2007

2006

S12 - The HPC Challenge (HPCC) benchmark suite.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE SC2006 Conference on High Performance Networking and Computing, 2006

An Implementation of Parallel 1-D FFT Using SSE3 Instructions on Dual-Core Processors.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel Computing. State of the Art in Scientific Computing, 2006

Profile-based optimization of power performance by using dynamic voltage scaling on a PC cluster.

[BibT_eX]

[DOI]

Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

MegaProto/E: power-aware high-performance cluster with commodity technology.

[BibT_eX]

[DOI]

Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

Performance Improvement by Data Management Layer in a Grid RPC System.

[BibT_eX]

[DOI]

Proceedings of the Advances in Grid and Pervasive Computing, 2006

Emprical study on Reducing Energy of Parallel Programs using Slack Reclamation by DVFS in a Power-scalable High Performance Cluster.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Cluster Computing, 2006

PACS-CS: A Large-Scale Bandwidth-Aware PC Cluster for Scientific Computations.

[BibT_eX]

[DOI]

Proceedings of the Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2006), 2006

Robust Posture Estimation of the Human Face in Rapid Lighting Changes using a 3-D Reference Picture.

[BibT_eX]

[DOI]

Noriyoshi Okamoto

Proceedings of the Canadian Conference on Electrical and Computer Engineering, 2006

2005

An algorithm for multiple-precision floating-point multiplication.

[BibT_eX]

[DOI]

Appl. Math. Comput., 2005

MegaProto: 1 TFlops/10kW Rack Is Feasible Even with Only Commodity Technology.

[BibT_eX]

[DOI]

Proceedings of the ACM/IEEE SC2005 Conference on High Performance Networking and Computing, 2005

A Hybrid MPI/OpenMP Implementation of a Parallel 3-D FFT on SMP Clusters.

[BibT_eX]

[DOI]

Proceedings of the Parallel Processing and Applied Mathematics, 2005

Computation of High-Precision Mathematical Constants in a Combined Cluster and Grid Environment.

[BibT_eX]

[DOI]

Proceedings of the Large-Scale Scientific Computing, 5th International Conference, 2005

Design of a Software Distributed Shared Memory System using an MPI communication layer.

[BibT_eX]

[DOI]

Proceedings of the 8th International Symposium on Parallel Architectures, 2005

Low-cost High-bandwidth Tree Network for PC Clusters based on Tagged-VLAN Technology.

[BibT_eX]

[DOI]

Proceedings of the 8th International Symposium on Parallel Architectures, 2005

Empirical Study for Optimization of Power-Performance with On-Chip Memory.

[BibT_eX]

[DOI]

Proceedings of the High-Performance Computing - 6th International Symposium, 2005

MegaProto: A Low-Power and Compact Cluster for High-Performance Computing.

[BibT_eX]

[DOI]

Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Grid Environment for Computational Astrophysics Driven by GRAPE-6 with HMCS-G and OmniRPC.

[BibT_eX]

[DOI]

Proceedings of the 19th International Parallel and Distributed Processing Symposium (IPDPS 2005), 2005

Low Temperature Limit of Equations - Hidden Discrete Structure.

[BibT_eX]

Proceedings of the CCA 2005, 2005

2004

A stochastic model for solitons.

[BibT_eX]

[DOI]

Yoshiaki Itoh

Hosam M. Mahmoud

Random Struct. Algorithms, 2004

SCIMA-SMP: on-chip memory processor architecture for SMP.

[BibT_eX]

[DOI]

Proceedings of the 3rd Workshop on Memory Performance Issues, 2004

Heterogeneous Remote Computing System for Computational Astrophysics with OmniRPC.

[BibT_eX]

[DOI]

Proceedings of the 2004 Symposium on Applications and the Internet Workshops (SAINT 2004 Workshops), 2004

Performance Evaluation of OmniRPC in a Grid Environment.

[BibT_eX]

[DOI]

Proceedings of the 2004 Symposium on Applications and the Internet Workshops (SAINT 2004 Workshops), 2004

An Implementation of Parallel 3-D FFT Using Short Vector SIMD Instructions on Clusters of PCs.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel Computing, 2004

A Parallel Method for Large Sparse Generalized Eigenvalue Problems by OmniRPC in a Grid Environment.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel Computing, 2004

Parallel Implementation of Strassen's Matrix Multiplication Algorithm for Heterogeneous Clusters.

[BibT_eX]

[DOI]

Proceedings of the 18th International Parallel and Distributed Processing Symposium (IPDPS 2004), 2004

Implementation and performance evaluation of CONFLEX-G: grid-enabled molecular conformational space search program with OmniRPC.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual International Conference on Supercomputing, 2004

Formation of Dwarf Galaxies in Reionized Universe with Heterogeneous Multi-computer System.

[BibT_eX]

[DOI]

Proceedings of the Computational Science, 2004

2003

A parallel 1-D FFT algorithm for the Hitachi SR8000.

[BibT_eX]

[DOI]

Parallel Comput., 2003

Performance Evaluation of the Hitachi SR8000 Using SPEC OMP2001 Benchmarks.

[BibT_eX]

[DOI]

Int. J. Parallel Program., 2003

An OpenMP Implementation of Parallel FFT and Its Performance on IA-64 Processors.

[BibT_eX]

[DOI]

Proceedings of the OpenMP Shared Memory Parallel Programming, 2003

RI2N - Interconnection Network System for Clusters with Wide-Bandwidth and Fault-Tolerancy Based on Multiple Links.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, 5th International Symposium, 2003

A radix-16 FFT algorithm suitable for multiply-add instruction based on Goedecker method.

[BibT_eX]

[DOI]

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

OmniRPC: a Grid RPC ystem for Parallel Programming in Cluster and Grid Environment.

[BibT_eX]

[DOI]

Proceedings of the 3rd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2003), 2003

HMCS-G: Grid-enabled Hybrid Computing System for Computational Astrophysics.

[BibT_eX]

[DOI]

Proceedings of the 3rd IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2003), 2003

2002

A Blocking Algorithm for Parallel 1-D FFT on Shared-Memory Parallel Computers.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel Computing Advanced Scientific Computing, 2002

Performance Evaluation of the Hitachi SR8000 Using OpenMP Benchmarks.

[BibT_eX]

[DOI]

Proceedings of the High Performance Computing, 4th International Symposium, 2002

A Blocking Algorithm for Parallel 1-D FFT on Clusters of PCs.

[BibT_eX]

[DOI]

Proceedings of the Euro-Par 2002, 2002

2001

An extended split-radix FFT algorithm.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2001

A Mixed-Radix Parallel Three-Dimensional FFT Algorithm on Clusters of Vector SMPs.

[BibT_eX]

Proceedings of the Tenth SIAM Conference on Parallel Processing for Scientific Computing, 2001

A Blocking Algorithm for FFT on Cache-Based Processors.

[BibT_eX]

[DOI]

Proceedings of the High-Performance Computing and Networking, 9th International Conference, 2001

2000

High-Performance Radix-2, 3 and 5 Parallel 1-D Complex FFT Algorithms for Distributed-Memory Parallel Computers.

[BibT_eX]

[DOI]

Yasumasa Kanada

J. Supercomput., 2000

A fast algorithm for computing large Fibonacci numbers.

[BibT_eX]

[DOI]

Inf. Process. Lett., 2000

A Parallel 3-D FFT Algorithm on Clusters of Vector SMPs.

[BibT_eX]

[DOI]

Proceedings of the Applied Parallel Computing, 2000

A Performance Study on a Single Processing Node of the HITACHI SR8000.

[BibT_eX]

[DOI]

Proceedings of the Numerical Analysis and Its Applications, 2000

Implementation of Multiple-Precision Parallel Division and Square Root on Distributed-Memory Parallel Computers.

[BibT_eX]

[DOI]

Proceedings of the 2000 International Workshop on Parallel Processing, 2000

A new radix-6 FFT algorithm suitable for multiply-add instruction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2000

1999

Fast High-Precision Arithmetic on Distributed Memory Parallel Machines.

[BibT_eX]