Koji Nakano

Orcid: 0000-0002-2040-4032

Affiliations:
  • Hiroshima University, Department of Information Engineering


According to our database1, Koji Nakano authored at least 291 papers between 1992 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Generating hard quadratic unconstrained binary optimization instances via the method of combining bit reduction and duplication technique.
Int. J. Parallel Emergent Distributed Syst., September, 2024

Preface: Special issue on the Eleventh International Symposium on Networking and Computing.
Int. J. Netw. Comput., 2024

Designing Unit Ising Models for Logic Gate Simulation through Integer Linear Programming.
CoRR, 2024

Bit duplication technique to generate hard quadratic unconstrained binary optimization problems with adjustable sizes.
Concurr. Comput. Pract. Exp., 2024

The Logarithmic Random Bidding for the Parallel Roulette Wheel Selection with Precise Probabilities.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

Introduction to Computational Quantum Chemistry for Computer Scientists.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

APDCM 2024 Preface and Committee List.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024

2023
Preface: Special issue on the Tenth International Symposium on Networking and Computing.
Int. J. Netw. Comput., 2023

Dual-Matrix Domain-Wall: A Novel Technique for Generating Permutations by QUBO and Ising Models with Quadratic Sizes.
CoRR, 2023

Designing low-diameter interconnection networks with multi-ported host-switch graphs.
Concurr. Comput. Pract. Exp., 2023

GPU implementations of deflate encoding and decoding.
Concurr. Comput. Pract. Exp., 2023

Efficient parallel implementations to compute the diameter of a graph.
Concurr. Comput. Pract. Exp., 2023

A novel structured sparse fully connected layer in convolutional neural networks.
Concurr. Comput. Pract. Exp., 2023

High-throughput FPGA implementation for quadratic unconstrained binary optimization.
Concurr. Comput. Pract. Exp., 2023

Simple iterative trial search for the maximum independent set problem optimized for the GPUs.
Concurr. Comput. Pract. Exp., 2023

Graphics processing unit-accelerated high-quality watercolor painting image generation.
Concurr. Comput. Pract. Exp., 2023

Diverse Adaptive Bulk Search: a Framework for Solving QUBO Problems on Multiple GPUs.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023

Solving the N-Queens Puzzle by a QUBO Model with Quadratic Size.
Proceedings of the Eleventh International Symposium on Computing and Networking, CANDAR 2023, Matsue, Japan, November 28, 2023

Efficient GPU-Accelerated Bulk Evaluation of the Boys Function for Quantum Chemistry.
Proceedings of the Eleventh International Symposium on Computing and Networking, CANDAR 2023, Matsue, Japan, November 28, 2023

2022
GPU-accelerated scalable solver with bit permutated cyclic-min algorithm for quadratic unconstrained binary optimization.
J. Parallel Distributed Comput., 2022

Preface: Special issued on the Ninth International Symposium on Networking and Computing.
Int. J. Netw. Comput., 2022

Graph-theoretic Formulation of QUBO for Scalable Local Search on GPUs.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

Optimal Triangulation on the High Bandwidth Memory Model.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2022

The Bonsai Hypothesis: An Efficient Network Pruning Technique.
Proceedings of the Artificial Intelligence Applications and Innovations, 2022

BERT-Based Scientific Paper Quality Prediction.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2022, 2022

ConvUNeXt: A Lightweight Convolutional Neural Network for Watercolor Image Translation.
Proceedings of the 2022 Tenth International Symposium on Computing and Networking, CANDAR 2022, 2022

A benchmark QUBO problem inspired by digital halftoning based on the human visual system.
Proceedings of the Tenth International Symposium on Computing and Networking, 2022

Bit duplication technique to generate hard QUBO problems.
Proceedings of the 2022 Tenth International Symposium on Computing and Networking, CANDAR 2022, 2022

A Bokeh Image Generation Technique using Machine Learning.
Proceedings of the Tenth International Symposium on Computing and Networking, 2022

Message from the Organizers: CANDAR 2022.
Proceedings of the Tenth International Symposium on Computing and Networking, 2022

2021
Preface: Special Issue on the Eighth International Symposium on Networking and Computing.
Int. J. Netw. Comput., 2021

Efficient implementations of Bloom filter using block RAMs and DSP slices on the FPGA.
Concurr. Comput. Pract. Exp., 2021

Tile art image generation using parallel greedy algorithm on the GPU and its approximation with machine learning.
Concurr. Comput. Pract. Exp., 2021

On the Computational Power of Convolution Pooling: A Theoretical Approach for Deep Learning.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2021

Acceleration of Deflate Encoding and Decoding with GPU implementations.
Proceedings of the Ninth International Symposium on Computing and Networking, 2021

Solving the sparse QUBO on multiple GPUs for Simulating a Quantum Annealer.
Proceedings of the Ninth International Symposium on Computing and Networking, 2021

A GPU Implementation of Watercolor Painting Image Generation.
Proceedings of the Ninth International Symposium on Computing and Networking, 2021

2020
Efficient convolution pooling on the GPU.
J. Parallel Distributed Comput., 2020

Preface: Special issued on the Seventh International Symposium on Networking and Computing.
Int. J. Netw. Comput., 2020

A Rabin-Karp Implementation for Handling Multiple Pattern-Matching on the GPU.
IEICE Trans. Inf. Syst., 2020

A Work-Time Optimal Parallel Exhaustive Search Algorithm for the QUBO and the Ising model, with GPU implementation.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

An Efficient Multicore CPU Implementation for Convolution-Pooling Computation in CNNs.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

Workshop 10: APDCM Advances in Parallel and Distributed Computational Models.
Proceedings of the 2020 IEEE International Parallel and Distributed Processing Symposium Workshops, 2020

Adaptive Bulk Search: Solving Quadratic Unconstrained Binary Optimization Problems on Multiple GPUs.
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

Huffman Coding with Gap Arrays for GPU Acceleration.
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020

Art Font Image Generation with Conditional Generative Adversarial Networks.
Proceedings of the Eighth International Symposium on Computing and Networking Workshops, 2020

Category-oriented Sentiment Polarity Dictionary for Rating Prediction of Japanese Hotels.
Proceedings of the Eighth International Symposium on Computing and Networking Workshops, 2020

Fully-Pipelined Architecture for Simulated Annealing-based QUBO Solver on the FPGA.
Proceedings of the Eighth International Symposium on Computing and Networking, 2020

Efficient GPU Implementation for Solving the Maximum Independent Set Problem.
Proceedings of the Eighth International Symposium on Computing and Networking, 2020

2019
Designing High-Performance Interconnection Networks with Host-Switch Graphs.
IEEE Trans. Parallel Distributed Syst., 2019

Preface: Special issued on the Sixth International Symposium on Networking and Computing.
Int. J. Netw. Comput., 2019

Accelerating the Smith-Waterman Algorithm Using the Bitwise Parallel Bulk Computation Technique on the GPU.
IEICE Trans. Inf. Syst., 2019

Bulk execution of the dynamic programming for the optimal polygon triangulation problem on the GPU.
Concurr. Comput. Pract. Exp., 2019

Efficient cuDNN-Compatible Convolution-Pooling on the GPU.
Proceedings of the Parallel Processing and Applied Mathematics, 2019

Stained Glass Image Generation Using Voronoi Diagram and Its GPU Acceleration.
Proceedings of the Parallel Processing and Applied Mathematics, 2019

Efficient Triangular Matrix Vector Multiplication on the GPU.
Proceedings of the Parallel Processing and Applied Mathematics, 2019

FIFO-Based Hardware Sorters for High Bandwidth Memory.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2019

Introduction to APDCM 2019.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops, 2019

The Degree Diameter Problem for Host-Switch Graphs.
Proceedings of the Seventh International Symposium on Computing and Networking Workshops, 2019

A Watercolor Painting Image Generation Using Stroke-Based Rendering.
Proceedings of the Seventh International Symposium on Computing and Networking Workshops, 2019

Efficient GPU Implementations to Compute the Diameter of a Graph.
Proceedings of the 2019 Seventh International Symposium on Computing and Networking, 2019

Structured Sparse Fully-Connected Layers in the CNNs and Its GPU Acceleration.
Proceedings of the Seventh International Symposium on Computing and Networking Workshops, 2019

Throughput-Optimal Hardware Implementation of LZW Decompression on the FPGA.
Proceedings of the Seventh International Symposium on Computing and Networking Workshops, 2019

Folded Bloom Filter for High Bandwidth Memory, with GPU Implementations.
Proceedings of the 2019 Seventh International Symposium on Computing and Networking, 2019

2018
Almost optimal column-wise prefix-sum computation on the GPU.
J. Supercomput., 2018

Preface: Special issue on the Fifth International Symposium on Computing and Networking.
Int. J. Netw. Comput., 2018

Introduction to APDCM 2018.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

An Optimal Parallel Algorithm for Computing the Summed Area Table on the GPU.
Proceedings of the 2018 IEEE International Parallel and Distributed Processing Symposium Workshops, 2018

Efficient Byte Stream Pattern Test using Bloom Filter with Rolling Hash Functions on the FPGA.
Proceedings of the Sixth International Symposium on Computing and Networking, 2018

A Prefix-Sum-Based Rabin-Karp Implementation for Multiple Pattern Matching on GPGPU.
Proceedings of the Sixth International Symposium on Computing and Networking, 2018

Tile Art Image Generation Using Conditional Generative Adversarial Networks.
Proceedings of the Sixth International Symposium on Computing and Networking, 2018

2017
An Efficient GPU Implementation of Bulk Computation of the Eigenvalue Problem for Many Small Real Non-symmetric Matrices.
Int. J. Netw. Comput., 2017

Preface: Special Issue on the Fourth International Symposium on Computing and Networking.
Int. J. Netw. Comput., 2017

GPU-accelerated Exhaustive Verification of the Collatz Conjecture.
Int. J. Netw. Comput., 2017

An Efficient GPU Implementation of CKY Parsing Using the Bitwise Parallel Bulk Computation Technique.
IEICE Trans. Inf. Syst., 2017

C2CU: a CUDA C program generator for bulk execution of a sequential algorithm.
Concurr. Comput. Pract. Exp., 2017

Accelerating digital halftoning using the local exhaustive search on the GPU.
Concurr. Comput. Pract. Exp., 2017

Adaptive loss-less data compression method optimized for GPU decompression.
Concurr. Comput. Pract. Exp., 2017

Algorithms and applications towards the convergence of high-end data-intensive and computing systems.
Concurr. Comput. Pract. Exp., 2017

A GPU Implementation of Bulk Execution of the Dynamic Programming for the Optimal Polygon Triangulation.
Proceedings of the Parallel Processing and Applied Mathematics, 2017

Almost Optimal Column-wise Prefix-sum Computation on the GPU.
Proceedings of the Parallel Processing and Applied Mathematics, 2017

Photomosaic Generation by Rearranging Subimages, with GPU Acceleration.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Introduction to APDCM Workshop.
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017

Order/Radix Problem: Towards Low End-to-End Latency Interconnection Networks.
Proceedings of the 46th International Conference on Parallel Processing, 2017

Simple and Fast Parallel Algorithms for the Voronoi Map and the Euclidean Distance Map, with GPU Implementations.
Proceedings of the 46th International Conference on Parallel Processing, 2017

A Hybrid Architecture for the Approximate String Matching on an FPGA.
Proceedings of the Fifth International Symposium on Computing and Networking, 2017

A Square Pointillism Image Generation, and Its GPU Acceleration.
Proceedings of the Fifth International Symposium on Computing and Networking, 2017

Single Kernel Soft Synchronization Technique for Task Arrays on CUDA-enabled GPUs, with Applications.
Proceedings of the Fifth International Symposium on Computing and Networking, 2017

2016
A character art generator using the local exhaustive search, with GPU acceleration.
Int. J. Parallel Emergent Distributed Syst., 2016

Efficient Implementation of FDFM Approach for Euclidean Algorithms on the FPGA.
Int. J. Netw. Comput., 2016

Preface: Special issue on the Third International Symposium on Computing and Networking.
Int. J. Netw. Comput., 2016

Bulk execution of Euclidean algorithms on the CUDA-enabled GPU.
Int. J. Netw. Comput., 2016

Fast Simulation of Conway's Game of Life Using Bitwise Parallel Bulk Computation on a GPU.
Int. J. Found. Comput. Sci., 2016

A Memory-Access-Efficient Implementation for Computing the Approximate String Matching Algorithm on GPUs.
IEICE Trans. Inf. Syst., 2016

An FPGA Implementation for a Flexible-Length-Arithmetic Processor Employing the FDFM Processor Core Approach.
IEICE Trans. Inf. Syst., 2016

GPU-Accelerated Bulk Execution of Multiple-Length Multiplication with Warp-Synchronous Programming Technique.
IEICE Trans. Inf. Syst., 2016

Fully Parallelized LZW Decompression for CUDA-Enabled GPUs.
IEICE Trans. Inf. Syst., 2016

An Efficient Implementation of LZW Decompression in the FPGA.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Bitwise Parallel Bulk Computation on the GPU, with Application to the CKY Parsing for Context-Free Grammars.
Proceedings of the 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, 2016

Randomly Optimized Grid Graph for Low-Latency Interconnection Networks.
Proceedings of the 45th International Conference on Parallel Processing, 2016

An Efficient Implementation of LZW Compression in the FPGA.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2016

Light Loss-Less Data Compression, with GPU Implementation.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2016

Deterministic Construction of Regular Geometric Graphs with Short Average Distance and Limited Edge Length.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2016

GPU-Accelerated Bulk Computation of the Eigenvalue Problem for Many Small Real Non-symmetric Matrices.
Proceedings of the Fourth International Symposium on Computing and Networking, 2016

A Memory-Access-Efficient Implementation of the Approximate String Matching Algorithm on GPU.
Proceedings of the Fourth International Symposium on Computing and Networking, 2016

Accelerating Ant Colony Optimization for the Vertex Coloring Problem on the GPU.
Proceedings of the Fourth International Symposium on Computing and Networking, 2016

A Hardware Sorter for Almost Sorted Sequences, with FPGA Implementations.
Proceedings of the Fourth International Symposium on Computing and Networking, 2016

2015
Preface: Special Issue on the Second International Symposium on Computing and Networking.
Int. J. Netw. Comput., 2015

Using Pulse/Tone Signals as an Alternative to Boost Channel Reservation on Directional Communications.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2015

Parallel FDFM Approach for Computing GCDs Using the FPGA.
Proceedings of the Parallel Processing and Applied Mathematics, 2015

A Parallel Algorithm for LZW Decompression, with GPU Implementation.
Proceedings of the Parallel Processing and Applied Mathematics, 2015

Optimality of Fundamental Parallel Algorithms on the Hierarchical Memory Machine, with GPU Implementation.
Proceedings of the 23rd Euromicro International Conference on Parallel, 2015

Optimal Parallel Hardware K-Sorter and Top K-Sorter, with FPGA Implementations.
Proceedings of the 14th International Symposium on Parallel and Distributed Computing, 2015

GPU-Accelerated Digital Halftoning by the Local Exhaustive Search.
Proceedings of the 14th International Symposium on Parallel and Distributed Computing, 2015

APDCM Introduction and Committees.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Bulk GCD Computation Using a GPU to Break Weak RSA Keys.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

Asterisk PBX Capacity Evaluation.
Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium Workshop, 2015

A Fast Approximate String Matching Algorithm on GPU.
Proceedings of the Third International Symposium on Computing and Networking, 2015

A Flexible-Length-Arithmetic Processor Based on FDFM Approach in FPGAs.
Proceedings of the Third International Symposium on Computing and Networking, 2015

Parallelization Techniques for Error Diffusion with GPU Implementations.
Proceedings of the Third International Symposium on Computing and Networking, 2015

A Warp-Synchronous Implementation for Multiple-Length Multiplication on the GPU.
Proceedings of the Third International Symposium on Computing and Networking, 2015

Fast LZW Compression Using a GPU.
Proceedings of the Third International Symposium on Computing and Networking, 2015

Efficient GPU Implementations for the Conway's Game of Life.
Proceedings of the Third International Symposium on Computing and Networking, 2015

2014
Accelerating ant colony optimisation for the travelling salesman problem on the GPU.
Int. J. Parallel Emergent Distributed Syst., 2014

Optimal implementations of the approximate string matching and the approximate discrete signal matching on the memory machine models.
Int. J. Parallel Emergent Distributed Syst., 2014

Simple memory machine models for GPUs.
Int. J. Parallel Emergent Distributed Syst., 2014

Implementations of the Hough Transform on the Embedded Multicore Processors.
Int. J. Netw. Comput., 2014

Asynchronous Memory Machine Models with Barrier Synchronization.
IEICE Trans. Inf. Syst., 2014

An Optimal Implementation of the Approximate String Matching on the Hierarchical Memory Machine, with Performance Evaluation on the GPU.
IEICE Trans. Inf. Syst., 2014

Offline Permutation on the CUDA-enabled GPU.
IEICE Trans. Inf. Syst., 2014

An Efficient Implementation of the Gradient-Based Hough Transform Using DSP Slices and Block RAMs on the FPGA.
Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

Bulk Execution of Oblivious Algorithms on the Unified Memory Machine, with GPU Implementation.
Proceedings of the 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, 2014

Random Address Permute-Shift Technique for the Shared Memory on GPUs.
Proceedings of the 43rd International Conference on Parallel Processing Workshops, 2014

Parallel Algorithms for the Summed Area Table on the Asynchronous Hierarchical Memory Machine, with GPU implementations.
Proceedings of the 43rd International Conference on Parallel Processing, 2014

C2CU : A CUDA C Program Generator for Bulk Execution of a Sequential Algorithm.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2014

A GPU Implementation of Clipping-Free Halftoning Using the Direct Binary Search.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2014

GPU-Accelerated Verification of the Collatz Conjecture.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2014

An Efficient Implementation of the One-Dimensional Hough Transform Algorithm for Circle Detection on the FPGA.
Proceedings of the Second International Symposium on Computing and Networking, 2014

Thorough Evaluation of GPU Shared Memory Load and Store Instructions.
Proceedings of the Second International Symposium on Computing and Networking, 2014

A Time Optimal Parallel Algorithm for the Dynamic Programming on the Hierarchical Memory Machine.
Proceedings of the Second International Symposium on Computing and Networking, 2014

Theoretical Parallel Computing Models for GPU Computing.
Proceedings of the Open Problems in Mathematics and Computational Science, 2014

2013
Accelerating computation of Euclidean distance map using the GPU with efficient memory access.
Int. J. Parallel Emergent Distributed Syst., 2013

An FPGA implementation for neural networks with the FDFM processor core approach.
Int. J. Parallel Emergent Distributed Syst., 2013

Optimal Parallel Algorithms for Computing the Sum, the Prefix-Sums, and the Summed Area Table on the Memory Machine Models.
IEICE Trans. Inf. Syst., 2013

Offline Permutation Algorithms on the Discrete Memory Machine with Performance Evaluation on the GPU.
IEICE Trans. Inf. Syst., 2013

A GPU Implementation of Dynamic Programming for the Optimal Polygon Triangulation.
IEICE Trans. Inf. Syst., 2013

Efficient Hough Transform on the FPGA using DSP Slices and Block RAMs.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

The Hierarchical Memory Machine Model for GPUs.
Proceedings of the 2013 IEEE International Symposium on Parallel & Distributed Processing, 2013

An Optimal Offline Permutation Algorithm on the Hierarchical Memory Machine, with the GPU Implementation.
Proceedings of the 42nd International Conference on Parallel Processing, 2013

ASCII Art Generation Using the Local Exhaustive Search on the GPU.
Proceedings of the First International Symposium on Computing and Networking, 2013

The Random Address Shift to Reduce the Memory Access Congestion on the Discrete Memory Machine.
Proceedings of the First International Symposium on Computing and Networking, 2013

Sequential Memory Access on the Unified Memory Machine with Application to the Dynamic Programming.
Proceedings of the First International Symposium on Computing and Networking, 2013

TinyCSE: Tiny Computer System for Education.
Proceedings of the First International Symposium on Computing and Networking, 2013

A Flexible-Length-Arithmetic Processor Using Embedded DSP Slices and Block RAMs in FPGAs.
Proceedings of the First International Symposium on Computing and Networking, 2013

Template Matching Using DSP Slices on the FPGA.
Proceedings of the First International Symposium on Computing and Networking, 2013

The super warp architecture with random address shift.
Proceedings of the 20th Annual International Conference on High Performance Computing, 2013

2012
Preface.
Int. J. Netw. Comput., 2012

A Rewriting Approach to Replace Asynchronous ROMs with Synchronous Ones for the Circuits with Cycles.
Int. J. Netw. Comput., 2012

The Parallel FDFM Processor Core Approach for CRT-based RSA Decryption.
Int. J. Netw. Comput., 2012

Preface.
Int. J. Found. Comput. Sci., 2012

Accelerating the Dynamic Programming for the Optimal Polygon Triangulation on the GPU.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2012

An Optimal Parallel Prefix-Sums Algorithm on the Memory Machine Models for GPUs.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2012

An Efficient GPU Implementation of Ant Colony Optimization for the Traveling Salesman Problem.
Proceedings of the Third International Conference on Networking and Computing, 2012

Efficient Implementations of the Approximate String Matching on the Memory Machine Models.
Proceedings of the Third International Conference on Networking and Computing, 2012

An Implementation of Conflict-Free Offline Permutation on the GPU.
Proceedings of the Third International Conference on Networking and Computing, 2012

2011
Implementations of a Parallel Algorithm for Computing Euclidean Distance Map in Multicore Processors and GPUs.
Int. J. Netw. Comput., 2011

Efficient Exhaustive Verification of the Collatz Conjecture using DSP blocks of Xilinx FPGAs.
Int. J. Netw. Comput., 2011

An RSA Encryption Hardware Algorithm using a Single DSP Block and a Single Block RAM on the FPGA.
Int. J. Netw. Comput., 2011

An Efficient Parallel Sorting Compatible with the Standard Qsort.
Int. J. Found. Comput. Sci., 2011

Preface.
Int. J. Found. Comput. Sci., 2011

A Graph Rewriting Approach for Converting Asynchronous ROMs into Synchronous Ones.
IEICE Trans. Inf. Syst., 2011

CRT-Based DSP Decryption Using Montgomery Modular Multiplication on the FPGA.
Proceedings of the 25th IEEE International Symposium on Parallel and Distributed Processing, 2011

Fast and Accurate Template Matching Using Pixel Rearrangement on the GPU.
Proceedings of the Second International Conference on Networking and Computing, 2011

Accelerating the Dynamic Programming for the Matrix Chain Product on the GPU.
Proceedings of the Second International Conference on Networking and Computing, 2011

An Algorithm to Remove Asynchronous ROMs in Circuits with Cycles.
Proceedings of the Second International Conference on Networking and Computing, 2011

A GPU Implementation of Computing Euclidean Distance Map with Efficient Memory Access.
Proceedings of the Second International Conference on Networking and Computing, 2011

Fast Ellipse Detection Algorithm Using Hough Transform on the GPU.
Proceedings of the Second International Conference on Networking and Computing, 2011

The Parallel FDFM Processor Core Approach for Neural Networks.
Proceedings of the Second International Conference on Networking and Computing, 2011

2010
Halftoning via Error Diffusion using Circular Dot-overlap Model.
J. Digit. Content Technol. its Appl., 2010

Low-Latency Connected Component Labeling Using an FPGA.
Int. J. Found. Comput. Sci., 2010

Deafness Resilient MAC Protocol for Directional Communications.
IEICE Trans. Inf. Syst., 2010

Efficient exhaustive verification of the Collatz conjecture using DSP48E blocks of Xilinx Virtex-5 FPGAs.
Proceedings of the 24th IEEE International Symposium on Parallel and Distributed Processing, 2010

Efficient Canny Edge Detection Using a GPU.
Proceedings of the First International Conference on Networking and Computing, 2010

A Rewriting Algorithm to Generate AROM-free Fully Synchronous Circuits.
Proceedings of the First International Conference on Networking and Computing, 2010

Implementations of Parallel Computation of Euclidean Distance Map in Multicore Processors and GPUs.
Proceedings of the First International Conference on Networking and Computing, 2010

A Perspective on the Experiential Learning of Computer Architecture.
Proceedings of the 2010 IEEE/ACM Int'l Conference on Green Computing and Communications, 2010

2009
Special issue on Advances in Parallel and Distributed Computational Models.
Int. J. Parallel Emergent Distributed Syst., 2009

Clipping-Free Halftoning and Multitoning Using the Direct Binary Search.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2009

A Simple Parallel Convex Hulls Algorithm for Sorted Points and the Performance Evaluation on the Multicore Processors.
Proceedings of the 2009 International Conference on Parallel and Distributed Computing, 2009

A Hardware-Software Cooperative Approach for the Exhaustive Verification of the Collatz Conjecture.
Proceedings of the IEEE International Symposium on Parallel and Distributed Processing with Applications, 2009

RSA encryption and decryption using the redundant number system on the FPGA.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

A distributed approach for the problem of routing and wavelength assignment in WDM networks.
Proceedings of the 23rd IEEE International Symposium on Parallel and Distributed Processing, 2009

2008
A New FM Screening Method to Generate Cluster-Dot Binary Images Using the Local Exhaustive Search with FPGA Acceleration.
Int. J. Found. Comput. Sci., 2008

Redundant Radix-2r Number System for Accelerating Arithmetic Operations on the FPGAs.
Proceedings of the Ninth International Conference on Parallel and Distributed Computing, 2008

Optimized Component Labeling Algorithm for Using in Medium Sized FPGAs.
Proceedings of the Ninth International Conference on Parallel and Distributed Computing, 2008

Component labeling for k-concave binary images using an FPGA.
Proceedings of the 22nd IEEE International Symposium on Parallel and Distributed Processing, 2008

Processor, Assembler, and Compiler Design Education Using an FPGA.
Proceedings of the 14th International Conference on Parallel and Distributed Systems, 2008

Accelerating Montgomery Modulo Multiplication for Redundant Radix-64k Number System on the FPGA Using Dual-Port Block RAMs.
Proceedings of the 2008 IEEE/IPIP International Conference on Embedded and Ubiquitous Computing (EUC 2008), 2008

A Tiny Processing System for Education and Small Embedded Systems on the FPGAs.
Proceedings of the 2008 IEEE/IPIP International Conference on Embedded and Ubiquitous Computing (EUC 2008), 2008

MAC Layer Misbehavior on Ad Hoc Networks.
Proceedings of the 2008 IEEE/IPIP International Conference on Embedded and Ubiquitous Computing (EUC 2008), 2008

The Impact of Backup Routes on the Routing and Wavelength Assignment Problem in WDM Networks.
Proceedings of the 2008 IEEE/IPIP International Conference on Embedded and Ubiquitous Computing (EUC 2008), 2008

An Error Diffusion Based Algorithm for Hiding an Image in Distinct Two Images.
Proceedings of the International Conference on Computer Science and Software Engineering, 2008

2007
Fundamental Algorithms on the Reconfigurable Mesh.
Proceedings of the Handbook of Parallel Computing - Models, Algorithms and Applications., 2007

Efficient Hardware Algorithms for n Choose k Counters Using the Bitonic Merger.
Int. J. Found. Comput. Sci., 2007

Special Section on Parallel/Distributed Processing and Systems.
IEICE Trans. Inf. Syst., 2007

Randomized Initialization on the 1-Dimensional Reconfigurable Mesh.
Proceedings of the Eighth International Conference on Parallel and Distributed Computing, 2007

Proteus: An Architecture for Adapting Web Page on Small-Screen Devices.
Proceedings of the Network and Parallel Computing, IFIP International Conference, 2007

Cluster-dot Screening by Local Exhaustive Search with Hardware Accelaration.
Proceedings of the 21th International Parallel and Distributed Processing Symposium (IPDPS 2007), 2007

2006
Preface.
Int. J. Found. Comput. Sci., 2006

Special Section on Challenges in Ad-hoc and Multi-hop Wireless Communications.
IEICE Trans. Inf. Syst., 2006

An Energy Efficient Ranking Protocol for Radio Networks.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2006

An Energy Efficient Leader Election Protocol for Radio Network with a Single Transceiver.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2006

Randomized Leader Election Protocols in Noisy Radio Networks with a Single Transceiver.
Proceedings of the Parallel and Distributed Processing and Applications, 2006

Limiting the Effects of Deafness and Hidden Terminal Problems in Directional Communications.
Proceedings of the Parallel and Distributed Processing and Applications, 2006

Efficient hardware algorithms for n choose k counters.
Proceedings of the 20th International Parallel and Distributed Processing Symposium (IPDPS 2006), 2006

2005
FM Screening By The Local Exhaustive Search, With Hardware Acceleration.
Int. J. Found. Comput. Sci., 2005

Foreword.
Int. J. Found. Comput. Sci., 2005

Hardware n Choose k Counters with Applications to the Partial Exhaustive Search.
IEICE Trans. Inf. Syst., 2005

Adaptive Carrier Sensing and Packet Sending - An Alternative to Boost the Performance in Directional Communications.
Proceedings of the Sixth International Conference on Parallel and Distributed Computing, 2005

2004
Preface.
Int. J. Found. Comput. Sci., 2004

Time And Energy Optimal List Ranking Algorithms On The K-Channel Broadcast Communication Model With No Collision Detection.
Int. J. Found. Comput. Sci., 2004

Instance-Specific Solutions For Accelerating The Cky Parsing Of Large Context-Free Grammars.
Int. J. Found. Comput. Sci., 2004

Foreword.
IEICE Trans. Inf. Syst., 2004

Foreword.
IEICE Trans. Inf. Syst., 2004

Fundamental Protocols to Gather Information in Wireless Sensor Networks.
Proceedings of the Handbook of Sensor Networks, 2004

2003
An Efficient Parallel Prefix Sums Architecture with Domino Logic.
IEEE Trans. Parallel Distributed Syst., 2003

A time-optimal solution for the path cover problem on cographs.
Theor. Comput. Sci., 2003

The LD and DLAD Bio-Operations on Formal Languages.
J. Autom. Lang. Comb., 2003

Linear Layout of Generalized Hypercubes.
Int. J. Found. Comput. Sci., 2003

Sorting on Single-Channel Wireless Sensor Networks.
Int. J. Found. Comput. Sci., 2003

Randomized Time- and Energy-Optimal Routing in Single-Hop, Single-Channel Radio Networks.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2003

Instance-Specific Solutions to Accelerate the CKY Parsing.
Proceedings of the International Conference on Engineering of Reconfigurable Systems and Algorithms, June 23, 2003

An image retrieval system using FPGAs.
Proceedings of the 2003 Asia and South Pacific Design Automation Conference, 2003

2002
An Algorithm Visualization Tool on the Reconfigurable Mesh.
VLSI Design, 2002

Guest Editors' Introduction to Special Section on Mobile Computing and Wireless Networks.
IEEE Trans. Parallel Distributed Syst., 2002

Energy-Efficient Routing in the Broadcast Communication Model.
IEEE Trans. Parallel Distributed Syst., 2002

Uniform Leader Election Protocols for Radio Networks.
IEEE Trans. Parallel Distributed Syst., 2002

Identifying Faulty Nodes in Wireless Sensor Networks.
J. Interconnect. Networks, 2002

Doubly-Logarithmic Energy-Efficient Initialization Protocols for Single-Hop Radio Networks.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2002

Fundamental Protocols to Gather Information in Wireless Sensor Networks.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2002

An Energy-Efficient Initialization Protocol for Wireless Sensor Networks with No Collision Detection.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2002

A Survey on Leader Election Protocols for Radio Networks.
Proceedings of the International Symposium on Parallel Architectures, 2002

Workshop Introduction.
Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002

An Optimal Randomized Ranking Algorithm on the k-channel Broadcast Communication Model.
Proceedings of the 31st International Conference on Parallel Processing (ICPP 2002), 2002

Accelerating the CKY Parsing Using FPGAs.
Proceedings of the High Performance Computing, 2002

Time and Energy Optimal List Ranking Algorithms on the k -Channel Broadcast Communication Model.
Proceedings of the Computing and Combinatorics, 8th Annual International Conference, 2002

2001
Energy-Efficient Permutation Routing in Radio Networks.
IEEE Trans. Parallel Distributed Syst., 2001

Optimal Algorithms for the Multiple Query Problem on Reconfigurable Meshes, with Applications.
IEEE Trans. Parallel Distributed Syst., 2001

Fundamental Protocols on Wireless Sensor Networks.
Proceedings of the 15th International Parallel & Distributed Processing Symposium (IPDPS-01), 2001

An Energy-Efficient Initialization Protocol for Wireless Sensor Networks.
Proceedings of the 30th International Workshops on Parallel Processing (ICPP 2001 Workshops), 2001

2000
Energy-Efficient Initialization Protocols for Single-Hop Radio Networks with No Collision Detection.
IEEE Trans. Parallel Distributed Syst., 2000

Randomized Initialization Protocols for Ad Hoc Networks.
IEEE Trans. Parallel Distributed Syst., 2000

Scalable Hardware-Algorithms for Binary Prefix Sums.
IEEE Trans. Parallel Distributed Syst., 2000

A randomized leader election protocol for ad-hoc networks.
Proceedings of the SIROCCO 7, 2000

Randomized Leader Election Protocols in Radio Networks with No Collision Detection.
Proceedings of the Algorithms and Computation, 11th International Conference, 2000

Workshop on Advances in Parallel and Distributed Computational Models.
Proceedings of the Parallel and Distributed Processing, 2000

Multithreaded Parallel Computer Model with Performance Evaluation.
Proceedings of the Parallel and Distributed Processing, 2000

Energy-Efficient Deterministic Routing Protocols in Radio Networks.
Proceedings of the 2000 International Conference on Parallel Processing, 2000

Energy-Efficient Initialization Protocols for Radio Networks with No Collision Detection.
Proceedings of the 2000 International Conference on Parallel Processing, 2000

Energy-efficient randomized routing in radio networks.
Proceedings of the 4th International Workshop on Discrete Algorithms and Methods for Mobile Computing and Communications (DIAL-M 2000), 2000

1999
Broadcast-Efficient Protocols for Mobile Radio Networks.
IEEE Trans. Parallel Distributed Syst., 1999

Guest Editors' Introduction.
Int. J. Found. Comput. Sci., 1999

A Tool for Algorithm Visualization on the Reconfigurable Mesh.
Proceedings of the 1999 International Symposium on Parallel Architectures, 1999

Energy-Efficient Initialization Protocols for Ad-hoc Radio Networks.
Proceedings of the Algorithms and Computation, 10th International Symposium, 1999

An Efficient VLSI Architecture Parallel Prefix Counting With Domino Logic.
Proceedings of the 13th International Parallel Processing Symposium / 10th Symposium on Parallel and Distributed Processing (IPPS / SPDP '99), 1999

Randomized Initialization Protocols for Packet Radio Networks.
Proceedings of the 13th International Parallel Processing Symposium / 10th Symposium on Parallel and Distributed Processing (IPPS / SPDP '99), 1999

1998
An Efficient Algorithm for Row Minima Computations on Basic Reconfigurable Meshes.
IEEE Trans. Parallel Distributed Syst., 1998

Work-Time Optimal k-Merge Algorithms on the PRAM.
IEEE Trans. Parallel Distributed Syst., 1998

An O((log log n)<sup>2</sup>) Time Algorithm to Compute the Convex Hull of Sorted Points on Reconfigurable Meshes.
IEEE Trans. Parallel Distributed Syst., 1998

Optimal Parallel Algorithms for Finding Proximate Points, with Applications.
IEEE Trans. Parallel Distributed Syst., 1998

Integer Summing Algorithms on Reconfigurable Meshes.
Theor. Comput. Sci., 1998

Efficient List Ranking on the Reconfigurable Mesh with Applications.
Theory Comput. Syst., 1998

Randomized O (log log n)-Round Leader Election Protocols in Packet Radio Networks.
Proceedings of the Algorithms and Computation, 9th International Symposium, 1998

Broadcast-Efficient Algorithms on the Coarse-Grain Broadcast Communication Model with Few Channels.
Proceedings of the 12th International Parallel Processing Symposium / 9th Symposium on Parallel and Distributed Processing (IPPS/SPDP '98), March 30, 1998

A Scalable VLSI Architecture for Binary Prefix Sums.
Proceedings of the 12th International Parallel Processing Symposium / 9th Symposium on Parallel and Distributed Processing (IPPS/SPDP '98), March 30, 1998

An O((log log n)<sup>2</sup>) Time Convex Hull Algorithm on Reconfigurable Meshes.
Proceedings of the 12th International Parallel Processing Symposium / 9th Symposium on Parallel and Distributed Processing (IPPS/SPDP '98), March 30, 1998

1997
An Optimal Algorithm for the Angle-Restricted All Nearest Neighbor Problem on the Reconfigurable Mesh, with Applications.
IEEE Trans. Parallel Distributed Syst., 1997

An Approximation Algorithm for the Minimum Common Supertree Problem.
Nord. J. Comput., 1997

Optimal Parallel Algorithms for Finding Proximate Points, with Applications (Extended Abstract).
Proceedings of the Algorithms and Data Structures, 5th International Workshop, 1997

Weighted and Unweighted Selection Algorithms for k Sorted Sequences.
Proceedings of the Algorithms and Computation, 8th International Symposium, 1997

Broadcast-Efficient Sorting in the Presence of Few Channels.
Proceedings of the 1997 International Conference on Parallel Processing (ICPP '97), 1997

1996
Computation of the Convex Hull for Sorted Points on a Reconfigurable Mesh.
Parallel Algorithms Appl., 1996

An Optimal Algorithm for the Angle-Restricted All Nearest Neighbor Problem on the Reconfigurable.
Proceedings of IPPS '96, 1996

An Efficient Algorithm for Row Minima Computations in Monotone Matrices.
Proceedings of the 1996 International Conference on Parallel Processing, 1996

1995
A Bibliography of Published Papers on Dynamically Reconfigurable Architectures.
Parallel Process. Lett., 1995

Prefix-Sums Algorithms on Reconfigurable Meshes.
Parallel Process. Lett., 1995

Optimal Initializing Algorithms for a Reconfigurable Mesh.
J. Parallel Distributed Comput., 1995

1993
An optimal parallel algorithm for finding shortest paths inside simple polygons.
Syst. Comput. Jpn., 1993

Linear Layouts of Generalized Hypercubes.
Proceedings of the Graph-Theoretic Concepts in Computer Science, 1993

1992
Simple parallel algorithms to compute interval maxima.
Syst. Comput. Jpn., 1992

Methods for realizing a priority bus system.
Syst. Comput. Jpn., 1992


  Loading...