Hasitha Muthumala Waidyasooriya

Orcid: 0000-0001-5108-9891

According to our database1, Hasitha Muthumala Waidyasooriya authored at least 46 papers between 2008 and 2025.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


Online presence:

On csauthors.net:


Large-Scale AGV Routing Based on Multi-FPGA SQA Acceleration.
Proceedings of the 30th Asia and South Pacific Design Automation Conference, 2025

Performance evaluation of Word2vec accelerators exploiting spatial and temporal parallelism on DDR/HBM-based FPGAs.
J. Supercomput., August, 2024

Temporal and spatial parallel processing of simulated quantum annealing on a multicore CPU.
J. Supercomput., 2022

Design space exploration for an FPGA-based quantum annealing simulator with interaction-coefficient-generators.
J. Supercomput., 2022

FPGA-Accelerated Searchable Encrypted Database Management Systems for Cloud Services.
IEEE Trans. Cloud Comput., 2022

A Scalable Emulator for Quantum Fourier Transform Using Multiple-FPGAs With High-Bandwidth-Memory.
IEEE Access, 2022

Word2Vec FPGA Accelerator Based on Spatial and Temporal Parallelism.
Proceedings of the Parallel and Distributed Computing, Applications and Technologies, 2022

OpenCL-Based Design of an FPGA Accelerator for H.266/VVC Transform and Quantization.
Proceedings of the 65th IEEE International Midwest Symposium on Circuits and Systems, 2022

Implementation of an FPGA-Oriented Complex Number Computation Library Using Intel OneAPI DPC++.
Proceedings of the 65th IEEE International Midwest Symposium on Circuits and Systems, 2022

FPGA-Based Prototype of a Quantum Annealing Simulator for Sparse Ising Model.
Proceedings of the 15th IEEE International Symposium on Embedded Multicore/Many-core Systems-on-Chip, 2022

Highly-Parallel FPGA Accelerator for Simulated Quantum Annealing.
IEEE Trans. Emerg. Top. Comput., 2021

A GPU-Based Quantum Annealing Simulator for Fully-Connected Ising Models Utilizing Spatial and Temporal Parallelism.
IEEE Access, 2020

OpenCL-based design of an FPGA accelerator for quantum annealing simulation.
J. Supercomput., 2019

Multi-FPGA Accelerator Architecture for Stencil Computation Exploiting Spacial and Temporal Scalability.
IEEE Access, 2019

FPGA-Based Acceleration of Word2vec using OpenCL.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2019

Benchmarks for FPGA-Targeted High-Level-Synthesis.
Proceedings of the 2019 Seventh International Symposium on Computing and Networking, 2019

A Memory-Bandwidth-Efficient Word2vec Accelerator Using OpenCL for FPGA.
Proceedings of the Seventh International Symposium on Computing and Networking Workshops, 2019

Data-Transfer-Bottleneck-Less Architecture for FPGA-Based Quantum Annealing Simulation.
Proceedings of the 2019 Seventh International Symposium on Computing and Networking, 2019

Architecture of an FPGA-Based Heterogeneous System for Code-Search Problems.
Proceedings of the Supercomputing Frontiers - 4th Asian Conference, 2018

Accelerator Architecture for Simulated Quantum Annealing Based on Resource-Utilization-Aware Scheduling and its Implementation Using OpenCL.
Proceedings of the 2018 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), 2018

OpenCL-Based FPGA-Platform for Stencil Computation and Its Optimization Methodology.
IEEE Trans. Parallel Distributed Syst., 2017

OpenCL-Based FPGA Accelerator for 3D FDTD with Periodic and Absorbing Boundary Conditions.
Int. J. Reconfigurable Comput., 2017

An FPGA Accelerator for Molecular Dynamics Simulation Using OpenCL.
Int. J. Networked Distributed Comput., 2017

Architecture of an FPGA accelerator for LDA-based inference.
Proceedings of the 18th IEEE/ACIS International Conference on Software Engineering, 2017

Hardware-Acceleration of Short-Read Alignment Based on the Burrows-Wheeler Transform.
IEEE Trans. Parallel Distributed Syst., 2016

Architecture of an FPGA accelerator for molecular dynamics simulation using OpenCL.
Proceedings of the 15th IEEE/ACIS International Conference on Computer and Information Science, 2016

FPGA-based deep-pipelined architecture for FDTD acceleration using OpenCL.
Proceedings of the 15th IEEE/ACIS International Conference on Computer and Information Science, 2016

Data-Transfer-Aware Design of an FPGA-Based Heterogeneous Multicore Platform with Custom Accelerators.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2015

Hardware-oriented succinct-data-structure based on block-size-constrained compression.
Proceedings of the 7th International Conference of Soft Computing and Pattern Recognition, 2015

FDTD Acceleration for Cylindrical Resonator Design Based on the Hybrid of Single and Double Precision Floating-Point Computation.
J. Comput. Eng., 2014

Efficient data transfer scheme using word-pair-encoding-based compression for large-scale text-data processing.
Proceedings of the 2014 IEEE Asia Pacific Conference on Circuits and Systems, 2014

Evaluation of an FPGA-Based Heterogeneous Multicore Platform with SIMD/MIMD Custom Accelerators.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2013

Implementation of a custom hardware-accelerator for short-read mapping using Burrows-Wheeler alignment.
Proceedings of the 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2013

Memory-Access-Driven Context Partitioning for Window-Based Image Processing on Heterogeneous Multicore Processors.
IEICE Trans. Inf. Syst., 2012

Acceleration of Block Matching on a Low-Power Heterogeneous Multi-Core Processor Based on DTU Data-Transfer with Data Re-Allocation.
IEICE Trans. Electron., 2012

FPGA implementation of heterogeneous multicore platform with SIMD/MIMD custom accelerators.
Proceedings of the 2012 IEEE International Symposium on Circuits and Systems, 2012

Memory Allocation Exploiting Temporal Locality for Reducing Data-Transfer Bottlenecks in Heterogeneous Multicore Processors.
IEEE Trans. Circuits Syst. Video Technol., 2011

Memory Allocation for Window-Based Image Processing on Multiple Memory Modules with Simple Addressing Functions.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2011

Task Allocation with Algorithm Transformation for Reducing Data-Transfer Bottlenecks in Heterogeneous Multi-Core Processors: A Case Study of HOG Descriptor Computation.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2010

Mapping for a Heterogeneous Multi-Core Media Processor Considering the Data Transfer Time.
Proceedings of the 2010 International Conference on Engineering of Reconfigurable Systems & Algorithms, 2010

Architecture of an FPGA-Oriented Heterogeneous Multi-core Processor with SIMD-Accelerator Cores.
Proceedings of the 2010 International Conference on Engineering of Reconfigurable Systems & Algorithms, 2010

Implementation of a Partially Reconfigurable Multi-Context FPGA Based on Asynchronous Architecture.
IEICE Trans. Electron., 2009

Acceleration of Optical-Flow Extraction Using Dynamically Reconfigurable ALU Arrays.
Proceedings of the 2009 International Conference on Engineering of Reconfigurable Systems & Algorithms, 2009

Evaluation of Interconnect-Complexity-Aware Low-Power VLSI Design Using Multiple Supply and Threshold Voltages.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2008

Multi-Context FPGA Using Fine-Grained Interconnection Blocks and Its CAD Environment.
IEICE Trans. Electron., 2008

Implementation of a Multi-Context FPGA Based on Flexible-Context-Partitioning.
Proceedings of the 2008 International Conference on Engineering of Reconfigurable Systems & Algorithms, 2008
