Guiming Wu

Orcid: 0000-0002-6703-3195

According to our database1, Guiming Wu authored at least 25 papers between 2006 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Acceleration of Multi-Body Molecular Dynamics With Customized Parallel Dataflow.
IEEE Trans. Parallel Distributed Syst., December, 2024

MSMAC: Accelerating Multi-Scalar Multiplication for Zero-Knowledge Proof.
IACR Cryptol. ePrint Arch., 2024

2023
Topgun: An ECC Accelerator for Private Set Intersection.
ACM Trans. Reconfigurable Technol. Syst., December, 2023

E-Booster: A Field-Programmable Gate Array-Based Accelerator for Secure Tree Boosting Using Additively Homomorphic Encryption.
IEEE Micro, 2023

2022
A High-Performance Hardware Architecture for ECC Point Multiplication over Curve25519.
Proceedings of the 30th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2022

2019
FPGA应用于高性能计算的研究现状和未来挑战 (Research Advances and Future Challenges of FPGA-based High Performance Computing).
计算机科学, 2019

2017
A High-Performance Accelerator for Floating-Point Matrix Multiplication.
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

2015
GF(2m)上椭圆曲线标量乘的硬件结构实现 (Hardware Implementation of Scalar Multiplication on Elliptic Curves over GF(2m)).
计算机科学, 2015

面向定制结构的稀疏矩阵分块方法 (Sparse Matrix Blocking Method for Custom Architecture).
计算机科学, 2015

A deeply-pipelined FPGA-based SpMV accelerator with a hardware-friendly storage scheme.
IEICE Electron. Express, 2015

2013
High-Performance Architecture for the Conjugate Gradient Solver on FPGAs.
IEEE Trans. Circuits Syst. II Express Briefs, 2013

2012
A High Performance and Memory Efficient LU Decomposer on FPGAs.
IEEE Trans. Computers, 2012

Parallelizing sparse LU decomposition on FPGAs.
Proceedings of the 2012 International Conference on Field-Programmable Technology, 2012

2010
A Unified Co-Processor Architecture for Matrix Decomposition.
J. Comput. Sci. Technol., 2010

FPGA accelerating double/quad-double high precision floating-point applications for ExaScale computing.
Proceedings of the 24th International Conference on Supercomputing, 2010

Automatic synthesis of processor arrays with local memories on FPGAs.
Proceedings of the International Conference on Field-Programmable Technology, 2010

High performance and memory efficient implementation of matrix multiplication on FPGAs.
Proceedings of the International Conference on Field-Programmable Technology, 2010

Blocking LU Decomposition for FPGAs.
Proceedings of the 18th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2010

2009
A coarse-grained reconfigurable computing architecture with loop self-pipelining.
Sci. China Ser. F Inf. Sci., 2009

Exploiting Fine-Grained Pipeline Parallelism for Wavefront Computations on Multicore Platforms.
Proceedings of the ICPPW 2009, 2009

A Fine-grained Pipelined Implementation of the LINPACK Benchmark on FPGAs.
Proceedings of the FCCM 2009, 2009

2008
Computation rotating for data reuse.
Proceedings of the 13th Asia-Pacific Computer Systems Architecture Conference, 2008

2007
Instruction Selection for Subword Level Parallelism Optimizations for Application Specific Instruction Processors.
Proceedings of the Parallel and Distributed Processing and Applications, 2007

The Implementation of a Coarse-Grained Reconfigurable Architecture with Loop Self-pipelining.
Proceedings of the Reconfigurable Computing: Architectures, 2007

2006
Designing a Coarse-Grained Reconfigurable Architecture Using Loop Self-Pipelining.
Proceedings of the Advances in Computer Systems Architecture, 11th Asia-Pacific Conference, 2006


  Loading...