Weifeng Liu
Orcid: 0000-0002-2150-5759Affiliations:
- China University of Petroleum, Beijing, China
- Norwegian University of Science and Technology (former)
- University of Copenhagen, Niels Bohr Institute (NBI) (former)
- STFC Rutherford Appleton Laboratory, Didcot, UK (former)
According to our database1,
Weifeng Liu
authored at least 52 papers
between 2014 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on ntnu.edu
-
on orcid.org
-
on dl.acm.org
On csauthors.net:
Bibliography
2024
CCF Trans. High Perform. Comput., October, 2024
Mille-feuille: A Tile-Grained Mixed Precision Single-Kernel Conjugate Gradient Solver on GPUs.
Proceedings of the International Conference for High Performance Computing, 2024
Proceedings of the International Conference for High Performance Computing, 2024
Cuper: Customized Dataflow and Perceptual Decoding for Sparse Matrix-Vector Multiplication on HBM-Equipped FPGAs.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024
Efficient Spectral-Aware Power Supply Noise Analysis for Low-Power Design Verification.
Proceedings of the Design, Automation & Test in Europe Conference & Exhibition, 2024
MASC: A Memory-Efficient Adjoint Sensitivity Analysis through Compression Using Novel Spatiotemporal Prediction.
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024
Proceedings of the 61st ACM/IEEE Design Automation Conference, 2024
Machine Learning and GPU Accelerated Sparse Linear Solvers for Transistor-Level Circuit Simulation: A Perspective Survey (Invited Paper).
Proceedings of the 29th Asia and South Pacific Design Automation Conference, 2024
2023
CCF Trans. High Perform. Comput., June, 2023
Editorial for the special issue on architecture, algorithms and applications of high performance sparse matrix computations.
CCF Trans. High Perform. Comput., June, 2023
DASP: Specific Dense Matrix Multiply-Accumulate Units Accelerated General Sparse Matrix-Vector Multiplication.
Proceedings of the International Conference for High Performance Computing, 2023
PanguLU: A Scalable Regular Two-Dimensional Block-Cyclic Sparse Direct Solver on Distributed Heterogeneous Systems.
Proceedings of the International Conference for High Performance Computing, 2023
HASpGEMM: Heterogeneity-Aware Sparse General Matrix-Matrix Multiplication on Modern Asymmetric Multicore Processors.
Proceedings of the 52nd International Conference on Parallel Processing, 2023
Accelerating Sparse LU Factorization with Density-Aware Adaptive Matrix Multiplication for Circuit Simulation.
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023
Proceedings of the 60th ACM/IEEE Design Automation Conference, 2023
HASpMV: Heterogeneity-Aware Sparse Matrix-Vector Multiplication on Modern Asymmetric Multicore Processors.
Proceedings of the IEEE International Conference on Cluster Computing, 2023
Balancing Computation and Communication in Distributed Sparse Matrix-Vector Multiplication.
Proceedings of the 23rd IEEE/ACM International Symposium on Cluster, 2023
2022
IEEE Trans. Parallel Distributed Syst., 2022
TileSpGEMM: a tiled algorithm for parallel sparse general matrix-matrix multiplication on GPUs.
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022
TileSpMSpV: A Tiled Algorithm for Sparse Matrix-Sparse Vector Multiplication on GPUs.
Proceedings of the 51st International Conference on Parallel Processing, 2022
2021
YuenyeungSpTRSV: A Thread-Level and Warp-Level Fusion Synchronization-Free Sparse Triangular Solve.
IEEE Trans. Parallel Distributed Syst., 2021
BALS: Blocked Alternating Least Squares for Parallel Sparse Matrix Factorization on GPUs.
IEEE Trans. Parallel Distributed Syst., 2021
Int. J. Parallel Program., 2021
CCF Trans. High Perform. Comput., 2021
Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, 2021
PALBBD: A Parallel ArcLength Method Using Bordered Block Diagonal Form for DC Analysis.
Proceedings of the GLSVLSI '21: Great Lakes Symposium on VLSI 2021, 2021
SFLU: Synchronization-Free Sparse LU Factorization for Fast Circuit Simulation on GPUs.
Proceedings of the 58th ACM/IEEE Design Automation Conference, 2021
2020
clMF: A fine-grained and portable alternating least squares algorithm for parallel matrix factorization.
Future Gener. Comput. Syst., 2020
NUMA-Aware Optimization of Sparse Matrix-Vector Multiplication on ARMv8-Based Many-Core Architectures.
Proceedings of the Network and Parallel Computing, 2020
Proceedings of the Network and Parallel Computing, 2020
CapelliniSpTRSV: A Thread-Level Synchronization-Free Sparse Triangular Solve on GPUs.
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020
Proceedings of the ICPP 2020: 49th International Conference on Parallel Processing, 2020
2019
Int. J. Parallel Program., 2019
Performance evaluation and analysis of sparse matrix and graph kernels on heterogeneous processors.
CCF Trans. High Perform. Comput., 2019
IA-SpGEMM: an input-aware auto-tuning framework for parallel sparse matrix-matrix multiplication.
Proceedings of the ACM International Conference on Supercomputing, 2019
2018
swSpTRSV: a fast sparse triangular solve with sparse level tile layout on sunway architectures.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018
Register-based implementation of the sparse general matrix-matrix multiplication on GPUs.
Proceedings of the 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2018
Proceedings of the 32nd International Conference on Supercomputing, 2018
2017
Fast synchronization-free algorithms for parallel sparse triangular solves with multiple right-hand sides.
Concurr. Comput. Pract. Exp., 2017
Exploring and analyzing the real impact of modern on-package memory on HPC scientific kernels.
Proceedings of the International Conference for High Performance Computing, 2017
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium Workshops, 2017
Proceedings of the International Conference on Supercomputing, 2017
Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, 2017
2016
Proceedings of the 2016 International Conference on Supercomputing, 2016
Proceedings of the Euro-Par 2016: Parallel Processing, 2016
2015
Speculative segmented sum for sparse matrix-vector multiplication on heterogeneous processors.
Parallel Comput., 2015
A framework for general sparse matrix-matrix multiplication on GPUs and heterogeneous processors.
J. Parallel Distributed Comput., 2015
CSR5: An Efficient Storage Format for Cross-Platform Sparse Matrix-Vector Multiplication.
Proceedings of the 29th ACM on International Conference on Supercomputing, 2015
Proceedings of the British Machine Vision Conference 2015, 2015
2014
Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium, 2014
Proceedings of the Seventh Workshop on General Purpose Processing Using GPUs, 2014