2024
Towards optimized tensor code generation for deep learning on sunway many-core processor.
Frontiers Comput. Sci., April, 2024

2022
Evaluating performance of AI operators using roofline model.
Appl. Intell., 2022

2019
swTVM: Exploring the Automated Compilation for Deep Learning on Sunway Architecture.
CoRR, 2019

2018
Cooperative Preprocessing at Petabytes on High Performance Computing System.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2018

2015
A Balanced Vertex Cut Partition Method in Distributed Graph Computing.
Proceedings of the Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques, 2015