Towards optimized tensor code generation for deep learning on sunway many-core processor.
,
,
,
,
,
,
,
,
,
,
Frontiers Comput. Sci., April, 2024
Evaluating performance of AI operators using roofline model.
Appl. Intell., 2022
swTVM: Exploring the Automated Compilation for Deep Learning on Sunway Architecture.
CoRR, 2019
Cooperative Preprocessing at Petabytes on High Performance Computing System.
Proceedings of the Algorithms and Architectures for Parallel Processing, 2018
A Balanced Vertex Cut Partition Method in Distributed Graph Computing.
Proceedings of the Intelligence Science and Big Data Engineering. Big Data and Machine Learning Techniques, 2015