Jiarui Fang
Orcid: 0000-0002-6724-2763
According to our database1,
Jiarui Fang
authored at least 31 papers
between 2013 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
xDiT: an Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism.
CoRR, 2024
CoRR, 2024
PipeFusion: Displaced Patch Pipeline Parallelism for Inference of Diffusion Transformer Models.
CoRR, 2024
CoRR, 2024
CoRR, 2024
Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024
2023
IEEE Trans. Parallel Distributed Syst., 2023
Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models.
CoRR, 2023
Proceedings of the 52nd International Conference on Parallel Processing, 2023
2022
CoRR, 2022
CoRR, 2022
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
2021
PatrickStar: Parallel Training of Pre-trained Models via a Chunk-based Memory Management.
CoRR, 2021
Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021
2020
Efficient AES implementation on Sunway TaihuLight supercomputer: A systematic approach.
J. Parallel Distributed Comput., 2020
2019
Semantic Segmentation-Based Building Footprint Extraction Using Very High-Resolution Satellite Images and Multi-Source GIS Data.
Remote. Sens., 2019
RedSync: Reducing synchronization bandwidth for distributed deep learning training system.
J. Parallel Distributed Comput., 2019
swATOP: Automatically Optimizing Deep Learning Operators on SW26010 Many-Core Processor.
Proceedings of the 48th International Conference on Parallel Processing, 2019
2018
ACM Trans. Archit. Code Optim., 2018
A dynamic agricultural prediction system for large-scale drought assessment on the Sunway TaihuLight supercomputer.
Comput. Electron. Agric., 2018
Semantic Segmentation Based Building Extraction Method Using Multi-Source GIS Map Datasets and Satellite Imagery.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018
swCaffe: A Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight.
Proceedings of the IEEE International Conference on Cluster Computing, 2018
2017
Parallel Multiclass Support Vector Machine for Remote Sensing Data Classification on Multicore and Many-Core Architectures.
IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2017
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017
Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017
2016
Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer.
Proceedings of the International Conference for High Performance Computing, 2016
Cache-Friendly Design for Complex Spatially-Variable Coefficient Stencils on Many-Core Architectures.
Proceedings of the 23rd IEEE International Conference on High Performance Computing, 2016
2015
Optimizing Complex Spatially-Variant Coefficient Stencils for Seismic Modeling on GPU.
Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015
2013
Sourcing strategies in supply risk management: An approximate dynamic programming approach.
Comput. Oper. Res., 2013