We stand with Ukraine

We stand with Ukraine

Jiarui Fang

Orcid: 0000-0002-6724-2763

According to our database¹, Jiarui Fang authored at least 33 papers between 2013 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

2014

2016

2018

2020

2022

2024

0

1

2

3

4

5

6

7

8

9

10

6

2

3

1

1

2

2

1

1

2

1

1

1

1

2

2

2

1

1

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

FDI-SIL.

[BibT_eX]

[DOI]

,

,

,

,

Dataset, March, 2024

Unveiling Redundancy in Diffusion Transformers (DiTs): A Systematic Study.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2024

xDiT: an Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2024

LoongTrain: Efficient Training of Long-Sequence LLMs with Head-Context Parallelism.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, 2024

PipeFusion: Displaced Patch Pipeline Parallelism for Inference of Diffusion Transformer Models.

[BibT_eX]

[DOI]

,

,

,

CoRR, 2024

USP: A Unified Sequence Parallelism Approach for Long Context Generative AI.

[BibT_eX]

[DOI]

,

CoRR, 2024

AutoChunk: Automated Activation Chunk for Memory-Efficient Long Sequence Inference.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2024

FastFold: Optimizing AlphaFold Training and Inference on GPU Clusters.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming, 2024

Brief Analysis of False Data Injection Attacks Based on Two Data Modalities in IoTs-based Solar Insecticidal Lamps.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 22nd IEEE International Conference on Industrial Informatics, 2024

2023

Parallel Training of Pre-Trained Models via Chunk-Based Dynamic Memory Management.

[BibT_eX]

[DOI]

,

,

,

,

,

,

IEEE Trans. Parallel Distributed Syst., 2023

Colossal-Auto: Unified Automation of Parallelization and Activation Checkpoint for Large-scale Models.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2023

Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the 52nd International Conference on Parallel Processing, 2023

2022

Elixir: Train a Large Language Model on a Small GPU Cluster.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2022

EnergonAI: An Inference System for 10-100 Billion Parameter Transformer Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

CoRR, 2022

A Frequency-aware Software Cache for Large Recommendation System Embeddings.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

CoRR, 2022

RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021

PatrickStar: Parallel Training of Pre-trained Models via a Chunk-based Memory Management.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2021

TurboTransformers: an efficient GPU serving system for transformer models.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the PPoPP '21: 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, 2021

2020

Efficient AES implementation on Sunway TaihuLight supercomputer: A systematic approach.

[BibT_eX]

[DOI]

,

,

,

,

,

,

J. Parallel Distributed Comput., 2020

2019

Semantic Segmentation-Based Building Footprint Extraction Using Very High-Resolution Satellite Images and Multi-Source GIS Data.

[BibT_eX]

[DOI]

,

,

,

,

,

Remote. Sens., 2019

RedSync: Reducing synchronization bandwidth for distributed deep learning training system.

[BibT_eX]

[DOI]

,

,

,

J. Parallel Distributed Comput., 2019

swATOP: Automatically Optimizing Deep Learning Operators on SW26010 Many-Core Processor.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the 48th International Conference on Parallel Processing, 2019

2018

Optimizing Convolutional Neural Networks on the Sunway TaihuLight Supercomputer.

[BibT_eX]

[DOI]

,

,

,

,

,

ACM Trans. Archit. Code Optim., 2018

A dynamic agricultural prediction system for large-scale drought assessment on the Sunway TaihuLight supercomputer.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Comput. Electron. Agric., 2018

Semantic Segmentation Based Building Extraction Method Using Multi-Source GIS Map Datasets and Satellite Imagery.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018

swCaffe: A Parallel Framework for Accelerating Deep Learning Applications on Sunway TaihuLight.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Cluster Computing, 2018

2017

Parallel Multiclass Support Vector Machine for Remote Sensing Data Classification on Multicore and Many-Core Architectures.

[BibT_eX]

[DOI]

,

,

,

,

IEEE J. Sel. Top. Appl. Earth Obs. Remote. Sens., 2017

SW-AES: Accelerating AES Algorithm on the Sunway TaihuLight.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

swDNN: A Library for Accelerating Deep Learning Applications on Sunway TaihuLight.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 2017 IEEE International Parallel and Distributed Processing Symposium, 2017

2016

Refactoring and optimizing the community atmosphere model (CAM) on the sunway taihulight supercomputer.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the International Conference for High Performance Computing, 2016

Cache-Friendly Design for Complex Spatially-Variable Coefficient Stencils on Many-Core Architectures.

[BibT_eX]

[DOI]

,

,

Proceedings of the 23rd IEEE International Conference on High Performance Computing, 2016

2015

Optimizing Complex Spatially-Variant Coefficient Stencils for Seismic Modeling on GPU.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 21st IEEE International Conference on Parallel and Distributed Systems, 2015

2013

Sourcing strategies in supply risk management: An approximate dynamic programming approach.

[BibT_eX]

[DOI]

,

,

,

Tom Van Woensel

Comput. Oper. Res., 2013

Loading...