Ke-shi Ge

Orcid: 0000-0002-0669-6892

According to our database1, Ke-shi Ge authored at least 19 papers between 2017 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
A Multidimensional Communication Scheduling Method for Hybrid Parallel DNN Training.
IEEE Trans. Parallel Distributed Syst., August, 2024

Advances of Pipeline Model Parallelism for Deep Learning Training: An Overview.
J. Comput. Sci. Technol., May, 2024

2023
Merak: An Efficient Distributed DNN Training Framework With Automated 3D Parallelism for Giant Foundation Models.
IEEE Trans. Parallel Distributed Syst., May, 2023

Compressed Collective Sparse-Sketch for Distributed Data-Parallel Training of Deep Learning Models.
IEEE J. Sel. Areas Commun., April, 2023

Automated Tensor Model Parallelism with Overlapped Communication for Efficient Foundation Model Training.
CoRR, 2023

Auto-Divide GNN: Accelerating GNN Training with Subgraph Division.
Proceedings of the Euro-Par 2023: Parallel Processing - 29th International Conference on Parallel and Distributed Computing, Limassol, Cyprus, August 28, 2023

Prophet: Fine-grained Load Balancing for Parallel Training of Large-scale MoE Models.
Proceedings of the IEEE International Conference on Cluster Computing, 2023

2022
BRGraph: An efficient graph neural network training system by reusing batch data on GPU.
Concurr. Comput. Pract. Exp., 2022

S2 Reducer: High-Performance Sparse Communication to Accelerate Distributed Deep Learning.
Proceedings of the IEEE International Conference on Acoustics, 2022

AutoPipe: A Fast Pipeline Parallelism Approach with Balanced Partitioning and Micro-batch Slicing.
Proceedings of the IEEE International Conference on Cluster Computing, 2022

HPH: Hybrid Parallelism on Heterogeneous Clusters for Accelerating Large-scale DNNs Training.
Proceedings of the IEEE International Conference on Cluster Computing, 2022

2021
S2 Reducer: High-Performance Sparse Communication to Accelerate Distributed Deep Learning.
CoRR, 2021

CASQ: Accelerate Distributed Deep Learning with Sketch-Based Gradient Quantization.
Proceedings of the IEEE International Conference on Cluster Computing, 2021

2020
An efficient parallel and distributed solution to nonconvex penalized linear SVMs.
Frontiers Inf. Technol. Electron. Eng., 2020

Tag Pollution Detection in Web Videos via Cross-Modal Relevance Estimation.
Proceedings of the 28th IEEE/ACM International Symposium on Quality of Service, 2020

2019
HPDL: Towards a General Framework for High-performance Distributed Deep Learning.
Proceedings of the 39th IEEE International Conference on Distributed Computing Systems, 2019

2018
Deep Discriminative Clustering Network.
Proceedings of the 2018 International Joint Conference on Neural Networks, 2018

An Efficient ADMM-Based Algorithm to Nonconvex Penalized Support Vector Machines.
Proceedings of the 2018 IEEE International Conference on Data Mining Workshops, 2018

2017
Efficient parallel implementation of a density peaks clustering algorithm on graphics processing unit.
Frontiers Inf. Technol. Electron. Eng., 2017


  Loading...