Jiawei Fei

Orcid: 0000-0001-9325-0516

According to our database1, Jiawei Fei authored at least 23 papers between 2017 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
OmNICCL: Zero-cost Sparse AllReduce with Direct Cache Access and SmartNICs.
Proceedings of the 2024 SIGCOMM Workshop on Networks for AI Computing, 2024

2023
SLAMB: Accelerated Large Batch Training with Sparse Communication.
Proceedings of the International Conference on Machine Learning, 2023

2022
Unlocking the Power of Inline Floating-Point Operations on Programmable Switches.
Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation, 2022

TILE-SIM: A Systematic Approach to Systolic Array-based Accelerator Evaluation.
Proceedings of the International IEEE Symposium on Performance Analysis of Systems and Software, 2022

Mentha: Enabling Sparse-Packing Computation on Systolic Arrays.
Proceedings of the 51st International Conference on Parallel Processing, 2022

A Smartnic-Based Secure Aggregation Scheme for Federated Learning.
Proceedings of the 3rd International Conference on Computer Engineering and Intelligent Control Virtual Event, 2022

BP-Im2col: Implicit Im2col Supporting AI Backpropagation on Systolic Arrays.
Proceedings of the IEEE 40th International Conference on Computer Design, 2022

MZ Core: An Enhanced Matrix Acceleration Engine for HPC/ AI Applications.
Proceedings of the 24th IEEE Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, 2022

2021
Efficient sparse collective communication and its application to accelerate distributed deep learning.
Proceedings of the ACM SIGCOMM 2021 Conference, Virtual Event, USA, August 23-27, 2021., 2021

CNN+LSTM Accelerated Turbulent Flow Simulation with Link-Wise Artificial Compressibility Method.
Proceedings of the ICPP 2021: 50th International Conference on Parallel Processing, Lemont, IL, USA, August 9, 2021

2019
Metaflow: A DAG-Based Network Abstraction for Distributed Applications.
CoRR, 2019

Application-Oriented Network Scheduling With Metaflow.
IEEE Access, 2019

KVSwitch: An In-network Load Balancer for Key-Value Stores.
Proceedings of the 2019 IEEE Symposium on Computers and Communications, 2019

Metaflow: A Better Traffic Abstraction for Distributed Applications.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

SACC: Configuring Application-Level Cache Intelligently for In-Memory Database Based on Long Short-Term Memory.
Proceedings of the 21st IEEE International Conference on High Performance Computing and Communications; 17th IEEE International Conference on Smart City; 5th IEEE International Conference on Data Science and Systems, 2019

2018
Accelerating Deep Learning with a Parallel Mechanism Using CPU + MIC.
Int. J. Parallel Program., 2018

2017
A Fine-grained Parallel Approach for one Logical Process on Multi-core Machines.
Proceedings of the 10th EAI International Conference on Simulation Tools and Techniques, 2017

A Reusable Behavior Modeling Method Based on Atom Action and Atom Condition.
Proceedings of the 10th EAI International Conference on Simulation Tools and Techniques, 2017

An incremental face recognition system based on deep learning.
Proceedings of the Fifteenth IAPR International Conference on Machine Vision Applications, 2017

Parallel Computing in DNNs Using CPU and MIC.
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

OTR: A Fine-Grained Dynamic Power Scaling Pipeline Based on Trace.
Proceedings of the 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017

Performance Exploration for high Performance Simulation in Lightweight-Virtualization-Based Cloud.
Proceedings of the 10th International Symposium on Computational Intelligence and Design, 2017

A Model Description Transformation Tool: From Platform-Specific to Platform-Independent.
Proceedings of the 2017 International Conference on Management Engineering, 2017


  Loading...