Shijie Cao

Orcid: 0009-0000-2001-3763

According to our database1, Shijie Cao authored at least 16 papers between 2013 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation.
CoRR, 2024

2023
Inverse model and adaptive neighborhood search based cooperative optimizer for energy-efficient distributed flexible job shop scheduling.
Swarm Evol. Comput., December, 2023

AFPQ: Asymmetric Floating Point Quantization for LLMs.
CoRR, 2023

Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference.
CoRR, 2023

Integer or Floating Point? New Outlooks for Low-Bit Quantization on Large Language Models.
CoRR, 2023

NN-Stretch: Automatic Neural Network Branching for Parallel Inference on Heterogeneous Multi-Processors.
Proceedings of the 21st Annual International Conference on Mobile Systems, 2023

Efficient GPU Kernels for N: M-Sparse Weights in Deep Learning.
Proceedings of the Sixth Conference on Machine Learning and Systems, 2023

Accurate and Structured Pruning for Efficient Automatic Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Adam Accumulation to Reduce Memory Footprints of Both Activations and Gradients for Large-Scale DNN Training.
Proceedings of the ECAI 2023 - 26th European Conference on Artificial Intelligence, September 30 - October 4, 2023, Kraków, Poland, 2023

2021
Dense-to-Sparse Gate for Mixture-of-Experts.
CoRR, 2021

Building a COVID-19 Literature Knowledge Graph Based on PubMed.
Proceedings of 2021 International Conference on Medical Imaging and Computer-Aided Diagnosis, 2021

2019
FlexSaaS: A Reconfigurable Accelerator for Web Search Selection.
ACM Trans. Reconfigurable Technol. Syst., 2019

Efficient and Effective Sparse LSTM on FPGA with Bank-Balanced Sparsity.
Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2019

SeerNet: Predicting Convolutional Neural Network Feature-Map Sparsity Through Low-Bit Quantization.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Balanced Sparsity for Efficient DNN Inference on GPU.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2013
Information Technology Education Based on Cloud Computing.
Proceedings of the Information Computing and Applications - 4th International Conference, 2013


  Loading...