Reza Yazdani
Orcid: 0000-0002-7949-6453
According to our database1,
Reza Yazdani
authored at least 28 papers
between 2016 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
Patterns and factors associated with dental service utilization among insured people: a data mining approach.
BMC Medical Informatics Decis. Mak., December, 2024
Strategies for Humanitarian Logistics and Supply Chain in Organizational Contexts: Pre- and Post-Disaster Management Perspectives.
Syst., 2024
DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference.
CoRR, 2024
System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2024
2023
ACM Trans. Embed. Comput. Syst., March, 2023
The hybrid DHP method for evaluation, ranking and selection of green suppliers in the supply chain.
Int. J. Math. Oper. Res., 2023
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks.
CoRR, 2023
ZeroQuant-HERO: Hardware-Enhanced Robust Optimized Post-Training Quantization Framework for W8A8 Transformers.
CoRR, 2023
DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales.
CoRR, 2023
Understanding INT4 Quantization for Transformer Models: Latency Speedup, Composability, and Failure Cases.
CoRR, 2023
Understanding Int4 Quantization for Language Models: Latency Speedup, Composability, and Failure Cases.
Proceedings of the International Conference on Machine Learning, 2023
2022
A lion optimization algorithm for an integrating maintenance planning and production scheduling problem with a total absolute deviation of completion times objective.
Soft Comput., December, 2022
Exploring the impacts of COVID-19 pandemic on risks faced by infrastructure projects in Pakistan.
Int. J. Appl. Decis. Sci., 2022
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model.
CoRR, 2022
DeepSpeed- Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale.
Proceedings of the SC22: International Conference for High Performance Computing, 2022
ZeroQuant: Efficient and Affordable Post-Training Quantization for Large-Scale Transformers.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
DeepSpeed-MoE: Advancing Mixture-of-Experts Inference and Training to Power Next-Generation AI Scale.
Proceedings of the International Conference on Machine Learning, 2022
2021
Proceedings of the 2021 USENIX Annual Technical Conference, 2021
2020
IEEE Trans. Computers, 2020
2019
PhD thesis, 2019
IEEE Trans. Computers, 2019
LSTM-Sharp: An Adaptable, Energy-Efficient Hardware Accelerator for Long Short-Term Memory.
CoRR, 2019
Proceedings of the 28th International Conference on Parallel Architectures and Compilation Techniques, 2019
2018
Proceedings of the 45th ACM/IEEE Annual International Symposium on Computer Architecture, 2018
2017
Low-Power Automatic Speech Recognition Through a Mobile GPU and a Viterbi Accelerator.
IEEE Micro, 2017
Proceedings of the 50th Annual IEEE/ACM International Symposium on Microarchitecture, 2017
2016
Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture, 2016
Proceedings of the 2016 Design, Automation & Test in Europe Conference & Exhibition, 2016