Towards Comprehensive Preference Data Collection for Reward Modeling.
CoRR, 2024
Accelerating the Training of Large Language Models using Efficient Activation Rematerialization and Optimal Hybrid Parallelism.
Proceedings of the 2024 USENIX Annual Technical Conference, 2024
DSMPC-DCBF: A Hierarchical Parking Trajectory Optimization Method Based on MPC.
Proceedings of the 9th Asia-Pacific Conference on Intelligent Robot Systems, 2024
NAND-SPIN-based processing-in-MRAM architecture for convolutional neural network acceleration.
Sci. China Inf. Sci., April, 2023
S<sup>2</sup> Engine: A Novel Systolic Architecture for Sparse Convolutional Neural Networks.
IEEE Trans. Computers, 2022
S2Engine: A Novel Systolic Architecture for Sparse Convolutional Neural Networks.
CoRR, 2021
RoSearch: Search for Robust Student Architectures When Distilling Pre-trained Language Models.
CoRR, 2021
FedSkel: Efficient Federated Learning on Heterogeneous Systems with Skeleton Gradients Update.
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021
Accelerating CNN Training by Pruning Activation Gradients.
Proceedings of the Computer Vision - ECCV 2020, 2020
SparseTrain: Exploiting Dataflow Sparsity for Efficient Convolutional Neural Networks Training.
Proceedings of the 57th ACM/IEEE Design Automation Conference, 2020
Accelerating CNN Training by Sparsifying Activation Gradients.
CoRR, 2019