WorldModelBench: Judging Video Generation Models As World Models.
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, February, 2025
S*: Test Time Scaling for Code Generation.
CoRR, February, 2025
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters!
,
,
,
,
,
,
,
,
,
,
,
CoRR, February, 2025
Locality-aware Fair Scheduling in LLM Serving.
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, January, 2025
GraphPipe: Improving Performance and Scalability of DNN Training with Graph Pipeline Parallelism.
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2025
MoE-Lightning: High-Throughput MoE Inference on Memory-constrained GPUs.
Proceedings of the 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2025
NEO: Saving GPU Memory Crisis with CPU Offloading for Online LLM Inference.
CoRR, 2024
Atlas: Hierarchical Partitioning for Quantum Circuit Simulation on GPUs (Extended Version).
CoRR, 2024
Optimizing LLM Queries in Relational Workloads.
CoRR, 2024
Atlas: Hierarchical Partitioning for Quantum Circuit Simulation on GPUs.
Proceedings of the International Conference for High Performance Computing, 2024
Fairness in Serving Large Language Models.
Proceedings of the 18th USENIX Symposium on Operating Systems Design and Implementation, 2024
SGLang: Efficient Execution of Structured Language Model Programs.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
SLoRA: Scalable Serving of Thousands of LoRA Adapters.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Seventh Annual Conference on Machine Learning and Systems, 2024
Understanding Spatial-Temporal Interactions of Ecosystem Services and Their Drivers in a Multi-Scale Perspective of Miluo Using Multi-Source Remote Sensing Data.
Remote. Sens., July, 2023
Efficiently Programming Large Language Models using SGLang.
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
S-LoRA: Serving Thousands of Concurrent LoRA Adapters.
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
Low-loss Mode Field Adapter Using Reverse Tapering for Fundamental Mode Transmission over MMFs.
Proceedings of the Optical Fiber Communications Conference and Exhibition, 2022
Novel Mirror-flipped Mode Permutation Technique for Long-haul Mode-division Multiplexing Transmissions.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Optical Fiber Communications Conference and Exhibition, 2022
LightPro: Lightweight Probabilistic Workload Prediction Framework for Database-as-a-Service.
Proceedings of the IEEE International Conference on Web Services, 2022
Accelerating Data Serialization/Deserialization Protocols with In-Network Compute.
Proceedings of the IEEE/ACM International Workshop on Exascale MPI, 2022
AdaM: An Adaptive Fine-Grained Scheme for Distributed Metadata Management.
Proceedings of the 48th International Conference on Parallel Processing, 2019
Demonstration of ultra-compact contentionless-ROADM based on flexible wavelength router.
Proceedings of the European Conference on Optical Communication, 2014
A hybrid seasonal prediction model for tuberculosis incidence in China.
BMC Medical Informatics Decis. Mak., 2013
Green and agile petabit optical sub-wavelength switching prototype for the future OTN multi-chassis switch cluster.
Proceedings of the 2013 Optical Fiber Communication Conference and Exposition and the National Fiber Optic Engineers Conference (OFC/NFOEC), 2013