CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
MEPipe: Democratizing LLM Training with Memory-Efficient Slice-Level Pipeline Scheduling on Cost-Effective Accelerators.
Proceedings of the Twentieth European Conference on Computer Systems, 2025
Enabling Window-Based Monotonic Graph Analytics with Reusable Transitional Results for Pattern-Consistent Queries.
Proc. VLDB Endow., July, 2024
CogVLM2: Visual Language Models for Image and Video Understanding.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
SampleAttention: Near-Lossless Acceleration of Long Context LLM Inference with Adaptive Structured Sparse Attention.
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
AdaPipe: Optimizing Pipeline Parallelism with Adaptive Recomputation and Partitioning.
Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, 2024
TriCache: A User-Transparent Block Cache Enabling High-Performance Out-of-Core Processing with In-Memory Programs.
ACM Trans. Storage, May, 2023
GeaFlow: A Graph Extended and Accelerated Dataflow System.
Proc. ACM Manag. Data, 2023
BaGuaLu: targeting brain scale pretrained models with over 37 million cores.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the PPoPP '22: 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Seoul, Republic of Korea, April 2, 2022
Efficiently emulating high-bitwidth computation with low-bitwidth hardware.
Proceedings of the ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28, 2022
Chukonu: A Fully-Featured Big Data Processing System by Efficiently Integrating a Native Compute Engine into Spark.
Proc. VLDB Endow., 2021
LotusSQL: SQL engine for high-performance big data systems.
Big Data Min. Anal., 2021
RisGraph: A Real-Time Streaming System for Evolving Graphs to Support Sub-millisecond Per-update Analysis at Millions Ops/s.
Proceedings of the SIGMOD '21: International Conference on Management of Data, 2021
LiveGraph: A Transactional Graph Storage System with Purely Sequential Adjacency List Scans.
Proc. VLDB Endow., 2020
RisGraph: A Real-Time Streaming System for Evolving Graphs.
CoRR, 2020
LiveGraph: A Transactional Graph Storage System with Purely Sequential Adjacency List Scans.
CoRR, 2019
T2S-Tensor: Productively Generating High-Performance Spatial Hardware for Dense Tensor Computations.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 27th IEEE Annual International Symposium on Field-Programmable Custom Computing Machines, 2019
Student cluster competition 2017, team Tsinghua University: Reproducing vectorization of the tersoff multi-body potential on the Intel Skylake and NVIDIA Volta architectures.
,
,
,
,
,
,
,
,
,
,
,
Parallel Comput., 2018