2025
Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought.
CoRR, May, 2025

MEMO: Fine-grained Tensor Management For Ultra-long Context LLM Training.
Proc. ACM Manag. Data, February, 2025

BeamVQ: Beam Search with Vector Quantization to Mitigate Data Scarcity in Physical Spatiotemporal Forecasting.
CoRR, February, 2025

Hunyuan3D 2.0: Scaling Diffusion Models for High Resolution Textured 3D Assets Generation.
CoRR, January, 2025

Scaling Laws for Floating Point Quantization Training.
CoRR, January, 2025

2024
HunyuanVideo: A Systematic Framework For Large Video Generative Models.
CoRR, 2024

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent.
CoRR, 2024

Efficiently Training 7B LLM with 1 Million Sequence Length on 8 GPUs.
CoRR, 2024

BeamVQ: Aligning Space-Time Forecasting Model via Self-training on Physics-aware Metrics.
CoRR, 2024

Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding.
CoRR, 2024

PURE: Prompt Evolution with Graph ODE for Out-of-distribution Fluid Dynamics Modeling.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Surge Phenomenon in Optimal Learning Rate and Batch Size Scaling.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

2023
Angel-PTM: A Scalable and Economical Large-scale Pre-training System in Tencent.
Proc. VLDB Endow., 2023

2021
M6: A Chinese Multimodal Pretrainer.
CoRR, 2021

2020
A Multi-Semantic Metapath Model for Large Scale Heterogeneous Network Representation Learning.
CoRR, 2020