Xiaozhe Ren

Orcid: 0000-0002-0432-5510

According to our database¹, Xiaozhe Ren authored at least 20 papers between 2019 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

DAPE V2: Process Attention Score as Feature Map for Length Extrapolation.

[BibT_eX]

[DOI]

CoRR, 2024

CAPE: Context-Adaptive Positional Encoding for Length Extrapolation.

[BibT_eX]

[DOI]

CoRR, 2024

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation.

[BibT_eX]

[DOI]

CoRR, 2024

Poster Abstract: Tasking Heterogeneous Sensor Systems with LLMs.

[BibT_eX]

[DOI]

Proceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems, 2024

ScheMoE: An Extensible Mixture-of-Experts Distributed Training System with Tasks Scheduling.

[BibT_eX]

[DOI]

Proceedings of the Nineteenth European Conference on Computer Systems, 2024

2023

A Survey of Reasoning with Foundation Models.

[BibT_eX]

[DOI]

CoRR, 2023

PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing.

[BibT_eX]

[DOI]

CoRR, 2023

EdgeFM: Leveraging Foundation Model for Open-set Learning on the Edge.

[BibT_eX]

[DOI]

Proceedings of the 21st ACM Conference on Embedded Networked Sensor Systems, 2023

Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A Study on Transformer Configuration and Training Objective.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

One Student Knows All Experts Know: From Sparse to Dense.

[BibT_eX]

[DOI]

Proceedings of the First Tiny Papers Track at ICLR 2023, 2023

CAME: Confidence-guided Adaptive Memory Efficient Optimization.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

Deeper vs Wider: A Revisit of Transformer Configuration.

[BibT_eX]

[DOI]

CoRR, 2022

AutoBERT-Zero: Evolving BERT Backbone from Scratch.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Large-Scale Deep Learning Optimizations: A Comprehensive Survey.

[BibT_eX]

[DOI]

CoRR, 2021

NumGPT: Improving Numeracy Ability of Generative Pre-trained Models.

[BibT_eX]

[DOI]

CoRR, 2021

PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation.

[BibT_eX]

[DOI]

CoRR, 2021

SparseBERT: Rethinking the Importance Analysis in Self-attention.

[BibT_eX]

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021

EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

2019

NEZHA: Neural Contextualized Representation for Chinese Language Understanding.

[BibT_eX]

[DOI]

CoRR, 2019

Xiaozhe Ren

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...