Xiaozhe Ren

Orcid: 0000-0002-0432-5510

According to our database1, Xiaozhe Ren authored at least 20 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation.
CoRR, 2024

CAPE: Context-Adaptive Positional Encoding for Length Extrapolation.
CoRR, 2024

Poster Abstract: Tasking Heterogeneous Sensor Systems with LLMs.
Proceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems, 2024

ScheMoE: An Extensible Mixture-of-Experts Distributed Training System with Tasks Scheduling.
Proceedings of the Nineteenth European Conference on Computer Systems, 2024

PIXART-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
A Survey of Reasoning with Foundation Models.
CoRR, 2023

PanGu-Σ: Towards Trillion Parameter Language Model with Sparse Heterogeneous Computing.
CoRR, 2023

EdgeFM: Leveraging Foundation Model for Open-set Learning on the Edge.
Proceedings of the 21st ACM Conference on Embedded Networked Sensor Systems, 2023

Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A Study on Transformer Configuration and Training Objective.
Proceedings of the International Conference on Machine Learning, 2023

One Student Knows All Experts Know: From Sparse to Dense.
Proceedings of the First Tiny Papers Track at ICLR 2023, 2023

CAME: Confidence-guided Adaptive Memory Efficient Optimization.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Deeper vs Wider: A Revisit of Transformer Configuration.
CoRR, 2022

AutoBERT-Zero: Evolving BERT Backbone from Scratch.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Large-Scale Deep Learning Optimizations: A Comprehensive Survey.
CoRR, 2021

NumGPT: Improving Numeracy Ability of Generative Pre-trained Models.
CoRR, 2021

PanGu-α: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation.
CoRR, 2021

SparseBERT: Rethinking the Importance Analysis in Self-attention.
Proceedings of the 38th International Conference on Machine Learning, 2021

EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

2019
NEZHA: Neural Contextualized Representation for Chinese Language Understanding.
CoRR, 2019


  Loading...