2025

Pangu Light: Weight Re-Initialization for Pruning and Accelerating LLMs.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, May, 2025

Pangu Ultra MoE: How to Train Your Big MoE on Ascend NPUs.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Changzheng Zhang

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, May, 2025

Pangu Ultra: Pushing the Limits of Dense Large Language Models on Ascend NPUs.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Changzheng Zhang

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

CoRR, April, 2025

Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving.

[DOI]

,

,

,

,

,

,

,

,

,

,

CoRR, February, 2025

Optical Mie Scattering Concentration-Dynamics Tracer Joint Detection Method for Micron-Nano Dust in GIS/GIL Systems.

[DOI]

,

,

,

,

,

,

,

IEEE Trans. Instrum. Meas., 2025

Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification.

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

2024

CoIR: A Comprehensive Benchmark for Code Information Retrieval Models.

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2024

DQ-LoRe: Dual Queries with Low Rank Approximation Re-ranking for In-Context Learning.

[DOI]

,

,

Chuanyang Zheng

,

,

,

,

,

,

,

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Preparing Lessons for Progressive Training on Language Models.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

FIMO: A Challenge Formal Dataset for Automated Theorem Proving.

[DOI]

,

,

,

,

,

,

,

Chuanyang Zheng

,

,

,

,

CoRR, 2023

Reusing Pretrained Models by Multi-linear Operators for Efficient Training.

[DOI]

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

TRIGO: Benchmarking Formal Mathematical Proof Reduction for Generative Language Models.

[DOI]

,

,

,

,

,

,

,

,

,

,

Chuanyang Zheng

,

,

,

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

DT-Solver: Automated Theorem Proving with Dynamic-Tree Sampling Guided by Proof-level Value Function.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

One Cannot Stand for Everyone! Leveraging Multiple User Simulators to train Task-oriented Dialogue Systems.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

AutoConv: Automatically Generating Information-seeking Conversations with Large Language Models.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

NewsDialogues: Towards Proactive News Grounded Conversation.

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022

G-MAP: General Memory-Augmented Pre-trained Language Model for Domain Tasks.

[DOI]

,

,

,

,

,

,

,

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

FPT: Improving Prompt Tuning Efficiency via Progressive Training.

[DOI]

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

bert2BERT: Towards Reusable Pretrained Language Models.

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021

Improving task-agnostic BERT distillation with layer mapping search.

[DOI]

,

,

,

,

,

,

,

,

Neurocomputing, 2021

Integrating Regular Expressions with Neural Networks via DFA.

[DOI]

,

,

,

,

,

,

,

CoRR, 2021

LightMBERT: A Simple Yet Effective Method for Multilingual BERT Distillation.

[DOI]

,

,

,

,

,

,

,

CoRR, 2021

Extract then Distill: Efficient and Effective Task-Agnostic BERT Distillation.

[DOI]

,

,

,

,

,

,

Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2021, 2021

Generate & Rank: A Multi-task Framework for Math Word Problems.

[DOI]

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

AutoTinyBERT: Automatic Hyper-parameter Optimization for Efficient Pre-trained Language Models.

[DOI]

,

,

,

,

,

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020

The Solution of Huawei Cloud & Noah's Ark Lab to the NLPCC-2020 Challenge: Light Pre-Training Chinese Language Model for NLP Task.

[DOI]

,

,

,

,

,

Proceedings of the Natural Language Processing and Chinese Computing, 2020

TernaryBERT: Distillation-aware Ultra-low Bit BERT.

[DOI]

,

,

,

,

,

,

Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

TinyBERT: Distilling BERT for Natural Language Understanding.

[DOI]

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

PoD: Positional Dependency-Based Word Embedding for Aspect Term Extraction.

[DOI]

,

,

Proceedings of the 28th International Conference on Computational Linguistics, 2020

Dialog State Tracking with Reinforced Data Augmentation.

[DOI]

,

,

,

,

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

More Chinese women needed to hold up half the computing sky.

[DOI]

,

Proceedings of the ACM Turing Celebration Conference - China, 2019

2017

NNEMBs at SemEval-2017 Task 4: Neural Twitter Sentiment Classification: a Simple Ensemble Method with Different Embeddings.

[DOI]

,

,

Proceedings of the 11th International Workshop on Semantic Evaluation, 2017

Socialized Word Embeddings.

[DOI]

,

,

,

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Document-Level Multi-Aspect Sentiment Classification as Machine Comprehension.

[DOI]

,

,

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

2016

Unsupervised Word and Dependency Path Embeddings for Aspect Term Extraction.

[DOI]

,

,

,

,

,

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

2015

Splusplus: A Feature-Rich Two-stage Classifier for Sentiment Analysis of Tweets.

[DOI]

,

,

,

,

Proceedings of the 9th International Workshop on Semantic Evaluation, 2015