Correction to: MOSS: An Open Conversational Large Language Model.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Mach. Intell. Res., December, 2024
MOSS: An Open Conversational Large Language Model.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Mach. Intell. Res., October, 2024
Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective.
CoRR, 2024
GAOKAO-Eval: Does high scores truly reflect strong capabilities in LLMs?
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments.
CoRR, 2024
Llama Scope: Extracting Millions of Features from Llama-3.1-8B with Sparse Autoencoders.
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures.
CoRR, 2024
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance.
CoRR, 2024
Data-freeWeight Compress and Denoise for Large Language Models.
CoRR, 2024
Memorize Step by Step: Efficient Long-Context Prefilling with Incremental Memory and Decremental Chunk.
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Turn Waste into Worth: Rectifying Top-k Router of MoE.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
The Open-World Lottery Ticket Hypothesis for OOD Intent Classification.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
Code Needs Comments: Enhancing Code LLMs with Comment Augmentation.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Findings of the Association for Computational Linguistics, 2024
DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
Graph Structure Learning via Lottery Hypothesis at Scale.
Proceedings of the Asian Conference on Machine Learning, 2023
Two Birds One Stone: Dynamic Ensemble for OOD Intent Classification.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
A Probabilistic Framework for Discovering New Intents.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
Towards Open Environment Intent Prediction.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
UTC-IE: A Unified Token-pair Classification Architecture for Information Extraction.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
Discovering New Intents Using Latent Variables.
CoRR, 2022
An Open-World Lottery Ticket for Out-of-Domain Intent Classification.
CoRR, 2022
What Dense Graph Do You Need for Self-Attention?
CoRR, 2022
What Dense Graph Do You Need for Self-Attention?
Proceedings of the International Conference on Machine Learning, 2022
BBTv2: Towards a Gradient-Free Future with Large Language Models.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
KNN-Contrastive Learning for Out-of-Domain Intent Classification.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
Early Exiting with Ensemble Internal Classifiers.
CoRR, 2021
Mention Recommendation in Twitter with Cooperative Multi-Agent Reinforcement Learning.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019
A Uniform Construction of New Exact Travelling Wave Solutions and its Applications.
Proceedings of the International Conference on Networked Computing and Advanced Information Management, 2009