2025
Group Preference Alignment: Customized LLM Response Generation from In-Situ Conversations.
CoRR, March, 2025
Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, March, 2025
GenTool: Enhancing Tool Generalization in Language Models through Zero-to-One and Weak-to-Strong Simulation.
CoRR, February, 2025
A Practical Analysis of Human Alignment with *PO.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025
2024
Training Recommenders Over Large Item Corpus With Importance Sampling.
IEEE Trans. Knowl. Data Eng., December, 2024
POROver: Improving Safety and Reducing Overrefusal in Large Language Models with Overgeneration and Preference Optimization.
CoRR, 2024
Scaling Laws for Multilingual Language Models.
CoRR, 2024
Scaling Optimal LR Across Token Horizon.
CoRR, 2024
WildFeedback: Aligning LLMs With In-situ User Interactions And Feedback.
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Efficient LLM Training and Serving with Heterogeneous Context Sharding among Attention Heads.
CoRR, 2024
The Hitchhiker's Guide to Human Alignment with *PO.
CoRR, 2024
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
On the Adaptation of Unlimiformer for Decoder-Only Transformers.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
Interpretable User Satisfaction Estimation for Conversational Systems with Large Language Models.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
2023
Language Is Not All You Need: Aligning Perception with Language Models.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Magneto: A Foundation Transformer.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the International Conference on Machine Learning, 2023
A Length-Extrapolatable Transformer.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
2022
Can people experience romantic love for artificial intelligence? An empirical study of intelligent assistants.
Inf. Manag., 2022
TorchScale: Transformers at Scale.
,
,
,
,
,
,
,
,
,
,
CoRR, 2022
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2022
On the Representation Collapse of Sparse Mixture of Experts.
CoRR, 2022
METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals.
CoRR, 2022
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2022
On the Representation Collapse of Sparse Mixture of Experts.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators.
Proceedings of the Tenth International Conference on Learning Representations, 2022
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA.
,
,
,
,
,
,
,
,
,
,
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
2021
Comparative Analysis of Two Machine Learning Algorithms in Predicting Site-Level Net Ecosystem Exchange in Major Biomes.
,
,
,
,
,
,
,
,
,
,
,
,
Remote. Sens., 2021
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA.
CoRR, 2021
DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders.
CoRR, 2021
Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Sixth Conference on Machine Translation, 2021
COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
Language Scaling for Universal Suggested Replies Model.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers, 2021
InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021
Allocating Large Vocabulary Capacity for Cross-Lingual Language Model Pre-Training.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
Consistency Regularization for Cross-Lingual Fine-Tuning.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021
2020
Joint Task Offloading, CNN Layer Scheduling, and Resource Allocation in Cooperative Computing System.
IEEE Syst. J., 2020
XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders.
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2020
Knowledge-Aware Language Model Pretraining.
CoRR, 2020
Leading Conversational Search by Suggesting Useful Questions.
Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020
Pushing the Limits of Narrow Precision Inferencing at Cloud Scale with Microsoft Floating Point.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention.
Proceedings of the 8th International Conference on Learning Representations, 2020
2019
Security situation assessment for massive MIMO systems for 5G communications.
Future Gener. Comput. Syst., 2019
Generic Intent Representation in Web Search.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019
An Axiomatic Approach to Regularizing Neural Ranking Models.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019
Research on recommender algorithm optimization based on statistics and preference model.
Proceedings of the 2nd International Conference on Artificial Intelligence and Pattern Recognition, 2019
Towards Language Agnostic Universal Representations.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019
2018
Neural Ranking Models with Multiple Document Fields.
Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, 2018
2016
MS MARCO: A Human Generated MAchine Reading COmprehension Dataset.
Proceedings of the Workshop on Cognitive Computation: Integrating neural and symbolic approaches 2016 co-located with the 30th Annual Conference on Neural Information Processing Systems (NIPS 2016), 2016
Research on the Application of Data Mining in the Field of Electronic Commerce.
Proceedings of the Fuzzy Systems and Data Mining II, 2016
2015
Uncertain linguistic fuzzy soft sets and their applications in group decision making.
Appl. Soft Comput., 2015