Xia Song

Orcid: 0000-0001-8482-8283

According to our database1, Xia Song authored at least 46 papers between 2015 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Scaling Optimal LR Across Token Horizon.
CoRR, 2024

WildFeedback: Aligning LLMs With In-situ User Interactions And Feedback.
CoRR, 2024

Efficient LLM Training and Serving with Heterogeneous Context Sharding among Attention Heads.
CoRR, 2024

The Hitchhiker's Guide to Human Alignment with *PO.
CoRR, 2024

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone.
CoRR, 2024

On the Adaptation of Unlimiformer for Decoder-Only Transformers.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Interpretable User Satisfaction Estimation for Conversational Systems with Large Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Language Is Not All You Need: Aligning Perception with Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Magneto: A Foundation Transformer.
Proceedings of the International Conference on Machine Learning, 2023

A Length-Extrapolatable Transformer.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Beyond English-Centric Bitexts for Better Multilingual Language Representation Learning.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Can people experience romantic love for artificial intelligence? An empirical study of intelligent assistants.
Inf. Manag., 2022

TorchScale: Transformers at Scale.
CoRR, 2022

Foundation Transformers.
CoRR, 2022

On the Representation Collapse of Sparse Mixture of Experts.
CoRR, 2022

METRO: Efficient Denoising Pretraining of Large Scale Autoencoding Language Models with Model Generated Signals.
CoRR, 2022

Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model.
CoRR, 2022

On the Representation Collapse of Sparse Mixture of Experts.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators.
Proceedings of the Tenth International Conference on Learning Representations, 2022

XLM-E: Cross-lingual Language Model Pre-training via ELECTRA.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

2021
Comparative Analysis of Two Machine Learning Algorithms in Predicting Site-Level Net Ecosystem Exchange in Major Biomes.
Remote. Sens., 2021

XLM-E: Cross-lingual Language Model Pre-training via ELECTRA.
CoRR, 2021

DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders.
CoRR, 2021

Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task.
Proceedings of the Sixth Conference on Machine Translation, 2021

COCO-LM: Correcting and Contrasting Text Sequences for Language Model Pretraining.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Language Scaling for Universal Suggested Replies Model.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers, 2021

InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Allocating Large Vocabulary Capacity for Cross-Lingual Language Model Pre-Training.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

mT6: Multilingual Pretrained Text-to-Text Transformer with Translation Pairs.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Consistency Regularization for Cross-Lingual Fine-Tuning.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Joint Task Offloading, CNN Layer Scheduling, and Resource Allocation in Cooperative Computing System.
IEEE Syst. J., 2020

XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders.
CoRR, 2020

Knowledge-Aware Language Model Pretraining.
CoRR, 2020

Leading Conversational Search by Suggesting Useful Questions.
Proceedings of the WWW '20: The Web Conference 2020, Taipei, Taiwan, April 20-24, 2020, 2020

Pushing the Limits of Narrow Precision Inferencing at Cloud Scale with Microsoft Floating Point.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Transformer-XH: Multi-Evidence Reasoning with eXtra Hop Attention.
Proceedings of the 8th International Conference on Learning Representations, 2020

2019
Security situation assessment for massive MIMO systems for 5G communications.
Future Gener. Comput. Syst., 2019

Generic Intent Representation in Web Search.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

An Axiomatic Approach to Regularizing Neural Ranking Models.
Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 2019

Research on recommender algorithm optimization based on statistics and preference model.
Proceedings of the 2nd International Conference on Artificial Intelligence and Pattern Recognition, 2019

Towards Language Agnostic Universal Representations.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Neural Ranking Models with Multiple Document Fields.
Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, 2018

2016
MS MARCO: A Human Generated MAchine Reading COmprehension Dataset.
Proceedings of the Workshop on Cognitive Computation: Integrating neural and symbolic approaches 2016 co-located with the 30th Annual Conference on Neural Information Processing Systems (NIPS 2016), 2016

Research on the Application of Data Mining in the Field of Electronic Commerce.
Proceedings of the Fuzzy Systems and Data Mining II, 2016

2015
Uncertain linguistic fuzzy soft sets and their applications in group decision making.
Appl. Soft Comput., 2015


  Loading...