Zihang Dai

According to our database¹, Zihang Dai authored at least 31 papers between 2016 and 2023.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

2016

2017

2018

2019

2020

2021

2022

2023

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2023

Combined scaling for zero-shot transfer learning.

[BibT_eX]

[DOI]

Neurocomputing, October, 2023

2022

Transformer Quality in Linear Time.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

SimVLM: Simple Visual Language Model Pretraining with Weak Supervision.

[BibT_eX]

[DOI]

Proceedings of the Tenth International Conference on Learning Representations, 2022

2021

Combined Scaling for Zero-shot Transfer Learning.

[BibT_eX]

[DOI]

CoRR, 2021

Primer: Searching for Efficient Transformers for Language Modeling.

[BibT_eX]

[DOI]

CoRR, 2021

Searching for Efficient Transformers for Language Modeling.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Combiner: Full Attention Transformer with Sparse Computation Cost.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Pay Attention to MLPs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

CoAtNet: Marrying Convolution and Attention for All Data Sizes.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Meta Pseudo Labels.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020

Unsupervised Parallel Corpus Mining on Web Data.

[BibT_eX]

[DOI]

Guokun Lai

Zihang Dai

Yiming Yang

CoRR, 2020

Unsupervised Data Augmentation for Consistency Training.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Wiki-40B: Multilingual Language Model Dataset.

[BibT_eX]

[DOI]

Proceedings of The 12th Language Resources and Evaluation Conference, 2020

A Mutual Information Maximization Perspective of Language Representation Learning.

[BibT_eX]

[DOI]

Lingpeng Kong

Cyprien de Masson d'Autume

Proceedings of the 8th International Conference on Learning Representations, 2020

2019

Unsupervised Data Augmentation.

[BibT_eX]

[DOI]

CoRR, 2019

XLNet: Generalized Autoregressive Pretraining for Language Understanding.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Re-examination of the Role of Latent Variables in Sequence Modeling.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Characterizing and Avoiding Negative Transfer.

[BibT_eX]

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Transformer-XL: Attentive Language Models beyond a Fixed-Length Context.

[BibT_eX]

[DOI]

Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Fast and Simple Mixture of Softmaxes with BPE and Hybrid-LightRNN for Language Generation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Breaking the Softmax Bottleneck: A High-Rank RNN Language Model.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

Large-scale Cloze Test Dataset Created by Teachers.

[BibT_eX]

[DOI]

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

SwitchOut: an Efficient Data Augmentation Algorithm for Neural Machine Translation.

[BibT_eX]

[DOI]

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction.

[BibT_eX]

[DOI]

Zihang Dai

Qizhe Xie

Eduard H. Hovy

Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017

Large-scale Cloze Test Dataset Designed by Teachers.

[BibT_eX]

[DOI]

CoRR, 2017

Controllable Invariance through Adversarial Feature Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Good Semi-supervised Learning That Requires a Bad GAN.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Calibrating Energy-based Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Learning Representations, 2017

An Interpretable Knowledge Transfer Model for Knowledge Base Completion.

[BibT_eX]

[DOI]

Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016

CFO: Conditional Focused Neural Question Answering with Large-scale Knowledge Bases.

[BibT_eX]

[DOI]

Zihang Dai

Lei Li

Wei Xu

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

Zihang Dai

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...