Zihang Dai

According to our database1, Zihang Dai authored at least 31 papers between 2016 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Combined scaling for zero-shot transfer learning.
Neurocomputing, October, 2023

2022
Transformer Quality in Linear Time.
Proceedings of the International Conference on Machine Learning, 2022

SimVLM: Simple Visual Language Model Pretraining with Weak Supervision.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
Combined Scaling for Zero-shot Transfer Learning.
CoRR, 2021

Primer: Searching for Efficient Transformers for Language Modeling.
CoRR, 2021

Searching for Efficient Transformers for Language Modeling.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Combiner: Full Attention Transformer with Sparse Computation Cost.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Pay Attention to MLPs.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

CoAtNet: Marrying Convolution and Attention for All Data Sizes.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Meta Pseudo Labels.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

2020
Unsupervised Parallel Corpus Mining on Web Data.
CoRR, 2020

Unsupervised Data Augmentation for Consistency Training.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Wiki-40B: Multilingual Language Model Dataset.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

A Mutual Information Maximization Perspective of Language Representation Learning.
Proceedings of the 8th International Conference on Learning Representations, 2020

2019
Unsupervised Data Augmentation.
CoRR, 2019

XLNet: Generalized Autoregressive Pretraining for Language Understanding.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Re-examination of the Role of Latent Variables in Sequence Modeling.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Characterizing and Avoiding Negative Transfer.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Transformer-XL: Attentive Language Models beyond a Fixed-Length Context.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Fast and Simple Mixture of Softmaxes with BPE and Hybrid-LightRNN for Language Generation.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Breaking the Softmax Bottleneck: A High-Rank RNN Language Model.
Proceedings of the 6th International Conference on Learning Representations, 2018

Large-scale Cloze Test Dataset Created by Teachers.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

SwitchOut: an Efficient Data Augmentation Algorithm for Neural Machine Translation.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

From Credit Assignment to Entropy Regularization: Two New Algorithms for Neural Sequence Prediction.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017
Large-scale Cloze Test Dataset Designed by Teachers.
CoRR, 2017

Controllable Invariance through Adversarial Feature Learning.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Good Semi-supervised Learning That Requires a Bad GAN.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Calibrating Energy-based Generative Adversarial Networks.
Proceedings of the 5th International Conference on Learning Representations, 2017

An Interpretable Knowledge Transfer Model for Knowledge Base Completion.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016
CFO: Conditional Focused Neural Question Answering with Large-scale Knowledge Bases.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016


  Loading...