Shwai He

According to our database¹, Shwai He authored at least 23 papers between 2021 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers.

[BibT_eX]

[DOI]

CoRR, 2024

What Matters in Transformers? Not All Attention is Needed.

[BibT_eX]

[DOI]

CoRR, 2024

Loki: Low-Rank Keys for Efficient Sparse Attention.

[BibT_eX]

[DOI]

CoRR, 2024

Demystifying the Compression of Mixture-of-Experts Through a Unified Framework.

[BibT_eX]

[DOI]

CoRR, 2024

RESSA: Repair Sparse Vision-Language Models via Sparse Cross-Modality Adaptation.

[BibT_eX]

[DOI]

Shwai He

Tianlong Chen

CoRR, 2024

Reformatted Alignment.

[BibT_eX]

[DOI]

CoRR, 2024

Accurate prediction of antibody function and structure using bio-inspired antibody language model.

[BibT_eX]

[DOI]

Briefings Bioinform., 2024

Reformatted Alignment.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023

Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning.

[BibT_eX]

[DOI]

CoRR, 2023

MerA: Merging Pretrained Adapters For Few-Shot Learning.

[BibT_eX]

[DOI]

CoRR, 2023

SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

NeuralSlice: Neural 3D Triangle Mesh Reconstruction via Slicing 4D Tetrahedral Meshes.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Merging Experts into One: Improving Computational Efficiency of Mixture of Experts.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

PAD-Net: An Efficient Framework for Dynamic Networks.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022

Cherry Hypothesis: Identifying the Cherry on the Cake for Dynamic Networks.

[BibT_eX]

[DOI]

CoRR, 2022

SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters.

[BibT_eX]

[DOI]

CoRR, 2022

Vega-MT: The JD Explore Academy Translation System for WMT22.

[BibT_eX]

[DOI]

CoRR, 2022

When Sparsity Meets Dynamic Convolution.

[BibT_eX]

[DOI]

CoRR, 2022

Vega-MT: The JD Explore Academy Machine Translation System for WMT22.

[BibT_eX]

[DOI]

Proceedings of the Seventh Conference on Machine Translation, 2022

SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2021

Multi-modal Attention Network for Stock Movements Prediction.

[BibT_eX]

[DOI]

Shwai He

Shi Gu

CoRR, 2021

Shwai He

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...