Shwai He

According to our database1, Shwai He authored at least 23 papers between 2021 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Router-Tuning: A Simple and Effective Approach for Enabling Dynamic-Depth in Transformers.
CoRR, 2024

What Matters in Transformers? Not All Attention is Needed.
CoRR, 2024

Loki: Low-Rank Keys for Efficient Sparse Attention.
CoRR, 2024

Demystifying the Compression of Mixture-of-Experts Through a Unified Framework.
CoRR, 2024

RESSA: Repair Sparse Vision-Language Models via Sparse Cross-Modality Adaptation.
CoRR, 2024

Reformatted Alignment.
CoRR, 2024

Accurate prediction of antibody function and structure using bio-inspired antibody language model.
Briefings Bioinform., 2024

Reformatted Alignment.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Reflection-Tuning: Data Recycling Improves LLM Instruction-Tuning.
CoRR, 2023

MerA: Merging Pretrained Adapters For Few-Shot Learning.
CoRR, 2023

SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

NeuralSlice: Neural 3D Triangle Mesh Reconstruction via Slicing 4D Tetrahedral Meshes.
Proceedings of the International Conference on Machine Learning, 2023

Merging Experts into One: Improving Computational Efficiency of Mixture of Experts.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

PAD-Net: An Efficient Framework for Dynamic Networks.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Cherry Hypothesis: Identifying the Cherry on the Cake for Dynamic Networks.
CoRR, 2022

SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters.
CoRR, 2022

Vega-MT: The JD Explore Academy Translation System for WMT22.
CoRR, 2022

When Sparsity Meets Dynamic Convolution.
CoRR, 2022

Vega-MT: The JD Explore Academy Machine Translation System for WMT22.
Proceedings of the Seventh Conference on Machine Translation, 2022

SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

2021
Multi-modal Attention Network for Stock Movements Prediction.
CoRR, 2021


  Loading...