2025
On Efficient Training of Large-Scale Deep Learning Models.
ACM Comput. Surv., March, 2025
Explicit and Implicit Box Equivariance Learning for Weakly-Supervised Rotated Object Detection.
IEEE Trans. Emerg. Top. Comput. Intell., February, 2025
Hypnos: A domain-specific large language model for anesthesiology.
,
,
,
,
,
,
,
,
,
,
,
Neurocomputing, 2025
Code-switching finetuning: Bridging multilingual pretrained language models for enhanced cross-lingual performance.
Eng. Appl. Artif. Intell., 2025
Intention Analysis Makes LLMs A Good Jailbreak Defender.
Proceedings of the 31st International Conference on Computational Linguistics, 2025
Self-Evolution Knowledge Distillation for LLM-based Machine Translation.
Proceedings of the 31st International Conference on Computational Linguistics, 2025
MQM-APE: Toward High-Quality Error Annotation Predictors with Automatic Post-Editing in LLM Translation Evaluators.
Proceedings of the 31st International Conference on Computational Linguistics, 2025
2024
Can Linguistic Knowledge Improve Multimodal Alignment in Vision-Language Pretraining?
ACM Trans. Multim. Comput. Commun. Appl., December, 2024
E2S2: Encoding-Enhanced Sequence-to-Sequence Pretraining for Language Understanding and Generation.
IEEE Trans. Knowl. Data Eng., December, 2024
Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks.
IEEE Trans. Neural Networks Learn. Syst., October, 2024
Free-Form Composition Networks for Egocentric Action Recognition.
IEEE Trans. Circuits Syst. Video Technol., October, 2024
PanDa: Prompt Transfer Meets Knowledge Distillation for Efficient Model Adaptation.
IEEE Trans. Knowl. Data Eng., September, 2024
AdaSAM: Boosting sharpness-aware minimization with adaptive learning rate and momentum for training deep neural networks.
Neural Networks, January, 2024
Parameter-Efficient and Student-Friendly Knowledge Distillation.
IEEE Trans. Multim., 2024
Exploring sparsity in graph transformers.
Neural Networks, 2024
DynamicKV: Task-Aware Adaptive KV Cache Compression for Long Context LLMs.
CoRR, 2024
Simultaneous Computation and Memory Efficient Zeroth-Order Optimizer for Fine-Tuning Large Language Models.
CoRR, 2024
Exploring and Enhancing the Transfer of Distribution in Knowledge Distillation for Autoregressive Language Models.
CoRR, 2024
USCD: Improving Code Generation of LLMs by Uncertainty-Aware Selective Contrastive Decoding.
CoRR, 2024
Divide, Conquer and Combine: A Training-Free Framework for High-Resolution Image Perception in Multimodal Large Language Models.
CoRR, 2024
Aligning Large Language Models from Self-Reference AI Feedback with one General Principle.
CoRR, 2024
Demystifying the Compression of Mixture-of-Experts Through a Unified Framework.
CoRR, 2024
Improving Complex Reasoning over Knowledge Graph with Logic-Aware Curriculum Tuning.
CoRR, 2024
Achieving >97% on GSM8K: Deeply Understanding the Problems Makes LLMs Better Reasoners.
CoRR, 2024
Building Accurate Translation-Tailored LLMs with Language Aware Instruction Tuning.
CoRR, 2024
Towards Training A Chinese Large Language Model for Anesthesiology.
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Balanced Similarity with Auxiliary Prompts: Towards Alleviating Text-to-Image Retrieval Bias for CLIP in Zero-shot Learning.
CoRR, 2024
Healthcare Copilot: Eliciting the Power of General LLMs for Medical Consultation.
CoRR, 2024
Mitigating Reward Hacking via Information-Theoretic Reward Modeling.
CoRR, 2024
Intention Analysis Prompting Makes Large Language Models A Good Jailbreak Defender.
CoRR, 2024
POMP: Probability-driven Meta-graph Prompter for LLMs in Low-resource Unsupervised Neural Machine Translation.
CoRR, 2024
InfoRM: Mitigating Reward Hacking in RLHF via Information-Theoretic Reward Modeling.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
WisdoM: Improving Multimodal Sentiment Analysis by Fusing Contextual World Knowledge.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Learning from Imperfect Data: Towards Efficient Knowledge Distillation of Autoregressive Language Models for Text-to-SQL.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Self-Powered LLM Modality Expansion for Large Speech-Text Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Revisiting Catastrophic Forgetting in Large Language Model Tuning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Context-aware Watermark with Semantic Balanced Green-red Lists for Large Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer.
Proceedings of the ECAI 2024 - 27th European Conference on Artificial Intelligence, 19-24 October 2024, Santiago de Compostela, Spain, 2024
Sheared Backpropagation for Fine-Tuning Foundation Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Take Care of Your Prompt Bias! Investigating and Mitigating Prompt Bias in Factual Knowledge Extraction.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
3AM: An Ambiguity-Aware Multi-Modal Machine Translation Dataset.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
ROSE Doesn't Do That: Boosting the Safety of Instruction-Tuned Large Language Models with Reverse Prompt Contrastive Decoding.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Revisiting Knowledge Distillation for Autoregressive Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
Speech Sense Disambiguation: Tackling Homophone Ambiguity in End-to-End Speech Translation.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
Uncertainty Aware Learning for Language Model Alignment.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
OOP: Object-Oriented Programming Evaluation Benchmark for Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Revisiting Demonstration Selection Strategies in In-Context Learning.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
POMP: Probability-driven Meta-graph Prompter for LLMs in Low-resource Unsupervised Neural Machine Translation.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
DB-LLM: Accurate Dual-Binarization for Efficient LLMs.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Multi-Step Denoising Scheduled Sampling: Towards Alleviating Exposure Bias for Diffusion Models.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
Efficient Federated Learning Via Local Adaptive Amended Optimizer With Linear Speedup.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023
A perioperative risk assessment dataset with multi-view data based on online accelerated pairwise comparison.
Inf. Fusion, November, 2023
Knowledge Graph Augmented Network Towards Multiview Representation Learning for Aspect-Based Sentiment Analysis.
IEEE Trans. Knowl. Data Eng., October, 2023
KE-X: Towards subgraph explanations of knowledge graph embedding based on knowledge information gain.
Knowl. Based Syst., October, 2023
Recurrent graph encoder for syntax-aware neural machine translation.
Int. J. Mach. Learn. Cybern., April, 2023
Dynamic Contrastive Distillation for Image-Text Retrieval.
IEEE Trans. Multim., 2023
Unified Instance and Knowledge Alignment Pretraining for Aspect-Based Sentiment Analysis.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
Concrete Subspace Learning based Interference Elimination for Multi-task Model Fusion.
CoRR, 2023
SpliceMix: A Cross-scale and Semantic Blending Augmentation Strategy for Multi-label Image Classification.
CoRR, 2023
Unlikelihood Tuning on Negative Samples Amazingly Improves Zero-Shot Translation.
CoRR, 2023
Deep Model Fusion: A Survey.
CoRR, 2023
MerA: Merging Pretrained Adapters For Few-Shot Learning.
CoRR, 2023
Recursively Summarizing Enables Long-Term Dialogue Memory in Large Language Models.
CoRR, 2023
Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot Text Classification Tasks.
CoRR, 2023
On Efficient Training of Large-Scale Deep Learning Models: A Literature Review.
CoRR, 2023
Error Analysis Prompting Enables Human-Like Translation Evaluation in Large Language Models: A Case Study on ChatGPT.
CoRR, 2023
OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge Collaborative AutoML System.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
Can ChatGPT Understand Too? A Comparative Study on ChatGPT and Fine-tuned BERT.
CoRR, 2023
Bag of Tricks for Effective Language Model Pretraining and Downstream Adaptation: A Case Study on GLUE.
CoRR, 2023
SD-Conv: Towards the Parameter-Efficiency of Dynamic Convolution.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023
MPipeMoE: Memory Efficient MoE for Pre-trained Models with Adaptive Pipeline Parallelism.
Proceedings of the IEEE International Parallel and Distributed Processing Symposium, 2023
Prompt-Learning for Cross-Lingual Relation Extraction.
Proceedings of the International Joint Conference on Neural Networks, 2023
Gapformer: Graph Transformer with Graph Pooling for Node Classification.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023
Dynamic Regularized Sharpness Aware Minimization in Federated Learning: Approaching Global Consistency and Smooth Landscape.
Proceedings of the International Conference on Machine Learning, 2023
FedSpeed: Larger Local Interval, Less Communication Round, and Higher Generalization Accuracy.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
Zero-shot Sharpness-Aware Quantization for Pre-trained Language Models.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Self-Evolution Learning for Mixup: Enhance Data Augmentation on Few-Shot Text Classification Tasks.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
PromptST: Abstract Prompt Learning for End-to-End Speech Translation.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Towards Making the Most of ChatGPT for Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
Merging Experts into One: Improving Computational Efficiency of Mixture of Experts.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Revisiting Token Dropping Strategy in Efficient BERT Pretraining.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
Self-Evolution Learning for Discriminative Language Model Pretraining.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
Divide, Conquer, and Combine: Mixture of Semantic-Independent Experts for Zero-Shot Dialogue State Tracking.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
Token-Level Self-Evolution Training for Sequence-to-Sequence Learning.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023
Toward Human-Like Evaluation for Natural Language Generation with Error Analysis.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
Unsupervised Dense Retrieval with Relevance-Aware Contrastive Pre-Training.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
PAD-Net: An Efficient Framework for Dynamic Networks.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
TransGEC: Improving Grammatical Error Correction with Translationese.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
CASN: Class-Aware Score Network for Textual Adversarial Detection.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
Improving Simultaneous Machine Translation with Monolingual Data.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
Original or Translated? On the Use of Parallel Data for Translation Quality Estimation.
CoRR, 2022
Toward Efficient Language Model Pretraining and Downstream Adaptation via Self-Evolution: A Case Study on SuperGLUE.
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2022
Cherry Hypothesis: Identifying the Cherry on the Cake for Dynamic Networks.
CoRR, 2022
SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters.
CoRR, 2022
Vega-MT: The JD Explore Academy Translation System for WMT22.
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2022
Parameter-Efficient and Student-Friendly Knowledge Distillation.
CoRR, 2022
BLISS: Robust Sequence-to-Sequence Learning via Self-Supervised Input Representation.
CoRR, 2022
Bridging Cross-Lingual Gaps During Leveraging the Multilingual Sequence-to-Sequence Pretraining for Text Generation.
CoRR, 2022
Improving Neural Machine Translation by Denoising Training.
CoRR, 2022
Vega-MT: The JD Explore Academy Machine Translation System for WMT22.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Seventh Conference on Machine Translation, 2022
Where Does the Performance Improvement Come From?: - A Reproducibility Concern about Image-Text Retrieval.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022
MirrorAlign: A Super Lightweight Unsupervised Word Alignment Model via Cross-Lingual Contrastive Learning.
Proceedings of the 19th International Conference on Spoken Language Translation, 2022
Improving Sharpness-Aware Minimization with Fisher Mask for Better Generalization on Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022
SparseAdapter: An Easy Approach for Improving the Parameter-Efficiency of Adapters.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022
Fine-tuning Global Model via Data-Free Knowledge Distillation for Non-IID Federated Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
On the Complementarity between Pre-Training and Random-Initialization for Resource-Rich Machine Translation.
Proceedings of the 29th International Conference on Computational Linguistics, 2022
A Contrastive Cross-Channel Data Augmentation Framework for Aspect-Based Sentiment Analysis.
Proceedings of the 29th International Conference on Computational Linguistics, 2022
Redistributing Low-Frequency Words: Making the Most of Monolingual Data in Non-Autoregressive Translation.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
2021
The USYD-JD Speech Translation System for IWSLT 2021.
CoRR, 2021
Bridging the Gap Between Clean Data Training and Real-World Inference for Spoken Language Understanding.
CoRR, 2021
SLUA: A Super Lightweight Unsupervised Word Alignment Model via Cross-Lingual Contrastive Learning.
CoRR, 2021
The USYD-JD Speech Translation System for IWSLT2021.
Proceedings of the 18th International Conference on Spoken Language Translation, 2021
Understanding and Improving Lexical Choice in Non-Autoregressive Translation.
Proceedings of the 9th International Conference on Learning Representations, 2021
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning.
Proceedings of the 9th International Conference on Learning Representations, 2021
Towards Efficiently Diversifying Dialogue Generation Via Embedding Augmentation.
Proceedings of the IEEE International Conference on Acoustics, 2021
Improving Neural Machine Translation by Bidirectional Training.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
On the Complementarity between Pre-Training and Back-Translation for Neural Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021
On the Copying Behaviors of Pre-Training for Neural Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021
Progressive Multi-Granularity Training for Non-Autoregressive Translation.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021
Rejuvenating Low-Frequency Words: Making the Most of Parallel Data in Non-Autoregressive Translation.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021
2020
Tencent AI Lab Machine Translation Systems for WMT20 Chat Translation Task.
Proceedings of the Fifth Conference on Machine Translation, 2020
Context-Aware Cross-Attention for Non-Autoregressive Translation.
Proceedings of the 28th International Conference on Computational Linguistics, 2020
Self-Attention with Cross-Lingual Position Representation.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
2019
Recurrent Graph Syntax Encoder for Neural Machine Translation.
CoRR, 2019
The University of Sydney's Machine Translation System for WMT19.
Proceedings of the Fourth Conference on Machine Translation, 2019