Raj Dabre
Orcid: 0000-0003-0664-3421
According to our database1,
Raj Dabre
authored at least 124 papers
between 2012 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2025
CoRR, February, 2025
PrahokBART: A Pre-trained Sequence-to-Sequence Model for Khmer Natural Language Generation.
Proceedings of the 31st International Conference on Computational Linguistics, 2025
2024
Trans. Assoc. Comput. Linguistics, 2024
Bilingual Corpus Mining and Multistage Fine-tuning for Improving Machine Translation of Lecture Transcripts.
J. Inf. Process., 2024
Are Language Models Agnostic to Linguistically Grounded Perturbations? A Case Study of Indic Languages.
CoRR, 2024
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines.
CoRR, 2024
An Empirical Comparison of Vocabulary Expansion and Initialization Approaches for Language Models.
CoRR, 2024
How effective is Multi-source pivoting for Translation of Low Resource Indian Languages?
CoRR, 2024
CoRR, 2024
Do Not Worry if You Do Not Have Data: Building Pretrained Language Models Using Translationese.
CoRR, 2024
IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages.
CoRR, 2024
RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models models via Romanization.
CoRR, 2024
PUB: A Pragmatics Understanding Benchmark for Assessing LLMs' Pragmatics Capabilities.
CoRR, 2024
Findings of WMT 2024's MultiIndic22MT Shared Task for Machine Translation of 22 Indian Languages.
Proceedings of the Ninth Conference on Machine Translation, 2024
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Kreyòl-MT: Building MT for Latin American, Caribbean and Colonial African Creole Languages.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
SubMerge: Merging Equivalent Subword Tokenizations for Subword Regularized Models in Neural Machine Translation.
Proceedings of the 25th Annual Conference of the European Association for Machine Translation (Volume 1), 2024
Proceedings of the 7th Joint International Conference on Data Science & Management of Data (11th ACM IKDD CODS and 29th COMAD), 2024
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
How Effective is Synthetic Data and Instruction Fine-tuning for Translation with Markup using LLMs?
Proceedings of the 16th Conference of the Association for Machine Translation in the Americas, 2024
PUB: A Pragmatics Understanding Benchmark for Assessing LLMs' Pragmatics Capabilities.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, 2024
IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models via Romanization.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
Proceedings of the Findings of the Association for Computational Linguistics, 2024
2023
SelfSeg: A Self-supervised Sub-word Segmentation Method for Neural Machine Translation.
ACM Trans. Asian Low Resour. Lang. Inf. Process., August, 2023
IndicTrans2: Towards High-Quality and Accessible Machine Translation Models for all 22 Scheduled Indian Languages.
Trans. Mach. Learn. Res., 2023
Low-resource Multilingual Neural Translation Using Linguistic Feature-based Relevance Mechanisms.
ACM Trans. Asian Low Resour. Lang. Inf. Process., 2023
CoRR, 2023
Decomposed Prompting for Machine Translation Between Related Languages using Large Language Models.
CoRR, 2023
Variable-length Neural Interlingua Representations for Zero-shot Neural Machine Translation.
CoRR, 2023
Proceedings of the Eighth Conference on Machine Translation, 2023
Proceedings of the 20th International Conference on Spoken Language Translation, 2023
DecoMT: Decomposed Prompting for Machine Translation Between Related Languages using Large Language Models.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
CTQScorer: Combining Multiple Features for In-context Example Selection for Machine Translation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
An Empirical Study of Leveraging Knowledge Distillation for Compressing Multilingual Neural Machine Translation Models.
Proceedings of the 24th Annual Conference of the European Association for Machine Translation, 2023
Exploring the Impact of Layer Normalization for Zero-shot Neural Machine Translation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2023
IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation Metrics for Indian Languages.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
SciCap+: A Knowledge Augmented Dataset to Study the Challenges of Scientific Figure Captioning.
Proceedings of the Workshop on Scientific Document Understanding co-located with 37th AAAI Conference on Artificial Inteligence (AAAI 2023), 2023
2022
CoRR, 2022
NICT at MixMT 2022: Synthetic Code-Mixed Pre-training and Multi-way Fine-tuning for Hinglish-English Translation.
Proceedings of the Seventh Conference on Machine Translation, 2022
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2022, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
BERTSeg: BERT Based Unsupervised Subword Segmentation for Neural Machine Translation.
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, 2022
Proceedings of the Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, 2022
A Multilingual Multiway Evaluation Data Set for Structured Document Translation of Asian Languages.
Proceedings of the Findings of the Association for Computational Linguistics: AACL-IJCNLP 2022, 2022
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
Proceedings of the 29th International Conference on Computational Linguistics, 2022
Proceedings of the 9th Workshop on Asian Translation, 2022
Proceedings of the 9th Workshop on Asian Translation, 2022
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022
2021
CoRR, 2021
Recurrent Stacking of Layers in Neural Networks: An Application to Neural Machine Translation.
CoRR, 2021
Studying The Impact Of Document-level Context On Simultaneous Neural Machine Translation.
Proceedings of the 18th Biennial Machine Translation Summit - Volume 1: Research Track, 2021
Proceedings of the 18th Biennial Machine Translation Summit - Volume 1: Research Track, 2021
Proceedings of the 8th Workshop on Asian Translation, 2021
NICT-5's Submission To WAT 2021: MBART Pre-training And In-Domain Fine Tuning For Indic Languages.
Proceedings of the 8th Workshop on Asian Translation, 2021
2020
Mach. Transl., 2020
Pre-training via Leveraging Assisting Languages and Data Selection for Neural Machine Translation.
CoRR, 2020
Combining Sequence Distillation and Transfer Learning for Efficient Low-Resource Neural Machine Translation Models.
Proceedings of the Fifth Conference on Machine Translation, 2020
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020
Coursera Corpus Mining and Multistage Fine-Tuning for Improving Lectures Translation.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020
JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020
Harnessing Cross-lingual Features to Improve Cognate Detection for Low-resource Languages.
Proceedings of the 28th International Conference on Computational Linguistics, 2020
Proceedings of the 28th International Conference on Computational Linguistics, 2020
Improving Low-Resource NMT through Relevance Based Linguistic Features Incorporation.
Proceedings of the 28th International Conference on Computational Linguistics, 2020
Proceedings of the 7th Workshop on Asian Translation, 2020
NICT's Submission To WAT 2020: How Effective Are Simple Many-To-Many Neural Machine Translation Models?
Proceedings of the 7th Workshop on Asian Translation, 2020
Proceedings of the Fourth Workshop on Neural Generation and Translation, 2020
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, 2020
2019
Multi-Layer Softmaxing during Training Neural Machine Translation for Flexible Decoding with Fewer Layers.
CoRR, 2019
CoRR, 2019
Proceedings of the Fourth Conference on Machine Translation, 2019
NICT's Supervised Neural Machine Translation Systems for the WMT19 Translation Robustness Task.
Proceedings of the Fourth Conference on Machine Translation, 2019
NICT's Supervised Neural Machine Translation Systems for the WMT19 News Translation Task.
Proceedings of the Fourth Conference on Machine Translation, 2019
Exploiting Out-of-Domain Parallel Data through Multilingual Transfer Learning for Low-Resource Neural Machine Translation.
Proceedings of Machine Translation Summit XVII Volume 1: Research Track, 2019
Improving Transformer-Based Speech Recognition Systems with Compressed Structure and Speech Attributes Augmentation.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Exploiting Multilingualism through Multistage Fine-Tuning for Low-Resource Neural Machine Translation.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019
Proceedings of the 6th Workshop on Asian Translation, 2019
NICT's participation to WAT 2019: Multilingualism and Multi-step Fine-Tuning for Low Resource NMT.
Proceedings of the 6th Workshop on Asian Translation, 2019
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019
2018
Exploiting Multilingual Corpora Simply and Efficiently in Neural Machine Translation.
J. Inf. Process., 2018
A Comprehensive Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation.
J. Inf. Process., 2018
Proceedings of the 32nd Pacific Asia Conference on Language, 2018
NICT's Participation in WAT 2018: Approaches Using Multilingualism and Recurrently Stacked Layers.
Proceedings of the 32nd Pacific Asia Conference on Language, 2018
2017
CoRR, 2017
An Empirical Comparison of Simple Domain Adaptation Methods for Neural Machine Translation.
CoRR, 2017
An Empirical Study of Language Relatedness for Transfer Learning in Neural Machine Translation.
Proceedings of the 31st Pacific Asia Conference on Language, Information and Computation, 2017
Enabling Multi-Source Neural Machine Translation By Concatenating Source Sentences In Multiple Languages.
Proceedings of Machine Translation Summit XVI, Volume 1: Research Track, 2017
Proceedings of the 14th International Conference on Spoken Language Translation, 2017
Proceedings of the IJCNLP 2017, Taipei, Taiwan, November 27, 2017
Proceedings of the 4th Workshop on Asian Translation, 2017
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017
2016
Sophisticated Lexical Databases - Simplified Usage: Mobile Applications and Browser Plugins For Wordnets.
Proceedings of the 8th Global WordNet Conference, 2016
Proceedings of the First Conference on Machine Translation, 2016
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016
2015
Large-scale Dictionary Construction via Pivot-based Statistical Machine Translation with Significance Pruning and Neural Network Features.
Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation, 2015
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015
Proceedings of the 12th International Conference on Natural Language Processing, 2015
Proceedings of the 2nd Workshop on Asian Translation, 2015
2014
Proceedings of the Seventh Global Wordnet Conference, 2014
Proceedings of the 11th International Conference on Natural Language Processing, 2014
Anou Tradir: Experiences In Building Statistical Machine Translation Systems For Mauritian Languages - Creole, English, French.
Proceedings of the 11th International Conference on Natural Language Processing, 2014
Tackling Close Cousins: Experiences In Developing Statistical Machine Translation Systems For Marathi And Hindi.
Proceedings of the 11th International Conference on Natural Language Processing, 2014
2012
Proceedings of the COLING 2012, 2012