Jan Christian Blaise Cruz

Orcid: 0000-0002-2676-7790

According to our database1, Jan Christian Blaise Cruz authored at least 24 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Thank You, Stingray: Multilingual Large Language Models Can Not (Yet) Disambiguate Cross-Lingual Word Sense.
CoRR, 2024

WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines.
CoRR, 2024

SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages.
CoRR, 2024

CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark.
CoRR, 2024

Samsung R&D Institute Philippines @ WMT 2024 Low-resource Languages of Spain Shared Task.
Proceedings of the Ninth Conference on Machine Translation, 2024

Samsung R&D Institute Philippines @ WMT 2024 Indic MT Task.
Proceedings of the Ninth Conference on Machine Translation, 2024


2023
Multilingual Large Language Models Are Not (Yet) Code-Switchers.
CoRR, 2023

Prompting Multilingual Large Language Models to Generate Code-Mixed Texts: The Case of South East Asian Languages.
CoRR, 2023

Samsung R&D Institute Philippines at WMT 2023.
Proceedings of the Eighth Conference on Machine Translation, 2023

Multilingual Large Language Models Are Not (Yet) Code-Switchers.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

2022
Automatic WordNet Construction using Word Sense Induction through Sentence Embeddings.
CoRR, 2022

Using Synthetic Data for Conversational Response Generation in Low-resource Settings.
CoRR, 2022

Samsung Research Philippines - Datasaur AI's Submission for the WMT22 Large Scale Multilingual Translation Task.
Proceedings of the Seventh Conference on Machine Translation, 2022

Improving Large-scale Language Models and Resources for Filipino.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Using Synthetic Data to Train a Conversational Response Generation Model in Low Resource Settings.
Proceedings of the International Conference on Asian Language Processing, 2022

2021
Data Processing Matters: SRPH-Konvergen AI's Machine Translation System for WMT'21.
Proceedings of the Sixth Conference on Machine Translation, 2021

Simplifying Paragraph-Level Question Generation via Transformer Language Models.
Proceedings of the PRICAI 2021: Trends in Artificial Intelligence, 2021

Exploiting News Article Structure for Automatic Corpus Generation of Entailment Datasets.
Proceedings of the PRICAI 2021: Trends in Artificial Intelligence, 2021

2020
Investigating the True Performance of Transformers in Low-Resource Languages: A Case Study in Automatic Corpus Creation.
CoRR, 2020

Establishing Baselines for Text Classification in Low-Resource Languages.
CoRR, 2020

Transformer-based End-to-End Question Generation.
CoRR, 2020

Localization of Fake News Detection via Multitask Transfer Learning.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

2019
Evaluating Language Model Finetuning Techniques for Low-resource Languages.
CoRR, 2019


  Loading...