EDINET-Bench: Evaluating LLMs on Complex Financial Tasks using Japanese Financial Statements.
CoRR, June, 2025
llm-jp-modernbert: A ModernBERT Model Trained on a Large-Scale Japanese Corpus with Long Context Length.
CoRR, April, 2025
Developing Japanese CLIP Models Leveraging an Open-weight LLM for Large-scale Dataset Translation.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025
Constructing Multimodal Datasets from Scratch for Rapid Development of a Japanese Visual Language Model.
CoRR, 2024
A Comprehensive Analysis of Memorization in Large Language Models.
Proceedings of the 17th International Natural Language Generation Conference, 2024