This project is a pilot initiative on digitizing Japanese translations of Buddhist canonical literature, with a particular focus on the modern collection Kokuyaku Issaikyō. We create high-quality digital data in which page images are precisely aligned with carefully corrected Japanese text, and use this as a foundation for building a vision–language model (VLM) infrastructure tailored to Japanese Buddhist texts. The aim is to explore how far such models can support automatic reading and layout understanding, to develop a reusable workflow that can be applied to similar materials, and ultimately to contribute to the enhancement of digital archives at Tohoku University, the use of Japanese sources for AI model training, and the worldwide accessibility of Japanese scholarly materials.
Collaborative Research Project
Advanced digitization and AI analysis of Japanese Buddhist literature through data-driven methods
Overview
Researchers
Sebastian NEHRDICH
Principal Investigator
Assistant Professor, Center for Integrated Japanese Studies, Tohoku University
Field of research: Digital Humanities
Ryuta KIKUYA
Associate Professor, Graduate School of Arts & Letters, Tohoku University
Field of research: Indo-Tibetan Buddhism
Collaborators
・Satoshi Katō
(Tohoku University Archives, Professor)
・Kiyonori NAGASAKI
(Keio University, Faculty of Letters, Professor)
