Advanced digitization and AI analysis of Japanese Buddhist literature through data-driven methods

Collaborative Research Project

Advanced digitization and AI analysis of Japanese Buddhist literature through data-driven methods

Research Term December 2025 - 
Principal Investigator Sebastian Nehrdich (Assistant Professor, Center for Integrated Japanese Studies)

Overview

 This project is a pilot initiative on digitizing Japanese translations of Buddhist canonical literature, with a particular focus on the modern collection Kokuyaku Issaikyō. We create high-quality digital data in which page images are precisely aligned with carefully corrected Japanese text, and use this as a foundation for building a vision–language model (VLM) infrastructure tailored to Japanese Buddhist texts. The aim is to explore how far such models can support automatic reading and layout understanding, to develop a reusable workflow that can be applied to similar materials, and ultimately to contribute to the enhancement of digital archives at Tohoku University, the use of Japanese sources for AI model training, and the worldwide accessibility of Japanese scholarly materials.

Researchers

Sebastian NEHRDICH

Principal Investigator
Assistant Professor, Center for Integrated Japanese Studies, Tohoku University

Field of research: Digital Humanities 

Ryuta KIKUYA

Associate Professor, Graduate School of Arts & Letters, Tohoku University

Field of research: Indo-Tibetan Buddhism 

Collaborators

・Satoshi Katō
(Tohoku University Archives, Professor)

・Kiyonori NAGASAKI
(Keio University, Faculty of Letters,  Professor)

← Back to List