For researchers (English)
The NDL Lab is looking for researchers and organizations to cooperate with our activities.
Please experiment on novel technologies related to library services using the National Diet Library (NDL) data.
The following data are available. For researchers who are interested in the data, please contact the Research and Development for Next-Generation Systems Office.
(1) Metadata of materials held by the NDL (including text in the table of contents and authority data)
The NDL provides metadata of its collection materials as Linked Open Data (LOD). For more information, see Use and Connect: What is NDL Linked Open Data (LOD)? (Link to NDL Website).
(2) Image data and layout data sets of materials with expired copyright protection held by the NDL
Image data of books, rare books, and old materials with expired copyright protection in the NDL Digital Collection, and data sets with layout information for machine learning, are available on the following repository. (https://github.com/ndl-lab/layout-dataset)
(3) Data obtained by converting the image data of (2) into text with Optical Character Recognition (OCR).
Text data set is available on the following repository. (https://github.com/ndl-lab/tugidigi-txtdata)
In some cases, data other than the above can be offered. Please feel free to inquire.
(4) Learned model file of NDC predictor
The learned fastText model is available on the following repository. (https://github.com/ndl-lab/ndc_predictor)