In order to consult ancient books and documents, it is necessary to travel to different places and visit various libraries - this is the common memory of many ancient book researchers。In the digital age, that is changing。The National Library (National Center for the Protection of Ancient Books) and other 6 units have recently released 6,786 new digital resources of ancient books online.。So far, 130,000 ancient book digital resources have been published online nationwide.。Relying on digital means, the voluminous ancient books have come out of the "boudoir and high cabinet", so that civilization is within reach。
Take into account both "hiding" and "using"
Ancient books, as cultural relics, must be protected, and as documents must be used by readers。Taking into account both "collection" and "use" has always been the focus of the protection of ancient books, and digitalization is the best way。The digitization of Chinese ancient books began in the 1990s。With the continuous maturity of digital technology, the work of enabling ancient books by science and technology has made gratifying progress。
"In 2016, the National Library set up the platform of 'Chinese Ancient Books Resource Library', and released digital resources such as ordinary ancient books, tortoise bones, and Dunhuang documents, and all achieved login-free online reading。Nan Jiangtao, associate researcher of the National Library, introduced that the National Library has also jointly released the "French and Tibetan Dun Huang's will", "Tianjin Library Ancient Books" and "Yunnan Provincial Library Ancient Books" in collaboration with collection institutions at home and abroad, and basically built the framework of the "National digital platform for ancient books"。
With the in-depth development of the "Chinese Ancient Books Protection Plan", libraries around the country have invested manpower and material resources to vigorously promote the digitalization of ancient books。The National Library has jointly released 39 units of digital ancient books, January 4 is the 7th joint release, which not only contains the Ming and Qing Dynasty engravings, as well as tombstone rubbings and other characteristic resources。Relevant data show that among the existing 130,000 digital ancient books, more than 102,000 (pieces) belong to the "Chinese Ancient Books Resource Database".。
"The 130,000 digital resources of ancient books are especially precious to us researchers.。Yang Haizheng, professor of the Chinese Department of Peking University, lamented that online access eliminates the time of traveling to and from the library and balances the heritage and documentary nature of ancient books。
AI helps organize ancient books
Converting paper ancient books into digital texts is only the first step in the preservation of ancient books。"The existing digital ancient books are mostly converted from microfilm, with low resolution and inconvenient use。Yang Haizheng explained that such ancient books usually do not have a retrieval function, and if you want to read a certain content, you need to read the original article by page, and it is difficult to quickly find the knowledge you want。
The rapid development of artificial intelligence has brought revolutionary changes to the sorting and classification of digital ancient books。In October 2022, the digital ancient books platform "Knowledge of Ancient Books", developed by ByteDance and the Digital Humanities Research Center of Peking University, is a vivid case。
Entering the website of "Reading ancient books", the reporter saw "Zhouyi", "Zuozhuan" and "Li Ji" displayed on the home page。Randomly open a book, the left side is the chapter directory, the right side is the text, the typesetting form not only conforms to the reading habits of modern people, but also restores the reading beauty of ancient books paper。
"Different from some digital platforms, 'Reading ancient books' is completely free, and adds a series of convenient functions such as simple and traditional translation, background image comparison, and full-text search.。Introduced by Tang Kai Xin, general manager of Corporate Social Responsibility Department of Tiktok Group,The platform mainly uses three technologies: character recognition, automatic punctuation and named entity recognition,It can not only extract and organize the text on the photocopy,It can also identify the personal name, place name and other information in the text through the sequence annotation,It is 96 to 97 percent accurate。
"The platform has sorted out and launched 685 classic and ancient books, totaling more than 79 million words, mainly from the" Four Series ".。Tang said that the mobile version of "Recognizing Ancient Books" has been online, and the bibliography on the platform will continue to be updated in the future。
Industry insiders predict that with the application of AI technology, the ancient historical and cultural knowledge contained in ancient books and documents will continue to be extracted and constructed into a variety of knowledge bases, and will support Internet front-end applications in the form of knowledge maps。
Cross-border cooperation has become a trend
In fact, before the launch of "Reading ancient books", cross-border cooperation between cultural protection institutions, scientific research institutions and Internet companies has become more and more common。For example, Tencent and Dunhuang Research Academy developed AI disease recognition technology to help "consult" Dunhuang millennium murals。
Due to the advantages in product development and design, the participation of social forces such as Internet companies will further guarantee the service quality of the digital platform of ancient books。"We have excellent product managers, designers, and software engineers who can continuously optimize the product functions of the digital ancient books platform。"Tang Kai Xin said。
The birth of "reading ancient books" cannot be separated from the support of experts and scholars。Wang Jun, director of the Digital Humanities Research Center of Peking University, said that Peking University is responsible for manual review and proofreading in this cooperation, making up for the shortcomings of artificial intelligence, which has a recognition error rate, and using its own academic platform to connect more professional researchers and students。
Experts believe that in the collation of ancient books, humanities and social science scholars should actively intervene and strengthen cooperation with technical personnel, so as to make better use of machines rather than being led by machines, so as to ensure the accuracy of the results。
"How to cultivate interdisciplinary talents with both technical and academic abilities and how to form a multidisciplinary curriculum system for classical literature and other related majors in colleges and universities are all issues that need comprehensive consideration.。"Wang Jun said。