عنوان

Namsel: An Optical Character Recognition System for Tibetan Text

پدید آورنده

Rowinski, Zach; Keutzer, Kurt,Rowinski, Zach; Keutzer, Kurt

موضوع

رده

کتابخانه

مرکز و کتابخانه مطالعات اسلامی به زبان‌های اروپایی

محل استقرار

استان: قم ـ شهر: قم

تماس با کتابخانه : 32910706-025

شماره کتابشناسی ملی

شماره

LA6d5781k5

زبان اثر

زبان متن نوشتاري يا گفتاري و مانند آن

انگلیسی

عنوان و نام پديدآور

عنوان اصلي

Namsel: An Optical Character Recognition System for Tibetan Text

نام عام مواد

[Article]

نام نخستين پديدآور

Rowinski, Zach; Keutzer, Kurt

یادداشتهای مربوط به خلاصه یا چکیده

متن يادداشت

The use of advanced computational methods for the analysis of large corpora of electronic texts is becoming increasingly popular in humanities and social science research. Unfortunately, Tibetan Studies has lacked such a repository of electronic, searchable texts. The automated recognition of printed texts, known as Optical Character Recognition (OCR), offers a solution to this problem; however, until recently, robust OCR systems for the Tibetan language have not been available. In this paper, we introduce one new system, called Namsel, which uses Optical Character Recognition (OCR) to support the production, review, and distribution of searchable Tibetan texts at a large scale. Namsel tackles a number of challenges unique to the recognition of complex scripts such as Tibean uchen and has been able to achieve high accuracy rates on a wide range of machine-printed works. In this paper, we discuss the details of Tibetan OCR, how Namsel works, and the problems it is able to solve. We also discuss the collaborative work between Namsel and its partner libraries aimed at building a comprehensive database of historical and modern Tibetan works-a database that consists of more than one million pages of texts spanning over a thousand years of literary production.

مجموعه

تاريخ نشر

2016

عنوان

Himalayan Linguistics

شماره جلد

15/1

نام شخص به منزله سر شناسه - (مسئولیت معنوی درجه اول )

عنصر شناسه اي

Rowinski, Zach; Keutzer, Kurt

دسترسی و محل الکترونیکی

نام الکترونيکي

اطلاعات رکورد کتابشناسی

نوع ماده

[Article]

کد کاربرگه

275578

اطلاعات دسترسی رکورد

سطح دسترسي

تكميل شده

عنوان Namsel: An Optical Character Recognition System for Tibetan Text

پدید آورنده Rowinski, Zach; Keutzer, Kurt,Rowinski, Zach; Keutzer, Kurt