Theory and applications of natural language processing
GENERAL NOTES
Text of Note
Includes index.
CONTENTS NOTE
Text of Note
Intro; Preface; Acknowledgements; Contents; About the Authors; 1 Turkish and Its Challenges for Language and Speech Processing; 1.1 Introduction; 1.2 Turkish Morphology; 1.3 Constituent Order and Morphology-Syntax Interface; 1.4 Applications; 1.5 State-of-the-Art Tools and Resources for Turkish; Notes; References; 2 Morphological Processing for Turkish; 2.1 Introduction; 2.2 Overview of Turkish Morphology; 2.3 Morphophonology and Morphographemics; 2.4 Root Lexicons and Morphotactics; 2.4.1 Representational Convention; 2.4.2 Nominal Morphotactics; 2.4.3 Verbal Morphotactics; 2.4.4 Derivations
3.3.1.2 Constraints with Voting3.3.2 Learning the Rules; 3.3.3 Models Based on Inflectional Group n-Grams; 3.3.4 Discriminative Methods for Disambiguation; 3.4 Discussion; 3.4.1 Data Sets; 3.4.2 Experimental Results; 3.5 Conclusions; References; 4 Language Modeling for Turkish Text and Speech Processing; 4.1 Introduction; 4.2 Language Modeling; 4.3 Challenges in Statistical Language Modeling for Turkish; 4.4 Sub-lexical Units for Statistical Language Modeling; 4.4.1 Linguistic Sub-lexical Units; 4.4.2 Statistical Sub-lexical Units; 4.5 Statistical Language Modeling for Turkish
Text of Note
4.5.1 Language Modeling with Linguistic Sub-lexical Units4.5.1.1 Surface Form Stem+Ending Model; 4.5.1.2 Lexical Form Stem+Ending Model; 4.5.2 Statistical Sub-lexical Units: Morphs; 4.6 Discriminative Language Modeling for Turkish; 4.6.1 Discriminative Language Model; 4.6.2 Feature Sets for Turkish DLM; 4.6.2.1 Basic n-Gram Features; 4.6.2.2 Linguistically Motivated Features; 4.6.2.3 Statistically Motivated Features; 4.7 Conclusions; References; 5 Turkish Speech Recognition; 5.1 Introduction; 5.2 Foundations of Automatic Speech Recognition; 5.3 Turkish Language Resources for ASR
Text of Note
5.3.1 Turkish Acoustic and Text Data5.3.2 Linguistic Tools Used in Turkish ASR; 5.4 Turkish ASR Systems; 5.4.1 Newspaper Content Transcription System; 5.4.2 Turkish Broadcast News Transcription System; 5.4.3 LVCSR System for Call Center Conversations; 5.5 Conclusions; References; 6 Turkish Named-Entity Recognition; 6.1 Introduction; 6.2 NER on Turkish; 6.3 Task Description; 6.3.1 Representation; 6.3.2 Evaluating NER Performance; 6.4 Domain and Datasets; 6.4.1 Formal Texts; 6.4.2 Informal Texts; 6.4.3 Challenges of Informal Texts for NER; 6.5 Preprocessing for NER; 6.5.1 Tokenization
0
8
8
8
8
SUMMARY OR ABSTRACT
Text of Note
This book brings together work on Turkish natural language and speech processing over the last 25 years, covering numerous fundamental tasks ranging from morphological processing and language modeling, to full-fledged deep parsing and machine translation, as well as computational resources developed along the way to enable most of this work. Owing to its complex morphology and free constituent order, Turkish has proved to be a fascinating language for natural language and speech processing research and applications. After an overview of the aspects of Turkish that make it challenging for natural language and speech processing tasks, this book discusses in detail the main tasks and applications of Turkish natural language and speech processing. A compendium of the work on Turkish natural language and speech processing, it is a valuable reference for new researchers considering computational work on Turkish, as well as a one-stop resource for commercial and research institutions planning to develop applications for Turkish. It also serves as a blueprint for similar work on other Turkic languages such as Azeri, Turkmen and Uzbek.