Latent semantic mapping (LSM) is a generalization of latent semantic analysis (LSA), a paradigm originally developed to capture hidden word patterns in a text document corpus. In information retrieval, LSA enables retrieval on the basis of conceptual content, instead of merely matching words between queries and documents. It operates under the assumption that there is some latent semantic structure in the data, which is partially obscured by the randomness of word choice with respect to retrieval. Algebraic and/or statistical techniques are brought to bear to estimate this structure and get rid of the obscuring "noise." This results in a parsimonious continuous parameter description of words and documents, which then replaces the original parameterization in indexing and retrieval. This approach exhibits three main characteristics: 1) discrete entities (words and documents) are mapped onto a continuous vector space; 2) this mapping is determined by global correlation patterns; and 3) dimensionality reduction is an integral part of the process. Such fairly generic properties are advantageous in a variety of different contexts, which motivates a broader interpretation of the underlying paradigm. The outcome (LSM) is a data-driven framework for modeling meaningful global relationships implicit in large volumes of (not necessarily textual) data. This monograph gives a general overview of the framework, and underscores the multifaceted benefits it can bring to a number of problems in natural language understanding and spoken language processing. It concludes with a discussion of the inherent tradeoffs associated with the approach, and some perspectives on its general applicability to data-driven information extraction.
ویراست دیگر از اثر در قالب دیگر رسانه
شماره استاندارد بين المللي کتاب و موسيقي
9781598291049
موضوع (اسم عام یاعبارت اسمی عام)
موضوع مستند نشده
Automatic speech recognition.
موضوع مستند نشده
Computational linguistics.
موضوع مستند نشده
Latent semantic indexing.
موضوع مستند نشده
Semantics-- Data processing.
موضوع مستند نشده
Semantics-- Mathematical models.
موضوع مستند نشده
Automatic speech recognition.
موضوع مستند نشده
Computational linguistics.
موضوع مستند نشده
LANGUAGE ARTS & DISCIPLINES-- Linguistics-- Psycholinguistics.
موضوع مستند نشده
Latent semantic indexing.
موضوع مستند نشده
Semantics-- Data processing.
موضوع مستند نشده
Semantics-- Mathematical models.
مقوله موضوعی
موضوع مستند نشده
LAN-- 009040
رده بندی ديویی
شماره
401/
.
9
ويراست
22
رده بندی کنگره
شماره رده
P325
.
5
.
D38
نشانه اثر
B45
2007
نام شخص به منزله سر شناسه - (مسئولیت معنوی درجه اول )