Model-based clustering and classification for data science :
نام عام مواد
[Book]
ساير اطلاعات عنواني
with applications in R /
نام نخستين پديدآور
Charles Bouveyron, Gilles Celeux, T. Brendan Murphy, Adrian E. Raftery.
وضعیت نشر و پخش و غیره
محل نشرو پخش و غیره
Cambridge :
نام ناشر، پخش کننده و غيره
Cambridge University Press,
تاریخ نشرو بخش و غیره
2019.
مشخصات ظاهری
نام خاص و کميت اثر
1 online resource (xvii, 427 pages)
فروست
عنوان فروست
Cambridge series in statistical and probabilistic mathematics ;
مشخصه جلد
50
یادداشتهای مربوط به کتابنامه ، واژه نامه و نمایه های داخل اثر
متن يادداشت
Includes bibliographical references (pages 386-414) and index.
یادداشتهای مربوط به مندرجات
متن يادداشت
Cover; Half-title; Series information; Title page; Copyright information; Dedication; Contents; Expanded Contents; Preface; 1 Introduction; 1.1 Cluster Analysis; 1.1.1 From Grouping to Clustering; 1.1.2 Model-based Clustering; 1.2 Classification; 1.2.1 From Taxonomy to Machine Learning; 1.2.2 Model-based Discriminant Analysis; 1.3 Examples; 1.4 Software; 1.5 Organization of the Book; 1.6 Bibliographic Notes; 2 Model-based Clustering: Basic Ideas; 2.1 Finite Mixture Models; 2.2 Geometrically Constrained Multivariate Normal Mixture Models; 2.3 Estimation by Maximum Likelihood
متن يادداشت
2.4 Initializing the EM Algorithm2.4.1 Initialization by Hierarchical Model-based Clustering; 2.4.2 Initialization Using the smallEM Strategy; 2.5 Examples with Known Number of Clusters; 2.6 Choosing the Number of Clusters and the Clustering Model; 2.7 Illustrative Analyses; 2.7.1 Wine Varieties; 2.7.2 Craniometric Analysis; 2.8 Who Invented Model-based Clustering?; 2.9 Bibliographic Notes; 3 Dealing with Difficulties; 3.1 Outliers; 3.1.1 Outliers in Model-based Clustering; 3.1.2 Mixture Modeling with a Uniform Component for Outliers; 3.1.3 Trimming Data with tclust
متن يادداشت
3.2 Dealing with Degeneracies: Bayesian Regularization3.3 Non-Gaussian Mixture Components and Merging; 3.4 Bibliographic Notes; 4 Model-based Classification; 4.1 Classification in the Probabilistic Framework; 4.1.1 Generative or Predictive Approach; 4.1.2 An Introductory Example; 4.2 Parameter Estimation; 4.3 Parsimonious Classification Models; 4.3.1 Gaussian Classification with EDDA; 4.3.2 Regularized Discriminant Analysis; 4.4 Multinomial Classification; 4.4.1 The Conditional Independence Model; 4.4.2 An Illustration; 4.5 Variable Selection; 4.6 Mixture Discriminant Analysis
متن يادداشت
4.7 Model Assessment and Selection4.7.1 The Cross-validated Error Rate; 4.7.2 Model Selection and Assessing the Error Rate; 4.7.3 Penalized Log-likelihood Criteria; 5 Semi-supervised Clustering and Classification; 5.1 Semi-supervised Classification; 5.1.1 Estimating the Model Parameters through the EM Algorithm; 5.1.2 A First Experimental Comparison; 5.1.3 Model Selection Criteria for Semi-supervised Classification; 5.2 Semi-supervised Clustering; 5.2.1 Incorporating Must-link Constraints; 5.2.2 Incorporating Cannot-link Constraints; 5.3 Supervised Classification with Uncertain Labels
متن يادداشت
5.3.1 The Label Noise Problem5.3.2 A Model-based Approach for the Binary Case; 5.3.3 A Model-based Approach for the Multi-class Case; 5.4 Novelty Detection: Supervised Classification with Unobserved Classes; 5.4.1 A Transductive Model-based Approach; 5.4.2 An Inductive Model-based Approach; 5.5 Bibliographic Notes; 6 Discrete Data Clustering; 6.1 Example Data; 6.2 The Latent Class Model for Categorical Data; 6.2.1 Maximum Likelihood Estimation; 6.2.2 Parsimonious Latent Class Models; 6.2.3 The Latent Class Model as a Cluster Analysis Tool; 6.2.4 Model Selection
بدون عنوان
0
بدون عنوان
8
بدون عنوان
8
بدون عنوان
8
بدون عنوان
8
یادداشتهای مربوط به خلاصه یا چکیده
متن يادداشت
Cluster analysis finds groups in data automatically. Most methods have been heuristic and leave open such central questions as: how many clusters are there? Which method should I use? How should I handle outliers? Classification assigns new observations to groups given previously classified observations, and also has open questions about parameter tuning, robustness and uncertainty assessment. This book frames cluster analysis and classification in terms of statistical models, thus yielding principled estimation, testing and prediction methods, and sound answers to the central questions. It builds the basic ideas in an accessible but rigorous way, with extensive data examples and R code; describes modern approaches to high-dimensional data and networks; and explains such recent advances as Bayesian regularization, non-Gaussian model-based clustering, cluster merging, variable selection, semi-supervised and robust classification, clustering of functional data, text and images, and co-clustering. Written for advanced undergraduates in data science, as well as researchers and practitioners, it assumes basic knowledge of multivariate calculus, linear algebra, probability and statistics
ویراست دیگر از اثر در قالب دیگر رسانه
عنوان
Model-based clustering and classification for data science.
شماره استاندارد بين المللي کتاب و موسيقي
110849420X
شماره استاندارد بين المللي کتاب و موسيقي
9781108640596
موضوع (اسم عام یاعبارت اسمی عام)
موضوع مستند نشده
Cluster analysis.
موضوع مستند نشده
Mathematical statistics.
موضوع مستند نشده
R (Computer program language)
موضوع مستند نشده
Statistics-- Classification.
موضوع مستند نشده
Cluster analysis.
موضوع مستند نشده
Mathematical statistics.
موضوع مستند نشده
R (Computer program language)
موضوع مستند نشده
Statistics.
رده بندی ديویی
شماره
519
.
5/3
ويراست
23
رده بندی کنگره
شماره رده
QA278
.
55
نشانه اثر
.
M63
2019
نام شخص به منزله سر شناسه - (مسئولیت معنوی درجه اول )