Build machine learning models, natural language processing applications, and recommender systems with PySpark to solve various business challenges. This book starts with the fundamentals of Spark and its evolution and then covers the entire spectrum of traditional machine learning algorithms along with natural language processing and recommender systems using PySpark. Machine Learning with PySpark shows you how to build supervised machine learning models such as linear regression, logistic regression, decision trees, and random forest. You'll also see unsupervised machine learning models such as K-means and hierarchical clustering. A major portion of the book focuses on feature engineering to create useful features with PySpark to train the machine learning models. The natural language processing section covers text processing, text mining, and embedding for classification. After reading this book, you will understand how to use PySpark's machine learning library to build and train various machine learning models. Additionally you'll become comfortable with related PySpark components, such as data ingestion, data processing, and data analysis, that you can use to develop data-driven intelligent applications. You will: Build a spectrum of supervised and unsupervised machine learning algorithms Implement machine learning algorithms with Spark MLlib libraries Develop a recommender system with Spark MLlib libraries Handle issues related to feature engineering, class balance, bias and variance, and cross validation for building an optimal fit model.
یادداشتهای مربوط به سفارشات
منبع سفارش / آدرس اشتراک
OverDrive, Inc.
شماره انبار
20BA8036-C37B-4BEE-9B39-7679978B32E4
ویراست دیگر از اثر در قالب دیگر رسانه
شماره استاندارد بين المللي کتاب و موسيقي
9781484241301
شماره استاندارد بين المللي کتاب و موسيقي
9781484241325
موضوع (اسم عام یاعبارت اسمی عام)
موضوع مستند نشده
Application software-- Development.
موضوع مستند نشده
Python (Computer program language)
موضوع مستند نشده
SPARK (Computer program language)
موضوع مستند نشده
Application software-- Development.
موضوع مستند نشده
COMPUTERS-- Databases-- Data Mining.
موضوع مستند نشده
Python (Computer program language)
موضوع مستند نشده
SPARK (Computer program language)
مقوله موضوعی
موضوع مستند نشده
COM-- 021030
موضوع مستند نشده
UYQ
موضوع مستند نشده
UYQ
رده بندی ديویی
شماره
005
.
7
ويراست
23
رده بندی کنگره
شماره رده
QA76
.
76
.
A65
نام شخص به منزله سر شناسه - (مسئولیت معنوی درجه اول )