Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Big Data and Hadoop; Introduction; Defining a Big Data problem; Building a Hadoop-based Big Data platform; Choosing from Hadoop alternatives; Chapter 2: Preparing for Hadoop Installation; Introduction; Choosing hardware for cluster nodes; Designing the cluster network; Configuring the cluster administrator machine; Creating the kickstart file and boot media; Installing the Linux operating system; Installing Java and other tools; Configuring SSH
متن يادداشت
Chapter 3: Configuring a Hadoop ClusterIntroduction; Choosing a Hadoop version; Configuring Hadoop in pseudo-distributed mode; Configuring Hadoop in fully-distributed mode; Validating Hadoop installation; Configuring ZooKeeper; Installing HBase; Installing Hive; Installing Pig; Installing Mahout; Chapter 4: Managing a Hadoop Cluster; Introduction; Managing the HDFS cluster; Configuring SecondaryNameNode; Managing the MapReduce cluster; Managing TaskTracker; Decommissioning DataNode; Replacing a slave node; Managing MapReduce jobs; Checking job history from the web UI; Importing data to HDFS
متن يادداشت
Manipulating files on HDFSConfiguring the HDFS quota; Configuring CapacityScheduler; Configuring Fair Scheduler; Configuring Hadoop daemon logging; Configuring Hadoop audit logging; Upgrading Hadoop; Chapter 5: Hardening a Hadoop Cluster; Introduction; Configuring service-level authentication; Configuring job authorization with ACL; Securing a Hadoop cluster with Kerberos; Configuring web UI authentication; Recovering from NameNode failure; Configuring NameNode high availability; Configuring HDFS federation; Chapter 6: Monitoring a Hadoop Cluster; Introduction
متن يادداشت
Monitoring a Hadoop cluster with JMXMonitoring a Hadoop cluster with Ganglia; Monitoring a Hadoop cluster with Nagios; Monitoring a Hadoop cluster with Ambari; Monitoring a Hadoop cluster with Chukwa; Chapter 7: Tuning Hadoop Cluster for Best Performance; Introduction; Benchmarking and profiling a Hadoop cluster; Analyzing job history with Rumen; Benchmarking a Hadoop cluster with GridMix; Using Hadoop Vaidya to identify performance problems; Balancing data blocks for a Hadoop cluster; Choosing a proper block size; Using compression for input and output; Configuring speculative execution
متن يادداشت
Setting proper number of map and reduce slots for the TaskTrackerTuning the JobTracker configuration; Tuning the TaskTracker configuration; Tuning shuffle, merge, and sort parameters; Configuring memory for a Hadoop cluster; Setting proper number of parallel copies; Tuning JVM parameters; Configuring JVM Reuse; Configuring the reducer initialization time; Chapter 8: Building a Hadoop Cluster with Amazon EC2 and S3; Introduction; Registering with Amazon Web Services (AWS); Managing AWS security credentials; Preparing a local machine for EC2 connection; Creating an Amazon Machine Image (AMI)
بدون عنوان
0
بدون عنوان
8
بدون عنوان
8
بدون عنوان
8
بدون عنوان
8
یادداشتهای مربوط به خلاصه یا چکیده
متن يادداشت
Solve specific problems using individual self-contained code recipes, or work through the book to develop your capabilities. This book is packed with easy-to-follow code and commands used for illustration, which makes your learning curve easy and quick.If you are a Hadoop cluster system administrator with Unix/Linux system management experience and you are looking to get a good grounding in how to set up and manage a Hadoop cluster, then this book is for you. It's assumed that you will have some experience in Unix/Linux command line already, as well as being familiar with network communication
یادداشتهای مربوط به سفارشات
منبع سفارش / آدرس اشتراک
Safari Books Online
شماره انبار
CL0500000301
ویراست دیگر از اثر در قالب دیگر رسانه
عنوان
Hadoop Operations and Cluster Management Cookbook
شماره استاندارد بين المللي کتاب و موسيقي
9781782165163
قطعه
عنوان
Safari books online
موضوع (اسم عام یاعبارت اسمی عام)
موضوع مستند نشده
Electronic data processing-- Distributed processing.
موضوع مستند نشده
File organization (Computer science)
موضوع مستند نشده
Apache Hadoop (Computer file)
موضوع مستند نشده
Cloud computing
موضوع مستند نشده
Electronic data processing-- Distributed processing
موضوع مستند نشده
File organization (Computer science)
موضوع مستند نشده
Open source software
رده بندی ديویی
شماره
005
.
74
رده بندی کنگره
شماره رده
QA76
.
9
.
F5
نام شخص به منزله سر شناسه - (مسئولیت معنوی درجه اول )