Essential Techniques to Help You Process, and Get Unique Insights from, Big Data, 2nd Edition.
2nd ed.
Birmingham :
Packt Publishing Ltd,
2018.
1 online resource (203 pages)
Cover; Title Page; Copyright and Credits; Dedication; Packt Upsell; Contributors; Table of Contents; Preface; Chapter 1: Overview of Big Data and Hive; A short history; Introducing big data; The relational and NoSQL databases versus Hadoop; Batch, real-time, and stream processing; Overview of the Hadoop ecosystem; Hive overview; Summary; Chapter 2: Setting Up the Hive Environment; Installing Hive from Apache; Installing Hive from vendors; Using Hive in the cloud ; Using the Hive command; Using the Hive IDE; Summary; Chapter 3: Data Definition and Description; Understanding data types.
Chapter 9: Security ConsiderationsAuthentication; Metastore authentication; Hiveserver2 authentication; Authorization; Legacy mode; Storage-based mode; SQL standard-based mode; Mask and encryption; The data-hashing function; The data-masking function; The data-encryption function; Other methods; Summary; Chapter 10: Working with Other Tools; The JDBC/ODBC connector; NoSQL; The Hue/Ambari Hive view; HCatalog; Oozie; Spark; Hivemall; Summary; Other Books You May Enjoy; Index.
Data type conversionsData Definition Language; Database; Tables; Table creation; Table description; Table cleaning; Table alteration; Partitions; Buckets; Views; Summary; Chapter 4: Data Correlation and Scope; Project data with SELECT; Filtering data with conditions; Linking data with JOIN; INNER JOIN; OUTER JOIN; Special joins; Combining data with UNION; Summary; Chapter 5: Data Manipulation; Data exchanging with LOAD; Data exchange with INSERT; Data exchange with [EX|IM]PORT; Data sorting; Functions; Function tips for collections; Function tips for date and string; Virtual column functions.
Apache Hive helps you deal with data summarization, queries, and analysis for huge amounts of data. This book will give you a background in big data, and familiarize you with your Hive working environment. Next you will cover advanced topics like performance and security in Hive and how to work efficiently to find solutions to big data problems.
01201872
B10778
Apache Hive Essentials : Essential Techniques to Help You Process, and Get Unique Insights from, Big Data, 2nd Edition.
9781788995092
Apache Hadoop.
Apache Hadoop.
Big data.
Cloud computing.
Databases-Design-Data processing.
Electronic data processing-- Distributed processing.
Big data.
Cloud computing.
Computers-- Data Processing.
Computers-- Database Management-- Data Warehousing.