Document Type
|
:
|
BL
|
Record Number
|
:
|
592441
|
Doc. No
|
:
|
b421660
|
Main Entry
|
:
|
Guo, Shumin.
|
Title & Author
|
:
|
Hadoop Operations and Cluster Management Cookbook
|
Page. NO
|
:
|
1 online resource (368 pages)
|
ISBN
|
:
|
9781782165170
|
|
:
|
: 1782165177
|
|
:
|
9781782165163
|
Notes
|
:
|
Description based upon print version of record
|
|
:
|
Using S3 to host data
|
Contents
|
:
|
Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Big Data and Hadoop; Introduction; Defining a Big Data problem; Building a Hadoop-based Big Data platform; Choosing from Hadoop alternatives; Chapter 2: Preparing for Hadoop Installation; Introduction; Choosing hardware for cluster nodes; Designing the cluster network; Configuring the cluster administrator machine; Creating the kickstart file and boot media; Installing the Linux operating system; Installing Java and other tools; Configuring SSH
|
|
:
|
Chapter 3: Configuring a Hadoop ClusterIntroduction; Choosing a Hadoop version; Configuring Hadoop in pseudo-distributed mode; Configuring Hadoop in fully-distributed mode; Validating Hadoop installation; Configuring ZooKeeper; Installing HBase; Installing Hive; Installing Pig; Installing Mahout; Chapter 4: Managing a Hadoop Cluster; Introduction; Managing the HDFS cluster; Configuring SecondaryNameNode; Managing the MapReduce cluster; Managing TaskTracker; Decommissioning DataNode; Replacing a slave node; Managing MapReduce jobs; Checking job history from the web UI; Importing data to HDFS
|
|
:
|
Manipulating files on HDFSConfiguring the HDFS quota; Configuring CapacityScheduler; Configuring Fair Scheduler; Configuring Hadoop daemon logging; Configuring Hadoop audit logging; Upgrading Hadoop; Chapter 5: Hardening a Hadoop Cluster; Introduction; Configuring service-level authentication; Configuring job authorization with ACL; Securing a Hadoop cluster with Kerberos; Configuring web UI authentication; Recovering from NameNode failure; Configuring NameNode high availability; Configuring HDFS federation; Chapter 6: Monitoring a Hadoop Cluster; Introduction
|
|
:
|
Monitoring a Hadoop cluster with JMXMonitoring a Hadoop cluster with Ganglia; Monitoring a Hadoop cluster with Nagios; Monitoring a Hadoop cluster with Ambari; Monitoring a Hadoop cluster with Chukwa; Chapter 7: Tuning Hadoop Cluster for Best Performance; Introduction; Benchmarking and profiling a Hadoop cluster; Analyzing job history with Rumen; Benchmarking a Hadoop cluster with GridMix; Using Hadoop Vaidya to identify performance problems; Balancing data blocks for a Hadoop cluster; Choosing a proper block size; Using compression for input and output; Configuring speculative execution
|
|
:
|
Setting proper number of map and reduce slots for the TaskTrackerTuning the JobTracker configuration; Tuning the TaskTracker configuration; Tuning shuffle, merge, and sort parameters; Configuring memory for a Hadoop cluster; Setting proper number of parallel copies; Tuning JVM parameters; Configuring JVM Reuse; Configuring the reducer initialization time; Chapter 8: Building a Hadoop Cluster with Amazon EC2 and S3; Introduction; Registering with Amazon Web Services (AWS); Managing AWS security credentials; Preparing a local machine for EC2 connection; Creating an Amazon Machine Image (AMI)
|
Abstract
|
:
|
Solve specific problems using individual self-contained code recipes, or work through the book to develop your capabilities. This book is packed with easy-to-follow code and commands used for illustration, which makes your learning curve easy and quick.If you are a Hadoop cluster system administrator with Unix/Linux system management experience and you are looking to get a good grounding in how to set up and manage a Hadoop cluster, then this book is for you. It's assumed that you will have some experience in Unix/Linux command line already, as well as being familiar with network communication
|
Subject
|
:
|
Apache Hadoop (Computer file)
|
Subject
|
:
|
Cloud computing
|
Subject
|
:
|
Electronic data processing-- Distributed processing
|
Subject
|
:
|
File organization (Computer science)
|
Subject
|
:
|
Open source software
|
Subject
|
:
|
Electronic data processing-- Distributed processing.
|
Subject
|
:
|
File organization (Computer science)
|
Dewey Classification
|
:
|
005.74
|
LC Classification
|
:
|
QA76.9.F5
|