Document Type
|
:
|
BL
|
Record Number
|
:
|
851081
|
Main Entry
|
:
|
Chellappan, Subhashini
|
Title & Author
|
:
|
Practical Apache Spark : : using the Scala API /\ Subhashini Chellappan and Dharanitharan Ganesan.
|
Publication Statement
|
:
|
[Place of publication not identified] :: Apress,, [2018]
|
|
:
|
, ©2018
|
Page. NO
|
:
|
1 online resource
|
ISBN
|
:
|
1484236521
|
|
:
|
: 9781484236529
|
|
:
|
9781484236512
|
Notes
|
:
|
Includes index.
|
Contents
|
:
|
Intro; Table of Contents; About the Authors; About the Technical Reviewers; Acknowledgments; Introduction; Chapter 1: Scala: Functional Programming Aspects; What Is Functional Programming?; What Is a Pure Function?; Example of Pure Function; Scala Programming Features; Variable Declaration and Initialization; Type Inference; Immutability; Lazy Evaluation; String Interpolation; String -- s Interpolator; String -- f Interpolator; String -- raw Interpolator; Pattern Matching; Scala Class vs. Object; Singleton Object; Companion Classes and Objects; Case Classes; Pattern Matching on Case Classes
|
|
:
|
Direct Acylic Graph in Apache SparkHow DAG Works in Spark; How Spark Achieves Fault Tolerance Through DAG; Persisting RDD; Shared Variables; Broadcast Variables; Accumulators; Simple Build Tool (SBT); Assignments; Reference Links; Points to Remember; Chapter 4: Spark SQL, DataFrames, and Datasets; What Is Spark SQL?; Datasets and DataFrames; Spark Session; Creating DataFrames; DataFrame Operations; Untyped DataFrame Operation: Select; Untyped DataFrame Operation: Filter; Untyped DataFrame Operation: Aggregate Operations; Running SQL Queries Programatically; Creating Views; Dataset Operations
|
|
:
|
Interoperating with RDDsReflection-Based Approach to Infer Schema; Different Data Sources; Generic Load and Save Functions; Manually Specifying Options; Run SQL on Files Directly; JDBC to External Databases; Working with Hive Tables; Building Spark SQL Application with SBT; Points to Remember; Chapter 5: Introduction to Spark Streaming; Data Processing; Streaming Data; Why Streaming Data Are Important; Introduction to Spark Streaming; Internal Working of Spark Streaming; Spark Streaming Concepts; Discretized Streams (DStream); Streaming Context; DStream Operations
|
|
:
|
Scala CollectionsIterating Over the Collection; Common Methods of Collection; Functional Programming Aspects of Scala; Anonymous Functions; Higher Order Functions; Function Composition; Function Currying; Nested Functions; Functions with Variable Length Parameters; Reference Links; Points to Remember; Chapter 2: Single and Multinode Cluster Setup; Spark Multinode Cluster Setup; Recommended Platform; Operating System; Prerequisites; Spark Installation Steps; Spark Web UI; Spark Master UI; Spark Application UI; Stopping the Spark Cluster; Spark Single-Node Cluster Setup; Prerequisites
|
|
:
|
Spark Installation StepsSpark Master UI; Points to Remember; Chapter 3: Introduction to Apache Spark and Spark Core; What Is Apache Spark?; Why Apache Spark?; Spark vs. Hadoop MapReduce; Apache Spark Architecture; Spark Components; Spark Core (RDD); Spark SQL; Spark Streaming; MLib; GraphX; SparkR; Spark Shell; Spark Core: RDD; RDD Operations; Transformations; Actions; Creating an RDD; Using Parallelized Collection; From External Data Source; Creating an RDD from the Hadoop File System; Creating an RDD: File Partitioning; RDD Transformations; RDD Actions; Working with Pair RDDs
|
Abstract
|
:
|
Work with Apache Spark using Scala to deploy and set up single-node, multi-node, and high-availability clusters. This book discusses various components of Spark such as Spark Core, DataFrames, Datasets and SQL, Spark Streaming, Spark MLib, and R on Spark with the help of practical code snippets for each topic. Practical Apache Spark also covers the integration of Apache Spark with Kafka with examples. You'll follow a learn-to-do-by-yourself approach to learning - learn the concepts, practice the code snippets in Scala, and complete the assignments given to get an overall exposure. On completion, you'll have knowledge of the functional programming aspects of Scala, and hands-on expertise in various Spark components. You'll also become familiar with machine learning algorithms with real-time usage. What You Will Learn Discover the functional programming features of Scala Understand the complete architecture of Spark and its components Integrate Apache Spark with Hive and Kafka Use Spark SQL, DataFrames, and Datasets to process data using traditional SQL queries Work with different machine learning concepts and libraries using Spark's MLlib packages Who This Book Is For Developers and professionals who deal with batch and stream data processing.
|
Subject
|
:
|
Scala (Computer program language)
|
Subject
|
:
|
COMPUTERS-- Databases-- General.
|
Subject
|
:
|
Scala (Computer program language)
|
Subject
|
:
|
Spark (Electronic resource : Apache Software Foundation)
|
|
:
|
SPARK (Electronic resource)
|
|
:
|
Spark (Electronic resource : Apache Software Foundation)
|
|
:
|
SPARK (Electronic resource)
|
Dewey Classification
|
:
|
005.758
|
LC Classification
|
:
|
QA76.9.D3
|
Added Entry
|
:
|
Ganesan, Dharanitharan
|