رکورد قبلیرکورد بعدی

" SQL on big data : "


Document Type : BL
Record Number : 602301
Doc. No : b431520
Main Entry : Pal, Sumit
Title & Author : SQL on big data : : technology, architecture, and innovation /\ Sumit Pal
Page. NO : 1 online resource
ISBN : 9781484222478
: : 1484222474
: 1484222466
: 9781484222461
Contents : At a Glance; Contents; About the Author; About the Technical Reviewer; Acknowledgments; Introduction; Chapter 1: Why SQL on Big Data?; Why SQL on Big Data?; Why RDBMS Cannot Scale; SQL-on-Big-Data Goals; SQL-on-Big-Data Landscape; Open Source Tools; Apache Drill; Apache Phoenix; Apache Presto; BlinkDB; Impala; Hadapt; Hive; Kylin; Tajo; Spark SQL; Spark SQL with Tachyon; Splice Machine; Trafodion; Commercial Tools; Actian Vector; AtScale; Citus; Greenplum; HAWQ; JethroData; SQLstream; VoltDB; Appliances and Analytic DB Engines; IBM BLU; Microsoft PolyBase; Netezza; Oracle Exadata
: TeradataVertica; How to Choose an SQL-on-Big-Data Solution; Summary; Chapter 2: SQL-on-Big-Data Challenges & Solutions; Types of SQL; Query Workloads; Types of Data: Structured, Semi-Structured, and Unstructured; Semi-Structured Data; Unstructured Data; How to Implement SQL Engines on Big Data; SQL Engines on Traditional Databases; How an SQL Engine Works in an Analytic Database; Why Is DML Difficult on HDFS?; Challenges to Doing Low-Latency SQL on Big Data; Approaches to Solving SQL on Big Data; Approaches to Reduce Latency on SQL Queries; File Formats; Text/CSV Files; JSON Records
: Avro FormatSequence Files; RC Files; ORC Files; Parquet Files; How to Choose a File Format?; Data Compression; Indexing, Partitioning, and Bucketing; Why Indexing Is Difficult; Partitioning; Advantages; Limitations; Bucketing; Recommendations; Summary; Chapter 3: Batch SQL-Architecture; Hive; Hive Architecture Deep Dive; How Hive Translates SQL into MR; Hive Query Compiler; Analytic Functions in Hive; Common Real-Life Use Cases of Analytic Functions; TopN; Clickstream Sessionization; Grouping Sets, Cube, and Rollup; ACID Support in Hive; Serialization and SerDe in Hive
: Performance Improvements in HiveOptimization by Using a Broadcast Join; Pipelining the Data for Joins; Dynamically Partitioned Joins; Vectorization of Queries; Use of LLAP with Tez; CBO Optimizers; Join Order; Bushy Trees; Table Sizing; Recommendations to Speed Up Hive; Upcoming Features in Hive; Summary; Chapter 4: Interactive SQL-Architecture; Why Is Interactive SQL So Important?; SQL Engines for Interactive Workloads; Spark; Spark Stack; Spark Architecture; Spark SQL; Spark SQL Architecture; Spark SQL Optimization-Catalyst Optimizer; Spark SQL with Tachyon (Alluxio)
: Analytic Query Support in Spark SQLGeneral Architecture Pattern; Impala; Impala Architecture; Impala Optimizations; HDFS Caching; File Format Selection; Recommendations to Make Impala Queries Faster; Code Generation; SQL Enhancements and Impala Shortcomings; Apache Drill; Apache Drill Architecture; Key Features; Query Execution; Vertica; Vertica with Hadoop; Hadoop MapReduce Connector; Vertica Hadoop Connector for HDFS; Jethro Data; Others; MPP vs. Batch-Comparisons; Capabilities and Characteristics to Look for in the SQL Engine; Technical Decisions; Soft Decisions; Summary
Subject : SQL (Computer program language)
Subject : Big data
Dewey Classification : ‭005.75/6‬
LC Classification : ‭QA76.73.S67‬
Added Entry : Ohio Library and Information Network
کپی لینک

پیشنهاد خرید
پیوستها
Search result is zero
نظرسنجی
نظرسنجی منابع دیجیتال

1 - آیا از کیفیت منابع دیجیتال راضی هستید؟