The Apache Spark cluster computing system aims to make data analytics fast-both fast to run and fast to write. But as powerful and useful as Spark is for distributed systems, there are many issues that may occur during implementation. This practical cookbook contains recipes solving the most common problems that Spark users face. Author Neelesh Srinivas Salian, a customer operations engineer at Cloudera, has seen all things that can go wrong in the code for Spark applications.
Data engineers, system administrators, architects will learn recipes for debugging common and unexpected problems that occur during key phases of Spark implementation on large distributed system environments. From setting up your cluster to running your first application, submitting to a cluster, understanding storage needs, and handling security and monitoring metrics, this book is your guide to facing any Spark operations issue.
ISBN: | 9781491971581 |
Publication date: | 31st July 2017 |
Author: | Neelesh Srinivas Salian |
Publisher: | O'Reilly an imprint of O'Reilly Media |
Format: | Paperback |
Pagination: | 200 pages |
Genres: |
Data mining |