Learning Spark: Lightning-Fast Data Analytics
Thumbnail 1Thumbnail 2

Learning Spark: Lightning-Fast Data Analytics

Product ID: 189532636
Secure Transaction
Frequently Bought Together

Description

Learning Spark: Lightning-Fast Data Analytics

Reviews

A**Z

Covers theoretical and practical aspects of the spark ecosystem in great depth

This book is a great resource to learn about spark. It covers in detail the concepts related to the Spark architecture, theoretical concepts about parallelization and topics related to optimizing analytical pipelines running on Spark. The book has a very nice section about the delta lake. Also covers MLflow yup a good level of detail, more like a complement to the docs. The section on machines learning includes theoretical explanations on how some ML algorithms change when running then parallely, as MLlib does.I used the book as an extra study resource when taking some Databricks certifications. It was a great addition to my study materials.

J**A

Well organized and solid information

It was easy to follow the book. The setup of Spark shell was also clearly written. I also find the instructions online to install spark locally to be sufficient as well. The book is well organized to delineate different components of Spark, e.g. intro, structured api, streaming, optimizations, data lake, ml deployment options. While ML deployment needs for individual business use cases are highly specific, I find the overview deployment framework provided by the book to be helpful. I also liked that the book uses screenshots of Spark UI and arrows to point in the screenshots to explain the UI, since the UI can be hard to understand. The code samples and the graphics in other sections are useful as well. There’s also coverage on how to connect to different apps, like beeline (which I’ve never heard of), tableau, thrift. Overall, the book contains solid information on the inner workings of Spark. I would recommend giving this book a read!

S**E

Decent introduction to Spark

I am always trying to learn new skills to make myself more marketable in the work place. My background is mainly in SQL with some Python and I am learning JS right now. I decided to give this book a shot to see whether Spark is another tool I want to add to my arsenal. The books does what it promises; it gives you a good introduction to Spark. I did have some issues installing the required programs on a MacBook, but once I had everything installed, I was able to follow along. My big complaint is what others have mentioned, which is concepts are mentioned without any background to what or why.If you have some programming background, this book should be sufficient to get you up and running in Spark.

C**S

Buen libro para iniciarse en spark

Da buenos ejemplos sea en Scala y python aunque no siempre están en python el lenguaje Scala es similar (como un Java python). Sugiere que si quieres practicar utiliza databricks si no quieres instalar nada on-premise o si gusta instala spark utilizando wsl de Windows o una máquina virtual con Linux.

M**D

Must read

This book is a must read for anyone trying to learn Spark in the big data environment.

A**R

More databricks centric

Nice book if you really want to work hands on without having to worry about internals of spark.

T**S

Great beginner book

I'm a software engineer who knows his way through SQL, mostly running queries/transforms on Postgres and Redshift. The majority of my background is in building and supporting services. Having no background knowledge in Spark, I was looking for a book that explains the fundamental concepts, helps me get up running, and helps me expand my toolkit for working with "big data".I was able to follow along in this book fairly easily. Working on a MacBook, I did have to first install Scala, download Spark, enable Spark in IntelliJ, etc. I didn't have trouble with this as it was fairly straightforward. With my environment set up, I found the book presents every code sample in Scala and Python. I worked through the code samples, chapter by chapter, writing Scala in IntelliJ or sometimes writing Scala in the Spark CLI itself.I did take a detour from the book slightly to learn a bit more about sbt, which is the Scala build tool.For a beginner such as myself, this book is a God send, but I do wish the authors approached some things differently.In my opinion, some topics are covered in a very "hand-wavy" manner. For example, Chapter 4 discusses managed vs. unmanaged tables. While knowing this difference exists is helpful for the reader, the authors never discuss when you should use a managed table or an unmanaged table. They could have included that information or pointed the user to some external source. This part of Chapter 4 then shows sample code on how to create a managed table from a CSV file. However, it's not clear what should I do with that information. What are the patterns applicable to a managed table vs. unmanaged table? What are the trade-offs? Being a beginner book, I still feel the authors could have written even just 1 page, which would add significant value to this section.Sometimes the book will share some interesting tidbit but using terminology or concepts that the authors haven't really described. I found this very frustrating. For example:> (Chapter 4, page 92) ... you can create multiple SparkSessions within a single Spark application—this can be handy, for example, in cases where you want to access (and combine) data from two different SparkSessions that don’t share the same Hive metastore configurations.If you search for mentions Hive, you see the authors briefly mentioned Spark uses a Hive metastore to persist table metadata. So are the authors saying I can use one Spark installation and access table metadata from different Hive metastores? Why would I ever want to access only the metadata for different tables? Again – the use case isn't clear.As a beginner, I found this book very valuable, and I believe it is a great investment.

E**C

Decent introduction to Spark

You should probably have some familiarity with machine learning and python before you pick this book up, but it's a decent introduction to Spark.

I**R

Best Reference for Spark

If you want to learn Spark 3.x+ this book is for you. Easy to read and with practical examples. Recommend it!

F**O

Contenido actualizado

Me parece un buen libro introductor al framework, sobre todo porque hasta el momento de esta reseña es de los únicos que tiene contenido actualizado a la versión 3.0 de Spark. Me ayudó mucho a pasar la certificación de Databricks en conjunto con el libro "Spark: The Definitive Guide: Big Data Processing Made Simple".Lo recomiendo.

J**Y

Nicely laid out and explained

I've just started my role as a Data Engineer where I looked at Azure's Data Factory. I needed to learn PySpark so I picked up this book and found it a super useful guide. It is explained clearly, and whilst it's clearly aimed at someone who has been in the industry longer than I, I found I could easily understand it.I haven't read the chapter on streaming or the two chapters on machine learning as it isn't applicable to me, but everything else has been just what I needed. Well done to the authors for putting together such an amazing guide.If you want to see the different chapter contents, I've added them as photos for your ease.

B**R

Very Good

Concepts are explained very well

Common Questions

Trustpilot

TrustScore 4.5 | 7,300+ reviews

Meera L.

Smooth transaction and product arrived in perfect condition.

3 weeks ago

Pooja R.

The customer service exceeded my expectations. Perfect for buying products you can't find elsewhere.

1 week ago

Shop Global, Save with Desertcart
Value for Money
Competitive prices on a vast range of products
Shop Globally
Serving over 300 million shoppers across more than 200 countries
Enhanced Protection
Trusted payment options loved by worldwide shoppers
Customer Assurance
Trusted payment options loved by worldwide shoppers.
Desertcart App
Shop on the go, anytime, anywhere.
452 Lei
Romaniastore
1
Free Returns

30 daysfor PRO membership users

15 dayswithout membership

Secure Transaction

Trustpilot

TrustScore 4.5 | 7,300+ reviews

Ali H.

Fast shipping and excellent packaging. The Leatherman tool feels very premium and sturdy.

1 day ago

Suresh K.

Very impressed with the quality and fast delivery. Will shop here again.

4 days ago

Learning Spark Lightning Fast Data Analytics | Desertcart Romania