Member-only story

Getting started with Spark (part 3)

Hang Nguyen
Apr 18, 2022

--

OK now let’s dig down into Spark:

  • written in functional programming language Scala
  • application programming interfaces in Java, R and Python

Read and Write data with Spark

getting schema of dataframe
getting columns’ names
same as df.head() in Python

Manipulate the data

In 2 ways (not much of difference in performance nor speed)

  • Imperative Programming: Using Spark DataFrames and Python
  • Declarative Programming: using SQL.

Imperative way

--

--

Hang Nguyen
Hang Nguyen

Written by Hang Nguyen

Just sharing (data) knowledge

No responses yet