#69 PySpark in action with MongoDB

After learning Spark and MongoDB separately, how about connect these two together to do some data analysis on large amount of data (around 4 GB of data)?

It’s absolutely possible to do so :)

Prerequisites

In order to make this happen, you should have Spark installed to your local machine. Tutorials about setting up Spark can be found easily on the internet regardless of your operating systems. I will skip that part so that you can explore yourself. I myself use Spark 3.3.0, the newest version available.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store