Getting started with Spark (part 2)
After discussing in part 1 all some knowledge on hardware, let’s move on to Hadoop.
In general, it means that multiple CPUs share the same memory, while for distributed computing, each CPU has its own memory and is connected to other machines across a network.
- Hadoop — an ecosystem of tools for big data storage and data analysis. Hadoop is an older system than Spark but is still used by many companies. The major difference between…