#72 Prepare for Databricks Data Engineer Associate certification exam part #1: Basic Terminology

Hang Nguyen
5 min readOct 20, 2022

Before jumping right to actual preparation, let’s grab some of the most basic concepts and knowledge of Databricks!

Databricks Architecture and Services

Databricks architecture contains 2 main elements:

  • Control plane: contains backend services that Databricks manages in its own cloud account. Majority of data DOES NOT reside here. Notebook commands and many other workspace configurations are stored in the control plane and encrypted at rest.
  • Data plane: is where data is processed.

Clusters

A Databricks cluster is a set of computation resources and configurations on which you run data engineering, data science, and data analytics workloads, such as production ETL pipelines, streaming analytics, ad-hoc analytics, and machine learning.

Clusters are made up of 1 or more virtual machine (VM) instances. Driver coordinates activities of executors and executors run tasks composing a Spark job.

--

--

Hang Nguyen
Hang Nguyen

Written by Hang Nguyen

Just sharing (data) knowledge

Responses (1)