#106 Databricks notebook with pip install

Hang Nguyen
3 min readOct 18, 2024

I could not believe my eyes that this simple command pip install can cause that much trauma to my data engineer career. Period.

We all know that in order to use some Python package in Databricks notebook, we often prompt to use%pip install. This is true in theory, but in practice we can get errors from placing it in the wrong place. Also, with the feature Severless in Databricks, it can be a different story. Let's dwell right to cautions when using pip install in Databricks notebook that won't cause that much trauma like the one I did experience.

Difference between !pip install and %pip install

You may confuse which one to use before pip install, as a single symbol does make a difference. Lucky for you that we usually opt for just one.

The topmost option to use in Databricks notebook is%pip install since it can guarantee to install packages into the Python environment of the running notebook kernel. Meanwhile, !pip install cannot guarantee the same feature as it does not always interact with the kernel depending on the configuration.

An example of use is as follows:

%pip install pandas

What can go wrong with this %pip install

It is okay to place this %pip installcommand with other commands in the same Python code cell in…

--

--

Hang Nguyen
Hang Nguyen

Written by Hang Nguyen

Just sharing (data) knowledge

No responses yet