Some examples plots that seaborn can create, if everything goes well

Three common seaborn difficulties

Explaining some aspects of using seaborn that most often confound newcomers

You (might) need to reformat your data

Seaborn’s plotting functions are most expressive when provided with a “tidy” long-form dataset. With data formatted this way, you can pass the full dataset and select the columns that you want to visualize by assigning the column names to different roles (x, y, hue, etc.).

A “messy” data table
A “messy” table representing a household budget.
A “tidy” data table, in long-form
The same budget, but represented in a “tidy” long-form table.
budget_long = budget.melt(
id_vars="Category",
var_name="Year",
value_name="Expense",
)
The rules that lineplot and boxplot use for wide-form data
There are many options for passing wide-form data, but different functions will interpret it differently.

There are two kinds of plotting functions

The second difficulty is typically encountered when you try to combine a seaborn plot with a matplotlib figure that has multiple axes.

plt.plot(x, y)  # Plots on the "current" axes, creating it if needed
f, axs = plt.subplots(ncols=2) # Creates a new figure with two axes
axs[0].plot(x, y) # Plots on the first axes of the new figure
plt.plot(x, y) # Plots on the second axes of the new figure
sns.lineplot(x=x, y=y)  # Plots on the "current" axes
f, axs = plt.subplots(ncols=2) # Creates a new figure
sns.lineplot(x=x, y=y, ax=axs[0]) # Plots on the first new axes
sns.lineplot(x=x, y=y) # Plots on the second new axes
f, ax = plt.subplots()
sns.displot(data, x="a", ax=ax)
sns.displot(data, x="b", ax=ax)

Categorical plots will always be categorical

Several seaborn functions specialize in creating plots where one of the axes corresponds to a categorical variable: a variable whose values do not (necessarily) bear a quantitative relationship to each other. Examples would include country of origin (which is both categorical and unordered) and age group (which is ordered, but still categorical). Such variables are often encoded with strings, and at the time these functions were created, matplotlib was not able to interpret string data. So the seaborn functions internally map from the data values to ordinal indices (0, 1, …, n), which are then passed to matplotlib.

Sensible output from pointplot with a numeric x variable
Sometimes it makes sense to make treat a numeric variable as categorical…
Nonsensical output from pointplot with a numeric x variable
…but other times, it makes a huge mess.
Nonsensical output when layering a lineplot onto a stripplot
The stripplot treats size as categorical, but the lineplot doesn’t, so the line is shifted to the right.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Michael Waskom

Michael Waskom

174 Followers

Computational cognitive neuroscientist and creator of the seaborn data visualization library