10 Examples to Master Distribution Plots with Python Seaborn



Original Source Here

10 Examples to Master Distribution Plots with Python Seaborn

Distribution plots are of crucial importance for EDA

Photo by Sharon Pittaway on Unsplash

The first step of any data product should be understanding the raw data. For successful and efficient products, this step occupies a substantial part of the entire workflow.

There are several methods used for understanding and exploring the data. One of them is creating data visualizations. They help us both explore and explain the data.

By creating appropriate and well-designed visualizations, we can discover the underlying structure and the relationships within the data.

Distribution plots are of crucial importance for exploratory data analysis. They help us detect outliers and skewness, or get an overview of the measures of central tendency (mean, median, and mode).

In this article, we will go over 10 examples to master how to create distribution plots with the Seaborn library for Python. For the examples, we will use a small sample from the Melbourne housing dataset available on Kaggle.

Let’s start with importing the libraries and reading the dataset into a Pandas data frame.

import pandas as pd
import seaborn as sns
sns.set(style="darkgrid", font_scale=1.2)
df = pd.read_csv(
"/content/melb_housing.csv",
usecols=["Regionname", "Type", "Rooms", "Distance", "Price"]
)
df.head()
(image by author)

The dataset contains some features of the houses in Melbourne along with their prices.

The displot function of Seaborn allows for creating 3 different types of distribution plots which are:

  • Histogram
  • Kde (kernel density estimate) plot
  • Ecdf plot

We just need to adjust the kind parameter to choose the type of plot.

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: