The Visual Interpretation of Decision Tree



Original Source Here

One can refer to an article by Michael Galarnyk to get a hands-on implementation of decision tree visualization with the scikit-learn package.

In this article, we will discuss how to create much efficient and better-formatted decision tree visualization using the dtreeviz library.

dtreeviz:

dtreeviz is an open-source Python library used to visualize the decisions or rules of a decision tree model. Install the library from PyPl using pip install dtreeviz and import it as from dtreeviz.trees import * . dtreeviz library can visualize the decision tree for both classification and regression tasks.

Classification Tasks:

The below-mentioned code snippet can be used to create an instance of the dtreeviz function and plot the visualization for a decision tree classifier model trained on the Iris dataset.

viz = dtreeviz(clf, 
x_data=X_train,
y_data=y_train,
target_name='class',
feature_names=iris.feature_names,
class_names=list(iris.target_names),
title="Decision Tree - Iris Data")
(Image by Author), Decision Tree Visualizer for Iris Dataset

The decision tree generated from the dtreeviz library is better formatted and interpretable. For each node of the above plot, we see a stacked histogram of the feature that is used to split at that level, color-coded according to the class. Observing from the histogram one can observe how the split is occurring. For example, for the first node, the split is at petal_length=2.45 where the records having petal_length<=2.45 are predicted as setosa, and petal_length>2.45 we have got the tree extended.

The above plots were for the classification dataset, we can also plot decision tree visualization for the regression decision tree model. The Boston housing dataset is used to demonstrate the regression decision tree model. The code snippet below shows how to read the data and train the decision tree regressor model.

Regression Tasks:

dtreeviz can also visualize a decision tree regressor model. The below-mentioned code snippet can be used to visualize a decision tree regressor trained on the Boston Housing dataset.

viz = dtreeviz(reg,
x_data=X_train,
y_data=y_train,
target_name='price',
feature_names=boston.feature_names,
title="Decision Tree - Boston housing",
show_node_labels = True)
(Image by Author), Decision Tree Visualizer for Boston Housing dataset

For regression decision tree plots, at each node, we have a scatterplot between the target class and the feature that is used to split at that level. One can interpret the model by observing the dashed line in the scatterplots.

  • The vertical lines in the scatterplot denote the split point at that level (same as histogram split from classification).
  • The horizontal dashed line in the scatterplot are the target means for left and right decision nodes.

Some Interesting features of dtreeviz:

  • Orientation: By default, the decision tree plots are from top to bottom, one can change it to the left to right orientation using the orientation=’LR’ parameter from the dtreeviz function.
  • Remove Scatterplots or Histograms: For a decision trees model with large depths, the presence of scatterplots or histograms can make the plot very large. One can avoid those things from their plots using the fancy=False parameter from the dtreeviz function.
  • Feature Importance: To get the feature importance plot, one can use the explanation_type=’sklearn_default’ parameter.

There are various other features of the dtreeviz library, to know more about them read this article from the explained.ai site.

Conclusion:

This article discussed how dtreeviz can be used to plot decision tree classification and regression models. Data Scientists / Analysts can use it to get an understanding of their decision tree models by observing the set of rules that cause the prediction.

There are certain limitations such as interpreting a decision tree with large depth is very difficult. Also, the decision tree generates only SVG plots with reduced dependencies.

References:

[1] Dtreeviz GitHub repository: https://github.com/parrt/dtreeviz

Thank You for Reading

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot

%d bloggers like this: