https://miro.medium.com/max/1200/0*erUy0b5pFe4xKAKS

Original Source Here

# The DataFrame Type Conversions You Should Know as a Pandas User

The Pandas DataFrames have been widely popular among Machine Learning Engineers and Data Scientists for various tabular data analysis, management, and processing tasks.

While the Pandas library is self-sufficient for numerous use-cases and various other Python libraries provide inherent support for Pandas such as Matplotlib, Plotly, Sklearn, etc., there might be situations where you may need to convert your Pandas DataFrame to other supported Datatypes (or Data Structures) in Python.

Even if not, I firmly believe the awareness about these conversions can be helpful in the general usage of Pandas DataFrames.

Therefore, in this post, I will demonstrate different ways to convert a Pandas DataFrame to other widely-used datatypes by developers in the Python community.

The highlight of the article is mentioned below:

**· ****Understanding the Pandas DataFrame**

· **Converting to NumPy Array**

· **Converting to Python List**

· **Converting to Dictionary**

· **Conclusion**

Let’s begin 🚀!

# Understanding the Pandas DataFrame

Before we proceed with various type-conversions of a Pandas DataFrame, let’s briefly understand this data structure.

Simply put, a Pandas DataFrame is a tabular data structure residing inside a Python environment.

It can proficiently perform a wide variety of tabular data operations such as filtering operations, I/O operations, data grouping and aggregation, table joins, column distribution methods, rolling window analysis, and many more.

Of course, one can perform the above operations only when they have a Pandas DataFrame loaded in an existing Python environment/session.

One of the most rudimentary techniques to create a DataFrame is using the `pd.DataFrame()`

method as demonstrated below.

First, we import the required libraries.

Next, we create a DataFrame `df`

from a list of lists `data`

using the `pd.DataFrame()`

method as follows:

We can verify the class of the DataFrame using the `type()`

method in Python:

You can read about various techniques to create a Pandas DataFrame in my previous blog.

# Converting to NumPy Array

First and foremost, let’s understand how you can convert a Pandas data object to a NumPy array.

Here, we shall consider the following DataFrame:

## Method 1:

You can use the `values`

attribute to convert a Pandas DataFrame to a NumPy array.

We can verify the data type of the `result`

object, which, indeed, is a NumPy array.

## Method 2:

Another function available in Pandas is the `to_numpy()`

method.

Note:Pandas official documentation recommends using the`df.to_numpy()`

over the`values`

attribute discussed in Method 1. (Source: here)

## Method 3:

Lastly, you can also use the elemental method of NumPy — `np.array()`

to convert a Pandas DataFrame to a NumPy array as follows:

If you want to learn about various methods to create NumPy arrays, you can find my blog below:

# Converting to Python List

Next, we shall learn some methods to convert a Pandas DataFrame to a Python list.

Unfortunately, Pandas does not offer a direct method to convert a Pandas DataFrame to a Python list.

Therefore, to achieve this, we should first convert the DataFrame to a NumPy array, followed by the conversion to a list using the `tolist()`

method in NumPy.

As demonstrated above, the approach first converts the DataFrame to a NumPy array using the `values`

attribute discussed in the previous section, post which, we use the `tolist()`

method of NumPy.

# Converting to Dictionary

Another popular conversion of the Pandas DataFrame is generating a Python dictionary from it.

As a quick recap, we are using the following DataFrame in this blog:

In Pandas, we can convert a DataFrame to a dictionary using the `to_dict()`

method. Below, we’ll discuss the various formats of the Python dictionary that we can generate using this method.

These formats primarily vary on the type of key-value pairs returned the method. The structure of the dictionary is determined by the `orient`

parameter of the `to_dict()`

method.

## Method 1:

With `orient='dict'`

(which is also the default value of the parameter), the method returns a nested dictionary, in which the keys of the outer dictionary are the name of the columns, and the keys of the inner dictionary are index values.

A diagrammatic illustration of the default behavior (`orient='dict'`

) is shown below:

The code block below demonstrates the output of the `to_dict()`

method.

## Method 2:

In contrast to having nested dictionaries as in Method 1, you can generate a dictionary from a DataFrame with `key`

as the column name and the `value`

being the column represented as a list.

You can achieve this by passing `orient=”list”`

to the `to_dict()`

method.

This is depicted in the diagram below:

The corresponding implementation is shown below:

## Method 3:

Another interesting way of generating a dictionary using the `to_dict()`

method is by specifying the parameter `orient=”split”`

.

The dictionary returned has three key-value pairs. These are:

`1. `**'index'**: The value holds the index of the DataFrame as a Python list.

2. **'columns'**: This is also a list which specifies the name of the columns.

3. **'data'**: The value of this parameter is a list of list which represents the rows of the DataFrame. The value of this key is the same as what we discussed in **'Converting to Python List'** section.

The output of this conversion is shown below:

Additionally, this method provides four more representations to obtain a dictionary from a DataFrame.

These are`orient='`

, **series’**`orient='`

,**tight**’`orient='`

, and **records**’`orient=`

. You can read about them in the official documentation here. Additionally, this answer on StackOverflow is an excellent resource for learning about them.**'index**’

# Conclusion

To conclude, in this post, I demonstrated various ways to convert a Pandas DataFrame to different Python Data objects.

More specifically, I discussed the conversion of a Pandas DataFrame to a NumPy array, Python List, and a Dictionary.

Note that various other data classes (or data types/frameworks etc.) support conversion to and from a Pandas, such as a DataTable DataFrame, Dask DataFrame, and Spark DataFrame, which I will demonstrate in another post.

**Thanks for reading!**

AI/ML

Trending AI/ML Article Identified & Digested via Granola by Ramsey Elbasheer; a Machine-Driven RSS Bot