NumPy arrays offer computational advantages, especially for numerical operations. They are more memory-efficient and faster for certain types of calculations, making them ideal for machine learning algorithms, mathematical computations, and array-focused operations.
Converting DataFrame to NumPy Array
Using values
Property
The simplest way to convert a DataFrame to a NumPy array is by using the values
property. This property returns the DataFrame data as a NumPy array.
Creating a Sample DataFrame
Let’s create a DataFrame with some real data:
import pandas as pd
# Sample DataFrame Learning @ freshers.in
data = {
'Name': ['Sachin', 'Manju', 'Ram', 'Raju', 'David', 'Wilson'],
'Age': [32, 29, 35, 40, 28, 33],
'City': ['Mumbai', 'Bangalore', 'Chennai', 'Delhi', 'New York', 'San Francisco']
}
df = pd.DataFrame(data)
Conversion to NumPy Array
array = df.values
Using to_numpy()
Method
Another approach is to use the to_numpy()
method, which provides more flexibility.
Example:
array = df.to_numpy()
array
array([['Sachin', 32, 'Mumbai'],
['Manju', 29, 'Bangalore'],
['Ram', 35, 'Chennai'],
['Raju', 40, 'Delhi'],
['David', 28, 'New York'],
['Wilson', 33, 'San Francisco']], dtype=object)
This method is more explicit and self-documenting, making the code easier to understand.
Handling Different Data Types
One thing to keep in mind is that NumPy arrays should ideally have homogenous data types for efficient computation. If a DataFrame contains multiple data types, NumPy will choose the most general/compatible type (like converting all numbers to floats if there are any floats in the DataFrame).
Use Cases for Conversion
- Machine Learning: Many ML libraries prefer or require input data in the form of NumPy arrays.
- Mathematical Operations: NumPy’s powerful mathematical functions work efficiently with arrays.
- Data Visualization: Some plotting libraries work better with NumPy arrays or specifically require them.