In pandas, DataFrame headers are essentially the column names. These are not usually added to the DataFrame directly as you would add a row. Instead, they’re specified upon the creation of the DataFrame or reassigned after it’s been created. However, if your data comes without a header, and the first row of your data actually represents the header, you might need to promote this row to the header of your DataFrame.
Here’s an example of how to do this. We’ll create a DataFrame without headers first and then set the headers using the first row.
Creating DataFrame without headers:
First, we simulate the scenario where we have data without headers.
import pandas as pd
# Sample data
data = [
['Name', 'Age', 'Occupation'],
['Sachin', 25, 'Data Scientist'],
['Ram', 30, 'Software Engineer'],
['Jose', 22, 'Doctor']
]
# Create a DataFrame without headers
df = pd.DataFrame(data)
print(df)
This will print:
0 1 2
0 Name Age Occupation
1 Sachin 25 Data Scientist
2 Ram 30 Software Engineer
3 Jose 22 Doctor
Promoting the first row to the header:
If the first row of your DataFrame is the header, you can set it as such and remove it from the data like this:
# Set the first row as the header
header = df.iloc[0]
# Create a new DataFrame without the first row
df = df[1:]
# Set the header row as the df header
df.columns = header
print(df)
This will set the first row as the DataFrame’s header, and you’ll get:
0 Name Age Occupation
1 Sachin 25 Data Scientist
2 Ram 30 Software Engineer
3 Jose 22 Doctor
Creating DataFrame with headers directly:
In practice, if you’re creating a DataFrame from scratch or from a file and you know the headers, you’d usually specify or include them upon creation:
# Create a DataFrame with headers
df_with_headers = pd.DataFrame(data[1:], columns=data[0])
print(df_with_headers)