In pandas, the Python data analysis library, the DataFrame is a table-like data structure. In this article, we’ll show you how to create a DataFrame
and use the handy head()
function to preview your data.
▶️ For reference, see the official head documentation:
pandas DataFrame head Documentation
Importing pandas
import pandas as pd
Creating a DataFrame
data = {
"Name": ["Taro", "Hanako", "Jiro", "Mika", "Kenichi", "Keiko", "Sho", "Akane", "Takashi", "Aoi"],
"Age": [23, 29, 35, 42, 18, 33, 27, 24, 31, 30],
"Occupation": ["Engineer", "Designer", "Teacher", "Doctor", "Student", "Nurse", "Programmer", "Sales", "Lawyer", "Researcher"],
"Annual Income (¥)": [4500000, 5500000, 4900000, 7300000, 0, 4000000, 6000000, 3200000, 8000000, 5800000],
"Location": ["Tokyo", "Osaka", "Nagoya", "Sapporo", "Fukuoka", "Tokyo", "Kobe", "Sendai", "Yokohama", "Chiba"],
"Years Employed": [2, 4, 10, 15, 1, 5, 3, 1, 12, 8]
}
df = pd.DataFrame(data)
df
Name | Age | Occupation | Annual Income (¥) | Location | Years Employed | |
---|---|---|---|---|---|---|
0 | Taro | 23 | Engineer | 4500000 | Tokyo | 2 |
1 | Hanako | 29 | Designer | 5500000 | Osaka | 4 |
2 | Jiro | 35 | Teacher | 4900000 | Nagoya | 10 |
3 | Mika | 42 | Doctor | 7300000 | Sapporo | 15 |
4 | Kenichi | 18 | Student | 0 | Fukuoka | 1 |
5 | Keiko | 33 | Nurse | 4000000 | Tokyo | 5 |
6 | Sho | 27 | Programmer | 6000000 | Kobe | 3 |
7 | Akane | 24 | Sales | 3200000 | Sendai | 1 |
8 | Takashi | 31 | Lawyer | 8000000 | Yokohama | 12 |
9 | Aoi | 30 | Researcher | 5800000 | Chiba | 8 |
Using head()
Display the First 5 Rows
df.head()
Name | Age | Occupation | Annual Income (¥) | Location | Years Employed | |
---|---|---|---|---|---|---|
0 | Taro | 23 | Engineer | 4500000 | Tokyo | 2 |
1 | Hanako | 29 | Designer | 5500000 | Osaka | 4 |
2 | Jiro | 35 | Teacher | 4900000 | Nagoya | 10 |
3 | Mika | 42 | Doctor | 7300000 | Sapporo | 15 |
4 | Kenichi | 18 | Student | 0 | Fukuoka | 1 |
Display a Custom Number of Rows
df.head(3) # First 3 rows
Name | Age | Occupation | Annual Income (¥) | Location | Years Employed | |
---|---|---|---|---|---|---|
0 | Taro | 23 | Engineer | 4500000 | Tokyo | 2 |
1 | Hanako | 29 | Designer | 5500000 | Osaka | 4 |
2 | Jiro | 35 | Teacher | 4900000 | Nagoya | 10 |
View Only Column Names
df.head(0)
Name | Age | Occupation | Annual Income (¥) | Location | Years Employed |
---|
Note: Using df.head(0)
shows no data rows but displays only the column names. It’s useful for checking your DataFrame’s structure and column order.
[Personal Experience] The first time I used head(0)
, I thought it was an error because nothing appeared. Later I learned it’s a valid way to see just the headers, and now I always start with it to verify columns.
Combining with print()
print(df.head(3))
This prints rows in a plain text format, which can be hard to read. In Jupyter or Colab, it’s better to rely on df.head()
for a nicely formatted table.
Name Age Occupation Annual Income (¥) Location Years Employed
0 Taro 23 Engineer 4500000 Tokyo 2
1 Hanako 29 Designer 5500000 Osaka 4
2 Jiro 35 Teacher 4900000 Nagoya 10
Saving the Result as a Variable
first_rows = df.head(3)
head()
output to a variable makes it easy to pass directly into plotting or CSV-export steps.Comparing with tail()
df.tail() # Last 5 rows
Name | Age | Occupation | Annual Income (¥) | Location | Years Employed | |
---|---|---|---|---|---|---|
5 | Keiko | 33 | Nurse | 4000000 | Tokyo | 5 |
6 | Sho | 27 | Programmer | 6000000 | Kobe | 3 |
7 | Akane | 24 | Sales | 3200000 | Sendai | 1 |
8 | Takashi | 31 | Lawyer | 8000000 | Yokohama | 12 |
9 | Aoi | 30 | Researcher | 5800000 | Chiba | 8 |
df.tail(2) # Last 2 rows
Name | Age | Occupation | Annual Income (¥) | Location | Years Employed | |
---|---|---|---|---|---|---|
8 | Takashi | 31 | Lawyer | 8000000 | Yokohama | 12 |
9 | Aoi | 30 | Researcher | 5800000 | Chiba | 8 |
tail()
to check for anomalies or missing values at the end of your dataset. I once used a CSV with a trailing NaN row and skewed my stats—now I always verify with tail()
.▶️ For reference, see the official tail documentation:
pandas DataFrame tail Documentation
Summary
Syntax | Description |
---|---|
df.head() | Display first 5 rows |
df.head(n) | Display first n rows |
df.head(0) | Display only column names (same for df.tail(0) ) |
df.tail() | Display last 5 rows |
df.tail(n) | Display last n rows |
The head()
method is invaluable for quickly inspecting large datasets. Next time, we’ll cover info()
and describe()
to understand your data’s structure and summary statistics!
コメント