【Vol.3】 Pandas Series vs DataFrame – Illustrated Guide

未分類

To start data analysis, it’s essential to understand pandas’ fundamental data structures. pandas offers two main structures: a Series and a DataFrame. Mastering these lets you manipulate and analyze data efficiently. In this post, we’ll explain in detail how to create pandas Series and DataFrames.
[Personal Experience] I once confused the two and passed a 2D list into a Series, causing errors. Remember: Series is 1D, DataFrame is 2D—this insight saved me from many early headaches.

1. Install pandas

If you don’t have pandas installed, run:

pip install pandas

2. Import pandas

To use pandas in your code, import it—commonly with the alias pd:

import pandas as pd

3. What Is a pandas Series?

A pandas Series is a one-dimensional labeled data structure. It behaves like a list or NumPy array but with an index (labels) for easier data operations.

▶️ See the official docs: pandas Series Documentation

3.1 Creating a Series

You can create a Series from a list or a dictionary. Lists maintain order; dictionaries use key–value pairs.

From a List

Example creating a Series from a list:
[Failure Story] I was surprised when print() showed an indexed vertical format. It’s different from a list at first, but I soon found it clearer.

import pandas as pd

data = [1, 2, 3, 4, 5]
series = pd.Series(data)

print(series)

Output:

0    1
1    2
2    3
3    4
4    5
dtype: int64

With Custom Index Labels

Index labels add meaningful context to your data:

index_labels = ['a', 'b', 'c', 'd', 'e']
series_with_index = pd.Series(data, index=index_labels)

print(series_with_index)

Output:

a    1
b    2
c    3
d    4
e    5
dtype: int64

From a Dictionary

Create a Series directly from key–value pairs:

data_dict = {'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
series_from_dict = pd.Series(data_dict)

print(series_from_dict)

Output:

a    1
b    2
c    3
d    4
e    5
dtype: int64

4. What Is a pandas DataFrame?

A pandas DataFrame is a two-dimensional labeled data structure. It arranges data in rows and columns—similar to an Excel sheet or SQL table.

▶️ See the official docs: pandas DataFrame Documentation

4.1 Creating a DataFrame

You can build a DataFrame from lists or dictionaries. Here’s how to create one from a list:

From a List of Lists

Each inner list becomes a row:

import pandas as pd

data = [[1, 'Alice', 24],
        [2, 'Bob', 27],
        [3, 'Charlie', 22],
        [4, 'David', 32],
        [5, 'Eve', 29]]

df = pd.DataFrame(data, columns=['ID', 'Name', 'Age'])

print(df)

Output:

   ID     Name  Age
0   1    Alice   24
1   2      Bob   27
2   3  Charlie   22
3   4    David   32
4   5      Eve   29

From a Dictionary

Use keys as column names:

data = {'ID': [1, 2, 3, 4, 5],
        'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eve'],
        'Age': [24, 27, 22, 32, 29]}

df = pd.DataFrame(data)

print(df)

Output:

   ID     Name  Age
0   1    Alice   24
1   2      Bob   27
2   3  Charlie   22
3   4    David   32
4   5      Eve   29

5. Setting and Changing the Index

You can set or modify an index on Series or DataFrames for custom ordering and labels:

# Set the 'ID' column as index
df = df.set_index('ID')
print(df)

Output:

       Name  Age
ID
1    Alice   24
2      Bob   27
3  Charlie   22
4    David   32
5      Eve   29

6. Difference Between 1D (Series) and 2D (DataFrame)

The illustration below visualizes how a Series (1D) and a DataFrame (2D) relate their indexes and data:

Structure of Series vs. DataFrame

7. Conclusion

We’ve covered how to create pandas Series and DataFrames—key structures for data analysis. A Series is a 1D labeled array; a DataFrame is a 2D labeled table. Once you can use both confidently, data manipulation and analysis with pandas become much easier.
[Tip] I drew my own diagrams to clearly see the differences between Series and DataFrame. Having that visual reference reduced errors dramatically.

Next time, we’ll learn how to display DataFrame contents effectively!

▲ Back to Top

コメント

Copied title and URL