This page includes an interactive code editor. Try modifying and running the examples!
Pandas Data Structures
The two primary data structures in Pandas are the Series and the DataFrame. These are the fundamental building blocks for efficient data manipulation and analysis in Python.
1. The Pandas Series
The Series is the most basic Pandas data structure.
- Definition: A one-dimensional labeled array capable of holding data of any type (integer, string, float, Python objects, etc.).
- Analogy: You can think of a Series as a single column in a spreadsheet or a SQL table.
- Key Feature: Index: A Series has an index (row labels) associated with its data, which allows for fast lookups and automatic data alignment during operations.
| Feature | Description |
|---|---|
| Dimensionality | 1-Dimensional |
| Mutability | Data is mutable (can be changed), but the size is immutable (cannot easily add/remove elements). |
| Data Type | Homogeneous (all elements typically hold the same data type). |
2. The Pandas DataFrame
The DataFrame is the most commonly used and powerful Pandas data structure.
- Definition: A two-dimensional, size-mutable, and potentially heterogeneous tabular data structure with labeled axes (rows and columns).
- Analogy: A DataFrame is essentially a spreadsheet or a collection of Series objects where each Series is a column and they all share the same index (the row labels).
- Structure: It has both a **row index** and distinct **column labels**.
| Feature | Description |
|---|---|
| Dimensionality | 2-Dimensional |
| Mutability | Both data and size are mutable (you can add and remove columns). |
| Data Type | Heterogeneous (each column/Series can hold a different data type). |
| Core Components | Composed of: **Data** (the values), **Index** (row labels), and **Columns** (column labels). |
Data Structure Hierarchy
The relationship between the two structures can be visualized as:
DataFrame → Collection of aligned SeriesSeries → Collection of labeled Scalar ValuesExample: Creating Series and DataFrames
Data Structures Example
Series Features
- One-dimensional
- Homogeneous data
- Size immutable
- Values mutable
DataFrame Features
- Two-dimensional
- Potentially heterogeneous data
- Size mutable
- Data mutable