
- Python Pandas - Home
- Python Pandas - Introduction
- Python Pandas - Environment Setup
- Python Pandas - Basics
- Python Pandas - Introduction to Data Structures
- Python Pandas - Index Objects
- Python Pandas - Panel
- Python Pandas - Basic Functionality
- Python Pandas - Indexing & Selecting Data
- Python Pandas - Series
- Python Pandas - Series
- Python Pandas - Slicing a Series Object
- Python Pandas - Attributes of a Series Object
- Python Pandas - Arithmetic Operations on Series Object
- Python Pandas - Converting Series to Other Objects
- Python Pandas - DataFrame
- Python Pandas - DataFrame
- Python Pandas - Accessing DataFrame
- Python Pandas - Slicing a DataFrame Object
- Python Pandas - Modifying DataFrame
- Python Pandas - Removing Rows from a DataFrame
- Python Pandas - Arithmetic Operations on DataFrame
- Python Pandas - IO Tools
- Python Pandas - IO Tools
- Python Pandas - Working with CSV Format
- Python Pandas - Reading & Writing JSON Files
- Python Pandas - Reading Data from an Excel File
- Python Pandas - Writing Data to Excel Files
- Python Pandas - Working with HTML Data
- Python Pandas - Clipboard
- Python Pandas - Working with HDF5 Format
- Python Pandas - Comparison with SQL
- Python Pandas - Data Handling
- Python Pandas - Sorting
- Python Pandas - Reindexing
- Python Pandas - Iteration
- Python Pandas - Concatenation
- Python Pandas - Statistical Functions
- Python Pandas - Descriptive Statistics
- Python Pandas - Working with Text Data
- Python Pandas - Function Application
- Python Pandas - Options & Customization
- Python Pandas - Window Functions
- Python Pandas - Aggregations
- Python Pandas - Merging/Joining
- Python Pandas - MultiIndex
- Python Pandas - Basics of MultiIndex
- Python Pandas - Indexing with MultiIndex
- Python Pandas - Advanced Reindexing with MultiIndex
- Python Pandas - Renaming MultiIndex Labels
- Python Pandas - Sorting a MultiIndex
- Python Pandas - Binary Operations
- Python Pandas - Binary Comparison Operations
- Python Pandas - Boolean Indexing
- Python Pandas - Boolean Masking
- Python Pandas - Data Reshaping & Pivoting
- Python Pandas - Pivoting
- Python Pandas - Stacking & Unstacking
- Python Pandas - Melting
- Python Pandas - Computing Dummy Variables
- Python Pandas - Categorical Data
- Python Pandas - Categorical Data
- Python Pandas - Ordering & Sorting Categorical Data
- Python Pandas - Comparing Categorical Data
- Python Pandas - Handling Missing Data
- Python Pandas - Missing Data
- Python Pandas - Filling Missing Data
- Python Pandas - Interpolation of Missing Values
- Python Pandas - Dropping Missing Data
- Python Pandas - Calculations with Missing Data
- Python Pandas - Handling Duplicates
- Python Pandas - Duplicated Data
- Python Pandas - Counting & Retrieving Unique Elements
- Python Pandas - Duplicated Labels
- Python Pandas - Grouping & Aggregation
- Python Pandas - GroupBy
- Python Pandas - Time-series Data
- Python Pandas - Date Functionality
- Python Pandas - Timedelta
- Python Pandas - Sparse Data Structures
- Python Pandas - Sparse Data
- Python Pandas - Visualization
- Python Pandas - Visualization
- Python Pandas - Additional Concepts
- Python Pandas - Caveats & Gotchas
Python Pandas - Indexing with MultiIndex
Indexing with MultiIndex refers to accessing and selecting data in a Pandas DataFrame that has multiple levels of indexing. Unlike standard DataFrames that have a single index, a MultiIndexed DataFrame allows hierarchical indexing, where rows and columns are labeled using multiple keys.
This type of indexing is useful for handling structured datasets, making it easier to perform operations like grouping, slicing, and advanced selections. Instead of using a single label or position-based indexing, you can use tuples of labels to access data at different levels.
In this tutorial, you will learn how to use MultiIndex for advanced indexing and selection, including slicing, and Boolean indexing.
Basic Indexing with MultiIndex
Indexing with MultiIndex is similar to single-index DataFrames, but here you can also use tuples to index by multiple levels.
Example
Here is a basic example of selecting a subset of data using the level name with the .loc[] indexer.
import pandas as pd # Create a MultiIndex object index = pd.MultiIndex.from_tuples([('A', 'one'), ('A', 'two'), ('B', 'one'), ('B', 'two')]) # Create a DataFrame data = [[1, 2], [3, 4], [5, 6], [7, 8]] df = pd.DataFrame(data, index=index, columns=['X', 'Y']) # Display the input DataFrame print('Original MultiIndexed DataFrame:\n',df) # Select all rows based on the level label print('Selected Subset:\n',df.loc['A'])
Following is the output of the above code −
Original MultiIndexed DataFrame:
X | Y | ||
---|---|---|---|
A | one | 1 | 2 |
two | 3 | 4 | |
B | one | 5 | 6 |
two | 7 | 8 |
X | Y | |
---|---|---|
one | 1 | 2 |
two | 3 | 4 |
Example
Here is another example demonstrating indexing a MultiIndexed DataFrame using a tuple of level labels with the .loc[] indexer.
import pandas as pd # Create a MultiIndex object index = pd.MultiIndex.from_tuples([('A', 'one'), ('A', 'two'), ('B', 'one'), ('B', 'two')]) # Create a DataFrame data = [[1, 2], [3, 4], [5, 6], [7, 8]] df = pd.DataFrame(data, index=index, columns=['X', 'Y']) # Display the input DataFrame print('Original MultiIndexed DataFrame:\n',df) # Index the data based on the tuple of level labels print('Selected Subset:') print(df.loc[('B', 'one')])
Following is the output of the above code −
Original MultiIndexed DataFrame:
X | Y | ||
---|---|---|---|
A | one | 1 | 2 |
two | 3 | 4 | |
B | one | 5 | 6 |
two | 7 | 8 |
Advanced Indexing with MultiIndexed Data
Advanced indexing with a MultiIndexed DataFrame can be done by using the .loc indexer, it allows you to specify more complex conditions and selections in a MultiIndex DataFrame.
Example
Following is the example of selecting the data from a MultiIndexed DataFrame using the advanced indexing with .loc[] indexer.
import pandas as pd # Create a MultiIndex object index = pd.MultiIndex.from_tuples([('A', 'one'), ('A', 'two'), ('B', 'one'), ('B', 'two')]) # Create a DataFrame data = [[1, 2], [3, 4], [5, 6], [7, 8]] df = pd.DataFrame(data, index=index, columns=['X', 'Y']) # Display the input DataFrame print('Original MultiIndexed DataFrame:\n',df) # Select specific element print('Selected data:') print(df.loc[('A', 'two'), 'Y'])
Following is the output of the above code −
Original MultiIndexed DataFrame:
X | Y | ||
---|---|---|---|
A | one | 1 | 2 |
two | 3 | 4 | |
B | one | 5 | 6 |
two | 7 | 8 |
Boolean Indexing with MultiIndex
Pandas MultiIndexed objects allows you to apply the boolean indexing to filter data based on conditions. It will create a mask and apply it to the DataFrame.
Example
The following example demonstrates applying the boolean indexing to the MultiIndexed DataFrame to select the rows where 'X' is greater than 2.
import pandas as pd # Create a MultiIndex object index = pd.MultiIndex.from_tuples([('A', 'one'), ('A', 'two'), ('B', 'one'), ('B', 'two')]) # Create a DataFrame data = [[1, 2], [3, 4], [5, 6], [7, 8]] df = pd.DataFrame(data, index=index, columns=['X', 'Y']) # Display the input DataFrame print('Original MultiIndexed DataFrame:\n',df) # Select data based on the boolean indexing print('Selected data:') mask = df['X'] > 2 print(df[mask])
Following is the output of the above code −
Original MultiIndexed DataFrame:
X | Y | ||
---|---|---|---|
A | one | 1 | 2 |
two | 3 | 4 | |
B | one | 5 | 6 |
two | 7 | 8 |
X | Y | ||
---|---|---|---|
A | two | 3 | 4 |
B | one | 5 | 6 |
two | 7 | 8 |
Slicing with MultiIndex
Slicing with MultiIndex works similarly to single-index DataFrames but requires tuples for complex operations.
Example
This example demonstrates how to apply slicing to a MultiIndexed DataFrame using the pandas slicer and the .loc[] indexer.
import pandas as pd # Create a MultiIndex object index = pd.MultiIndex.from_tuples([('A', 'one'), ('A', 'two'), ('A', 'three'),('B', 'one'), ('B', 'two'), ('B', 'three')]) # Create a DataFrame data = [[1, 2], [3, 4], [1, 1], [5, 6], [7, 8], [2, 2]] df = pd.DataFrame(data, index=index, columns=['X', 'Y']) # Display the input DataFrame print('Original MultiIndexed DataFrame:\n',df) # Slice rows between 'A' and 'B' print('Sliced data:') print(df.loc[('A', 'B'),['one','three'],:])
Following is the output of the above code −
Original MultiIndexed DataFrame:
X | Y | ||
---|---|---|---|
A | one | 1 | 2 |
two | 3 | 4 | |
three | 1 | 1 | |
B | one | 5 | 6 |
two | 7 | 8 | |
three | 2 | 2 |
X | Y | ||
---|---|---|---|
A | one | 1 | 2 |
three | 1 | 1 | |
B | one | 5 | 6 |
three | 2 | 2 |