In this post, I am going to explain how to access the DataFrame rows using index position.
Using ‘loc’ accessor, we can access and manipulate data in a DataFrame using label-based indexing.
I am going to use below dataset to demonstrate the examples.
Name Age City Gender Rating 0 Krishna 34 Bangalore Male 81 1 Sailu 35 Hyderabad Female 76 2 Joel 29 Hyderabad Male 67 3 Chamu 35 Chennai Female 100 4 Krishna 52 Bangalore Male 87 5 Raj 34 Chennai Male 89
Example 1: Access a single row by its label
df.loc[row_label_1]
Example 2: Access multiple rows by their labels
df.loc[[row_label_1, row_label_2]]
Example 3: Access specific column of the row
df.loc[row_label_1, column_label_1]
Example 4: Access specific column2 of the row
df.loc[row_label_1, [column_label_1, column_label_2]]
Example 5: Access multiple rows and columns
df.loc[[row_label_1, row_label_2], [column_label_1, column_label_2]]
Example 6: Access all rows for specific columns
df.loc[:, [column_label_1, column_label_2]]
Find the below working application.
loc_hello_world.py
import pandas as pd
# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Krishna', "Raj"],
'Age': [34, 35, 29, 35, 52, 34],
'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
'Rating': [81, 76, 67, 100, 87, 89]}
df = pd.DataFrame(data)
print('Original DataFrame')
print(df)
print('\nSet "Name" column as index column')
df.set_index('Name', inplace=True)
print(df)
row_label_1 = 'Krishna'
row_label_2 = 'Chamu'
column_label_1 = 'City'
column_label_2 = 'Age'
# Access a single row by its label
result = df.loc[row_label_1]
print('\nAccess a single row by its label :\n',result)
# Access multiple rows by their labels
result = df.loc[[row_label_1, row_label_2]]
print('\nAccess multiple rows by their labels :\n',result)
# Access specific column of the row
result = df.loc[row_label_1, column_label_1]
print('\nAccess specific column of the row :\n',result)
# Access specific columns of the row
result = df.loc[row_label_1, [column_label_1, column_label_2]]
print('\nAccess specific columns of the row :\n',result)
# Access multiple rows and columns
result = df.loc[[row_label_1, row_label_2], [column_label_1, column_label_2]]
print('\nAccess multiple rows and columns :\n',result)
# Access all rows for specific columns
result = df.loc[:, [column_label_1, column_label_2]]
print('\nAccess all rows for specific columns :\n',result)
Output
Original DataFrame Name Age City Gender Rating 0 Krishna 34 Bangalore Male 81 1 Sailu 35 Hyderabad Female 76 2 Joel 29 Hyderabad Male 67 3 Chamu 35 Chennai Female 100 4 Krishna 52 Bangalore Male 87 5 Raj 34 Chennai Male 89 Set "Name" column as index column Age City Gender Rating Name Krishna 34 Bangalore Male 81 Sailu 35 Hyderabad Female 76 Joel 29 Hyderabad Male 67 Chamu 35 Chennai Female 100 Krishna 52 Bangalore Male 87 Raj 34 Chennai Male 89 Access a single row by its label : Age City Gender Rating Name Krishna 34 Bangalore Male 81 Krishna 52 Bangalore Male 87 Access multiple rows by their labels : Age City Gender Rating Name Krishna 34 Bangalore Male 81 Krishna 52 Bangalore Male 87 Chamu 35 Chennai Female 100 Access specific column of the row : Name Krishna Bangalore Krishna Bangalore Name: City, dtype: object Access specific columns of the row : City Age Name Krishna Bangalore 34 Krishna Bangalore 52 Access multiple rows and columns : City Age Name Krishna Bangalore 34 Krishna Bangalore 52 Chamu Chennai 35 Access all rows for specific columns : City Age Name Krishna Bangalore 34 Sailu Hyderabad 35 Joel Hyderabad 29 Chamu Chennai 35 Krishna Bangalore 52 Raj Chennai 34
Following are the common uses of loc accessor
a. Access rows and columns
b. Slicing with index labels
c. Boolean indexing
d. Assigning values
Access rows and columns
This is already covered in the introduction part of this post.
Slicing with index labels
Example 1: Slicing rows based on index labels. Here both 'row_label_1' and 'row_label_2' are inclusive.
df.loc['row_label_1':'row_label_2']
Example 2: Slicing rows and selecting specific columns. Here 'row_label_1', 'row_label_2', 'column_label_1' and 'column_label_2' are inclusive.
df.loc['row_label_1':'row_label_2', 'column_label_1':'column_label_2']
slicing.py
import pandas as pd
# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Ram', "Raj"],
'Age': [34, 35, 29, 35, 52, 34],
'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
'Rating': [81, 76, 67, 100, 87, 89]}
df = pd.DataFrame(data)
print('Original DataFrame')
print(df)
print('\nSet "Name" column as index column')
df.set_index('Name', inplace=True)
print(df)
row_label_1 = 'Krishna'
row_label_2 = 'Chamu'
column_label_1 = 'Age'
column_label_2 = 'Gender'
# Slicing rows based on index labels
result = df.loc[row_label_1:row_label_2]
print('\nSlicing rows based on index labels:\n',result)
# Slicing rows and selecting specific columns
result = df.loc[row_label_1:row_label_2, column_label_1:column_label_2]
print('\nSlicing rows and selecting specific columns\n',result)
Output
Original DataFrame Name Age City Gender Rating 0 Krishna 34 Bangalore Male 81 1 Sailu 35 Hyderabad Female 76 2 Joel 29 Hyderabad Male 67 3 Chamu 35 Chennai Female 100 4 Ram 52 Bangalore Male 87 5 Raj 34 Chennai Male 89 Set "Name" column as index column Age City Gender Rating Name Krishna 34 Bangalore Male 81 Sailu 35 Hyderabad Female 76 Joel 29 Hyderabad Male 67 Chamu 35 Chennai Female 100 Ram 52 Bangalore Male 87 Raj 34 Chennai Male 89 Slicing rows based on index labels: Age City Gender Rating Name Krishna 34 Bangalore Male 81 Sailu 35 Hyderabad Female 76 Joel 29 Hyderabad Male 67 Chamu 35 Chennai Female 100 Slicing rows and selecting specific columns Age City Gender Name Krishna 34 Bangalore Male Sailu 35 Hyderabad Female Joel 29 Hyderabad Male Chamu 35 Chennai Female
Boolean indexing
Example 1: Access rows based on a condition
age_greater_34 = df.loc[df['Age'] > 34]
Example 2: Access rows based on multiple conditions
age_greater_34_city_hyderabad = df.loc[(df['Age'] > 34) & (df['City'] == 'Hyderabad')]
boolean_indexing.py
import pandas as pd
# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Ram', "Raj"],
'Age': [34, 35, 29, 35, 52, 34],
'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
'Rating': [81, 76, 67, 100, 87, 89]}
df = pd.DataFrame(data)
print('Original DataFrame')
print(df)
print('\nSet "Name" column as index column')
df.set_index('Name', inplace=True)
print(df)
# Access rows based on a condition
age_greater_34 = df.loc[df['Age'] > 34]
print('\nage_greater_34\n',age_greater_34)
# Access rows based on multiple conditions
age_greater_34_city_hyderabad = df.loc[(df['Age'] > 34) & (df['City'] == 'Hyderabad')]
print('\nage_greater_34_city_hyderabad:\n',age_greater_34_city_hyderabad)
Output
Original DataFrame Name Age City Gender Rating 0 Krishna 34 Bangalore Male 81 1 Sailu 35 Hyderabad Female 76 2 Joel 29 Hyderabad Male 67 3 Chamu 35 Chennai Female 100 4 Ram 52 Bangalore Male 87 5 Raj 34 Chennai Male 89 Set "Name" column as index column Age City Gender Rating Name Krishna 34 Bangalore Male 81 Sailu 35 Hyderabad Female 76 Joel 29 Hyderabad Male 67 Chamu 35 Chennai Female 100 Ram 52 Bangalore Male 87 Raj 34 Chennai Male 89 age_greater_34 Age City Gender Rating Name Sailu 35 Hyderabad Female 76 Chamu 35 Chennai Female 100 Ram 52 Bangalore Male 87 age_greater_35_city_hyderabad: Age City Gender Rating Name Sailu 35 Hyderabad Female 76
Assigning values
Example 1: Assign a value to a specific cell
df.loc[row_label, column_label] = new_value
Example 2: Assign a value to multiple cells based on a condition
df.loc[df['City'] == 'Hyderabad', 'City'] = 'Mumbai'
assign_values.py
import pandas as pd
# Create a sample DataFrame
data = {'Name': ['Krishna', 'Sailu', 'Joel', 'Chamu', 'Ram', "Raj"],
'Age': [34, 35, 29, 35, 52, 34],
'City': ['Bangalore', 'Hyderabad', 'Hyderabad', 'Chennai', 'Bangalore', 'Chennai'],
'Gender': ['Male', 'Female', 'Male', 'Female', 'Male', 'Male'],
'Rating': [81, 76, 67, 100, 87, 89]}
df = pd.DataFrame(data)
print('Original DataFrame')
print(df)
print('\nSet "Name" column as index column')
df.set_index('Name', inplace=True)
print(df)
row_label = 'Joel'
column_label = 'City'
new_value = 'Delhi'
# Assign a value to a specific cell
df.loc[row_label, column_label] = new_value
print('\nSetting Joel\' City to Delhi.\n', df)
# Assign a value to multiple cells based on a condition
df.loc[df['City'] == 'Hyderabad', 'City'] = 'Mumbai'
print('\nSetting the city to Mumbai where the City is Hyderabad.\n',df)
Output
Original DataFrame Name Age City Gender Rating 0 Krishna 34 Bangalore Male 81 1 Sailu 35 Hyderabad Female 76 2 Joel 29 Hyderabad Male 67 3 Chamu 35 Chennai Female 100 4 Ram 52 Bangalore Male 87 5 Raj 34 Chennai Male 89 Set "Name" column as index column Age City Gender Rating Name Krishna 34 Bangalore Male 81 Sailu 35 Hyderabad Female 76 Joel 29 Hyderabad Male 67 Chamu 35 Chennai Female 100 Ram 52 Bangalore Male 87 Raj 34 Chennai Male 89 Setting Joel' City to Delhi. Age City Gender Rating Name Krishna 34 Bangalore Male 81 Sailu 35 Hyderabad Female 76 Joel 29 Delhi Male 67 Chamu 35 Chennai Female 100 Ram 52 Bangalore Male 87 Raj 34 Chennai Male 89 Setting the city to Mumbai where the City is Hyderabad. Age City Gender Rating Name Krishna 34 Bangalore Male 81 Sailu 35 Mumbai Female 76 Joel 29 Delhi Male 67 Chamu 35 Chennai Female 100 Ram 52 Bangalore Male 87 Raj 34 Chennai Male 89
No comments:
Post a Comment