DataFrame index
DataFrame index is an unique identifier assigned to each row of the DataFrame. When a DataFrame is created, Pandas assign a default index to the DataFrame, where the index will start with the number 0 and increment by 1 for each subsequent rows. This default index is known as RangeIndex.
dataframe_index.py
import pandas as pd
# Create a sample DataFrame
data = {'Name': ['Krishna', 'Ram', 'Joel', 'Gopi', 'Jitendra', "Raj"],
'Age': [34, 25, 29, 41, 52, 23],
'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Hyderabad', 'Bangalore', 'Chennai']}
df = pd.DataFrame(data)
print(df)
Output
Name Age City 0 Krishna 34 Bangalore 1 Ram 25 Chennai 2 Joel 29 Hyderabad 3 Gopi 41 Hyderabad 4 Jitendra 52 Bangalore 5 Raj 23 Chennai
Specify new index to the existing DataFrame
By updating df.index attribute we can specify a new index to the existing DataFrame.
Example
df.index = ['a', 'b', 'c', 'd', 'e', 'f']
add_new_index_to_dataframe.py
import pandas as pd
# Create a sample DataFrame
data = {'Name': ['Krishna', 'Ram', 'Joel', 'Gopi', 'Jitendra', "Raj"],
'Age': [34, 25, 29, 41, 52, 23],
'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Hyderabad', 'Bangalore', 'Chennai']}
df = pd.DataFrame(data)
print("with default index")
print(df)
print("\nwith updated index\n")
df.index = ['a', 'b', 'c', 'd', 'e', 'f']
print(df)
print("\nprint with the index label\n")
for label in df.index:
print("For the index : ",label)
print(df.loc[label], "\n")
Output
with default index
Name Age City
0 Krishna 34 Bangalore
1 Ram 25 Chennai
2 Joel 29 Hyderabad
3 Gopi 41 Hyderabad
4 Jitendra 52 Bangalore
5 Raj 23 Chennai
with updated index
Name Age City
a Krishna 34 Bangalore
b Ram 25 Chennai
c Joel 29 Hyderabad
d Gopi 41 Hyderabad
e Jitendra 52 Bangalore
f Raj 23 Chennai
print with the index label
For the index : a
Name Krishna
Age 34
City Bangalore
Name: a, dtype: object
For the index : b
Name Ram
Age 25
City Chennai
Name: b, dtype: object
For the index : c
Name Joel
Age 29
City Hyderabad
Name: c, dtype: object
For the index : d
Name Gopi
Age 41
City Hyderabad
Name: d, dtype: object
For the index : e
Name Jitendra
Age 52
City Bangalore
Name: e, dtype: object
For the index : f
Name Raj
Age 23
City Chennai
Name: f, dtype: object
Use existing column as an index
Using set_index() method, we can set one or more columns as DataFrame index.
Example
df.set_index("Name", inplace=True)
Above snippet use the ‘Name’ column as DataFrame index. ‘inPlace=True’ perform the DataFrame modification in place without creating a new DataFrame.
set_existing_column_as_index.py
import pandas as pd
# Create a sample DataFrame
data = {'Name': ['Krishna', 'Ram', 'Joel', 'Gopi', 'Jitendra', "Raj"],
'Age': [34, 25, 29, 41, 52, 23],
'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Hyderabad', 'Bangalore', 'Chennai']}
df = pd.DataFrame(data)
print("with default index")
print(df)
print("\n")
print("with Name as index")
df.set_index("Name", inplace=True)
print(df)
Output
with default index
Name Age City
0 Krishna 34 Bangalore
1 Ram 25 Chennai
2 Joel 29 Hyderabad
3 Gopi 41 Hyderabad
4 Jitendra 52 Bangalore
5 Raj 23 Chennai
with Name as index
Age City
Name
Krishna 34 Bangalore
Ram 25 Chennai
Joel 29 Hyderabad
Gopi 41 Hyderabad
Jitendra 52 Bangalore
Raj 23 Chennai
Set multiple columns as index
Below snippet use the Name and City columns as an index.
df.set_index(["Name", "City"], inplace=True)
Below snippet access the DataFrame row by its Name and City.
row = df.loc[("Krishna", "Bangalore")]
multi_column_index.py
import pandas as pd
# Create a sample DataFrame
data = {'Name': ['Krishna', 'Ram', 'Joel', 'Gopi', 'Jitendra', "Raj"],
'Age': [34, 25, 29, 41, 52, 23],
'City': ['Bangalore', 'Chennai', 'Hyderabad', 'Hyderabad', 'Bangalore', 'Chennai']}
df = pd.DataFrame(data)
print("with default index")
print(df)
print("\n")
print("with Name as index")
df.set_index(["Name", "City"], inplace=True)
print(df)
# Access row with index ("Krishna", "Bangalore")
print('\nAccess row with index ("Krishna", "Bangalore")')
row = df.loc[("Krishna", "Bangalore")]
print(row)
Output
with default index
Name Age City
0 Krishna 34 Bangalore
1 Ram 25 Chennai
2 Joel 29 Hyderabad
3 Gopi 41 Hyderabad
4 Jitendra 52 Bangalore
5 Raj 23 Chennai
with Name as index
Age
Name City
Krishna Bangalore 34
Ram Chennai 25
Joel Hyderabad 29
Gopi Hyderabad 41
Jitendra Bangalore 52
Raj Chennai 23
Access row with index ("Krishna", "Bangalore")
Age 34
Name: (Krishna, Bangalore), dtype: int64
Previous Next Home
No comments:
Post a Comment