Get this book -> Problems on Array: For Interviews and Competitive Programming

Table of Contents

- Pandas DataFrame
- Sort DataFrame rows based on index
- Sort DataFrame rows based on a single column
- Sort DataFrame rows based on a multiple columns
- Sort DataFrame rows based on columns in Descending Order
- Sort DataFrame columns based on index
- Sort columns of a DataFrame based on a single row
- Sort columns of a DataFrame in Descending Order based on a single row
- Sort columns of a DataFrame based on a multiple rows
- N largest value
- N smallest value
- Conclusion

# Pandas DataFrame

DataFrame is the two-dimensional data structure of Pandas. It consists of labeled rows and columns.

Letβs create a simple one to understand the basic features of a DataFrame. To create a DataFrame, we can pass a Python dictionary to its constructor.

Code:

```
import pandas as pd
df = pd.DataFrame({
"Name": ["Arjuna", "Bhishma", "Krishna", "Karna"],
"Age": [30, 100, 80, 35]
})
print(df)
```

Age | Name | |
---|---|---|

0 | 30 | Arjuna |

1 | 100 | Bhishma |

2 | 80 | Krishna |

3 | 35 | Karna |

Dictionary keys become the column names and the values become the data stored in the DataFrame.In the above example we have a DataFrame with two columns and four rows.

# Sort DataFrame rows based on index

If we want to sort rows of a DataFrame object by index, just call **sort_index()**.

Code:

```
import pandas as pd
df = pd.DataFrame([(2,3),(4,2),(1,8),(9,1)], index=[3,1,0,2], columns = ['c1','c2'])
sorted_df=df.sort_index()
print(sorted_df)
```

In the above example, the original DataFrame is sorted by index.The sorted DataFrame now looks like this-

c1 | c2 | |
---|---|---|

0 | 1 | 8 |

1 | 4 | 2 |

2 | 9 | 1 |

3 | 2 | 3 |

# Sort DataFrame rows based on a single column

We can sort all the rows in DataFrame based on a single column, that is passing the column name in **by** argument.

Code:

```
k={"a" : [4,1,1,2], "b":[1, 4, 2, 6], "c":[3,1,6,5]}
df=pd.DataFrame(k)
sorted_df=df.sort_values(by="a")
print(sorted_df)
```

**sort_values** is used to sort a dataframe by its column or columns

In the above example, rows of the DataFrame are sorted based on the dictionary key or column name (a). The sorted DataFrame now looks like this-

a | b | c | |
---|---|---|---|

1 | 1 | 4 | 1 |

2 | 1 | 2 | 6 |

3 | 2 | 6 | 5 |

0 | 4 | 1 | 3 |

# Sort DataFrame rows based on a multiple columns

what if we want to sort in such a way values are same for one column then can we use 2nd column for sorting those rows ?

We can sort all the rows in a DataFrame based on multiple columns, that is passing the column names in a list in **by** argument.

Code:

```
k={"a" : [4,1,1,2], "b":[1, 4, 2, 6], "c":[3,1,6,5]}
df=pd.DataFrame(k)
sorted_df=df.sort_values(by=["a","b"])
print(sorted_df)
```

In the above example, rows of the DataFrame are sorted based on the dictionary keys or column names (a,b). The sorted DataFrame now looks like this-

a | b | c | |
---|---|---|---|

2 | 1 | 2 | 6 |

1 | 1 | 4 | 1 |

3 | 2 | 6 | 5 |

0 | 4 | 1 | 3 |

# Sort DataFrame rows based on columns in descending order

We can sort all the rows in a DataFrame in descending order,by passing the argument **ascending** with value **False** along with **by** argument.

Code:

```
k={"a" : [4,1,1,2], "b":[1, 4, 2, 6], "c":[3,1,6,5]}
df=pd.DataFrame(k)
sorted_df=df.sort_values(by="a",ascending=False)
print(sorted_df)
```

In the above example, rows of the DataFrame are sorted in descending order based on the dictionary key or column name (a). The sorted DataFrame now looks like this-

a | b | c | |
---|---|---|---|

0 | 4 | 1 | 3 |

3 | 2 | 6 | 5 |

1 | 1 | 4 | 1 |

2 | 1 | 2 | 6 |

# Sort DataFrame columns based on index

If we want to sort rows of a DataFrame object by index, just call **sort_index()** and pass the argument **axis=1**.

Code:

```
import pandas as pd
df = pd.DataFrame([(2,3),(4,2),(1,8),(9,1)], index=[3,1,0,2], columns = ['c2','c1'])
sorted_df=df.sort_index(axis=1)
print(sorted_df)
```

In the above example, the original DataFrame is sorted by index from columns.The sorted DataFrame now looks like this-

c1 | c2 | |
---|---|---|

3 | 3 | 2 |

1 | 2 | 4 |

0 | 8 | 1 |

2 | 1 | 9 |

# Sort columns of a DataFrame based on a single row

We can sort all the columns of a DataFrame using a single row, by passing the row index labels in **by** argument and **axis=1**.

Code:

```
matrix=[(5,4,3,2),(1,4,2,6),(3,1,6,5)]
df = pd.DataFrame(matrix, index=list('abc'))
sorted_df=df.sort_values(by='b',axis=1)
print(sorted_df)
```

In the above example, all the columns in a DataFrame are sorted based on a single row with index label 'b'. The sorted DataFrame now looks like this-

0 | 2 | 1 | 3 | |
---|---|---|---|---|

a | 5 | 3 | 4 | 2 |

b | 1 | 2 | 4 | 6 |

c | 3 | 6 | 1 | 5 |

# Sort columns of a DataFrame in descending order based on a single row

We can sort all the columns of a DataFrame using a single row in descending order, by passing the row index labels in **by** argument , **axis=1** and **ascending** with value **False**.

Code:

```
matrix=[(5,4,3,2),(1,4,2,6),(3,1,6,5)]
df = pd.DataFrame(matrix, index=list('abc'))
sorted_df=df.sort_values(by='b',axis=1,ascending=False)
print(sorted_df)
```

In the above example, all the columns in a DataFrame are sorted in descending order based on a single row with index label 'b'. The sorted DataFrame now looks like this-

3 | 1 | 2 | 0 | |
---|---|---|---|---|

a | 2 | 4 | 3 | 5 |

b | 6 | 4 | 2 | 1 |

c | 5 | 1 | 6 | 3 |

# Sort columns of a DataFrame based on a multiple rows

We can sort all the columns of a DataFrame using multiple rows, by passing the row index labels in **by** argument and **axis=1**.

Code:

```
matrix=[(5,4,3,2),(1,4,2,6),(3,1,6,5)]
df = pd.DataFrame(matrix, index=list('abc'))
sorted_df=df.sort_values(by=['a','b'],axis=1)
print(sorted_df)
```

In the above example, all the columns in a DataFrame are sorted based on a multiple rows with index labels 'a' and 'b'. The sorted DataFrame now looks like this-

3 | 2 | 1 | 0 | |
---|---|---|---|---|

a | 2 | 3 | 4 | 5 |

b | 6 | 2 | 4 | 1 |

c | 5 | 6 | 1 | 3 |

# N largest value

DataFrame provide some functions to get the largest n values from the data.

```
import pandas as pd
d = {"a": [1,2,3,4,5], "b":[2,3,4,5,6]}
df = pd.DataFrame(d)
result = df.nlargest(3, "a")
print(result)
```

The three largest value based on column 'a' of this DataFrame.

a | b | |
---|---|---|

4 | 5 | 6 |

3 | 4 | 5 |

2 | 3 | 4 |

# N smallest value

DataFrame provide some functions to get the smallest n values from the data.

```
import pandas as pd
d = {"a": [1,2,3,4,5], "b":[2,3,4,5,6]}
df = pd.DataFrame(d)
result = df.nlargest(3, "a")
print(result)
```

The two smallest value based on column 'b' of this DataFrame.

a | b | |
---|---|---|

0 | 1 | 2 |

1 | 2 | 3 |

# Conclusion

In this article at OpenGenus, we learned how to sort columns and rows of a DataFrame objects in Pandas using sort_index and sort_values.