Open-Source Internship opportunity by OpenGenus for programmers. Apply now.
To introduce the topic, we will be talking about Pandas Dataframe and how easily we can add and remove rows in a DataFrame by following simple steps.
Table of contents:
- Adding a Row using a Dictionary
- Adding a Row using a List
- Adding a Row using a Series
- Removing a Row by Name
- Removing a Row by Position
- Removing a Row by the boolean index
Adding a Row using a Dictionary
Let's say that we have a Pandas DataFrame named as df which looks like
first | last | |
---|---|---|
Lena | Robert | lena@gmail |
Ashley | John | ashley@gmail |
We have two rows and three different columns, which describe a person's first name, last name & email.
Next, to add a new row to the above DataFrame is to use the Pandas .append() method. Just like we append items to an array of items, this method works in a similar way on a Pandas DataFrame.
To add another row we use a dictionary, with the keys matching the names of the columns of the DataFrame and the values matching the contents of the new row that we want to add. The extra parameter ignore_index is set to True while using a dictionary to add rows.
Code:
df = df.append({'first':'Jill', 'last':'Cooper', 'email':'jill@gmail'}, ignore_index=True)
Thus by writing this, we have successfully added a new row to the existing DataFrame which now looks like this-
first | last | |
---|---|---|
Lena | Robert | lena@gmail |
Ashley | John | ashley@gmail |
Jill | Cooper | Jill@gmail |
Adding a Row using a List
Using the same DataFrame as the above example, we continue adding a new row to the same DataFrame now by using a list.
While adding a row to a Pandas DataFrame using list, we have to do it by the Pandas .loc method. The loc method contains the length of the DataFrame.
Next we add the data of our new row into a list.
Code:
df.loc[len(df)] = ['Jay','Bion', 'jay@gmail']
Thus by writing this, we have successfully added a new row to the existing DataFrame which now looks like this-
first | last | |
---|---|---|
Lena | Robert | lena@gmail |
Ashley | John | ashley@gmail |
Jill | Cooper | Jill@gmail |
Jay | Bion | jay@gmail |
Adding a Row using a Series
Now that we have learned two ways on how to do this, the third method includes using a Pandas Series. A Series is a one-dimensional array capable of holding data of any type.
This process involves using the .append() method that we used before. The Pandas .Series() method converts list which contains the data of the new row into a Series.
Code:
df = df.append(pd.Series(['Shaun','Salp', 'shaun@gmail']), ignore_index=True)
Thus by writing this, we have successfully added a new row to the existing DataFrame which now looks like this-
first | last | |
---|---|---|
Lena | Robert | lena@gmail |
Ashley | John | ashley@gmail |
Jill | Cooper | Jill@gmail |
Jay | Bion | jay@gmail |
Shaun | Salp | shaun@gmail |
Removing a Row by Name
For deleting a row in the Dataframe below,
first | last | |
---|---|---|
Lena | Robert | lena@gmail |
Ashley | John | ashley@gmail |
Jill | Cooper | Jill@gmail |
Jay | Bion | jay@gmail |
Shaun | Salp | shaun@gmail |
We use the Pandas .drop() method. The .drop() method has many parameters, the important ones being:
- label: The label/labels of either row or column
- axis: Default of this parameter is set to 0 which indicates rows, thus no need to mention it while deleting rows.
- inplace: When set to True it allows you to overwrite over the existing DataFrame. If set to False then Pandas will return a copy of the DataFrame instead of overwriting the original DataFrame.
Deleting a row by name involves using the .drop() method. We are using the name of the first entry in the DataFrame. The two parameters used will be the label,inplace, where label is set to the name and inplace is set to True.
Code:
df.drop('Lena', inplace=True)
The above code would result in the deletion of the row with the name='Lena' from the DataFrame which now looks like this-
first | last | |
---|---|---|
Ashley | John | ashley@gmail |
Jill | Cooper | Jill@gmail |
Jay | Bion | jay@gmail |
Shaun | Salp | shaun@gmail |
Removing a Row by Position
Using the same DataFrame as the above example, we continue deleting a row from the same DataFrame now by Position.
Deleting by position uses the row position(index of the row) to delete the specific row. It uses the .drop method with the index of the row to be deleted. This method is convienient for deleting multiple rows by just using their index position.
Code:
df = df.drop(df.index [[ 0 ]])
The above code would result in the deletion of the row with the name='Ashley' from the DataFrame which now looks like this-
first | last | |
---|---|---|
Jill | Cooper | Jill@gmail |
Jay | Bion | jay@gmail |
Shaun | Salp | shaun@gmail |
Removing a Row by the boolean index
Using the same DataFrame as the above example, we continue deleting a row from the same DataFrame now by using the Boolean index.
Boolean index is normally used to filter the rows of a Pandas Dataframe easily, similarily it can also be used to delete the rows. This method is used as it can delete multiple rows with the same index data all at once, instead of specifying the index number of multiple rows.
Code:
df = df.drop(df.index != "Jill")
The above code would result in the deletion of all possible rows with the name='Jill' from the DataFrame which now looks like this-
first | last | |
---|---|---|
Jay | Bion | jay@gmail |
Shaun | Salp | shaun@gmail |
Conclusion
In this article at OpenGenus, we learned how to add and delete rows into a Pandas DataFrame. We also learned a number of different methods to add rows which include using dictionaries, lists, and Pandas Series. Next, we learned how to delete rows by using the drop method. The different ways to use this method according to Name, Position and finally about deleting rows by the boolean index.