zaport.blogg.se - Pandas drop duplicate rows

Pandas drop duplicate rows how to#

In this example, Ill explain how to delete duplicate observations in a pandas DataFrame. We can see the outputs in the above output block, and the value “None” is the output from the drop_duplicates() method. Example 1: Drop Duplicates from pandas DataFrame. The Pandas series is as follows − East Johnīy setting inplace=True, we have successfully updated the original series object with deleted rows. Result = series.drop_duplicates(inplace=True)īy setting the True value to the inplace parameter, we can modify our original series object with deleted rows and the method returns None as its output. # delete duplicate values with inplace=True Example 2įor the same example, we have changed the inplace parameter value from default False to True. Here the original series object does not affect by this method instead it returns a new series object. The drop_duplicate method returns a new series object with deleted rows. The Pandas series is given below − East John Index=)Īfter creating the series object we applied the drop_duplicate() method without changing the default parameters. # create pandas series with duplicate values In this following example, we have created a pandas series with a list of strings and we assigned the index labels also by defining index parameters. Also, we can change it to last and False occurrences. The default behavior of this parameter is “first” which means it drops the duplicate values except for the first occurrence. The other important parameter in the drop_duplicates() method is “Keep”.

Instead, it will return a new one.īy using the inplace parameter, we can update the changes into the original series object by setting “inplace=True”. This method returns a series with deleted duplicate rows, and it won’t alter the original series object.

To remove duplicate values from a pandas series object, we can use the drop_duplicate() method. In the process of analysing the data, deleting duplicate values is a commonly used data cleaning task. The main advantage of using the pandas package is analysing the data for Data Science and Machine Learning applications.