To drop rows with NaN (null) values in Pandas DataFrame:
df.dropna()
To drop only the rows where all the values are NaN:
df.dropna(how="all")
Steps to Drop Rows with NaN Values in Pandas DataFrame
Step 1: Create a DataFrame with NaN Values
Create a DataFrame with NaN values:
import pandas as pdimport numpy as npdata = {"col_a": [1, 2, np.nan, 4], "col_b": [5, np.nan, np.nan, 8], "col_c": [9, 10, 11, 12] }df = pd.DataFrame(data)print(df)
As can be observed, the second and third rows now have NaN values:
col_a col_b col_c0 1.0 5.0 91 2.0 NaN 102 NaN NaN 113 4.0 8.0 12
Step 2: Drop the Rows with the NaN Values in Pandas DataFrame
Use df.dropna() to drop all the rows with the NaN values in the DataFrame:
import pandas as pdimport numpy as npdata = {"col_a": [1, 2, np.nan, 4], "col_b": [5, np.nan, np.nan, 8], "col_c": [9, 10, 11, 12] }df = pd.DataFrame(data)df_dropped = df.dropna()print(df_dropped)
There results are two rows without any NaN values:
col_a col_b col_c0 1.0 5.0 93 4.0 8.0 12
Noticed that those two rows no longer have a sequential index. It’s currently 0 and 3. You can then reset the index to start from 0 and increase sequentially.
Step 3 (Optional): Reset the Index
The general syntax to reset an index in Pandas DataFrame:
df.reset_index(drop=True)
The complete script to drop the rows with the NaN values, and then reset the index:
import pandas as pdimport numpy as npdata = {"col_a": [1, 2, np.nan, 4], "col_b": [5, np.nan, np.nan, 8], "col_c": [9, 10, 11, 12] }df = pd.DataFrame(data)df_dropped = df.dropna()df_reset = df_dropped.reset_index(drop=True)print(df_reset)
The index now starts from 0 and increases sequentially:
col_a col_b col_c0 1.0 5.0 91 4.0 8.0 12
Drop Rows Where all the Values are NaN
Here is an example of a DataFrame where all the values are NaN for the third row:
import pandas as pdimport numpy as npdata = {"col_a": [1, 2, np.nan, 4], "col_b": [5, np.nan, np.nan, 8], "col_c": [9, 10, np.nan, 12] }df = pd.DataFrame(data)print(df)
As can be seen, all the values are NaN for the third row:
col_a col_b col_c0 1.0 5.0 9.01 2.0 NaN 10.02 NaN NaN NaN3 4.0 8.0 12.0
Use df.dropna(how=”all”) to drop only the row/s where all the values are NaN:
import pandas as pdimport numpy as npdata = {"col_a": [1, 2, np.nan, 4], "col_b": [5, np.nan, np.nan, 8], "col_c": [9, 10, np.nan, 12] }df = pd.DataFrame(data)df_dropped = df.dropna(how="all")print(df_dropped)
The result:
col_a col_b col_c0 1.0 5.0 9.01 2.0 NaN 10.03 4.0 8.0 12.0