How to Fix: KeyError in Pandas?

The KeyError in Pandas occurs when you try to access the columns in pandas DataFrame, which does not exist, or you misspell them.

Typically, we import data from the excel name, which imports the column names, and there are high chances that you misspell the column names or include an unwanted space before or after the column name.

The column names are case-sensitive, and if you make a mistake, then Python will raise an exception KeyError: ‘column_name

Let us take a simple example to demonstrate KeyError in Pandas. In this example, we create a pandas DataFrame of employee’s data, and let’s say we need to print all the employee names.

# import pandas library
import pandas
import numpy as np

# create pandas DataFrame
df =  pandas.DataFrame(np.array([["Jack", 22, "US"], ["Chandler", 55, "Canada"], ["Ross", 48, "India"]]),
                   columns=['name', 'age', 'country'])

# print names of employee
print(df["Name"])

Output

    raise KeyError(key) from err
KeyError: 'Name'

When we run the program, Python raises KeyError, since we have misspelled the “name” column as “Name”.

Solution KeyError in Pandas

We can fix the issue by correcting the spelling of the key. If we are not sure what the column names are, we can print all the columns into the list as shown below.

# import pandas library
import pandas
import numpy as np

# create pandas DataFrame
df =  pandas.DataFrame(np.array([["Jack", 22, "US"], ["Chandler", 55, "Canada"], ["Ross", 48, "India"]]),
                   columns=['name', 'age', 'country'])

# print names of employee
print(df["name"])

Output

0        Jack
1    Chandler
2        Ross
Name: name, dtype: object

We can now see a column called “name,” and we can fix our code by providing the correct spelling as a key to the pandas DataFrame, as shown below.

We can also avoid the KeyErrors raised by the compilers when an invalid key is passed. The DataFrame has a get method where we can give a column name and retrieve all the column values.

Syntax : DataFrame.get( 'column_name' , default = default_value_if_column_is_not_present)

If there are any misspelled or invalid columns, the default value will be printed instead of raising a KeyError. Let’s look at an example to demonstrate how this works.

# import pandas library
import pandas
import numpy as np

# create pandas DataFrame
df = pandas.DataFrame(np.array([["Jack", 22, "US"], ["Chandler", 55, "Canada"], ["Ross", 48, "India"]]),
                      columns=['name', 'age', 'country'])

# print names of employee
print(df.get("Name", default="Name is not present"))

‘Output

Name is not present

And if we provide the correct column name to the DataFrame.get() method, it will list all the column values present in that.

# import pandas library
import pandas
import numpy as np

# create pandas DataFrame
df = pandas.DataFrame(np.array([["Jack", 22, "US"], ["Chandler", 55, "Canada"], ["Ross", 48, "India"]]),
                      columns=['name', 'age', 'country'])

# print names of employee
print(df.get("name", default="Name is not present"))

Output

0        Jack
1    Chandler
2        Ross
Name: name, dtype: object

How to Fix: KeyError in Pandas?

Solution KeyError in Pandas

Srinivas Ramakrishna

Leave a Reply Cancel reply

Python bytes()

Python String ljust()

How to get hostname in Python?

Python typeerror: not all arguments converted during string formatting

How To Convert Python String To Array

[Solved] TypeError: cannot unpack non-iterable NoneType object

How to Concatenate Strings in R

numpy.median() Function

How to Check the NumPy Version

numpy.ndarray.flatten() function

numpy.mean() Function

Solution KeyError in Pandas

Leave a Reply Cancel reply

Sign Up for Our Newsletters

You May Also Like