Pandas: How to see the variety of values in a DataFrame column

To see the variety of values in an individual column of your DataFrame, you can use the name of the DataFrame column (in this case ‘Y’) and place it into a set, then print the values in the set. For example:

# Check the variety of values in the Y column
prop_variety = set(my_dataframe["Y"])
prop_variety

which yields:

{0, 1}

For the case of seeing the variety of values for every column in the DataFrame, I wrote a function that will print each column label followed by the variety of values in each column. You just pass it the name of the DataFrame. For example:

def view_column_variety(df):
    """See the variety of values held in each column."""
    header_list = list(df.columns.values)
    for prop in header_list:
        prop_variety = set(df[prop])
        print(prop)
        print(prop_variety, "\n")

Then call it like so:

view_column_variety(my_dataframe)

Example of output:

layerObject:properties:dir
{'T', 'F', 'B'} 

layerObject:properties:fc
{1, 2, 3, 4, 5, 6} 

layerObject:properties:laneCat
{1, 2, 3} 

layerObject:properties:lanes
{'1', 'None'} 

layerObject:properties:nmc_roadtype
{'6', '1', '5', 'None', '2', '3'} 

layerObject:properties:pc
{'6', '1', 'None', '0', '5', '4', '2', '3', '7', '8'} 

layerObject:properties:roadQuality
{'0', '1', 'None'} 

layerObject:properties:speedCat
{1, 2, 3, 4, 5, 6, 7, 8}

Leave a Reply