Use .apply to send a column of every row to a function
You can use .apply to send a single column to a function. This is useful when
cleaning up data - converting formats, altering values etc.
# What's our data look like?df=pd.read_csv("../Civil_List_2014.csv").head(3)df
DPT
NAME
ADDRESS
TTL #
PC
SAL-RATE
0
868
B J SANDIFORD
DEPARTMENT OF CITYWIDE ADM
12702
X
$5.00
1
868
C A WIGFALL
DEPARTMENT OF CITYWIDE ADM
12702
X
$5.00
2
69
A E A-AWOSOGBA
HRA/DEPARTMENT OF SOCIAL S
52311
A
$51955.00
# Get rid of $ and , in the SAL-RATE, then convert it to a floatdefmoney_to_float(money_str):returnfloat(money_str.replace("$","").replace(",",""))df['SAL-RATE'].apply(money_to_float)
# Save the result in a new columndf['salary']=df['SAL-RATE'].apply(money_to_float)
# Take a peekdf
DPT
NAME
ADDRESS
TTL #
PC
SAL-RATE
salary
0
868
B J SANDIFORD
DEPARTMENT OF CITYWIDE ADM
12702
X
$5.00
5.0
1
868
C A WIGFALL
DEPARTMENT OF CITYWIDE ADM
12702
X
$5.00
5.0
2
69
A E A-AWOSOGBA
HRA/DEPARTMENT OF SOCIAL S
52311
A
$51955.00
51955.0
Use .apply with axis=1 to send every single row to a function
You can also send an entire row at a time instead of just a single column.
Use this if you need to use multiple columns to get a result.
# Create a dataframe from a list of dictionariesrectangles=[{'height':40,'width':10},{'height':20,'width':9},{'height':3.4,'width':4}]rectangles_df=pd.DataFrame(rectangles)rectangles_df
height
width
0
40.0
10
1
20.0
9
2
3.4
4
# Use the height and width to calculate the areadefcalculate_area(row):returnrow['height']*row['width']rectangles_df.apply(calculate_area,axis=1)
0 400.0
1 180.0
2 13.6
dtype: float64
# Use .apply to save the new column if we'd likerectangles_df['area']=rectangles_df.apply(calculate_area,axis=1)rectangles_df
height
width
area
0
40.0
10
400.0
1
20.0
9
180.0
2
3.4
4
13.6
Want to hear when I release new things? My infrequent and sporadic newsletter can help with that.