4 Defining our terms

Assaults come in two major categories:

  • Aggravated assault, a more serious Part I crime
  • Simple assault, a less serious Part II crime

While there are subsets of each - children and partners get different classifications, for example - we’ll group all aggravated assaults as Part I and all simple assaults as Part II.

df.CCDESC.value_counts()
## BATTERY - SIMPLE ASSAULT                          14483
## INTIMATE PARTNER - SIMPLE ASSAULT                  8143
## ASSAULT WITH DEADLY WEAPON, AGGRAVATED ASSAULT     7352
## CHILD ABUSE (PHYSICAL) - SIMPLE ASSAULT             822
## INTIMATE PARTNER - AGGRAVATED ASSAULT               247
## CHILD ABUSE (PHYSICAL) - AGGRAVATED ASSAULT         178
## ASSAULT WITH DEADLY WEAPON ON POLICE OFFICER        116
## OTHER ASSAULT                                       111
## Name: CCDESC, dtype: int64

After we categorize them as Part I or Part II (aka “not Part I”), we see only about 20% of our cases are the more violent aggravated assault.

df['is_part_i'] = df.CCDESC.str.contains("AGGRAVATED").astype(int)
df.is_part_i.value_counts()
## 0    23675
## 1     7777
## Name: is_part_i, dtype: int64