6.3 Making predictions with our model

Now that our model has read through almost 40,000 narratives, it should have a decent idea of what aggravated assault vs. simple assault is. We can test how good it is with a couple fake sentences:

# we already learned the words above, so we just use .transform to count them
sample_X = vec.transform([
  "S SHOT AND STABBED V WITH A GUN AND THE GUN HAD A KNIFE ON IT",
  "S PUNCHED THEIR NEIGHBOR"
])
clf.predict(sample_X)

## array([1, 0])

For the first one our model saw words associated with Part I crimes, so it predicted 1 - an aggravated assault - and for the second one it saw less serious words, so it predicted 0 - a sample assault. Seems reasonable to me!