10 Discussion topics

  • We don’t know how any of those classifiers work, but we’re still using them. How do we feel about that?
  • When we removed the high_success_rate_agency column and performance plummeted for some of the classifiers and remained decent for others, it wasn’t something we were expecting at all.
  • In the last question I called their performance “decent” - without high_success_rate_agency the random forest only predicted around 200 of 454 successfully. Is that really “decent,” or just better than 7 out of 454?
  • If you were building a system like this, what kind of success rate would you like in your predictions before you release this to the world? Would it be different between predicting successful and unsuccessful requests?
  • Which do you feel is more important in the FOIA Predictor: correctly predicting successful requests or predicting unsuccessful requests?
  • Rachel said she was hoping other people would contribute to the FOIA Predictor to improve it, as it was released as an open-source project. Although it got some press and people did use it, no one seemed to have examined it or make their own contributions. Is sunlight really the best disinfectant if no one is looking at or commenting on published code?
  • Do you think we should have left in high_success_rate_agency?
  • While we could determine the features that were important to a classifier, we couldn’t turn those into actionable items. How useful or useless does that make our classifier?
  • It’s easy to throw some garbage into the FOIA Predictor and get a high-percent-change result. Try mashing keys and see what happens! Does that mean the FOIA Predictor is useless?
  • Let’s say the FOIA Predictor was much more accurate on both unsuccessful and successful requests - 80% or so. If putting entering garbage still predicted a successful request, does that mean the FOIA Predictor is useless?