3.2 Success metrics (editorial choice)
Let’s read this data in! Since we’re using the same dataset that FOIA Predictor uses, it with a lot of extras. Along with the request itself, it also calculated bits and pieces, too, like average sentence length and a readability score.
import pandas as pd
pd.set_option("display.max_columns", 20)
pd.set_option("display.max_colwidth", 100)
columns = ['trackingID', 'title', 'agency', 'date_submitted',
'closed_date', 'url', 'status', 'char_count', 'word_count', 'ref_data',
'sen_count', 'avg_sen_len', 'closed_datetime', 'ref_foia',
'ref_fees', 'phone_number', 'hyperlink', 'email_address', 'ref_date',
'readability', 'specificity', 'high_success_rate_agency', 'request']
df = pd.read_csv("data/recent-requests-data-for-model.csv", usecols=columns)
df.head(3)
trackingID | title | agency | date_submitted | closed_date | url | status | request | char_count | word_count | sen_count | avg_sen_len | closed_datetime | ref_foia | ref_fees | phone_number | hyperlink | email_address | ref_date | readability | ref_data | specificity | high_success_rate_agency |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
24540 | “02/29/16 - SLCPD Abdi Mohamed Protest Action Plan and Debrief Docs” | 4223 | 2016-03-18 | 2016-04-11 | /foi/salt-lake-city-359/022916-slcpd-abdi-mohamed-protest-action-plan-and-debrief-docs-24540/ | no_docs | “- Action Plan(s) for policing demonstrations, protests and other events on February 29, 2016]. Please include all draft and final versions along with any associated metadata. - After Action Reports and/or written debriefs of demonstrations, protests and other events on the day. Please include all draft and final versions along with any associated metadata.. Please include in your search, any and all documents, correspondence within and outside the department, emails, training presentations, PowerPoint slides, and Microsoft Word documents, which discuss the demonstrations and or protests through the date of this request. The requested documents will be made available to the general public free of charge as part of the public information service at MuckRock.com, and is not being made for commercial usage. In the event that fees cannot be waived, I would be grateful if you would inform me of the total charges in advance of fulfilling my request. I would prefer the request filled electronically, by e-mail attachment if available or CD-ROM if not. Thank you in advance for your anticipated cooperation in this matter. I look forward to receiving your response to this request within 5 business days, as the statute requires.” | 1008 | 194 | 9 | 21.55556 | 2016-04-11 | 0 | 1 | 0 | 0 | 0 | 1 | 13.820355 | 1 | 8 | 0 |
34051 | “1026812 documents” | 503 | 2017-02-26 | 2017-03-20 | /foi/chicago-169/1026812-documents-34051/ | done | “The following documents from the IAD investigation under the CR number 1026812, identified by their attachment number and description: 6. Synoptic report of Sgt. Richard Downs 19. Handwritten statement of Lt. John Brundage 24-30. Interviews with Cmdr. Leo Schmitz, Lt. John Brundage, Sgt. Patrick Quinn, P.O. Brenda Gomez-Sanchez, P.P.O Milton Kinnison, and Sgt. Sean Ronan 54-55. Results of email account searches of Cmdr. Leo Schmitz and Sgt. Sean Ronan 56-59. All in-car camera footage 78. Cmdr. Leo Schmitz’s Blackberry log 80. Cmdr. Leo Schmitz’s response to OCIC report 81. OCIC report retrieved from Sgt. Sean Ronan’s email 87. Cmdr. Leo Schmitz’s disciplinary history 89. Lt. John Brundage’s disciplinary history” | 568 | 114 | 12 | 9.50000 | 2017-03-20 | 0 | 0 | 1 | 0 | 0 | 0 | 6.787368 | 0 | 32 | 1 |
31682 | “1033 MOU and annual (2015/16) inventory form (Illinois Dept of CMS)” | 4074 | 2017-01-05 | 2017-01-20 | /foi/illinois-168/1033-mou-and-annual-201516-inventory-form-illinois-dept-of-cms-31682/ | done | "-The current memorandum of agreement (MOA) or memorandum of understanding (MOU) with the Defense Logistics Agency, Disposition Services regarding the 1033 equipment surplus program administered by the DLA Law Enforcement Support Office -The annual inventory form for years 2015 and 2016 required to be completed by the state coordinator of the 1033 program According to the DLA FAQ page regarding the 1033 program, CMS is the agency in charge of coordinating the 1033 program for Illinois. See http://www.dispositionservices.dla.mil/leso/Pages/StateCoordinatorList.aspx"; | 473 | 79 | 2 | 39.50000 | 2017-01-20 | 0 | 0 | 0 | 1 | 0 | 1 | 20.000000 | 0 | 9 | 1 |
When you read in the dataset of fulfilled FOIA requests, you immediately need to make some editorial decisions. The thing we’re looking for - whether a request was fulfilled or denied - is not actually exactly in the dataset.
What our dataset has instead is a status
column. The statuses look like this:
## done 2749
## no_docs 1879
## processed 1491
## ack 945
## rejected 739
## fix 455
## abandoned 304
## payment 247
## appealing 176
## partial 82
## submitted 27
## Name: status, dtype: int64
We see denied in there and done
technically means fulfilled, but we also see a lot of other things. We see when the request is the agency says the documents don’t exist, or they aren’t responding to emails, or the requester isn’t responding to emails, or more things that might not be totally clear.
The question is what counts as fulfilled, and do we use all of these documents? Let’s look at some of our options.
- Option One: Remove everything except documents that were marked as either accepted or rejected. Accepted counts as fulfilled, denied counts as not fulfilled.
- Option Two: Keep everything. Accepted counts as fulfilled, everything not ‘accepted’ counts as denied.
- **Option Three: Accepted counts as fulfilled, denied counts as not fulfilled. “Documents don’t exist” also counts as denied, because maybe you were just too vague, or too specific. “Abandoned” counts as denied, because again, maybe you were too vague or too specific in your original request.
- Options Four through One Hundred: You have a lot of options here. Picking which ones matter, picking what counts as fulfilled, what counts as denied. And there might not a right answer, maybe you and someone else have different ideas about what counts as a fulfilled request.
The FOIA Predictor uses Option Two, which casts the most narrow net for accepted and the widest net for denied. Because machine learning loves to do things with numbers, we’re now going to count 1
as success, and 0
as a denied.
## 0 6345
## 1 2749
## Name: successful, dtype: int64
More denials than successful requests, but I’m actually impressed at how many successes we have!