Analyzing online safety through app store reviews from The Washington Post

Analyzing online safety through app store reviews

After downloading over a hundred thousand reviews of "random chat apps," how to find reports of bullying, racism, and unwanted sexual behavior.

natural language processing text analysis classification reading lots of documents

Readings and links

Apple says its App Store is ‘a safe and trusted place.’ We found 1,500 reports of unwanted sexual behavior on six apps, some targeting minors., from the Washington Post (be sure to watch the video!)

Summary

The Washington Post downloaded over 130,000 reviews of "random chat apps," investigating cases of bullying, racism, and unwanted sexual behavior. Instead of reading each and every review, they used machine learning to tag likely offenders, resulting in uncovering several thousand potential reviews worth reviewing. In the end they found 1,500 reviews that included "uncomfortable sexual situations." As a result of this research, they could make statements like "at least 19 percent of the reviews on ChatLive mentioned unwanted sexual approaches."

This project is very similar to the Takata airbags project. We'll repeat the same sort of steps:

Obtain the reviews
Read a sample of reviews, labeling each as "interesting" or not
Convert the words in each review to features
Use the features and labels to train a classifier
Use the classifier on the unread reviews
Manually read the ones predicted as interesting

This project also introduces the concept of probability or decision functions, where instead of paying attention to the yes/no predicted class we pay attention to the certainty of the prediction.

If you'd like to increase the performance of the classifier, a good first step would be to flag more reviews.

Notebooks, Assignments, and Walkthroughs

Scrape and combine app store reviews

Using a website devoted to phone app marketing, we'll download over 50,000 reviews for various apps and save them to a CSV.

Read online

Jupyter Notebook

Download notebook

Jupyter Notebook

Interactive version

Jupyter Notebook

Build a classifier to detect reviews about bad behavior

Using a small dataset of tagged reviews, can we detect reviews that mention bullying, racism, or unwanted sexual behavior?

Read online

Jupyter Notebook

Download notebook

Jupyter Notebook

Interactive version

Jupyter Notebook

Discussion topics

Is it okay to "steal" all of those reviews from Apple?

What's the difference between using interesting/not interesting flags compared to the chance of a comment being interesting?

What counts as unwanted sexual behavior or bullying? Find a few examples you feel are borderline.

In line with the last question, many of these being suggestive/bulling or not are either borderline or a personal decision. In the Takata airbags story, the main drive was to find people to interview. In this case, it's writing sentences like "At least 19 percent of the reviews on ChatLive mentioned unwanted sexual approaches." Is there a difference between the two? Compared to the NYT one, should the Washington Post have taken any additional precautions as a result?

How can we test whether our classifier does a good job or not? We spent a lot of time testing in the airbags example, but not here. Is there a difference?

Analyzing online safety through app store reviews

Readings and links

Summary

Notebooks, Assignments, and Walkthroughs

Scrape and combine app store reviews

Build a classifier to detect reviews about bad behavior

Discussion topics

Text analysis

Putting things in categories automatically

How X affects Y

Python data science reference

All Projects