A very short introduction to sentiment analysis

A critical look at sentiment analysis libraries and a walkthrough on how to train your own sentiment-analyzing algorithm. Alternatively titled, "Sentiment analysis is very very bad complicated."

sentiment analysis natural language processing classification

Summary

Sentiment analysis is simple enough in concept - flagging content as "positive" and "negative" - but with more than a quick glance it becomes an excellent example of the tradeoffs you encounter when using easy-to-use tools that lean on machine learning.

We'll examine a handful of sentiment analysis tools, identifying how and when they might disagree, as well as how they come to their positive/negative conclusions. To drive the point home we'll design our own sentiment analysis algorithm, seeing how well it performs and what tradeoffs might come with easy access to large amounts of data.

Notebooks, Assignments, and Walkthroughs

Comparing sentiment analysis tools

Different sentiment analysis tools can give you different results when given the same piece of text. Let's examine a few and see the differences.

Designing your own sentiment analysis tool

Does it really make sense to see whether a tweet is positive or negative based on words we learned from product reviews? Let's build our own sentiment analysis tool.

How much does more data matter?

If we weren't satisfied with the performance of our sentiment analysis tool from last round, let's increase the amount of data we use to teach it what's a positive vs negative tweet.

Cleaning the Sentiment140 data

Sentiment140 is a set of 1.4 million tweets, tagged as positive or negative. This is the cleaning performed for the custom sentiment analysis tool we made above.