Classification
A collection of simple notebooks and guides to cover concepts in isolation, without the context of a journalistic project. Useful for review or to prep for topics used in the actual projects.
linear regression
Readings and links
Summary
NoneNotebooks, Assignments, and Walkthroughs
Introduction to Classification
If you'd like to have a computer help you put things into categories, classification is for you!
Evaluating Classifiers
Classifiers aren't always exactly right. Let's take a look at a few approaches to evaluating their performance.
Scikit-learn and categorical features
Unlike statsmodels, scikit-learn doesn't play very well with categorical features (plane type, race, road condition, etc). Learn to bend it to your will when performing classification tasks.
Comparing classifiers
There are many different classifiers out there, let's give a few of our options a spin.
Using classification algorithms with text
We've been writing classifiers with numbers so far, but more often than not they're used with text! Let's give it a shot before we start in on reproducing published projects.
Correcting for imbalanced datasets in classification problems
You don't always have an even split between your two (or more) classes when doing a classification problem. This sort of bias actually tends to cause problems, but it can be mitigated with a little careful thought.