A collection of simple notebooks and guides to cover concepts in isolation, without the context of a journalistic project. Useful for review or to prep for topics used in the actual projects.

linear regression

Notebooks, Assignments, and Walkthroughs

Introduction to Classification

If you'd like to have a computer help you put things into categories, classification is for you!

Evaluating Classifiers

Classifiers aren't always exactly right. Let's take a look at a few approaches to evaluating their performance.

Scikit-learn and categorical features

Unlike statsmodels, scikit-learn doesn't play very well with categorical features (plane type, race, road condition, etc). Learn to bend it to your will when performing classification tasks.

Comparing classifiers

There are many different classifiers out there, let's give a few of our options a spin.

Using classification algorithms with text

We've been writing classifiers with numbers so far, but more often than not they're used with text! Let's give it a shot before we start in on reproducing published projects.

Correcting for imbalanced datasets in classification problems

You don't always have an even split between your two (or more) classes when doing a classification problem. This sort of bias actually tends to cause problems, but it can be mitigated with a little careful thought.