Analyzing whether larger cars cause more deadly crashes
Reproducing a research paper on the impact of weight on car accidents, along with a look at a state-based car crash database.
logistic regression feature engineering confidence intervals seaborn
Readings and links
This chapter reproduces an academic study regarding car weight and fatalities in car crashes. While it isn't an actual piece of journalism, it's a complicated pieces of data finding, cleaning, and combining, with many decisions made along the way.
Notebooks, Assignments, and Walkthroughs
Use car crash data from the state of Maryland to learn about feature engineering and feature selection (with a logistic regression classifier).
Open a folder full of Excel files from the Maryland DOT, then extract and combine the data into a series of CSV files.
Before we can analyze our data, we'll need to combine vehicle weights with makes and models, as well as clean up the results a bit.
By using a car's unique VIN identifier, we can use a government database to easily track down a car's make, model and year.
A simple bit of data wrangling.
After combining from so many sources, we need to filter out the car crashes we're interested in. We're curious about 2-car accidents that happen between light vehicles.