Analyzing mortgage rejections for racial bias
Based on government-mandated data collection on mortgage granting, are certain banks or areas discriminatory in their lending practices?
logistic regression classification odds ratio Home Mortgage Disclosure Act race
Readings and links
Summary
Analyzing a massive trove of public records, Reveal performed an analysis of lending disparities within racial and ethnic groups. Home Mortgage Disclosure Act data from individual borrowers is too unwieldy to sit as a CSV or even open in pandas, so this project jumps directly into managing a SQL database populated through scripts provided by the Consumer Finance Protection Bureau.
With one of the most easily reproducible whitepapers I've ever seen, it's simple to walk through Reveal's footsteps and use logistic regression to pull back the mask on the mortgage industry.
Reporting and analysis by Aaron Glantz and Emmanuel Martinez.
Notebooks, Assignments, and Walkthroughs
Complete walkthrough
Start-to-finish walkthrough of a reproduction of the Reveal analysis. Requires a bit of technical heavy lifting.
Cleaning and combining data for the Reveal Mortgage Analysis
A full logistic regression using lending data and demographic data, following the whitepaper published by Reveal.
Wild formulas in statsmodels using Patsy (short version)
Using R-style/Patsy formulas in statsmodels opens up a lot of interesting opportunities for tweaking your regression at execution time.
Reveal Mortgage Analysis - Logistic Regression using statsmodels formulas
An introduction to using R-style/Patsy formulas in statsmodels, along with specially-created columns in your dataframe.
Reveal Mortgage Analysis - Logistic Regression
Use logistic regression to investigate lending disparities.