Analyzing mortgage rejections for racial bias
Based on government-mandated data collection on mortgage granting, are certain banks or areas discriminatory in their lending practices?
logistic regression classification odds ratio Home Mortgage Disclosure Act race
Readings and links
Analyzing a massive trove of public records, Reveal performed an analysis of lending disparities within racial and ethnic groups. Home Mortgage Disclosure Act data from individual borrowers is too unwieldy to sit as a CSV or even open in pandas, so this project jumps directly into managing a SQL database populated through scripts provided by the Consumer Finance Protection Bureau.
With one of the most easily reproducible whitepapers I've ever seen, it's simple to walk through Reveal's footsteps and use logistic regression to pull back the mask on the mortgage industry.
Reporting and analysis by Aaron Glantz and Emmanuel Martinez.
Notebooks, Assignments, and Walkthroughs
Start-to-finish walkthrough of a reproduction of the Reveal analysis. Requires a bit of technical heavy lifting.
A full logistic regression using lending data and demographic data, following the whitepaper published by Reveal.
Using R-style/Patsy formulas in statsmodels opens up a lot of interesting opportunities for tweaking your regression at execution time.
An introduction to using R-style/Patsy formulas in statsmodels, along with specially-created columns in your dataframe.
Use logistic regression to investigate lending disparities.