Combining car VIN data with vehicle weights#

A simple bit of data wrangling.

VIN data#

vin_df = pd.read_csv("data/vin_data.csv")
vin_df.head()
VIN Make Model ModelYear
0 2FMDA5143TBB45576 FORD WINDSTAR 1996
1 2G1WC5E37E1120089 CHEVROLET IMPALA 2014
2 5J6RE4H55AL053951 HONDA CR-V 2010
3 1N4AA5AP0EC435185 NISSAN MAXIMA 2014
4 JTHCK262075010440 LEXUS IS 2007

Weight data#

weights_df = pd.read_csv("data/weights.csv")
weights_df.head()
Make Model ModelYear weight
0 ACURA CL 1988 3009.0
1 ACURA CL 1989 3009.0
2 ACURA CL 1990 3009.0
3 ACURA CL 1991 3009.0
4 ACURA CL 1992 3009.0
vins_weights = vin_df.merge(weights_df, 
           left_on=['Make', 'Model', 'ModelYear'], 
           right_on=['Make', 'Model', 'ModelYear'], 
           how='left')
vins_weights.head(5)
VIN Make Model ModelYear weight
0 2FMDA5143TBB45576 FORD WINDSTAR 1996 3733.0
1 2G1WC5E37E1120089 CHEVROLET IMPALA 2014 3618.0
2 5J6RE4H55AL053951 HONDA CR-V 2010 3389.0
3 1N4AA5AP0EC435185 NISSAN MAXIMA 2014 3556.0
4 JTHCK262075010440 LEXUS IS 2007 3527.0
vins_weights = vins_weights.dropna(subset=['weight'])
vins_weights.shape
(500661, 5)
vins_weights.to_csv("data/vins_and_weights.csv", index=False)