{
    "cells": [
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "# Finding surveillance planes using random forests\n",
                "\n",
                "**The story:**\n",
                "\n",
                "- https://www.buzzfeednews.com/article/peteraldhous/spies-in-the-skies\n",
                "- https://www.buzzfeednews.com/article/peteraldhous/hidden-spy-planes\n",
                "    \n",
                "This story, done by Peter Aldhous at Buzzfeed News, involved training a machine learning algorithm to recognize government surveillance planes based on what their flight patterns look like.\n",
                "\n",
                "**Topics:** Random Forests\n",
                "\n",
                "**Datasets**\n",
                "\n",
                "* **feds.csv:** Transponder codes of planes operated by the federal government\n",
                "* **planes_features.csv:** various features describing each plane's flight patterns\n",
                "* **train.csv:** a labeled dataset of transponder codes and whether each plane is a surveillance plane or not\n",
                "    - The `label` column was originally `class`, but I renamed it because pandas freaks out a bit with a column named `class`\n",
                "    - This was created by Buzzfeed `feds.csv`\n",
                "* **data dictionary:** You can find the data dictionary published with their analysis [here](https://buzzfeednews.github.io/2016-04-federal-surveillance-planes/analysis.html)\n",
                "* **a few other files**\n",
                "\n",
                "## What's the goal?\n",
                "\n",
                "The FBI and Department of Homeland Security operate many planes that are not directly labeled as belonging to the government. If we can uncover these planes, we have a better idea of the surveillance activities they are undertaking."
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "<p class=\"reading-options\">\n  <a class=\"btn\" href=\"/buzzfeed-spy-planes/buzzfeed-surveillance-planes-random-forests\">\n    <i class=\"fa fa-sm fa-book\"></i>\n    Read online\n  </a>\n  <a class=\"btn\" href=\"/buzzfeed-spy-planes/notebooks/Buzzfeed Surveillance Planes Random Forests.ipynb\">\n    <i class=\"fa fa-sm fa-download\"></i>\n    Download notebook\n  </a>\n  <a class=\"btn\" href=\"https://colab.research.google.com/github/littlecolumns/ds4j-notebooks/blob/master/buzzfeed-spy-planes/notebooks/Buzzfeed Surveillance Planes Random Forests.ipynb\" target=\"_new\">\n    <i class=\"fa fa-sm fa-laptop\"></i>\n    Interactive version\n  </a>\n</p>"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Prep work: Downloading necessary files\n",
                "Before we get started, we need to download all of the data we'll be using.\n",
                "* **planes_features.csv:** BuzzFeed plane features - as provided by BuzzFeed\n",
                "* **train.csv:** BuzzFeed labeled plane data - as provided by BuzzFeed\n",
                "* **feds.csv:** BuzzFeed federal planes list - as provided by BuzzFeed\n"
            ]
        },
        {
            "cell_type": "code",
            "metadata": {},
            "source": [
                "# Make data directory if it doesn't exist\n",
                "!mkdir -p data\n",
                "!wget -nc https://nyc3.digitaloceanspaces.com/ml-files-distro/v1/buzzfeed-spy-planes/data/planes_features.csv -P data\n",
                "!wget -nc https://nyc3.digitaloceanspaces.com/ml-files-distro/v1/buzzfeed-spy-planes/data/train.csv -P data\n",
                "!wget -nc https://nyc3.digitaloceanspaces.com/ml-files-distro/v1/buzzfeed-spy-planes/data/feds.csv -P data"
            ],
            "outputs": [],
            "execution_count": null
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## Imports\n",
                "\n",
                "Also set a large number of maximum columns."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 1,
            "metadata": {},
            "outputs": [],
            "source": [
                "import pandas as pd\n",
                "\n",
                "pd.set_option(\"display.max_columns\", 100)"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "# Read in our data\n",
                "\n",
                "Almost all classification problems start with a set of labeled features. In this case, the features are in one CSV file and the labels are in another. **Read both files in and merge them on `adshex`, the transpoder code.**"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 2,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/html": [
                            "<div>\n",
                            "<style scoped>\n",
                            "    .dataframe tbody tr th:only-of-type {\n",
                            "        vertical-align: middle;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe tbody tr th {\n",
                            "        vertical-align: top;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe thead th {\n",
                            "        text-align: right;\n",
                            "    }\n",
                            "</style>\n",
                            "<table border=\"1\" class=\"dataframe\">\n",
                            "  <thead>\n",
                            "    <tr style=\"text-align: right;\">\n",
                            "      <th></th>\n",
                            "      <th>adshex</th>\n",
                            "      <th>duration1</th>\n",
                            "      <th>duration2</th>\n",
                            "      <th>duration3</th>\n",
                            "      <th>duration4</th>\n",
                            "      <th>duration5</th>\n",
                            "      <th>boxes1</th>\n",
                            "      <th>boxes2</th>\n",
                            "      <th>boxes3</th>\n",
                            "      <th>boxes4</th>\n",
                            "      <th>boxes5</th>\n",
                            "      <th>speed1</th>\n",
                            "      <th>speed2</th>\n",
                            "      <th>speed3</th>\n",
                            "      <th>speed4</th>\n",
                            "      <th>speed5</th>\n",
                            "      <th>altitude1</th>\n",
                            "      <th>altitude2</th>\n",
                            "      <th>altitude3</th>\n",
                            "      <th>altitude4</th>\n",
                            "      <th>altitude5</th>\n",
                            "      <th>steer1</th>\n",
                            "      <th>steer2</th>\n",
                            "      <th>steer3</th>\n",
                            "      <th>steer4</th>\n",
                            "      <th>steer5</th>\n",
                            "      <th>steer6</th>\n",
                            "      <th>steer7</th>\n",
                            "      <th>steer8</th>\n",
                            "      <th>flights</th>\n",
                            "      <th>squawk_1</th>\n",
                            "      <th>observations</th>\n",
                            "      <th>type</th>\n",
                            "    </tr>\n",
                            "  </thead>\n",
                            "  <tbody>\n",
                            "    <tr>\n",
                            "      <th>0</th>\n",
                            "      <td>A</td>\n",
                            "      <td>0.120253</td>\n",
                            "      <td>0.075949</td>\n",
                            "      <td>0.183544</td>\n",
                            "      <td>0.335443</td>\n",
                            "      <td>0.284810</td>\n",
                            "      <td>0.088608</td>\n",
                            "      <td>0.044304</td>\n",
                            "      <td>0.069620</td>\n",
                            "      <td>0.120253</td>\n",
                            "      <td>0.677215</td>\n",
                            "      <td>0.021824</td>\n",
                            "      <td>0.020550</td>\n",
                            "      <td>0.062330</td>\n",
                            "      <td>0.100713</td>\n",
                            "      <td>0.794582</td>\n",
                            "      <td>0.042374</td>\n",
                            "      <td>0.060971</td>\n",
                            "      <td>0.066831</td>\n",
                            "      <td>0.106403</td>\n",
                            "      <td>0.723421</td>\n",
                            "      <td>0.020211</td>\n",
                            "      <td>0.048913</td>\n",
                            "      <td>0.270550</td>\n",
                            "      <td>0.344090</td>\n",
                            "      <td>0.097317</td>\n",
                            "      <td>0.186651</td>\n",
                            "      <td>0.011379</td>\n",
                            "      <td>0.009426</td>\n",
                            "      <td>158</td>\n",
                            "      <td>0</td>\n",
                            "      <td>11776</td>\n",
                            "      <td>GRND</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>1</th>\n",
                            "      <td>A00000</td>\n",
                            "      <td>0.211735</td>\n",
                            "      <td>0.155612</td>\n",
                            "      <td>0.181122</td>\n",
                            "      <td>0.198980</td>\n",
                            "      <td>0.252551</td>\n",
                            "      <td>0.204082</td>\n",
                            "      <td>0.183673</td>\n",
                            "      <td>0.168367</td>\n",
                            "      <td>0.173469</td>\n",
                            "      <td>0.267857</td>\n",
                            "      <td>0.107348</td>\n",
                            "      <td>0.143410</td>\n",
                            "      <td>0.208139</td>\n",
                            "      <td>0.177013</td>\n",
                            "      <td>0.364090</td>\n",
                            "      <td>0.177318</td>\n",
                            "      <td>0.114457</td>\n",
                            "      <td>0.129648</td>\n",
                            "      <td>0.197694</td>\n",
                            "      <td>0.380882</td>\n",
                            "      <td>0.034976</td>\n",
                            "      <td>0.048127</td>\n",
                            "      <td>0.240732</td>\n",
                            "      <td>0.356314</td>\n",
                            "      <td>0.116116</td>\n",
                            "      <td>0.159325</td>\n",
                            "      <td>0.012828</td>\n",
                            "      <td>0.013628</td>\n",
                            "      <td>392</td>\n",
                            "      <td>0</td>\n",
                            "      <td>52465</td>\n",
                            "      <td>TBM7</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>2</th>\n",
                            "      <td>A00002</td>\n",
                            "      <td>0.517241</td>\n",
                            "      <td>0.103448</td>\n",
                            "      <td>0.103448</td>\n",
                            "      <td>0.103448</td>\n",
                            "      <td>0.172414</td>\n",
                            "      <td>0.862069</td>\n",
                            "      <td>0.137931</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.990792</td>\n",
                            "      <td>0.000921</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.008287</td>\n",
                            "      <td>0.599448</td>\n",
                            "      <td>0.400552</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.105893</td>\n",
                            "      <td>0.090239</td>\n",
                            "      <td>0.174954</td>\n",
                            "      <td>0.244015</td>\n",
                            "      <td>0.034070</td>\n",
                            "      <td>0.202578</td>\n",
                            "      <td>0.021179</td>\n",
                            "      <td>0.068140</td>\n",
                            "      <td>29</td>\n",
                            "      <td>0</td>\n",
                            "      <td>1086</td>\n",
                            "      <td>SHIP</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>3</th>\n",
                            "      <td>A00008</td>\n",
                            "      <td>0.125000</td>\n",
                            "      <td>0.041667</td>\n",
                            "      <td>0.208333</td>\n",
                            "      <td>0.166667</td>\n",
                            "      <td>0.458333</td>\n",
                            "      <td>0.125000</td>\n",
                            "      <td>0.083333</td>\n",
                            "      <td>0.125000</td>\n",
                            "      <td>0.166667</td>\n",
                            "      <td>0.500000</td>\n",
                            "      <td>0.187960</td>\n",
                            "      <td>0.278952</td>\n",
                            "      <td>0.221048</td>\n",
                            "      <td>0.190257</td>\n",
                            "      <td>0.121783</td>\n",
                            "      <td>0.014706</td>\n",
                            "      <td>0.053309</td>\n",
                            "      <td>0.149816</td>\n",
                            "      <td>0.279871</td>\n",
                            "      <td>0.502298</td>\n",
                            "      <td>0.029871</td>\n",
                            "      <td>0.044118</td>\n",
                            "      <td>0.202665</td>\n",
                            "      <td>0.380515</td>\n",
                            "      <td>0.094669</td>\n",
                            "      <td>0.182904</td>\n",
                            "      <td>0.014706</td>\n",
                            "      <td>0.020221</td>\n",
                            "      <td>24</td>\n",
                            "      <td>0</td>\n",
                            "      <td>2176</td>\n",
                            "      <td>PA46</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>4</th>\n",
                            "      <td>A0001E</td>\n",
                            "      <td>0.100000</td>\n",
                            "      <td>0.200000</td>\n",
                            "      <td>0.200000</td>\n",
                            "      <td>0.400000</td>\n",
                            "      <td>0.100000</td>\n",
                            "      <td>0.100000</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.100000</td>\n",
                            "      <td>0.400000</td>\n",
                            "      <td>0.400000</td>\n",
                            "      <td>0.007937</td>\n",
                            "      <td>0.026984</td>\n",
                            "      <td>0.084127</td>\n",
                            "      <td>0.179365</td>\n",
                            "      <td>0.701587</td>\n",
                            "      <td>0.041270</td>\n",
                            "      <td>0.085714</td>\n",
                            "      <td>0.039683</td>\n",
                            "      <td>0.111111</td>\n",
                            "      <td>0.722222</td>\n",
                            "      <td>0.019048</td>\n",
                            "      <td>0.049206</td>\n",
                            "      <td>0.249206</td>\n",
                            "      <td>0.326984</td>\n",
                            "      <td>0.112698</td>\n",
                            "      <td>0.206349</td>\n",
                            "      <td>0.012698</td>\n",
                            "      <td>0.011111</td>\n",
                            "      <td>10</td>\n",
                            "      <td>1135</td>\n",
                            "      <td>630</td>\n",
                            "      <td>C56X</td>\n",
                            "    </tr>\n",
                            "  </tbody>\n",
                            "</table>\n",
                            "</div>"
                        ],
                        "text/plain": [
                            "   adshex  duration1  duration2  duration3  duration4  duration5    boxes1  \\\n",
                            "0       A   0.120253   0.075949   0.183544   0.335443   0.284810  0.088608   \n",
                            "1  A00000   0.211735   0.155612   0.181122   0.198980   0.252551  0.204082   \n",
                            "2  A00002   0.517241   0.103448   0.103448   0.103448   0.172414  0.862069   \n",
                            "3  A00008   0.125000   0.041667   0.208333   0.166667   0.458333  0.125000   \n",
                            "4  A0001E   0.100000   0.200000   0.200000   0.400000   0.100000  0.100000   \n",
                            "\n",
                            "     boxes2    boxes3    boxes4    boxes5    speed1    speed2    speed3  \\\n",
                            "0  0.044304  0.069620  0.120253  0.677215  0.021824  0.020550  0.062330   \n",
                            "1  0.183673  0.168367  0.173469  0.267857  0.107348  0.143410  0.208139   \n",
                            "2  0.137931  0.000000  0.000000  0.000000  0.990792  0.000921  0.000000   \n",
                            "3  0.083333  0.125000  0.166667  0.500000  0.187960  0.278952  0.221048   \n",
                            "4  0.000000  0.100000  0.400000  0.400000  0.007937  0.026984  0.084127   \n",
                            "\n",
                            "     speed4    speed5  altitude1  altitude2  altitude3  altitude4  altitude5  \\\n",
                            "0  0.100713  0.794582   0.042374   0.060971   0.066831   0.106403   0.723421   \n",
                            "1  0.177013  0.364090   0.177318   0.114457   0.129648   0.197694   0.380882   \n",
                            "2  0.000000  0.008287   0.599448   0.400552   0.000000   0.000000   0.000000   \n",
                            "3  0.190257  0.121783   0.014706   0.053309   0.149816   0.279871   0.502298   \n",
                            "4  0.179365  0.701587   0.041270   0.085714   0.039683   0.111111   0.722222   \n",
                            "\n",
                            "     steer1    steer2    steer3    steer4    steer5    steer6    steer7  \\\n",
                            "0  0.020211  0.048913  0.270550  0.344090  0.097317  0.186651  0.011379   \n",
                            "1  0.034976  0.048127  0.240732  0.356314  0.116116  0.159325  0.012828   \n",
                            "2  0.105893  0.090239  0.174954  0.244015  0.034070  0.202578  0.021179   \n",
                            "3  0.029871  0.044118  0.202665  0.380515  0.094669  0.182904  0.014706   \n",
                            "4  0.019048  0.049206  0.249206  0.326984  0.112698  0.206349  0.012698   \n",
                            "\n",
                            "     steer8  flights  squawk_1  observations  type  \n",
                            "0  0.009426      158         0         11776  GRND  \n",
                            "1  0.013628      392         0         52465  TBM7  \n",
                            "2  0.068140       29         0          1086  SHIP  \n",
                            "3  0.020221       24         0          2176  PA46  \n",
                            "4  0.011111       10      1135           630  C56X  "
                        ]
                    },
                    "execution_count": 2,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "# Read in your features\n",
                "features = pd.read_csv(\"data/planes_features.csv\")\n",
                "features.head()"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 3,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/html": [
                            "<div>\n",
                            "<style scoped>\n",
                            "    .dataframe tbody tr th:only-of-type {\n",
                            "        vertical-align: middle;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe tbody tr th {\n",
                            "        vertical-align: top;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe thead th {\n",
                            "        text-align: right;\n",
                            "    }\n",
                            "</style>\n",
                            "<table border=\"1\" class=\"dataframe\">\n",
                            "  <thead>\n",
                            "    <tr style=\"text-align: right;\">\n",
                            "      <th></th>\n",
                            "      <th>adshex</th>\n",
                            "      <th>label</th>\n",
                            "    </tr>\n",
                            "  </thead>\n",
                            "  <tbody>\n",
                            "    <tr>\n",
                            "      <th>0</th>\n",
                            "      <td>A00C4B</td>\n",
                            "      <td>surveil</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>1</th>\n",
                            "      <td>A0AB21</td>\n",
                            "      <td>surveil</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>2</th>\n",
                            "      <td>A0AE77</td>\n",
                            "      <td>surveil</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>3</th>\n",
                            "      <td>A0AE7C</td>\n",
                            "      <td>surveil</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>4</th>\n",
                            "      <td>A0C462</td>\n",
                            "      <td>surveil</td>\n",
                            "    </tr>\n",
                            "  </tbody>\n",
                            "</table>\n",
                            "</div>"
                        ],
                        "text/plain": [
                            "   adshex    label\n",
                            "0  A00C4B  surveil\n",
                            "1  A0AB21  surveil\n",
                            "2  A0AE77  surveil\n",
                            "3  A0AE7C  surveil\n",
                            "4  A0C462  surveil"
                        ]
                    },
                    "execution_count": 3,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "# Read in your labels\n",
                "labeled = pd.read_csv(\"data/train.csv\").rename(columns={'class': 'label'})\n",
                "labeled.head()"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 4,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/html": [
                            "<div>\n",
                            "<style scoped>\n",
                            "    .dataframe tbody tr th:only-of-type {\n",
                            "        vertical-align: middle;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe tbody tr th {\n",
                            "        vertical-align: top;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe thead th {\n",
                            "        text-align: right;\n",
                            "    }\n",
                            "</style>\n",
                            "<table border=\"1\" class=\"dataframe\">\n",
                            "  <thead>\n",
                            "    <tr style=\"text-align: right;\">\n",
                            "      <th></th>\n",
                            "      <th>adshex</th>\n",
                            "      <th>label</th>\n",
                            "      <th>duration1</th>\n",
                            "      <th>duration2</th>\n",
                            "      <th>duration3</th>\n",
                            "      <th>duration4</th>\n",
                            "      <th>duration5</th>\n",
                            "      <th>boxes1</th>\n",
                            "      <th>boxes2</th>\n",
                            "      <th>boxes3</th>\n",
                            "      <th>boxes4</th>\n",
                            "      <th>boxes5</th>\n",
                            "      <th>speed1</th>\n",
                            "      <th>speed2</th>\n",
                            "      <th>speed3</th>\n",
                            "      <th>speed4</th>\n",
                            "      <th>speed5</th>\n",
                            "      <th>altitude1</th>\n",
                            "      <th>altitude2</th>\n",
                            "      <th>altitude3</th>\n",
                            "      <th>altitude4</th>\n",
                            "      <th>altitude5</th>\n",
                            "      <th>steer1</th>\n",
                            "      <th>steer2</th>\n",
                            "      <th>steer3</th>\n",
                            "      <th>steer4</th>\n",
                            "      <th>steer5</th>\n",
                            "      <th>steer6</th>\n",
                            "      <th>steer7</th>\n",
                            "      <th>steer8</th>\n",
                            "      <th>flights</th>\n",
                            "      <th>squawk_1</th>\n",
                            "      <th>observations</th>\n",
                            "      <th>type</th>\n",
                            "    </tr>\n",
                            "  </thead>\n",
                            "  <tbody>\n",
                            "    <tr>\n",
                            "      <th>0</th>\n",
                            "      <td>A00C4B</td>\n",
                            "      <td>surveil</td>\n",
                            "      <td>0.450000</td>\n",
                            "      <td>0.125000</td>\n",
                            "      <td>0.025000</td>\n",
                            "      <td>0.025000</td>\n",
                            "      <td>0.375000</td>\n",
                            "      <td>0.475000</td>\n",
                            "      <td>0.250000</td>\n",
                            "      <td>0.250000</td>\n",
                            "      <td>0.025000</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.337128</td>\n",
                            "      <td>0.408286</td>\n",
                            "      <td>0.185431</td>\n",
                            "      <td>0.053026</td>\n",
                            "      <td>0.016129</td>\n",
                            "      <td>0.010226</td>\n",
                            "      <td>0.168564</td>\n",
                            "      <td>0.793274</td>\n",
                            "      <td>0.027936</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.151697</td>\n",
                            "      <td>0.203774</td>\n",
                            "      <td>0.303922</td>\n",
                            "      <td>0.154544</td>\n",
                            "      <td>0.033312</td>\n",
                            "      <td>0.088024</td>\n",
                            "      <td>0.010858</td>\n",
                            "      <td>0.010753</td>\n",
                            "      <td>40</td>\n",
                            "      <td>4414</td>\n",
                            "      <td>9486</td>\n",
                            "      <td>C182</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>1</th>\n",
                            "      <td>A0AB21</td>\n",
                            "      <td>surveil</td>\n",
                            "      <td>0.523810</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.047619</td>\n",
                            "      <td>0.095238</td>\n",
                            "      <td>0.333333</td>\n",
                            "      <td>0.714286</td>\n",
                            "      <td>0.095238</td>\n",
                            "      <td>0.047619</td>\n",
                            "      <td>0.142857</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.703329</td>\n",
                            "      <td>0.144543</td>\n",
                            "      <td>0.114201</td>\n",
                            "      <td>0.026549</td>\n",
                            "      <td>0.011378</td>\n",
                            "      <td>0.007164</td>\n",
                            "      <td>0.580700</td>\n",
                            "      <td>0.374210</td>\n",
                            "      <td>0.037927</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.141593</td>\n",
                            "      <td>0.152550</td>\n",
                            "      <td>0.166456</td>\n",
                            "      <td>0.309313</td>\n",
                            "      <td>0.008007</td>\n",
                            "      <td>0.078382</td>\n",
                            "      <td>0.021492</td>\n",
                            "      <td>0.064054</td>\n",
                            "      <td>21</td>\n",
                            "      <td>4414</td>\n",
                            "      <td>2373</td>\n",
                            "      <td>C182</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>2</th>\n",
                            "      <td>A0AE77</td>\n",
                            "      <td>surveil</td>\n",
                            "      <td>0.262295</td>\n",
                            "      <td>0.196721</td>\n",
                            "      <td>0.081967</td>\n",
                            "      <td>0.114754</td>\n",
                            "      <td>0.344262</td>\n",
                            "      <td>0.639344</td>\n",
                            "      <td>0.295082</td>\n",
                            "      <td>0.032787</td>\n",
                            "      <td>0.032787</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.703037</td>\n",
                            "      <td>0.181262</td>\n",
                            "      <td>0.066502</td>\n",
                            "      <td>0.030956</td>\n",
                            "      <td>0.018244</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.000118</td>\n",
                            "      <td>0.034134</td>\n",
                            "      <td>0.923376</td>\n",
                            "      <td>0.042373</td>\n",
                            "      <td>0.121234</td>\n",
                            "      <td>0.256709</td>\n",
                            "      <td>0.279779</td>\n",
                            "      <td>0.209981</td>\n",
                            "      <td>0.009416</td>\n",
                            "      <td>0.037900</td>\n",
                            "      <td>0.011064</td>\n",
                            "      <td>0.027778</td>\n",
                            "      <td>61</td>\n",
                            "      <td>4414</td>\n",
                            "      <td>8496</td>\n",
                            "      <td>T206</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>3</th>\n",
                            "      <td>A0AE7C</td>\n",
                            "      <td>surveil</td>\n",
                            "      <td>0.521739</td>\n",
                            "      <td>0.086957</td>\n",
                            "      <td>0.043478</td>\n",
                            "      <td>0.043478</td>\n",
                            "      <td>0.304348</td>\n",
                            "      <td>0.565217</td>\n",
                            "      <td>0.043478</td>\n",
                            "      <td>0.260870</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.130435</td>\n",
                            "      <td>0.129674</td>\n",
                            "      <td>0.291088</td>\n",
                            "      <td>0.384954</td>\n",
                            "      <td>0.098159</td>\n",
                            "      <td>0.096126</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.004631</td>\n",
                            "      <td>0.200723</td>\n",
                            "      <td>0.722806</td>\n",
                            "      <td>0.071840</td>\n",
                            "      <td>0.159494</td>\n",
                            "      <td>0.256636</td>\n",
                            "      <td>0.238111</td>\n",
                            "      <td>0.168305</td>\n",
                            "      <td>0.023043</td>\n",
                            "      <td>0.086073</td>\n",
                            "      <td>0.014007</td>\n",
                            "      <td>0.014797</td>\n",
                            "      <td>23</td>\n",
                            "      <td>4415</td>\n",
                            "      <td>8853</td>\n",
                            "      <td>T206</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>4</th>\n",
                            "      <td>A0C462</td>\n",
                            "      <td>surveil</td>\n",
                            "      <td>0.250000</td>\n",
                            "      <td>0.083333</td>\n",
                            "      <td>0.500000</td>\n",
                            "      <td>0.083333</td>\n",
                            "      <td>0.083333</td>\n",
                            "      <td>0.208333</td>\n",
                            "      <td>0.041667</td>\n",
                            "      <td>0.041667</td>\n",
                            "      <td>0.500000</td>\n",
                            "      <td>0.208333</td>\n",
                            "      <td>0.040691</td>\n",
                            "      <td>0.002466</td>\n",
                            "      <td>0.041924</td>\n",
                            "      <td>0.170160</td>\n",
                            "      <td>0.744760</td>\n",
                            "      <td>0.011097</td>\n",
                            "      <td>0.007398</td>\n",
                            "      <td>0.023428</td>\n",
                            "      <td>0.090012</td>\n",
                            "      <td>0.868064</td>\n",
                            "      <td>0.019729</td>\n",
                            "      <td>0.020962</td>\n",
                            "      <td>0.199753</td>\n",
                            "      <td>0.478422</td>\n",
                            "      <td>0.119605</td>\n",
                            "      <td>0.118372</td>\n",
                            "      <td>0.006165</td>\n",
                            "      <td>0.011097</td>\n",
                            "      <td>24</td>\n",
                            "      <td>1731</td>\n",
                            "      <td>811</td>\n",
                            "      <td>P8</td>\n",
                            "    </tr>\n",
                            "  </tbody>\n",
                            "</table>\n",
                            "</div>"
                        ],
                        "text/plain": [
                            "   adshex    label  duration1  duration2  duration3  duration4  duration5  \\\n",
                            "0  A00C4B  surveil   0.450000   0.125000   0.025000   0.025000   0.375000   \n",
                            "1  A0AB21  surveil   0.523810   0.000000   0.047619   0.095238   0.333333   \n",
                            "2  A0AE77  surveil   0.262295   0.196721   0.081967   0.114754   0.344262   \n",
                            "3  A0AE7C  surveil   0.521739   0.086957   0.043478   0.043478   0.304348   \n",
                            "4  A0C462  surveil   0.250000   0.083333   0.500000   0.083333   0.083333   \n",
                            "\n",
                            "     boxes1    boxes2    boxes3    boxes4    boxes5    speed1    speed2  \\\n",
                            "0  0.475000  0.250000  0.250000  0.025000  0.000000  0.337128  0.408286   \n",
                            "1  0.714286  0.095238  0.047619  0.142857  0.000000  0.703329  0.144543   \n",
                            "2  0.639344  0.295082  0.032787  0.032787  0.000000  0.703037  0.181262   \n",
                            "3  0.565217  0.043478  0.260870  0.000000  0.130435  0.129674  0.291088   \n",
                            "4  0.208333  0.041667  0.041667  0.500000  0.208333  0.040691  0.002466   \n",
                            "\n",
                            "     speed3    speed4    speed5  altitude1  altitude2  altitude3  altitude4  \\\n",
                            "0  0.185431  0.053026  0.016129   0.010226   0.168564   0.793274   0.027936   \n",
                            "1  0.114201  0.026549  0.011378   0.007164   0.580700   0.374210   0.037927   \n",
                            "2  0.066502  0.030956  0.018244   0.000000   0.000118   0.034134   0.923376   \n",
                            "3  0.384954  0.098159  0.096126   0.000000   0.004631   0.200723   0.722806   \n",
                            "4  0.041924  0.170160  0.744760   0.011097   0.007398   0.023428   0.090012   \n",
                            "\n",
                            "   altitude5    steer1    steer2    steer3    steer4    steer5    steer6  \\\n",
                            "0   0.000000  0.151697  0.203774  0.303922  0.154544  0.033312  0.088024   \n",
                            "1   0.000000  0.141593  0.152550  0.166456  0.309313  0.008007  0.078382   \n",
                            "2   0.042373  0.121234  0.256709  0.279779  0.209981  0.009416  0.037900   \n",
                            "3   0.071840  0.159494  0.256636  0.238111  0.168305  0.023043  0.086073   \n",
                            "4   0.868064  0.019729  0.020962  0.199753  0.478422  0.119605  0.118372   \n",
                            "\n",
                            "     steer7    steer8  flights  squawk_1  observations  type  \n",
                            "0  0.010858  0.010753       40      4414          9486  C182  \n",
                            "1  0.021492  0.064054       21      4414          2373  C182  \n",
                            "2  0.011064  0.027778       61      4414          8496  T206  \n",
                            "3  0.014007  0.014797       23      4415          8853  T206  \n",
                            "4  0.006165  0.011097       24      1731           811    P8  "
                        ]
                    },
                    "execution_count": 4,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "df = labeled.merge(features, on='adshex')\n",
                "df.head()"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### No wait, merge them again!\n",
                "\n",
                "We have features for about 20,000 planes and labels for about 600 planes. When you merge, the planes you have features for but not labels for will disappear.\n",
                "\n",
                "We want to keep those in the dataframe so we can play detective with them later, and try to find surveillance planes using the features. When you merge, you should use `how='left'` or `how='right'` to keep unmatched columns from the left (or right) dataframe."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 5,
            "metadata": {},
            "outputs": [],
            "source": [
                "df = labeled.merge(features, on='adshex', how='right')"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "Confirm you have 19,799 rows and 34 columns."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 6,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/plain": [
                            "(19799, 34)"
                        ]
                    },
                    "execution_count": 6,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "df.shape"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "# Cleaning up our data\n",
                "\n",
                "## Number-izing our labels\n",
                "\n",
                "Each row is a plane, and it's marked as either a surveillance plane or not. How many do we have in each category?"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 7,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/plain": [
                            "other      500\n",
                            "surveil     97\n",
                            "Name: label, dtype: int64"
                        ]
                    },
                    "execution_count": 7,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "df.label.value_counts()"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "How do you feel about that split?\n",
                "\n",
                "**Prepare this column for machine learning.** What's wrong with it as `\"surveil\"` and `\"other\"`? Add a new column that we can use for classification."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 8,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/html": [
                            "<div>\n",
                            "<style scoped>\n",
                            "    .dataframe tbody tr th:only-of-type {\n",
                            "        vertical-align: middle;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe tbody tr th {\n",
                            "        vertical-align: top;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe thead th {\n",
                            "        text-align: right;\n",
                            "    }\n",
                            "</style>\n",
                            "<table border=\"1\" class=\"dataframe\">\n",
                            "  <thead>\n",
                            "    <tr style=\"text-align: right;\">\n",
                            "      <th></th>\n",
                            "      <th>adshex</th>\n",
                            "      <th>label</th>\n",
                            "      <th>duration1</th>\n",
                            "      <th>duration2</th>\n",
                            "      <th>duration3</th>\n",
                            "      <th>duration4</th>\n",
                            "      <th>duration5</th>\n",
                            "      <th>boxes1</th>\n",
                            "      <th>boxes2</th>\n",
                            "      <th>boxes3</th>\n",
                            "      <th>boxes4</th>\n",
                            "      <th>boxes5</th>\n",
                            "      <th>speed1</th>\n",
                            "      <th>speed2</th>\n",
                            "      <th>speed3</th>\n",
                            "      <th>speed4</th>\n",
                            "      <th>speed5</th>\n",
                            "      <th>altitude1</th>\n",
                            "      <th>altitude2</th>\n",
                            "      <th>altitude3</th>\n",
                            "      <th>altitude4</th>\n",
                            "      <th>altitude5</th>\n",
                            "      <th>steer1</th>\n",
                            "      <th>steer2</th>\n",
                            "      <th>steer3</th>\n",
                            "      <th>steer4</th>\n",
                            "      <th>steer5</th>\n",
                            "      <th>steer6</th>\n",
                            "      <th>steer7</th>\n",
                            "      <th>steer8</th>\n",
                            "      <th>flights</th>\n",
                            "      <th>squawk_1</th>\n",
                            "      <th>observations</th>\n",
                            "      <th>type</th>\n",
                            "    </tr>\n",
                            "  </thead>\n",
                            "  <tbody>\n",
                            "    <tr>\n",
                            "      <th>0</th>\n",
                            "      <td>A00C4B</td>\n",
                            "      <td>1.0</td>\n",
                            "      <td>0.450000</td>\n",
                            "      <td>0.125000</td>\n",
                            "      <td>0.025000</td>\n",
                            "      <td>0.025000</td>\n",
                            "      <td>0.375000</td>\n",
                            "      <td>0.475000</td>\n",
                            "      <td>0.250000</td>\n",
                            "      <td>0.250000</td>\n",
                            "      <td>0.025000</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.337128</td>\n",
                            "      <td>0.408286</td>\n",
                            "      <td>0.185431</td>\n",
                            "      <td>0.053026</td>\n",
                            "      <td>0.016129</td>\n",
                            "      <td>0.010226</td>\n",
                            "      <td>0.168564</td>\n",
                            "      <td>0.793274</td>\n",
                            "      <td>0.027936</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.151697</td>\n",
                            "      <td>0.203774</td>\n",
                            "      <td>0.303922</td>\n",
                            "      <td>0.154544</td>\n",
                            "      <td>0.033312</td>\n",
                            "      <td>0.088024</td>\n",
                            "      <td>0.010858</td>\n",
                            "      <td>0.010753</td>\n",
                            "      <td>40</td>\n",
                            "      <td>4414</td>\n",
                            "      <td>9486</td>\n",
                            "      <td>C182</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>1</th>\n",
                            "      <td>A0AB21</td>\n",
                            "      <td>1.0</td>\n",
                            "      <td>0.523810</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.047619</td>\n",
                            "      <td>0.095238</td>\n",
                            "      <td>0.333333</td>\n",
                            "      <td>0.714286</td>\n",
                            "      <td>0.095238</td>\n",
                            "      <td>0.047619</td>\n",
                            "      <td>0.142857</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.703329</td>\n",
                            "      <td>0.144543</td>\n",
                            "      <td>0.114201</td>\n",
                            "      <td>0.026549</td>\n",
                            "      <td>0.011378</td>\n",
                            "      <td>0.007164</td>\n",
                            "      <td>0.580700</td>\n",
                            "      <td>0.374210</td>\n",
                            "      <td>0.037927</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.141593</td>\n",
                            "      <td>0.152550</td>\n",
                            "      <td>0.166456</td>\n",
                            "      <td>0.309313</td>\n",
                            "      <td>0.008007</td>\n",
                            "      <td>0.078382</td>\n",
                            "      <td>0.021492</td>\n",
                            "      <td>0.064054</td>\n",
                            "      <td>21</td>\n",
                            "      <td>4414</td>\n",
                            "      <td>2373</td>\n",
                            "      <td>C182</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>2</th>\n",
                            "      <td>A0AE77</td>\n",
                            "      <td>1.0</td>\n",
                            "      <td>0.262295</td>\n",
                            "      <td>0.196721</td>\n",
                            "      <td>0.081967</td>\n",
                            "      <td>0.114754</td>\n",
                            "      <td>0.344262</td>\n",
                            "      <td>0.639344</td>\n",
                            "      <td>0.295082</td>\n",
                            "      <td>0.032787</td>\n",
                            "      <td>0.032787</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.703037</td>\n",
                            "      <td>0.181262</td>\n",
                            "      <td>0.066502</td>\n",
                            "      <td>0.030956</td>\n",
                            "      <td>0.018244</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.000118</td>\n",
                            "      <td>0.034134</td>\n",
                            "      <td>0.923376</td>\n",
                            "      <td>0.042373</td>\n",
                            "      <td>0.121234</td>\n",
                            "      <td>0.256709</td>\n",
                            "      <td>0.279779</td>\n",
                            "      <td>0.209981</td>\n",
                            "      <td>0.009416</td>\n",
                            "      <td>0.037900</td>\n",
                            "      <td>0.011064</td>\n",
                            "      <td>0.027778</td>\n",
                            "      <td>61</td>\n",
                            "      <td>4414</td>\n",
                            "      <td>8496</td>\n",
                            "      <td>T206</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>3</th>\n",
                            "      <td>A0AE7C</td>\n",
                            "      <td>1.0</td>\n",
                            "      <td>0.521739</td>\n",
                            "      <td>0.086957</td>\n",
                            "      <td>0.043478</td>\n",
                            "      <td>0.043478</td>\n",
                            "      <td>0.304348</td>\n",
                            "      <td>0.565217</td>\n",
                            "      <td>0.043478</td>\n",
                            "      <td>0.260870</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.130435</td>\n",
                            "      <td>0.129674</td>\n",
                            "      <td>0.291088</td>\n",
                            "      <td>0.384954</td>\n",
                            "      <td>0.098159</td>\n",
                            "      <td>0.096126</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.004631</td>\n",
                            "      <td>0.200723</td>\n",
                            "      <td>0.722806</td>\n",
                            "      <td>0.071840</td>\n",
                            "      <td>0.159494</td>\n",
                            "      <td>0.256636</td>\n",
                            "      <td>0.238111</td>\n",
                            "      <td>0.168305</td>\n",
                            "      <td>0.023043</td>\n",
                            "      <td>0.086073</td>\n",
                            "      <td>0.014007</td>\n",
                            "      <td>0.014797</td>\n",
                            "      <td>23</td>\n",
                            "      <td>4415</td>\n",
                            "      <td>8853</td>\n",
                            "      <td>T206</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>4</th>\n",
                            "      <td>A0C462</td>\n",
                            "      <td>1.0</td>\n",
                            "      <td>0.250000</td>\n",
                            "      <td>0.083333</td>\n",
                            "      <td>0.500000</td>\n",
                            "      <td>0.083333</td>\n",
                            "      <td>0.083333</td>\n",
                            "      <td>0.208333</td>\n",
                            "      <td>0.041667</td>\n",
                            "      <td>0.041667</td>\n",
                            "      <td>0.500000</td>\n",
                            "      <td>0.208333</td>\n",
                            "      <td>0.040691</td>\n",
                            "      <td>0.002466</td>\n",
                            "      <td>0.041924</td>\n",
                            "      <td>0.170160</td>\n",
                            "      <td>0.744760</td>\n",
                            "      <td>0.011097</td>\n",
                            "      <td>0.007398</td>\n",
                            "      <td>0.023428</td>\n",
                            "      <td>0.090012</td>\n",
                            "      <td>0.868064</td>\n",
                            "      <td>0.019729</td>\n",
                            "      <td>0.020962</td>\n",
                            "      <td>0.199753</td>\n",
                            "      <td>0.478422</td>\n",
                            "      <td>0.119605</td>\n",
                            "      <td>0.118372</td>\n",
                            "      <td>0.006165</td>\n",
                            "      <td>0.011097</td>\n",
                            "      <td>24</td>\n",
                            "      <td>1731</td>\n",
                            "      <td>811</td>\n",
                            "      <td>P8</td>\n",
                            "    </tr>\n",
                            "  </tbody>\n",
                            "</table>\n",
                            "</div>"
                        ],
                        "text/plain": [
                            "   adshex  label  duration1  duration2  duration3  duration4  duration5  \\\n",
                            "0  A00C4B    1.0   0.450000   0.125000   0.025000   0.025000   0.375000   \n",
                            "1  A0AB21    1.0   0.523810   0.000000   0.047619   0.095238   0.333333   \n",
                            "2  A0AE77    1.0   0.262295   0.196721   0.081967   0.114754   0.344262   \n",
                            "3  A0AE7C    1.0   0.521739   0.086957   0.043478   0.043478   0.304348   \n",
                            "4  A0C462    1.0   0.250000   0.083333   0.500000   0.083333   0.083333   \n",
                            "\n",
                            "     boxes1    boxes2    boxes3    boxes4    boxes5    speed1    speed2  \\\n",
                            "0  0.475000  0.250000  0.250000  0.025000  0.000000  0.337128  0.408286   \n",
                            "1  0.714286  0.095238  0.047619  0.142857  0.000000  0.703329  0.144543   \n",
                            "2  0.639344  0.295082  0.032787  0.032787  0.000000  0.703037  0.181262   \n",
                            "3  0.565217  0.043478  0.260870  0.000000  0.130435  0.129674  0.291088   \n",
                            "4  0.208333  0.041667  0.041667  0.500000  0.208333  0.040691  0.002466   \n",
                            "\n",
                            "     speed3    speed4    speed5  altitude1  altitude2  altitude3  altitude4  \\\n",
                            "0  0.185431  0.053026  0.016129   0.010226   0.168564   0.793274   0.027936   \n",
                            "1  0.114201  0.026549  0.011378   0.007164   0.580700   0.374210   0.037927   \n",
                            "2  0.066502  0.030956  0.018244   0.000000   0.000118   0.034134   0.923376   \n",
                            "3  0.384954  0.098159  0.096126   0.000000   0.004631   0.200723   0.722806   \n",
                            "4  0.041924  0.170160  0.744760   0.011097   0.007398   0.023428   0.090012   \n",
                            "\n",
                            "   altitude5    steer1    steer2    steer3    steer4    steer5    steer6  \\\n",
                            "0   0.000000  0.151697  0.203774  0.303922  0.154544  0.033312  0.088024   \n",
                            "1   0.000000  0.141593  0.152550  0.166456  0.309313  0.008007  0.078382   \n",
                            "2   0.042373  0.121234  0.256709  0.279779  0.209981  0.009416  0.037900   \n",
                            "3   0.071840  0.159494  0.256636  0.238111  0.168305  0.023043  0.086073   \n",
                            "4   0.868064  0.019729  0.020962  0.199753  0.478422  0.119605  0.118372   \n",
                            "\n",
                            "     steer7    steer8  flights  squawk_1  observations  type  \n",
                            "0  0.010858  0.010753       40      4414          9486  C182  \n",
                            "1  0.021492  0.064054       21      4414          2373  C182  \n",
                            "2  0.011064  0.027778       61      4414          8496  T206  \n",
                            "3  0.014007  0.014797       23      4415          8853  T206  \n",
                            "4  0.006165  0.011097       24      1731           811    P8  "
                        ]
                    },
                    "execution_count": 8,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "# Replace label with numbers\n",
                "df['label'] = df.label.replace({\n",
                "    'surveil': 1,\n",
                "    'other': 0\n",
                "})\n",
                "df.head()"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## Categorical variables\n",
                "\n",
                "Do we have any variables that count as categories? Yes, we do! ...but how many different categories does it have?\n",
                "\n",
                "* **Tip:** You can use `.unique()` or `.value_counts()` to count unique items, depending on what you're looking for"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 9,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/plain": [
                            "unknown    2528\n",
                            "C172       1014\n",
                            "SR22        799\n",
                            "BE36        699\n",
                            "C182        693\n",
                            "           ... \n",
                            "MS76          1\n",
                            "FGT           1\n",
                            "SC7           1\n",
                            "E35L          1\n",
                            "M20J          1\n",
                            "Name: type, Length: 455, dtype: int64"
                        ]
                    },
                    "execution_count": 9,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "df.type.value_counts()"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "Most of those types of plane only have one appearance, which means they wouldn't be very helpful identifiers in the final analysis. For example, if I only see one GLF5 and it's a surveillance plane, does that mean the next one I see is probably a surveillance plane? With such a small sample size, I have no idea!\n",
                "\n",
                "We have a few options\n",
                "\n",
                "1. Create a very large set of dummy variables out of all 133 types of planes\n",
                "2. Create `0`/`1` columns for common plane types and ignore the less common ones -  C182, T206, SR22\n",
                "3. Interview someone who knows something about planes and put these into a few broader categories\n",
                "4. Keep them as one column, just turn them into numbers - it doesn't make sense in terms of order, but if one or two plane types are very indicative of a surveillance plane the forest might pick it up\n",
                "\n",
                "Oddly enough, **the last one is a common approach.** Let's use it!\n",
                "\n",
                "If you want to convert a list of categories into numbers, an easy way is to use the `Categorical` data type."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 10,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/plain": [
                            "0    C182\n",
                            "1    C182\n",
                            "2    T206\n",
                            "3    T206\n",
                            "4      P8\n",
                            "Name: type, dtype: category\n",
                            "Categories (455, object): [208, A109, A119, A139, ..., WW24, XL2, ZZZZ, unknown]"
                        ]
                    },
                    "execution_count": 10,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "df.type = df.type.astype('category')\n",
                "df.type.head()"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "It looks like a normal bunch of strings, but pandas is secretly using a number for each one! You can find the number with `.cat.codes`.\n",
                "\n",
                "**Use `df.type.cat.codes` to make a new columns called `type_code`.** "
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 11,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/html": [
                            "<div>\n",
                            "<style scoped>\n",
                            "    .dataframe tbody tr th:only-of-type {\n",
                            "        vertical-align: middle;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe tbody tr th {\n",
                            "        vertical-align: top;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe thead th {\n",
                            "        text-align: right;\n",
                            "    }\n",
                            "</style>\n",
                            "<table border=\"1\" class=\"dataframe\">\n",
                            "  <thead>\n",
                            "    <tr style=\"text-align: right;\">\n",
                            "      <th></th>\n",
                            "      <th>type</th>\n",
                            "      <th>type_code</th>\n",
                            "    </tr>\n",
                            "  </thead>\n",
                            "  <tbody>\n",
                            "    <tr>\n",
                            "      <th>0</th>\n",
                            "      <td>C182</td>\n",
                            "      <td>91</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>1</th>\n",
                            "      <td>C182</td>\n",
                            "      <td>91</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>2</th>\n",
                            "      <td>T206</td>\n",
                            "      <td>417</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>3</th>\n",
                            "      <td>T206</td>\n",
                            "      <td>417</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>4</th>\n",
                            "      <td>P8</td>\n",
                            "      <td>337</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>5</th>\n",
                            "      <td>BE20</td>\n",
                            "      <td>48</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>6</th>\n",
                            "      <td>BE20</td>\n",
                            "      <td>48</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>7</th>\n",
                            "      <td>BE20</td>\n",
                            "      <td>48</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>8</th>\n",
                            "      <td>C182</td>\n",
                            "      <td>91</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>9</th>\n",
                            "      <td>C182</td>\n",
                            "      <td>91</td>\n",
                            "    </tr>\n",
                            "  </tbody>\n",
                            "</table>\n",
                            "</div>"
                        ],
                        "text/plain": [
                            "   type  type_code\n",
                            "0  C182         91\n",
                            "1  C182         91\n",
                            "2  T206        417\n",
                            "3  T206        417\n",
                            "4    P8        337\n",
                            "5  BE20         48\n",
                            "6  BE20         48\n",
                            "7  BE20         48\n",
                            "8  C182         91\n",
                            "9  C182         91"
                        ]
                    },
                    "execution_count": 11,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "df['type_code'] = df.type.cat.codes\n",
                "df[['type', 'type_code']].head(10)"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "We'll use `type_code` for machine learning since sklearn needs a number, and `type` for reading since we like text."
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "# Building our classifier\n",
                "\n",
                "When we're about to classify, we usually just drop our target column to build our inputs and outputs:\n",
                "\n",
                "```python\n",
                "X = train_df.drop(column='column_you_are_predicting')\n",
                "y = train_df.column_you_are_predicting\n",
                "```\n",
                "\n",
                "This time is a little different. First, we have unlabeled data in there! Use `.dropna()` to filter your training data so we only have labeled data.\n",
                "\n",
                "Confirm `train_df` has 597 rows and 35 columns."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 12,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/plain": [
                            "(597, 35)"
                        ]
                    },
                    "execution_count": 12,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "train_df = df.dropna()\n",
                "train_df.shape"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "We also have a few extra columns that we aren't using for classification (like the text version of the type column and the transponder code). It's fine to drop multiple columns here that you aren't using, just a little bit messier. You also have to make sure you're dropping all the right ones.\n",
                "\n",
                "Do a `.head()` to double-check all of the columns you need to drop when creating your `X`."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 13,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/html": [
                            "<div>\n",
                            "<style scoped>\n",
                            "    .dataframe tbody tr th:only-of-type {\n",
                            "        vertical-align: middle;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe tbody tr th {\n",
                            "        vertical-align: top;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe thead th {\n",
                            "        text-align: right;\n",
                            "    }\n",
                            "</style>\n",
                            "<table border=\"1\" class=\"dataframe\">\n",
                            "  <thead>\n",
                            "    <tr style=\"text-align: right;\">\n",
                            "      <th></th>\n",
                            "      <th>adshex</th>\n",
                            "      <th>label</th>\n",
                            "      <th>duration1</th>\n",
                            "      <th>duration2</th>\n",
                            "      <th>duration3</th>\n",
                            "      <th>duration4</th>\n",
                            "      <th>duration5</th>\n",
                            "      <th>boxes1</th>\n",
                            "      <th>boxes2</th>\n",
                            "      <th>boxes3</th>\n",
                            "      <th>boxes4</th>\n",
                            "      <th>boxes5</th>\n",
                            "      <th>speed1</th>\n",
                            "      <th>speed2</th>\n",
                            "      <th>speed3</th>\n",
                            "      <th>speed4</th>\n",
                            "      <th>speed5</th>\n",
                            "      <th>altitude1</th>\n",
                            "      <th>altitude2</th>\n",
                            "      <th>altitude3</th>\n",
                            "      <th>altitude4</th>\n",
                            "      <th>altitude5</th>\n",
                            "      <th>steer1</th>\n",
                            "      <th>steer2</th>\n",
                            "      <th>steer3</th>\n",
                            "      <th>steer4</th>\n",
                            "      <th>steer5</th>\n",
                            "      <th>steer6</th>\n",
                            "      <th>steer7</th>\n",
                            "      <th>steer8</th>\n",
                            "      <th>flights</th>\n",
                            "      <th>squawk_1</th>\n",
                            "      <th>observations</th>\n",
                            "      <th>type</th>\n",
                            "      <th>type_code</th>\n",
                            "    </tr>\n",
                            "  </thead>\n",
                            "  <tbody>\n",
                            "    <tr>\n",
                            "      <th>0</th>\n",
                            "      <td>A00C4B</td>\n",
                            "      <td>1.0</td>\n",
                            "      <td>0.45000</td>\n",
                            "      <td>0.125</td>\n",
                            "      <td>0.025000</td>\n",
                            "      <td>0.025000</td>\n",
                            "      <td>0.375000</td>\n",
                            "      <td>0.475000</td>\n",
                            "      <td>0.250000</td>\n",
                            "      <td>0.250000</td>\n",
                            "      <td>0.025000</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.337128</td>\n",
                            "      <td>0.408286</td>\n",
                            "      <td>0.185431</td>\n",
                            "      <td>0.053026</td>\n",
                            "      <td>0.016129</td>\n",
                            "      <td>0.010226</td>\n",
                            "      <td>0.168564</td>\n",
                            "      <td>0.793274</td>\n",
                            "      <td>0.027936</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.151697</td>\n",
                            "      <td>0.203774</td>\n",
                            "      <td>0.303922</td>\n",
                            "      <td>0.154544</td>\n",
                            "      <td>0.033312</td>\n",
                            "      <td>0.088024</td>\n",
                            "      <td>0.010858</td>\n",
                            "      <td>0.010753</td>\n",
                            "      <td>40</td>\n",
                            "      <td>4414</td>\n",
                            "      <td>9486</td>\n",
                            "      <td>C182</td>\n",
                            "      <td>91</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>1</th>\n",
                            "      <td>A0AB21</td>\n",
                            "      <td>1.0</td>\n",
                            "      <td>0.52381</td>\n",
                            "      <td>0.000</td>\n",
                            "      <td>0.047619</td>\n",
                            "      <td>0.095238</td>\n",
                            "      <td>0.333333</td>\n",
                            "      <td>0.714286</td>\n",
                            "      <td>0.095238</td>\n",
                            "      <td>0.047619</td>\n",
                            "      <td>0.142857</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.703329</td>\n",
                            "      <td>0.144543</td>\n",
                            "      <td>0.114201</td>\n",
                            "      <td>0.026549</td>\n",
                            "      <td>0.011378</td>\n",
                            "      <td>0.007164</td>\n",
                            "      <td>0.580700</td>\n",
                            "      <td>0.374210</td>\n",
                            "      <td>0.037927</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.141593</td>\n",
                            "      <td>0.152550</td>\n",
                            "      <td>0.166456</td>\n",
                            "      <td>0.309313</td>\n",
                            "      <td>0.008007</td>\n",
                            "      <td>0.078382</td>\n",
                            "      <td>0.021492</td>\n",
                            "      <td>0.064054</td>\n",
                            "      <td>21</td>\n",
                            "      <td>4414</td>\n",
                            "      <td>2373</td>\n",
                            "      <td>C182</td>\n",
                            "      <td>91</td>\n",
                            "    </tr>\n",
                            "  </tbody>\n",
                            "</table>\n",
                            "</div>"
                        ],
                        "text/plain": [
                            "   adshex  label  duration1  duration2  duration3  duration4  duration5  \\\n",
                            "0  A00C4B    1.0    0.45000      0.125   0.025000   0.025000   0.375000   \n",
                            "1  A0AB21    1.0    0.52381      0.000   0.047619   0.095238   0.333333   \n",
                            "\n",
                            "     boxes1    boxes2    boxes3    boxes4  boxes5    speed1    speed2  \\\n",
                            "0  0.475000  0.250000  0.250000  0.025000     0.0  0.337128  0.408286   \n",
                            "1  0.714286  0.095238  0.047619  0.142857     0.0  0.703329  0.144543   \n",
                            "\n",
                            "     speed3    speed4    speed5  altitude1  altitude2  altitude3  altitude4  \\\n",
                            "0  0.185431  0.053026  0.016129   0.010226   0.168564   0.793274   0.027936   \n",
                            "1  0.114201  0.026549  0.011378   0.007164   0.580700   0.374210   0.037927   \n",
                            "\n",
                            "   altitude5    steer1    steer2    steer3    steer4    steer5    steer6  \\\n",
                            "0        0.0  0.151697  0.203774  0.303922  0.154544  0.033312  0.088024   \n",
                            "1        0.0  0.141593  0.152550  0.166456  0.309313  0.008007  0.078382   \n",
                            "\n",
                            "     steer7    steer8  flights  squawk_1  observations  type  type_code  \n",
                            "0  0.010858  0.010753       40      4414          9486  C182         91  \n",
                            "1  0.021492  0.064054       21      4414          2373  C182         91  "
                        ]
                    },
                    "execution_count": 13,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "df.head(2)"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Create your `X` and `y`.\n",
                "\n",
                "When you do `train_df.drop`, you'll want to remove more than just your `0`/`1` surveillance label. What other columns do you not want to use as input? Maybe some categories you converted into codes?"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 14,
            "metadata": {},
            "outputs": [],
            "source": [
                "X = train_df.drop(columns=['adshex', 'type', 'label'])\n",
                "y = train_df.label"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "Triple-check that `X` is a list of numeric features and and `y` is a numeric label."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 15,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/html": [
                            "<div>\n",
                            "<style scoped>\n",
                            "    .dataframe tbody tr th:only-of-type {\n",
                            "        vertical-align: middle;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe tbody tr th {\n",
                            "        vertical-align: top;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe thead th {\n",
                            "        text-align: right;\n",
                            "    }\n",
                            "</style>\n",
                            "<table border=\"1\" class=\"dataframe\">\n",
                            "  <thead>\n",
                            "    <tr style=\"text-align: right;\">\n",
                            "      <th></th>\n",
                            "      <th>duration1</th>\n",
                            "      <th>duration2</th>\n",
                            "      <th>duration3</th>\n",
                            "      <th>duration4</th>\n",
                            "      <th>duration5</th>\n",
                            "      <th>boxes1</th>\n",
                            "      <th>boxes2</th>\n",
                            "      <th>boxes3</th>\n",
                            "      <th>boxes4</th>\n",
                            "      <th>boxes5</th>\n",
                            "      <th>speed1</th>\n",
                            "      <th>speed2</th>\n",
                            "      <th>speed3</th>\n",
                            "      <th>speed4</th>\n",
                            "      <th>speed5</th>\n",
                            "      <th>altitude1</th>\n",
                            "      <th>altitude2</th>\n",
                            "      <th>altitude3</th>\n",
                            "      <th>altitude4</th>\n",
                            "      <th>altitude5</th>\n",
                            "      <th>steer1</th>\n",
                            "      <th>steer2</th>\n",
                            "      <th>steer3</th>\n",
                            "      <th>steer4</th>\n",
                            "      <th>steer5</th>\n",
                            "      <th>steer6</th>\n",
                            "      <th>steer7</th>\n",
                            "      <th>steer8</th>\n",
                            "      <th>flights</th>\n",
                            "      <th>squawk_1</th>\n",
                            "      <th>observations</th>\n",
                            "      <th>type_code</th>\n",
                            "    </tr>\n",
                            "  </thead>\n",
                            "  <tbody>\n",
                            "    <tr>\n",
                            "      <th>0</th>\n",
                            "      <td>0.45000</td>\n",
                            "      <td>0.125</td>\n",
                            "      <td>0.025000</td>\n",
                            "      <td>0.025000</td>\n",
                            "      <td>0.375000</td>\n",
                            "      <td>0.475000</td>\n",
                            "      <td>0.250000</td>\n",
                            "      <td>0.250000</td>\n",
                            "      <td>0.025000</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.337128</td>\n",
                            "      <td>0.408286</td>\n",
                            "      <td>0.185431</td>\n",
                            "      <td>0.053026</td>\n",
                            "      <td>0.016129</td>\n",
                            "      <td>0.010226</td>\n",
                            "      <td>0.168564</td>\n",
                            "      <td>0.793274</td>\n",
                            "      <td>0.027936</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.151697</td>\n",
                            "      <td>0.203774</td>\n",
                            "      <td>0.303922</td>\n",
                            "      <td>0.154544</td>\n",
                            "      <td>0.033312</td>\n",
                            "      <td>0.088024</td>\n",
                            "      <td>0.010858</td>\n",
                            "      <td>0.010753</td>\n",
                            "      <td>40</td>\n",
                            "      <td>4414</td>\n",
                            "      <td>9486</td>\n",
                            "      <td>91</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>1</th>\n",
                            "      <td>0.52381</td>\n",
                            "      <td>0.000</td>\n",
                            "      <td>0.047619</td>\n",
                            "      <td>0.095238</td>\n",
                            "      <td>0.333333</td>\n",
                            "      <td>0.714286</td>\n",
                            "      <td>0.095238</td>\n",
                            "      <td>0.047619</td>\n",
                            "      <td>0.142857</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.703329</td>\n",
                            "      <td>0.144543</td>\n",
                            "      <td>0.114201</td>\n",
                            "      <td>0.026549</td>\n",
                            "      <td>0.011378</td>\n",
                            "      <td>0.007164</td>\n",
                            "      <td>0.580700</td>\n",
                            "      <td>0.374210</td>\n",
                            "      <td>0.037927</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.141593</td>\n",
                            "      <td>0.152550</td>\n",
                            "      <td>0.166456</td>\n",
                            "      <td>0.309313</td>\n",
                            "      <td>0.008007</td>\n",
                            "      <td>0.078382</td>\n",
                            "      <td>0.021492</td>\n",
                            "      <td>0.064054</td>\n",
                            "      <td>21</td>\n",
                            "      <td>4414</td>\n",
                            "      <td>2373</td>\n",
                            "      <td>91</td>\n",
                            "    </tr>\n",
                            "  </tbody>\n",
                            "</table>\n",
                            "</div>"
                        ],
                        "text/plain": [
                            "   duration1  duration2  duration3  duration4  duration5    boxes1    boxes2  \\\n",
                            "0    0.45000      0.125   0.025000   0.025000   0.375000  0.475000  0.250000   \n",
                            "1    0.52381      0.000   0.047619   0.095238   0.333333  0.714286  0.095238   \n",
                            "\n",
                            "     boxes3    boxes4  boxes5    speed1    speed2    speed3    speed4  \\\n",
                            "0  0.250000  0.025000     0.0  0.337128  0.408286  0.185431  0.053026   \n",
                            "1  0.047619  0.142857     0.0  0.703329  0.144543  0.114201  0.026549   \n",
                            "\n",
                            "     speed5  altitude1  altitude2  altitude3  altitude4  altitude5    steer1  \\\n",
                            "0  0.016129   0.010226   0.168564   0.793274   0.027936        0.0  0.151697   \n",
                            "1  0.011378   0.007164   0.580700   0.374210   0.037927        0.0  0.141593   \n",
                            "\n",
                            "     steer2    steer3    steer4    steer5    steer6    steer7    steer8  \\\n",
                            "0  0.203774  0.303922  0.154544  0.033312  0.088024  0.010858  0.010753   \n",
                            "1  0.152550  0.166456  0.309313  0.008007  0.078382  0.021492  0.064054   \n",
                            "\n",
                            "   flights  squawk_1  observations  type_code  \n",
                            "0       40      4414          9486         91  \n",
                            "1       21      4414          2373         91  "
                        ]
                    },
                    "execution_count": 15,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "X.head(2)"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 16,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/plain": [
                            "0    1.0\n",
                            "1    1.0\n",
                            "Name: label, dtype: float64"
                        ]
                    },
                    "execution_count": 16,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "y.head(2)"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Split into test and train datasets\n",
                "\n",
                "We could be nice and lazy and use all our data for training, but it just isn't right! Taking a test using the exact same questions you studied is just cheating. Split your data into test and train.\n",
                "\n",
                "* **Tip:** Don't do this manually! There's a method for it in sklearn"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 17,
            "metadata": {},
            "outputs": [],
            "source": [
                "from sklearn.model_selection import train_test_split\n",
                "\n",
                "X_train, X_test, y_train, y_test = train_test_split(X, y)"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "# Classify using a logistic classifier\n",
                "\n",
                "## Train your classifier\n",
                "\n",
                "Build a `LogisticRegression` and fit it to your data, making sure you're training using only `X_train` and `y_train`.\n",
                "\n",
                "* **Tip:** You'll want to give `LogisticRegression` an extra argument of `max_iter=4000` - it means \"work a little harder than you expect,\" because otherwise it won't find an answer (by default it only has a `max_iter` of 100)"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 18,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/plain": [
                            "LogisticRegression(C=1000000000.0, class_weight=None, dual=False,\n",
                            "                   fit_intercept=True, intercept_scaling=1, l1_ratio=None,\n",
                            "                   max_iter=4000, multi_class='warn', n_jobs=None, penalty='l2',\n",
                            "                   random_state=None, solver='lbfgs', tol=0.0001, verbose=0,\n",
                            "                   warm_start=False)"
                        ]
                    },
                    "execution_count": 18,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "from sklearn.linear_model import LogisticRegression\n",
                "\n",
                "clf = LogisticRegression(C=1e9, solver='lbfgs', max_iter=4000)\n",
                "\n",
                "clf.fit(X_train, y_train)"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## Examine the coefficients\n",
                "\n",
                "What does it mean? What features is the classifier using? Do you care about the odds ratio? **What is even the point of this `LogisticRegression` thing?**"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 19,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/html": [
                            "<div>\n",
                            "<style scoped>\n",
                            "    .dataframe tbody tr th:only-of-type {\n",
                            "        vertical-align: middle;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe tbody tr th {\n",
                            "        vertical-align: top;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe thead th {\n",
                            "        text-align: right;\n",
                            "    }\n",
                            "</style>\n",
                            "<table border=\"1\" class=\"dataframe\">\n",
                            "  <thead>\n",
                            "    <tr style=\"text-align: right;\">\n",
                            "      <th></th>\n",
                            "      <th>feature</th>\n",
                            "      <th>coefficient (log odds ratio)</th>\n",
                            "      <th>odds ratio</th>\n",
                            "    </tr>\n",
                            "  </thead>\n",
                            "  <tbody>\n",
                            "    <tr>\n",
                            "      <th>10</th>\n",
                            "      <td>speed1</td>\n",
                            "      <td>0.622477</td>\n",
                            "      <td>1.863538</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>21</th>\n",
                            "      <td>steer2</td>\n",
                            "      <td>0.507835</td>\n",
                            "      <td>1.661689</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>5</th>\n",
                            "      <td>boxes1</td>\n",
                            "      <td>0.403334</td>\n",
                            "      <td>1.496807</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>6</th>\n",
                            "      <td>boxes2</td>\n",
                            "      <td>0.339208</td>\n",
                            "      <td>1.403836</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>20</th>\n",
                            "      <td>steer1</td>\n",
                            "      <td>0.304670</td>\n",
                            "      <td>1.356177</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>17</th>\n",
                            "      <td>altitude3</td>\n",
                            "      <td>0.251857</td>\n",
                            "      <td>1.286412</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>0</th>\n",
                            "      <td>duration1</td>\n",
                            "      <td>0.114182</td>\n",
                            "      <td>1.120956</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>27</th>\n",
                            "      <td>steer8</td>\n",
                            "      <td>0.002105</td>\n",
                            "      <td>1.002107</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>29</th>\n",
                            "      <td>squawk_1</td>\n",
                            "      <td>0.000745</td>\n",
                            "      <td>1.000745</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>31</th>\n",
                            "      <td>type_code</td>\n",
                            "      <td>0.000180</td>\n",
                            "      <td>1.000180</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>30</th>\n",
                            "      <td>observations</td>\n",
                            "      <td>0.000014</td>\n",
                            "      <td>1.000014</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>28</th>\n",
                            "      <td>flights</td>\n",
                            "      <td>-0.003415</td>\n",
                            "      <td>0.996590</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>26</th>\n",
                            "      <td>steer7</td>\n",
                            "      <td>-0.013651</td>\n",
                            "      <td>0.986441</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>18</th>\n",
                            "      <td>altitude4</td>\n",
                            "      <td>-0.022505</td>\n",
                            "      <td>0.977746</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>11</th>\n",
                            "      <td>speed2</td>\n",
                            "      <td>-0.090921</td>\n",
                            "      <td>0.913090</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>1</th>\n",
                            "      <td>duration2</td>\n",
                            "      <td>-0.117767</td>\n",
                            "      <td>0.888903</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>7</th>\n",
                            "      <td>boxes3</td>\n",
                            "      <td>-0.391191</td>\n",
                            "      <td>0.676251</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>22</th>\n",
                            "      <td>steer3</td>\n",
                            "      <td>-0.397919</td>\n",
                            "      <td>0.671716</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>16</th>\n",
                            "      <td>altitude2</td>\n",
                            "      <td>-0.401974</td>\n",
                            "      <td>0.668998</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>24</th>\n",
                            "      <td>steer5</td>\n",
                            "      <td>-0.460506</td>\n",
                            "      <td>0.630964</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>25</th>\n",
                            "      <td>steer6</td>\n",
                            "      <td>-0.523748</td>\n",
                            "      <td>0.592297</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>12</th>\n",
                            "      <td>speed3</td>\n",
                            "      <td>-0.554705</td>\n",
                            "      <td>0.574241</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>4</th>\n",
                            "      <td>duration5</td>\n",
                            "      <td>-0.557450</td>\n",
                            "      <td>0.572667</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>2</th>\n",
                            "      <td>duration3</td>\n",
                            "      <td>-0.638612</td>\n",
                            "      <td>0.528025</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>15</th>\n",
                            "      <td>altitude1</td>\n",
                            "      <td>-0.692721</td>\n",
                            "      <td>0.500213</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>3</th>\n",
                            "      <td>duration4</td>\n",
                            "      <td>-0.823567</td>\n",
                            "      <td>0.438863</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>8</th>\n",
                            "      <td>boxes4</td>\n",
                            "      <td>-0.826607</td>\n",
                            "      <td>0.437531</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>13</th>\n",
                            "      <td>speed4</td>\n",
                            "      <td>-0.974281</td>\n",
                            "      <td>0.377463</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>14</th>\n",
                            "      <td>speed5</td>\n",
                            "      <td>-1.025784</td>\n",
                            "      <td>0.358515</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>19</th>\n",
                            "      <td>altitude5</td>\n",
                            "      <td>-1.157871</td>\n",
                            "      <td>0.314154</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>23</th>\n",
                            "      <td>steer4</td>\n",
                            "      <td>-1.477424</td>\n",
                            "      <td>0.228225</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>9</th>\n",
                            "      <td>boxes5</td>\n",
                            "      <td>-1.547959</td>\n",
                            "      <td>0.212682</td>\n",
                            "    </tr>\n",
                            "  </tbody>\n",
                            "</table>\n",
                            "</div>"
                        ],
                        "text/plain": [
                            "         feature  coefficient (log odds ratio)  odds ratio\n",
                            "10        speed1                      0.622477    1.863538\n",
                            "21        steer2                      0.507835    1.661689\n",
                            "5         boxes1                      0.403334    1.496807\n",
                            "6         boxes2                      0.339208    1.403836\n",
                            "20        steer1                      0.304670    1.356177\n",
                            "17     altitude3                      0.251857    1.286412\n",
                            "0      duration1                      0.114182    1.120956\n",
                            "27        steer8                      0.002105    1.002107\n",
                            "29      squawk_1                      0.000745    1.000745\n",
                            "31     type_code                      0.000180    1.000180\n",
                            "30  observations                      0.000014    1.000014\n",
                            "28       flights                     -0.003415    0.996590\n",
                            "26        steer7                     -0.013651    0.986441\n",
                            "18     altitude4                     -0.022505    0.977746\n",
                            "11        speed2                     -0.090921    0.913090\n",
                            "1      duration2                     -0.117767    0.888903\n",
                            "7         boxes3                     -0.391191    0.676251\n",
                            "22        steer3                     -0.397919    0.671716\n",
                            "16     altitude2                     -0.401974    0.668998\n",
                            "24        steer5                     -0.460506    0.630964\n",
                            "25        steer6                     -0.523748    0.592297\n",
                            "12        speed3                     -0.554705    0.574241\n",
                            "4      duration5                     -0.557450    0.572667\n",
                            "2      duration3                     -0.638612    0.528025\n",
                            "15     altitude1                     -0.692721    0.500213\n",
                            "3      duration4                     -0.823567    0.438863\n",
                            "8         boxes4                     -0.826607    0.437531\n",
                            "13        speed4                     -0.974281    0.377463\n",
                            "14        speed5                     -1.025784    0.358515\n",
                            "19     altitude5                     -1.157871    0.314154\n",
                            "23        steer4                     -1.477424    0.228225\n",
                            "9         boxes5                     -1.547959    0.212682"
                        ]
                    },
                    "execution_count": 19,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "import numpy as np\n",
                "\n",
                "feature_names = X.columns\n",
                "coefficients = clf.coef_[0]\n",
                "\n",
                "pd.DataFrame({\n",
                "    'feature': feature_names,\n",
                "    'coefficient (log odds ratio)': coefficients,\n",
                "    'odds ratio': np.exp(coefficients)\n",
                "}).sort_values(by='odds ratio', ascending=False)"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "If we don't care about the odds ratio, using the `eli5` package can shrink our code by a lot (and give us color!)"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 20,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/html": [
                            "\n",
                            "    <style>\n",
                            "    table.eli5-weights tr:hover {\n",
                            "        filter: brightness(85%);\n",
                            "    }\n",
                            "</style>\n",
                            "\n",
                            "\n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "        \n",
                            "\n",
                            "    \n",
                            "\n",
                            "        \n",
                            "            \n",
                            "                \n",
                            "                \n",
                            "    \n",
                            "        <p style=\"margin-bottom: 0.5em; margin-top: 0em\">\n",
                            "            <b>\n",
                            "    \n",
                            "        y=1.0\n",
                            "    \n",
                            "</b>\n",
                            "\n",
                            "top features\n",
                            "        </p>\n",
                            "    \n",
                            "    <table class=\"eli5-weights\"\n",
                            "           style=\"border-collapse: collapse; border: none; margin-top: 0em; table-layout: auto; margin-bottom: 2em;\">\n",
                            "        <thead>\n",
                            "        <tr style=\"border: none;\">\n",
                            "            \n",
                            "                <th style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\" title=\"Feature weights. Note that weights do not account for feature value scales, so if feature values have different scales, features with highest weights might not be the most important.\">\n",
                            "                    Weight<sup>?</sup>\n",
                            "                </th>\n",
                            "            \n",
                            "            <th style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">Feature</th>\n",
                            "            \n",
                            "        </tr>\n",
                            "        </thead>\n",
                            "        <tbody>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(120, 100.00%, 91.24%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        +0.622\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        speed1\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(120, 100.00%, 92.40%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        +0.508\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        steer2\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(120, 100.00%, 93.53%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        +0.403\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        boxes1\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(120, 100.00%, 93.53%); border: none;\">\n",
                            "                <td colspan=\"2\" style=\"padding: 0 0.5em 0 0.5em; text-align: center; border: none; white-space: nowrap;\">\n",
                            "                    <i>&hellip; 8 more positive &hellip;</i>\n",
                            "                </td>\n",
                            "            </tr>\n",
                            "        \n",
                            "\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 93.67%); border: none;\">\n",
                            "                <td colspan=\"2\" style=\"padding: 0 0.5em 0 0.5em; text-align: center; border: none; white-space: nowrap;\">\n",
                            "                    <i>&hellip; 5 more negative &hellip;</i>\n",
                            "                </td>\n",
                            "            </tr>\n",
                            "        \n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 93.67%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        -0.391\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        boxes3\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 93.59%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        -0.398\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        steer3\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 93.55%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        -0.402\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        altitude2\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 92.90%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        -0.461\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        steer5\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 92.23%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        -0.524\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        steer6\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 91.92%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        -0.555\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        speed3\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 91.89%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        -0.557\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        duration5\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 91.08%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        -0.639\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        duration3\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 90.56%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        -0.693\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        altitude1\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 89.34%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        -0.824\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        duration4\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 89.31%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        -0.827\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        boxes4\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 88.01%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        -0.974\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        speed4\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 87.57%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        -1.026\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        speed5\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 86.47%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        -1.158\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        altitude5\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 83.95%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        -1.477\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        steer4\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 83.42%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        -1.548\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        boxes5\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 80.00%); border: none;\">\n",
                            "    <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "        -2.023\n",
                            "    </td>\n",
                            "    <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "        &lt;BIAS&gt;\n",
                            "    </td>\n",
                            "    \n",
                            "</tr>\n",
                            "        \n",
                            "\n",
                            "        </tbody>\n",
                            "    </table>\n",
                            "\n",
                            "            \n",
                            "        \n",
                            "\n",
                            "        \n",
                            "\n",
                            "\n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "\n",
                            "\n"
                        ],
                        "text/plain": [
                            "<IPython.core.display.HTML object>"
                        ]
                    },
                    "execution_count": 20,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "import eli5\n",
                "\n",
                "feature_names = list(X.columns)\n",
                "\n",
                "# Use this line instead for wonderful warnings about the results\n",
                "# eli5.show_weights(clf, feature_names=feature_names, show=eli5.formatters.fields.ALL)\n",
                "eli5.show_weights(clf, feature_names=feature_names)"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## How well does our classifier perform?\n",
                "\n",
                "Let's take a look at the confusion matrix to see how well this classifier finds surveillance planes. Make sure you're using `y_test` and `X_test`, not the full dataset."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 21,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/html": [
                            "<div>\n",
                            "<style scoped>\n",
                            "    .dataframe tbody tr th:only-of-type {\n",
                            "        vertical-align: middle;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe tbody tr th {\n",
                            "        vertical-align: top;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe thead th {\n",
                            "        text-align: right;\n",
                            "    }\n",
                            "</style>\n",
                            "<table border=\"1\" class=\"dataframe\">\n",
                            "  <thead>\n",
                            "    <tr style=\"text-align: right;\">\n",
                            "      <th></th>\n",
                            "      <th>Predicted not surveil</th>\n",
                            "      <th>Predicted surveil</th>\n",
                            "    </tr>\n",
                            "  </thead>\n",
                            "  <tbody>\n",
                            "    <tr>\n",
                            "      <th>Is not surveil</th>\n",
                            "      <td>120</td>\n",
                            "      <td>4</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>Is surveil</th>\n",
                            "      <td>12</td>\n",
                            "      <td>14</td>\n",
                            "    </tr>\n",
                            "  </tbody>\n",
                            "</table>\n",
                            "</div>"
                        ],
                        "text/plain": [
                            "                Predicted not surveil  Predicted surveil\n",
                            "Is not surveil                    120                  4\n",
                            "Is surveil                         12                 14"
                        ]
                    },
                    "execution_count": 21,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "from sklearn.metrics import confusion_matrix\n",
                "\n",
                "y_true = y_test\n",
                "y_pred = clf.predict(X_test)\n",
                "matrix = confusion_matrix(y_true, y_pred)\n",
                "\n",
                "label_names = pd.Series(['not surveil', 'surveil'])\n",
                "pd.DataFrame(matrix,\n",
                "     columns='Predicted ' + label_names,\n",
                "     index='Is ' + label_names)"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "# Classify using a decision tree\n",
                "\n",
                "Now we'll use a decision tree. This is how you make one:\n",
                "\n",
                "```python\n",
                "from sklearn.tree import DecisionTreeClassifier\n",
                "\n",
                "clf = DecisionTreeClassifier()\n",
                "```\n",
                "\n",
                "But it's up to you to teach it what spy planes look like using your training data.\n",
                "\n",
                "If we use `max_depth=` to limit the depth of the tree, it will help us visualize it. For example, `max_depth=5` will only allow the tree to make five decisions.\n",
                "\n",
                "Make a decision tree and fit it to your data. Use a `max_depth=` of something between 2 to 5."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 22,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/plain": [
                            "DecisionTreeClassifier(class_weight=None, criterion='gini', max_depth=5,\n",
                            "                       max_features=None, max_leaf_nodes=None,\n",
                            "                       min_impurity_decrease=0.0, min_impurity_split=None,\n",
                            "                       min_samples_leaf=1, min_samples_split=2,\n",
                            "                       min_weight_fraction_leaf=0.0, presort=False,\n",
                            "                       random_state=None, splitter='best')"
                        ]
                    },
                    "execution_count": 22,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "from sklearn.tree import DecisionTreeClassifier\n",
                "\n",
                "clf = DecisionTreeClassifier(max_depth=5)\n",
                "clf.fit(X_train, y_train)"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## What are the important features?\n",
                "\n",
                "We'll use slighyl different code for a decision tree, as it likes to draw big pictures if we don't stop it. The code looks like this:\n",
                "\n",
                "```python\n",
                "import eli5\n",
                "\n",
                "feature_names=list(X.columns)\n",
                "eli5.show_weights(clf, feature_names=feature_names, show=['description', 'feature_importances'])\n",
                "```"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 23,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/html": [
                            "\n",
                            "    <style>\n",
                            "    table.eli5-weights tr:hover {\n",
                            "        filter: brightness(85%);\n",
                            "    }\n",
                            "</style>\n",
                            "\n",
                            "\n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "        \n",
                            "        <pre>\n",
                            "Decision tree feature importances; values are numbers 0 <= x <= 1;\n",
                            "all values sum to 1.\n",
                            "</pre>\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "        <table class=\"eli5-weights eli5-feature-importances\" style=\"border-collapse: collapse; border: none; margin-top: 0em; table-layout: auto;\">\n",
                            "    <thead>\n",
                            "    <tr style=\"border: none;\">\n",
                            "        <th style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">Weight</th>\n",
                            "        <th style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">Feature</th>\n",
                            "    </tr>\n",
                            "    </thead>\n",
                            "    <tbody>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 80.00%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.6489\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                steer2\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 92.94%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.1465\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                squawk_1\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 96.50%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0538\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                altitude1\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 96.86%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0461\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                duration4\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 98.11%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0223\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                speed2\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 98.19%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0210\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                boxes3\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 98.21%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0206\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                duration1\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 98.70%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0131\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                altitude2\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 98.78%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0120\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                steer1\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 98.80%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0117\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                boxes4\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 99.43%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0040\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                altitude5\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(0, 100.00%, 100.00%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                boxes2\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(0, 100.00%, 100.00%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                flights\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(0, 100.00%, 100.00%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                steer5\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(0, 100.00%, 100.00%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                speed1\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(0, 100.00%, 100.00%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                duration5\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(0, 100.00%, 100.00%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                boxes1\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(0, 100.00%, 100.00%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                boxes5\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(0, 100.00%, 100.00%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                duration3\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(0, 100.00%, 100.00%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                observations\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "    \n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(0, 100.00%, 100.00%); border: none;\">\n",
                            "                <td colspan=\"2\" style=\"padding: 0 0.5em 0 0.5em; text-align: center; border: none; white-space: nowrap;\">\n",
                            "                    <i>&hellip; 12 more &hellip;</i>\n",
                            "                </td>\n",
                            "            </tr>\n",
                            "        \n",
                            "    \n",
                            "    </tbody>\n",
                            "</table>\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "\n",
                            "\n"
                        ],
                        "text/plain": [
                            "<IPython.core.display.HTML object>"
                        ]
                    },
                    "execution_count": 23,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "import eli5\n",
                "\n",
                "feature_names=list(X.columns)\n",
                "eli5.show_weights(clf, feature_names=feature_names, show=['description', 'feature_importances'])"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Understanding the output\n",
                "\n",
                "**Why is the feature importance difference than for logistic regression?**\n",
                "\n",
                "Also, if you don't specify a `max_depth`, that's a LOT of zeroes! It doesn't even use most of the features! **Why not?**"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 24,
            "metadata": {},
            "outputs": [],
            "source": [
                "# Because it's a different algorithm\n",
                "# Because the features aren't important"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## How well does the tree perform?\n",
                "\n",
                "Display another confusion matrix with your new classifier."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 25,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/html": [
                            "<div>\n",
                            "<style scoped>\n",
                            "    .dataframe tbody tr th:only-of-type {\n",
                            "        vertical-align: middle;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe tbody tr th {\n",
                            "        vertical-align: top;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe thead th {\n",
                            "        text-align: right;\n",
                            "    }\n",
                            "</style>\n",
                            "<table border=\"1\" class=\"dataframe\">\n",
                            "  <thead>\n",
                            "    <tr style=\"text-align: right;\">\n",
                            "      <th></th>\n",
                            "      <th>Predicted not surveil</th>\n",
                            "      <th>Predicted surveil</th>\n",
                            "    </tr>\n",
                            "  </thead>\n",
                            "  <tbody>\n",
                            "    <tr>\n",
                            "      <th>Is not surveil</th>\n",
                            "      <td>120</td>\n",
                            "      <td>4</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>Is surveil</th>\n",
                            "      <td>2</td>\n",
                            "      <td>24</td>\n",
                            "    </tr>\n",
                            "  </tbody>\n",
                            "</table>\n",
                            "</div>"
                        ],
                        "text/plain": [
                            "                Predicted not surveil  Predicted surveil\n",
                            "Is not surveil                    120                  4\n",
                            "Is surveil                          2                 24"
                        ]
                    },
                    "execution_count": 25,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "from sklearn.metrics import confusion_matrix\n",
                "\n",
                "y_true = y_test\n",
                "y_pred = clf.predict(X_test)\n",
                "matrix = confusion_matrix(y_true, y_pred)\n",
                "\n",
                "label_names = pd.Series(['not surveil', 'surveil'])\n",
                "pd.DataFrame(matrix,\n",
                "     columns='Predicted ' + label_names,\n",
                "     index='Is ' + label_names)"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## Visualize the tree\n",
                "\n",
                "You can use `eli5` to visualize the decision tree itself! It usually takes up too much space, but since it's a special occasion we'll let it go."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 26,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/html": [
                            "\n",
                            "    <style>\n",
                            "    table.eli5-weights tr:hover {\n",
                            "        filter: brightness(85%);\n",
                            "    }\n",
                            "</style>\n",
                            "\n",
                            "\n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "        \n",
                            "        <br>\n",
                            "        <pre><svg width=\"1436pt\" height=\"642pt\"\n",
                            " viewBox=\"0.00 0.00 1436.13 642.00\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
                            "<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 638)\">\n",
                            "<title>Tree</title>\n",
                            "<polygon fill=\"#ffffff\" stroke=\"transparent\" points=\"-4,4 -4,-638 1432.1338,-638 1432.1338,4 -4,4\"/>\n",
                            "<!-- 0 -->\n",
                            "<g id=\"node1\" class=\"node\">\n",
                            "<title>0</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"897.2007,-634 749.933,-634 749.933,-556 897.2007,-556 897.2007,-634\"/>\n",
                            "<text text-anchor=\"middle\" x=\"823.5669\" y=\"-618.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">steer2 &lt;= 0.111</text>\n",
                            "<text text-anchor=\"middle\" x=\"823.5669\" y=\"-604.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.267</text>\n",
                            "<text text-anchor=\"middle\" x=\"823.5669\" y=\"-590.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 100.0%</text>\n",
                            "<text text-anchor=\"middle\" x=\"823.5669\" y=\"-576.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.841, 0.159]</text>\n",
                            "<text text-anchor=\"middle\" x=\"823.5669\" y=\"-562.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 1 -->\n",
                            "<g id=\"node2\" class=\"node\">\n",
                            "<title>1</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"738.2007,-520 590.933,-520 590.933,-442 738.2007,-442 738.2007,-520\"/>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-504.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">squawk_1 &lt;= 4380.5</text>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-490.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.093</text>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-476.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 87.2%</text>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-462.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.951, 0.049]</text>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-448.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 0&#45;&gt;1 -->\n",
                            "<g id=\"edge1\" class=\"edge\">\n",
                            "<title>0&#45;&gt;1</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M768.8481,-555.7677C755.5322,-546.2204 741.1857,-535.9342 727.5215,-526.1373\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"729.2835,-523.0939 719.1171,-520.1115 725.2046,-528.7828 729.2835,-523.0939\"/>\n",
                            "<text text-anchor=\"middle\" x=\"723.1831\" y=\"-540.5786\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">True</text>\n",
                            "</g>\n",
                            "<!-- 20 -->\n",
                            "<g id=\"node21\" class=\"node\">\n",
                            "<title>20</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"1045.6048,-520 905.529,-520 905.529,-442 1045.6048,-442 1045.6048,-520\"/>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-504.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">duration4 &lt;= 0.207</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-490.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.16</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-476.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 12.8%</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-462.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.088, 0.912]</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-448.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 0&#45;&gt;20 -->\n",
                            "<g id=\"edge20\" class=\"edge\">\n",
                            "<title>0&#45;&gt;20</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M875.8767,-555.7677C888.6063,-546.2204 902.3213,-535.9342 915.3839,-526.1373\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"917.5183,-528.9115 923.4183,-520.1115 913.3183,-523.3115 917.5183,-528.9115\"/>\n",
                            "<text text-anchor=\"middle\" x=\"919.8828\" y=\"-540.6602\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">False</text>\n",
                            "</g>\n",
                            "<!-- 2 -->\n",
                            "<g id=\"node3\" class=\"node\">\n",
                            "<title>2</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"568.2007,-406 420.933,-406 420.933,-328 568.2007,-328 568.2007,-406\"/>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-390.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">duration1 &lt;= 0.371</text>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-376.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.045</text>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-362.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 77.4%</text>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-348.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.977, 0.023]</text>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-334.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 1&#45;&gt;2 -->\n",
                            "<g id=\"edge2\" class=\"edge\">\n",
                            "<title>1&#45;&gt;2</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M606.0625,-441.7677C591.6911,-432.1303 576.1968,-421.7401 561.4636,-411.8601\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"563.1458,-408.7741 552.891,-406.1115 559.2471,-414.5879 563.1458,-408.7741\"/>\n",
                            "</g>\n",
                            "<!-- 13 -->\n",
                            "<g id=\"node14\" class=\"node\">\n",
                            "<title>13</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"738.2007,-406 590.933,-406 590.933,-328 738.2007,-328 738.2007,-406\"/>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-390.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">squawk_1 &lt;= 4465.5</text>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-376.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.375</text>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-362.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 9.8%</text>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-348.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.75, 0.25]</text>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-334.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 1&#45;&gt;13 -->\n",
                            "<g id=\"edge13\" class=\"edge\">\n",
                            "<title>1&#45;&gt;13</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M664.5669,-441.7677C664.5669,-433.6172 664.5669,-424.9283 664.5669,-416.4649\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"668.067,-416.3046 664.5669,-406.3046 661.067,-416.3047 668.067,-416.3046\"/>\n",
                            "</g>\n",
                            "<!-- 3 -->\n",
                            "<g id=\"node4\" class=\"node\">\n",
                            "<title>3</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"389.2007,-292 241.933,-292 241.933,-214 389.2007,-214 389.2007,-292\"/>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-276.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">speed2 &lt;= 0.003</text>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-262.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.006</text>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-248.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 69.4%</text>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-234.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.997, 0.003]</text>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-220.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 2&#45;&gt;3 -->\n",
                            "<g id=\"edge3\" class=\"edge\">\n",
                            "<title>2&#45;&gt;3</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M433.2157,-327.9272C417.8452,-318.1381 401.2415,-307.5637 385.4887,-297.5312\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"387.1528,-294.4415 376.8379,-292.0218 383.3925,-300.3458 387.1528,-294.4415\"/>\n",
                            "</g>\n",
                            "<!-- 8 -->\n",
                            "<g id=\"node9\" class=\"node\">\n",
                            "<title>8</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"568.2007,-292 420.933,-292 420.933,-214 568.2007,-214 568.2007,-292\"/>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-276.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">altitude1 &lt;= 0.122</text>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-262.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.313</text>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-248.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 8.1%</text>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-234.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.806, 0.194]</text>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-220.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 2&#45;&gt;8 -->\n",
                            "<g id=\"edge8\" class=\"edge\">\n",
                            "<title>2&#45;&gt;8</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M494.5669,-327.7677C494.5669,-319.6172 494.5669,-310.9283 494.5669,-302.4649\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"498.067,-302.3046 494.5669,-292.3046 491.067,-302.3047 498.067,-302.3046\"/>\n",
                            "</g>\n",
                            "<!-- 4 -->\n",
                            "<g id=\"node5\" class=\"node\">\n",
                            "<title>4</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"224.2007,-178 76.933,-178 76.933,-100 224.2007,-100 224.2007,-178\"/>\n",
                            "<text text-anchor=\"middle\" x=\"150.5669\" y=\"-162.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">boxes4 &lt;= 0.379</text>\n",
                            "<text text-anchor=\"middle\" x=\"150.5669\" y=\"-148.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.444</text>\n",
                            "<text text-anchor=\"middle\" x=\"150.5669\" y=\"-134.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 0.7%</text>\n",
                            "<text text-anchor=\"middle\" x=\"150.5669\" y=\"-120.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.667, 0.333]</text>\n",
                            "<text text-anchor=\"middle\" x=\"150.5669\" y=\"-106.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 3&#45;&gt;4 -->\n",
                            "<g id=\"edge4\" class=\"edge\">\n",
                            "<title>3&#45;&gt;4</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M258.7833,-213.7677C244.8345,-204.1303 229.796,-193.7401 215.496,-183.8601\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"217.3924,-180.9163 207.1756,-178.1115 213.4134,-186.6754 217.3924,-180.9163\"/>\n",
                            "</g>\n",
                            "<!-- 7 -->\n",
                            "<g id=\"node8\" class=\"node\">\n",
                            "<title>7</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"389.2007,-171 241.933,-171 241.933,-107 389.2007,-107 389.2007,-171\"/>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-155.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-141.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 68.7%</text>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-127.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [1.0, 0.0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-113.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 3&#45;&gt;7 -->\n",
                            "<g id=\"edge7\" class=\"edge\">\n",
                            "<title>3&#45;&gt;7</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M315.5669,-213.7677C315.5669,-203.3338 315.5669,-192.0174 315.5669,-181.4215\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"319.067,-181.1252 315.5669,-171.1252 312.067,-181.1252 319.067,-181.1252\"/>\n",
                            "</g>\n",
                            "<!-- 5 -->\n",
                            "<g id=\"node6\" class=\"node\">\n",
                            "<title>5</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"147.2007,-64 -.067,-64 -.067,0 147.2007,0 147.2007,-64\"/>\n",
                            "<text text-anchor=\"middle\" x=\"73.5669\" y=\"-48.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"73.5669\" y=\"-34.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 0.4%</text>\n",
                            "<text text-anchor=\"middle\" x=\"73.5669\" y=\"-20.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [1.0, 0.0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"73.5669\" y=\"-6.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 4&#45;&gt;5 -->\n",
                            "<g id=\"edge5\" class=\"edge\">\n",
                            "<title>4&#45;&gt;5</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M122.3322,-99.7647C115.957,-90.9057 109.1767,-81.4838 102.7629,-72.571\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"105.433,-70.2893 96.751,-64.2169 99.7512,-74.3781 105.433,-70.2893\"/>\n",
                            "</g>\n",
                            "<!-- 6 -->\n",
                            "<g id=\"node7\" class=\"node\">\n",
                            "<title>6</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"290.3113,-64 164.8225,-64 164.8225,0 290.3113,0 290.3113,-64\"/>\n",
                            "<text text-anchor=\"middle\" x=\"227.5669\" y=\"-48.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"227.5669\" y=\"-34.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 0.2%</text>\n",
                            "<text text-anchor=\"middle\" x=\"227.5669\" y=\"-20.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.0, 1.0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"227.5669\" y=\"-6.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 4&#45;&gt;6 -->\n",
                            "<g id=\"edge6\" class=\"edge\">\n",
                            "<title>4&#45;&gt;6</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M178.8016,-99.7647C185.1768,-90.9057 191.9571,-81.4838 198.3709,-72.571\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"201.3826,-74.3781 204.3828,-64.2169 195.7008,-70.2893 201.3826,-74.3781\"/>\n",
                            "</g>\n",
                            "<!-- 9 -->\n",
                            "<g id=\"node10\" class=\"node\">\n",
                            "<title>9</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"554.2007,-178 406.933,-178 406.933,-100 554.2007,-100 554.2007,-178\"/>\n",
                            "<text text-anchor=\"middle\" x=\"480.5669\" y=\"-162.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">altitude1 &lt;= 0.046</text>\n",
                            "<text text-anchor=\"middle\" x=\"480.5669\" y=\"-148.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.444</text>\n",
                            "<text text-anchor=\"middle\" x=\"480.5669\" y=\"-134.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 4.7%</text>\n",
                            "<text text-anchor=\"middle\" x=\"480.5669\" y=\"-120.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.667, 0.333]</text>\n",
                            "<text text-anchor=\"middle\" x=\"480.5669\" y=\"-106.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 8&#45;&gt;9 -->\n",
                            "<g id=\"edge9\" class=\"edge\">\n",
                            "<title>8&#45;&gt;9</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M489.7489,-213.7677C488.748,-205.6172 487.6809,-196.9283 486.6415,-188.4649\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"490.0867,-187.8034 485.3938,-178.3046 483.1389,-188.6567 490.0867,-187.8034\"/>\n",
                            "</g>\n",
                            "<!-- 12 -->\n",
                            "<g id=\"node13\" class=\"node\">\n",
                            "<title>12</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"719.2007,-171 571.933,-171 571.933,-107 719.2007,-107 719.2007,-171\"/>\n",
                            "<text text-anchor=\"middle\" x=\"645.5669\" y=\"-155.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"645.5669\" y=\"-141.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 3.4%</text>\n",
                            "<text text-anchor=\"middle\" x=\"645.5669\" y=\"-127.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [1.0, 0.0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"645.5669\" y=\"-113.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 8&#45;&gt;12 -->\n",
                            "<g id=\"edge12\" class=\"edge\">\n",
                            "<title>8&#45;&gt;12</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M546.5325,-213.7677C562.2243,-201.9209 579.4231,-188.9364 595.0209,-177.1606\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"597.143,-179.9439 603.0151,-171.1252 592.9253,-174.3572 597.143,-179.9439\"/>\n",
                            "</g>\n",
                            "<!-- 10 -->\n",
                            "<g id=\"node11\" class=\"node\">\n",
                            "<title>10</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"505.2007,-64 357.933,-64 357.933,0 505.2007,0 505.2007,-64\"/>\n",
                            "<text text-anchor=\"middle\" x=\"431.5669\" y=\"-48.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.231</text>\n",
                            "<text text-anchor=\"middle\" x=\"431.5669\" y=\"-34.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 3.4%</text>\n",
                            "<text text-anchor=\"middle\" x=\"431.5669\" y=\"-20.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.867, 0.133]</text>\n",
                            "<text text-anchor=\"middle\" x=\"431.5669\" y=\"-6.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 9&#45;&gt;10 -->\n",
                            "<g id=\"edge10\" class=\"edge\">\n",
                            "<title>9&#45;&gt;10</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M462.5993,-99.7647C458.6679,-91.1797 454.4943,-82.066 450.5255,-73.3994\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"453.6663,-71.8516 446.3204,-64.2169 447.3019,-74.7662 453.6663,-71.8516\"/>\n",
                            "</g>\n",
                            "<!-- 11 -->\n",
                            "<g id=\"node12\" class=\"node\">\n",
                            "<title>11</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"663.6048,-64 523.529,-64 523.529,0 663.6048,0 663.6048,-64\"/>\n",
                            "<text text-anchor=\"middle\" x=\"593.5669\" y=\"-48.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.278</text>\n",
                            "<text text-anchor=\"middle\" x=\"593.5669\" y=\"-34.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 1.3%</text>\n",
                            "<text text-anchor=\"middle\" x=\"593.5669\" y=\"-20.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.167, 0.833]</text>\n",
                            "<text text-anchor=\"middle\" x=\"593.5669\" y=\"-6.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 9&#45;&gt;11 -->\n",
                            "<g id=\"edge11\" class=\"edge\">\n",
                            "<title>9&#45;&gt;11</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M522.0023,-99.7647C531.8403,-90.4491 542.3357,-80.5109 552.1719,-71.197\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"554.6887,-73.634 559.5435,-64.2169 549.8757,-68.5511 554.6887,-73.634\"/>\n",
                            "</g>\n",
                            "<!-- 14 -->\n",
                            "<g id=\"node15\" class=\"node\">\n",
                            "<title>14</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"719.3113,-285 593.8225,-285 593.8225,-221 719.3113,-221 719.3113,-285\"/>\n",
                            "<text text-anchor=\"middle\" x=\"656.5669\" y=\"-269.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"656.5669\" y=\"-255.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 2.0%</text>\n",
                            "<text text-anchor=\"middle\" x=\"656.5669\" y=\"-241.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.0, 1.0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"656.5669\" y=\"-227.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 13&#45;&gt;14 -->\n",
                            "<g id=\"edge14\" class=\"edge\">\n",
                            "<title>13&#45;&gt;14</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M661.8137,-327.7677C661.0815,-317.3338 660.2874,-306.0174 659.5438,-295.4215\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"663.0128,-294.8556 658.8213,-285.1252 656.03,-295.3457 663.0128,-294.8556\"/>\n",
                            "</g>\n",
                            "<!-- 15 -->\n",
                            "<g id=\"node16\" class=\"node\">\n",
                            "<title>15</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"884.2007,-292 736.933,-292 736.933,-214 884.2007,-214 884.2007,-292\"/>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-276.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">steer1 &lt;= 0.027</text>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-262.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.108</text>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-248.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 7.8%</text>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-234.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.943, 0.057]</text>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-220.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 13&#45;&gt;15 -->\n",
                            "<g id=\"edge15\" class=\"edge\">\n",
                            "<title>13&#45;&gt;15</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M714.8118,-327.7677C726.856,-318.3633 739.8184,-308.242 752.1957,-298.5775\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"754.5015,-301.2177 760.2294,-292.3046 750.1934,-295.7004 754.5015,-301.2177\"/>\n",
                            "</g>\n",
                            "<!-- 16 -->\n",
                            "<g id=\"node17\" class=\"node\">\n",
                            "<title>16</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"884.2007,-171 736.933,-171 736.933,-107 884.2007,-107 884.2007,-171\"/>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-155.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-141.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 6.7%</text>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-127.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [1.0, 0.0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-113.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 15&#45;&gt;16 -->\n",
                            "<g id=\"edge16\" class=\"edge\">\n",
                            "<title>15&#45;&gt;16</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M810.5669,-213.7677C810.5669,-203.3338 810.5669,-192.0174 810.5669,-181.4215\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"814.067,-181.1252 810.5669,-171.1252 807.067,-181.1252 814.067,-181.1252\"/>\n",
                            "</g>\n",
                            "<!-- 17 -->\n",
                            "<g id=\"node18\" class=\"node\">\n",
                            "<title>17</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"1049.2007,-178 901.933,-178 901.933,-100 1049.2007,-100 1049.2007,-178\"/>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-162.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">boxes3 &lt;= 0.113</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-148.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.48</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-134.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 1.1%</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-120.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.6, 0.4]</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-106.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 15&#45;&gt;17 -->\n",
                            "<g id=\"edge17\" class=\"edge\">\n",
                            "<title>15&#45;&gt;17</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M867.3505,-213.7677C881.2993,-204.1303 896.3378,-193.7401 910.6378,-183.8601\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"912.7204,-186.6754 918.9582,-178.1115 908.7414,-180.9163 912.7204,-186.6754\"/>\n",
                            "</g>\n",
                            "<!-- 18 -->\n",
                            "<g id=\"node19\" class=\"node\">\n",
                            "<title>18</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"937.2007,-64 789.933,-64 789.933,0 937.2007,0 937.2007,-64\"/>\n",
                            "<text text-anchor=\"middle\" x=\"863.5669\" y=\"-48.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"863.5669\" y=\"-34.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 0.7%</text>\n",
                            "<text text-anchor=\"middle\" x=\"863.5669\" y=\"-20.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [1.0, 0.0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"863.5669\" y=\"-6.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 17&#45;&gt;18 -->\n",
                            "<g id=\"edge18\" class=\"edge\">\n",
                            "<title>17&#45;&gt;18</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M934.4982,-99.7647C924.7472,-90.4491 914.3447,-80.5109 904.5955,-71.197\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"906.9376,-68.594 897.2892,-64.2169 902.1021,-73.6555 906.9376,-68.594\"/>\n",
                            "</g>\n",
                            "<!-- 19 -->\n",
                            "<g id=\"node20\" class=\"node\">\n",
                            "<title>19</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"1080.3113,-64 954.8225,-64 954.8225,0 1080.3113,0 1080.3113,-64\"/>\n",
                            "<text text-anchor=\"middle\" x=\"1017.5669\" y=\"-48.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"1017.5669\" y=\"-34.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 0.4%</text>\n",
                            "<text text-anchor=\"middle\" x=\"1017.5669\" y=\"-20.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.0, 1.0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"1017.5669\" y=\"-6.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 17&#45;&gt;19 -->\n",
                            "<g id=\"edge19\" class=\"edge\">\n",
                            "<title>17&#45;&gt;19</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M990.9677,-99.7647C994.3016,-91.271 997.8387,-82.2599 1001.208,-73.6762\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"1004.5251,-74.8043 1004.921,-64.2169 998.0091,-72.2466 1004.5251,-74.8043\"/>\n",
                            "</g>\n",
                            "<!-- 21 -->\n",
                            "<g id=\"node22\" class=\"node\">\n",
                            "<title>21</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"1045.6048,-406 905.529,-406 905.529,-328 1045.6048,-328 1045.6048,-406\"/>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-390.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">speed2 &lt;= 0.013</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-376.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.071</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-362.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 12.1%</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-348.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.037, 0.963]</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-334.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 20&#45;&gt;21 -->\n",
                            "<g id=\"edge21\" class=\"edge\">\n",
                            "<title>20&#45;&gt;21</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M975.5669,-441.7677C975.5669,-433.6172 975.5669,-424.9283 975.5669,-416.4649\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"979.067,-416.3046 975.5669,-406.3046 972.067,-416.3047 979.067,-416.3046\"/>\n",
                            "</g>\n",
                            "<!-- 28 -->\n",
                            "<g id=\"node29\" class=\"node\">\n",
                            "<title>28</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"1211.2007,-399 1063.933,-399 1063.933,-335 1211.2007,-335 1211.2007,-399\"/>\n",
                            "<text text-anchor=\"middle\" x=\"1137.5669\" y=\"-383.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"1137.5669\" y=\"-369.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 0.7%</text>\n",
                            "<text text-anchor=\"middle\" x=\"1137.5669\" y=\"-355.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [1.0, 0.0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"1137.5669\" y=\"-341.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 20&#45;&gt;28 -->\n",
                            "<g id=\"edge28\" class=\"edge\">\n",
                            "<title>20&#45;&gt;28</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M1031.3181,-441.7677C1048.153,-429.9209 1066.6047,-416.9364 1083.3387,-405.1606\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"1085.7515,-407.7425 1091.9153,-399.1252 1081.723,-402.0178 1085.7515,-407.7425\"/>\n",
                            "</g>\n",
                            "<!-- 22 -->\n",
                            "<g id=\"node23\" class=\"node\">\n",
                            "<title>22</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"1049.2007,-285 901.933,-285 901.933,-221 1049.2007,-221 1049.2007,-285\"/>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-269.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-255.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 0.2%</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-241.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [1.0, 0.0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-227.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 21&#45;&gt;22 -->\n",
                            "<g id=\"edge22\" class=\"edge\">\n",
                            "<title>21&#45;&gt;22</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M975.5669,-327.7677C975.5669,-317.3338 975.5669,-306.0174 975.5669,-295.4215\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"979.067,-295.1252 975.5669,-285.1252 972.067,-295.1252 979.067,-295.1252\"/>\n",
                            "</g>\n",
                            "<!-- 23 -->\n",
                            "<g id=\"node24\" class=\"node\">\n",
                            "<title>23</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"1207.6048,-292 1067.529,-292 1067.529,-214 1207.6048,-214 1207.6048,-292\"/>\n",
                            "<text text-anchor=\"middle\" x=\"1137.5669\" y=\"-276.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">altitude5 &lt;= 0.261</text>\n",
                            "<text text-anchor=\"middle\" x=\"1137.5669\" y=\"-262.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.037</text>\n",
                            "<text text-anchor=\"middle\" x=\"1137.5669\" y=\"-248.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 11.9%</text>\n",
                            "<text text-anchor=\"middle\" x=\"1137.5669\" y=\"-234.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.019, 0.981]</text>\n",
                            "<text text-anchor=\"middle\" x=\"1137.5669\" y=\"-220.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 21&#45;&gt;23 -->\n",
                            "<g id=\"edge23\" class=\"edge\">\n",
                            "<title>21&#45;&gt;23</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M1031.3181,-327.7677C1044.8853,-318.2204 1059.5025,-307.9342 1073.4245,-298.1373\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"1075.8236,-300.7288 1081.9875,-292.1115 1071.7951,-295.0041 1075.8236,-300.7288\"/>\n",
                            "</g>\n",
                            "<!-- 24 -->\n",
                            "<g id=\"node25\" class=\"node\">\n",
                            "<title>24</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"1196.3113,-171 1070.8225,-171 1070.8225,-107 1196.3113,-107 1196.3113,-171\"/>\n",
                            "<text text-anchor=\"middle\" x=\"1133.5669\" y=\"-155.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"1133.5669\" y=\"-141.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 11.0%</text>\n",
                            "<text text-anchor=\"middle\" x=\"1133.5669\" y=\"-127.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.0, 1.0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"1133.5669\" y=\"-113.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 23&#45;&gt;24 -->\n",
                            "<g id=\"edge24\" class=\"edge\">\n",
                            "<title>23&#45;&gt;24</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M1136.1903,-213.7677C1135.8242,-203.3338 1135.4272,-192.0174 1135.0554,-181.4215\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"1138.5427,-180.9963 1134.6941,-171.1252 1131.547,-181.2418 1138.5427,-180.9963\"/>\n",
                            "</g>\n",
                            "<!-- 25 -->\n",
                            "<g id=\"node26\" class=\"node\">\n",
                            "<title>25</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"1340.6048,-178 1214.5289,-178 1214.5289,-100 1340.6048,-100 1340.6048,-178\"/>\n",
                            "<text text-anchor=\"middle\" x=\"1277.5669\" y=\"-162.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">altitude2 &lt;= 0.01</text>\n",
                            "<text text-anchor=\"middle\" x=\"1277.5669\" y=\"-148.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.375</text>\n",
                            "<text text-anchor=\"middle\" x=\"1277.5669\" y=\"-134.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 0.9%</text>\n",
                            "<text text-anchor=\"middle\" x=\"1277.5669\" y=\"-120.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.25, 0.75]</text>\n",
                            "<text text-anchor=\"middle\" x=\"1277.5669\" y=\"-106.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 23&#45;&gt;25 -->\n",
                            "<g id=\"edge25\" class=\"edge\">\n",
                            "<title>23&#45;&gt;25</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M1185.747,-213.7677C1197.1862,-204.4529 1209.4892,-194.4347 1221.2553,-184.8538\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"1223.7537,-187.333 1229.298,-178.3046 1219.3336,-181.9049 1223.7537,-187.333\"/>\n",
                            "</g>\n",
                            "<!-- 26 -->\n",
                            "<g id=\"node27\" class=\"node\">\n",
                            "<title>26</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"1263.3113,-64 1137.8225,-64 1137.8225,0 1263.3113,0 1263.3113,-64\"/>\n",
                            "<text text-anchor=\"middle\" x=\"1200.5669\" y=\"-48.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"1200.5669\" y=\"-34.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 0.7%</text>\n",
                            "<text text-anchor=\"middle\" x=\"1200.5669\" y=\"-20.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0.0, 1.0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"1200.5669\" y=\"-6.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 25&#45;&gt;26 -->\n",
                            "<g id=\"edge26\" class=\"edge\">\n",
                            "<title>25&#45;&gt;26</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M1249.3322,-99.7647C1242.957,-90.9057 1236.1767,-81.4838 1229.7629,-72.571\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"1232.433,-70.2893 1223.751,-64.2169 1226.7512,-74.3781 1232.433,-70.2893\"/>\n",
                            "</g>\n",
                            "<!-- 27 -->\n",
                            "<g id=\"node28\" class=\"node\">\n",
                            "<title>27</title>\n",
                            "<polygon fill=\"none\" stroke=\"#000000\" points=\"1428.2007,-64 1280.933,-64 1280.933,0 1428.2007,0 1428.2007,-64\"/>\n",
                            "<text text-anchor=\"middle\" x=\"1354.5669\" y=\"-48.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"1354.5669\" y=\"-34.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 0.2%</text>\n",
                            "<text text-anchor=\"middle\" x=\"1354.5669\" y=\"-20.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [1.0, 0.0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"1354.5669\" y=\"-6.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 25&#45;&gt;27 -->\n",
                            "<g id=\"edge27\" class=\"edge\">\n",
                            "<title>25&#45;&gt;27</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M1305.8016,-99.7647C1312.1768,-90.9057 1318.9571,-81.4838 1325.3709,-72.571\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"1328.3826,-74.3781 1331.3828,-64.2169 1322.7008,-70.2893 1328.3826,-74.3781\"/>\n",
                            "</g>\n",
                            "</g>\n",
                            "</svg>\n",
                            "</pre>\n",
                            "    \n",
                            "\n",
                            "\n",
                            "\n"
                        ],
                        "text/plain": [
                            "<IPython.core.display.HTML object>"
                        ]
                    },
                    "execution_count": 26,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "feature_names=list(X.columns)\n",
                "label_names = ['not surveillance', 'surveillance']\n",
                "eli5.show_weights(clf, feature_names=feature_names, target_names=label_names, show=['decision_tree'])"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "If you'd like your graph to have colors colors, or to not use eli5, you can do it the old-fashioned way. You might need to `brew install graphviz` and `pip install graphviz`.\n",
                "\n",
                "```python\n",
                "from sklearn import tree\n",
                "import graphviz\n",
                "\n",
                "label_names = ['not surveillance', 'surveillance']\n",
                "feature_names = X.columns\n",
                "\n",
                "dot_data = tree.export_graphviz(clf,\n",
                "                    feature_names=feature_names,  \n",
                "                    filled=True,\n",
                "                    class_names=label_names)  \n",
                "graph = graphviz.Source(dot_data)  \n",
                "graph\n",
                "```\n",
                "\n",
                "* **Tip:** You'll probably need to scroll sideways a bit"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 27,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "image/svg+xml": [
                            "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\n",
                            "<!DOCTYPE svg PUBLIC \"-//W3C//DTD SVG 1.1//EN\"\n",
                            " \"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd\">\n",
                            "<!-- Generated by graphviz version 2.40.1 (20161225.0304)\n",
                            " -->\n",
                            "<!-- Title: Tree Pages: 1 -->\n",
                            "<svg width=\"1432pt\" height=\"642pt\"\n",
                            " viewBox=\"0.00 0.00 1432.13 642.00\" xmlns=\"http://www.w3.org/2000/svg\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">\n",
                            "<g id=\"graph0\" class=\"graph\" transform=\"scale(1 1) rotate(0) translate(4 638)\">\n",
                            "<title>Tree</title>\n",
                            "<polygon fill=\"#ffffff\" stroke=\"transparent\" points=\"-4,4 -4,-638 1428.1338,-638 1428.1338,4 -4,4\"/>\n",
                            "<!-- 0 -->\n",
                            "<g id=\"node1\" class=\"node\">\n",
                            "<title>0</title>\n",
                            "<polygon fill=\"#ea995e\" stroke=\"#000000\" points=\"897.2007,-634 749.933,-634 749.933,-556 897.2007,-556 897.2007,-634\"/>\n",
                            "<text text-anchor=\"middle\" x=\"823.5669\" y=\"-618.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">steer2 &lt;= 0.111</text>\n",
                            "<text text-anchor=\"middle\" x=\"823.5669\" y=\"-604.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.267</text>\n",
                            "<text text-anchor=\"middle\" x=\"823.5669\" y=\"-590.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 447</text>\n",
                            "<text text-anchor=\"middle\" x=\"823.5669\" y=\"-576.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [376, 71]</text>\n",
                            "<text text-anchor=\"middle\" x=\"823.5669\" y=\"-562.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 1 -->\n",
                            "<g id=\"node2\" class=\"node\">\n",
                            "<title>1</title>\n",
                            "<polygon fill=\"#e68743\" stroke=\"#000000\" points=\"738.2007,-520 590.933,-520 590.933,-442 738.2007,-442 738.2007,-520\"/>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-504.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">squawk_1 &lt;= 4380.5</text>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-490.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.093</text>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-476.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 390</text>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-462.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [371, 19]</text>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-448.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 0&#45;&gt;1 -->\n",
                            "<g id=\"edge1\" class=\"edge\">\n",
                            "<title>0&#45;&gt;1</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M768.8481,-555.7677C755.5322,-546.2204 741.1857,-535.9342 727.5215,-526.1373\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"729.2835,-523.0939 719.1171,-520.1115 725.2046,-528.7828 729.2835,-523.0939\"/>\n",
                            "<text text-anchor=\"middle\" x=\"723.1831\" y=\"-540.5786\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">True</text>\n",
                            "</g>\n",
                            "<!-- 20 -->\n",
                            "<g id=\"node21\" class=\"node\">\n",
                            "<title>20</title>\n",
                            "<polygon fill=\"#4ca6e8\" stroke=\"#000000\" points=\"1038.3113,-520 912.8225,-520 912.8225,-442 1038.3113,-442 1038.3113,-520\"/>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-504.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">duration4 &lt;= 0.207</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-490.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.16</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-476.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 57</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-462.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [5, 52]</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-448.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 0&#45;&gt;20 -->\n",
                            "<g id=\"edge20\" class=\"edge\">\n",
                            "<title>0&#45;&gt;20</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M875.8767,-555.7677C888.6063,-546.2204 902.3213,-535.9342 915.3839,-526.1373\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"917.5183,-528.9115 923.4183,-520.1115 913.3183,-523.3115 917.5183,-528.9115\"/>\n",
                            "<text text-anchor=\"middle\" x=\"919.8828\" y=\"-540.6602\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">False</text>\n",
                            "</g>\n",
                            "<!-- 2 -->\n",
                            "<g id=\"node3\" class=\"node\">\n",
                            "<title>2</title>\n",
                            "<polygon fill=\"#e6843e\" stroke=\"#000000\" points=\"568.2007,-406 420.933,-406 420.933,-328 568.2007,-328 568.2007,-406\"/>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-390.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">duration1 &lt;= 0.371</text>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-376.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.045</text>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-362.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 346</text>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-348.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [338, 8]</text>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-334.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 1&#45;&gt;2 -->\n",
                            "<g id=\"edge2\" class=\"edge\">\n",
                            "<title>1&#45;&gt;2</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M606.0625,-441.7677C591.6911,-432.1303 576.1968,-421.7401 561.4636,-411.8601\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"563.1458,-408.7741 552.891,-406.1115 559.2471,-414.5879 563.1458,-408.7741\"/>\n",
                            "</g>\n",
                            "<!-- 13 -->\n",
                            "<g id=\"node14\" class=\"node\">\n",
                            "<title>13</title>\n",
                            "<polygon fill=\"#eeab7b\" stroke=\"#000000\" points=\"738.2007,-406 590.933,-406 590.933,-328 738.2007,-328 738.2007,-406\"/>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-390.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">squawk_1 &lt;= 4465.5</text>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-376.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.375</text>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-362.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 44</text>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-348.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [33, 11]</text>\n",
                            "<text text-anchor=\"middle\" x=\"664.5669\" y=\"-334.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 1&#45;&gt;13 -->\n",
                            "<g id=\"edge13\" class=\"edge\">\n",
                            "<title>1&#45;&gt;13</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M664.5669,-441.7677C664.5669,-433.6172 664.5669,-424.9283 664.5669,-416.4649\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"668.067,-416.3046 664.5669,-406.3046 661.067,-416.3047 668.067,-416.3046\"/>\n",
                            "</g>\n",
                            "<!-- 3 -->\n",
                            "<g id=\"node4\" class=\"node\">\n",
                            "<title>3</title>\n",
                            "<polygon fill=\"#e5813a\" stroke=\"#000000\" points=\"389.2007,-292 241.933,-292 241.933,-214 389.2007,-214 389.2007,-292\"/>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-276.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">speed2 &lt;= 0.003</text>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-262.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.006</text>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-248.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 310</text>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-234.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [309, 1]</text>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-220.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 2&#45;&gt;3 -->\n",
                            "<g id=\"edge3\" class=\"edge\">\n",
                            "<title>2&#45;&gt;3</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M433.2157,-327.9272C417.8452,-318.1381 401.2415,-307.5637 385.4887,-297.5312\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"387.1528,-294.4415 376.8379,-292.0218 383.3925,-300.3458 387.1528,-294.4415\"/>\n",
                            "</g>\n",
                            "<!-- 8 -->\n",
                            "<g id=\"node9\" class=\"node\">\n",
                            "<title>8</title>\n",
                            "<polygon fill=\"#eb9f69\" stroke=\"#000000\" points=\"568.2007,-292 420.933,-292 420.933,-214 568.2007,-214 568.2007,-292\"/>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-276.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">altitude1 &lt;= 0.122</text>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-262.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.313</text>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-248.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 36</text>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-234.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [29, 7]</text>\n",
                            "<text text-anchor=\"middle\" x=\"494.5669\" y=\"-220.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 2&#45;&gt;8 -->\n",
                            "<g id=\"edge8\" class=\"edge\">\n",
                            "<title>2&#45;&gt;8</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M494.5669,-327.7677C494.5669,-319.6172 494.5669,-310.9283 494.5669,-302.4649\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"498.067,-302.3046 494.5669,-292.3046 491.067,-302.3047 498.067,-302.3046\"/>\n",
                            "</g>\n",
                            "<!-- 4 -->\n",
                            "<g id=\"node5\" class=\"node\">\n",
                            "<title>4</title>\n",
                            "<polygon fill=\"#f2c09c\" stroke=\"#000000\" points=\"224.2007,-178 76.933,-178 76.933,-100 224.2007,-100 224.2007,-178\"/>\n",
                            "<text text-anchor=\"middle\" x=\"150.5669\" y=\"-162.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">boxes4 &lt;= 0.379</text>\n",
                            "<text text-anchor=\"middle\" x=\"150.5669\" y=\"-148.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.444</text>\n",
                            "<text text-anchor=\"middle\" x=\"150.5669\" y=\"-134.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 3</text>\n",
                            "<text text-anchor=\"middle\" x=\"150.5669\" y=\"-120.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [2, 1]</text>\n",
                            "<text text-anchor=\"middle\" x=\"150.5669\" y=\"-106.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 3&#45;&gt;4 -->\n",
                            "<g id=\"edge4\" class=\"edge\">\n",
                            "<title>3&#45;&gt;4</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M258.7833,-213.7677C244.8345,-204.1303 229.796,-193.7401 215.496,-183.8601\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"217.3924,-180.9163 207.1756,-178.1115 213.4134,-186.6754 217.3924,-180.9163\"/>\n",
                            "</g>\n",
                            "<!-- 7 -->\n",
                            "<g id=\"node8\" class=\"node\">\n",
                            "<title>7</title>\n",
                            "<polygon fill=\"#e58139\" stroke=\"#000000\" points=\"389.2007,-171 241.933,-171 241.933,-107 389.2007,-107 389.2007,-171\"/>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-155.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-141.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 307</text>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-127.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [307, 0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"315.5669\" y=\"-113.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 3&#45;&gt;7 -->\n",
                            "<g id=\"edge7\" class=\"edge\">\n",
                            "<title>3&#45;&gt;7</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M315.5669,-213.7677C315.5669,-203.3338 315.5669,-192.0174 315.5669,-181.4215\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"319.067,-181.1252 315.5669,-171.1252 312.067,-181.1252 319.067,-181.1252\"/>\n",
                            "</g>\n",
                            "<!-- 5 -->\n",
                            "<g id=\"node6\" class=\"node\">\n",
                            "<title>5</title>\n",
                            "<polygon fill=\"#e58139\" stroke=\"#000000\" points=\"147.2007,-64 -.067,-64 -.067,0 147.2007,0 147.2007,-64\"/>\n",
                            "<text text-anchor=\"middle\" x=\"73.5669\" y=\"-48.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"73.5669\" y=\"-34.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 2</text>\n",
                            "<text text-anchor=\"middle\" x=\"73.5669\" y=\"-20.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [2, 0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"73.5669\" y=\"-6.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 4&#45;&gt;5 -->\n",
                            "<g id=\"edge5\" class=\"edge\">\n",
                            "<title>4&#45;&gt;5</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M122.3322,-99.7647C115.957,-90.9057 109.1767,-81.4838 102.7629,-72.571\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"105.433,-70.2893 96.751,-64.2169 99.7512,-74.3781 105.433,-70.2893\"/>\n",
                            "</g>\n",
                            "<!-- 6 -->\n",
                            "<g id=\"node7\" class=\"node\">\n",
                            "<title>6</title>\n",
                            "<polygon fill=\"#399de5\" stroke=\"#000000\" points=\"290.3113,-64 164.8225,-64 164.8225,0 290.3113,0 290.3113,-64\"/>\n",
                            "<text text-anchor=\"middle\" x=\"227.5669\" y=\"-48.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"227.5669\" y=\"-34.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 1</text>\n",
                            "<text text-anchor=\"middle\" x=\"227.5669\" y=\"-20.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0, 1]</text>\n",
                            "<text text-anchor=\"middle\" x=\"227.5669\" y=\"-6.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 4&#45;&gt;6 -->\n",
                            "<g id=\"edge6\" class=\"edge\">\n",
                            "<title>4&#45;&gt;6</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M178.8016,-99.7647C185.1768,-90.9057 191.9571,-81.4838 198.3709,-72.571\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"201.3826,-74.3781 204.3828,-64.2169 195.7008,-70.2893 201.3826,-74.3781\"/>\n",
                            "</g>\n",
                            "<!-- 9 -->\n",
                            "<g id=\"node10\" class=\"node\">\n",
                            "<title>9</title>\n",
                            "<polygon fill=\"#f2c09c\" stroke=\"#000000\" points=\"554.2007,-178 406.933,-178 406.933,-100 554.2007,-100 554.2007,-178\"/>\n",
                            "<text text-anchor=\"middle\" x=\"480.5669\" y=\"-162.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">altitude1 &lt;= 0.046</text>\n",
                            "<text text-anchor=\"middle\" x=\"480.5669\" y=\"-148.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.444</text>\n",
                            "<text text-anchor=\"middle\" x=\"480.5669\" y=\"-134.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 21</text>\n",
                            "<text text-anchor=\"middle\" x=\"480.5669\" y=\"-120.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [14, 7]</text>\n",
                            "<text text-anchor=\"middle\" x=\"480.5669\" y=\"-106.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 8&#45;&gt;9 -->\n",
                            "<g id=\"edge9\" class=\"edge\">\n",
                            "<title>8&#45;&gt;9</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M489.7489,-213.7677C488.748,-205.6172 487.6809,-196.9283 486.6415,-188.4649\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"490.0867,-187.8034 485.3938,-178.3046 483.1389,-188.6567 490.0867,-187.8034\"/>\n",
                            "</g>\n",
                            "<!-- 12 -->\n",
                            "<g id=\"node13\" class=\"node\">\n",
                            "<title>12</title>\n",
                            "<polygon fill=\"#e58139\" stroke=\"#000000\" points=\"719.2007,-171 571.933,-171 571.933,-107 719.2007,-107 719.2007,-171\"/>\n",
                            "<text text-anchor=\"middle\" x=\"645.5669\" y=\"-155.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"645.5669\" y=\"-141.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 15</text>\n",
                            "<text text-anchor=\"middle\" x=\"645.5669\" y=\"-127.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [15, 0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"645.5669\" y=\"-113.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 8&#45;&gt;12 -->\n",
                            "<g id=\"edge12\" class=\"edge\">\n",
                            "<title>8&#45;&gt;12</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M546.5325,-213.7677C562.2243,-201.9209 579.4231,-188.9364 595.0209,-177.1606\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"597.143,-179.9439 603.0151,-171.1252 592.9253,-174.3572 597.143,-179.9439\"/>\n",
                            "</g>\n",
                            "<!-- 10 -->\n",
                            "<g id=\"node11\" class=\"node\">\n",
                            "<title>10</title>\n",
                            "<polygon fill=\"#e99457\" stroke=\"#000000\" points=\"505.2007,-64 357.933,-64 357.933,0 505.2007,0 505.2007,-64\"/>\n",
                            "<text text-anchor=\"middle\" x=\"431.5669\" y=\"-48.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.231</text>\n",
                            "<text text-anchor=\"middle\" x=\"431.5669\" y=\"-34.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 15</text>\n",
                            "<text text-anchor=\"middle\" x=\"431.5669\" y=\"-20.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [13, 2]</text>\n",
                            "<text text-anchor=\"middle\" x=\"431.5669\" y=\"-6.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 9&#45;&gt;10 -->\n",
                            "<g id=\"edge10\" class=\"edge\">\n",
                            "<title>9&#45;&gt;10</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M462.5993,-99.7647C458.6679,-91.1797 454.4943,-82.066 450.5255,-73.3994\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"453.6663,-71.8516 446.3204,-64.2169 447.3019,-74.7662 453.6663,-71.8516\"/>\n",
                            "</g>\n",
                            "<!-- 11 -->\n",
                            "<g id=\"node12\" class=\"node\">\n",
                            "<title>11</title>\n",
                            "<polygon fill=\"#61b1ea\" stroke=\"#000000\" points=\"648.3113,-64 522.8225,-64 522.8225,0 648.3113,0 648.3113,-64\"/>\n",
                            "<text text-anchor=\"middle\" x=\"585.5669\" y=\"-48.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.278</text>\n",
                            "<text text-anchor=\"middle\" x=\"585.5669\" y=\"-34.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 6</text>\n",
                            "<text text-anchor=\"middle\" x=\"585.5669\" y=\"-20.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [1, 5]</text>\n",
                            "<text text-anchor=\"middle\" x=\"585.5669\" y=\"-6.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 9&#45;&gt;11 -->\n",
                            "<g id=\"edge11\" class=\"edge\">\n",
                            "<title>9&#45;&gt;11</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M519.0688,-99.7647C528.1207,-90.5404 537.7715,-80.7057 546.8336,-71.4711\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"549.4462,-73.8058 553.9522,-64.2169 544.45,-68.9029 549.4462,-73.8058\"/>\n",
                            "</g>\n",
                            "<!-- 14 -->\n",
                            "<g id=\"node15\" class=\"node\">\n",
                            "<title>14</title>\n",
                            "<polygon fill=\"#399de5\" stroke=\"#000000\" points=\"719.3113,-285 593.8225,-285 593.8225,-221 719.3113,-221 719.3113,-285\"/>\n",
                            "<text text-anchor=\"middle\" x=\"656.5669\" y=\"-269.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"656.5669\" y=\"-255.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 9</text>\n",
                            "<text text-anchor=\"middle\" x=\"656.5669\" y=\"-241.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0, 9]</text>\n",
                            "<text text-anchor=\"middle\" x=\"656.5669\" y=\"-227.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 13&#45;&gt;14 -->\n",
                            "<g id=\"edge14\" class=\"edge\">\n",
                            "<title>13&#45;&gt;14</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M661.8137,-327.7677C661.0815,-317.3338 660.2874,-306.0174 659.5438,-295.4215\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"663.0128,-294.8556 658.8213,-285.1252 656.03,-295.3457 663.0128,-294.8556\"/>\n",
                            "</g>\n",
                            "<!-- 15 -->\n",
                            "<g id=\"node16\" class=\"node\">\n",
                            "<title>15</title>\n",
                            "<polygon fill=\"#e78945\" stroke=\"#000000\" points=\"884.2007,-292 736.933,-292 736.933,-214 884.2007,-214 884.2007,-292\"/>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-276.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">steer1 &lt;= 0.027</text>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-262.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.108</text>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-248.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 35</text>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-234.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [33, 2]</text>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-220.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 13&#45;&gt;15 -->\n",
                            "<g id=\"edge15\" class=\"edge\">\n",
                            "<title>13&#45;&gt;15</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M714.8118,-327.7677C726.856,-318.3633 739.8184,-308.242 752.1957,-298.5775\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"754.5015,-301.2177 760.2294,-292.3046 750.1934,-295.7004 754.5015,-301.2177\"/>\n",
                            "</g>\n",
                            "<!-- 16 -->\n",
                            "<g id=\"node17\" class=\"node\">\n",
                            "<title>16</title>\n",
                            "<polygon fill=\"#e58139\" stroke=\"#000000\" points=\"884.2007,-171 736.933,-171 736.933,-107 884.2007,-107 884.2007,-171\"/>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-155.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-141.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 30</text>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-127.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [30, 0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"810.5669\" y=\"-113.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 15&#45;&gt;16 -->\n",
                            "<g id=\"edge16\" class=\"edge\">\n",
                            "<title>15&#45;&gt;16</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M810.5669,-213.7677C810.5669,-203.3338 810.5669,-192.0174 810.5669,-181.4215\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"814.067,-181.1252 810.5669,-171.1252 807.067,-181.1252 814.067,-181.1252\"/>\n",
                            "</g>\n",
                            "<!-- 17 -->\n",
                            "<g id=\"node18\" class=\"node\">\n",
                            "<title>17</title>\n",
                            "<polygon fill=\"#f6d5bd\" stroke=\"#000000\" points=\"1049.2007,-178 901.933,-178 901.933,-100 1049.2007,-100 1049.2007,-178\"/>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-162.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">boxes3 &lt;= 0.113</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-148.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.48</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-134.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 5</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-120.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [3, 2]</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-106.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 15&#45;&gt;17 -->\n",
                            "<g id=\"edge17\" class=\"edge\">\n",
                            "<title>15&#45;&gt;17</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M867.3505,-213.7677C881.2993,-204.1303 896.3378,-193.7401 910.6378,-183.8601\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"912.7204,-186.6754 918.9582,-178.1115 908.7414,-180.9163 912.7204,-186.6754\"/>\n",
                            "</g>\n",
                            "<!-- 18 -->\n",
                            "<g id=\"node19\" class=\"node\">\n",
                            "<title>18</title>\n",
                            "<polygon fill=\"#e58139\" stroke=\"#000000\" points=\"933.2007,-64 785.933,-64 785.933,0 933.2007,0 933.2007,-64\"/>\n",
                            "<text text-anchor=\"middle\" x=\"859.5669\" y=\"-48.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"859.5669\" y=\"-34.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 3</text>\n",
                            "<text text-anchor=\"middle\" x=\"859.5669\" y=\"-20.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [3, 0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"859.5669\" y=\"-6.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 17&#45;&gt;18 -->\n",
                            "<g id=\"edge18\" class=\"edge\">\n",
                            "<title>17&#45;&gt;18</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M933.0315,-99.7647C922.9322,-90.4491 912.1582,-80.5109 902.0608,-71.197\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"904.2172,-68.4244 894.4936,-64.2169 899.471,-73.5697 904.2172,-68.4244\"/>\n",
                            "</g>\n",
                            "<!-- 19 -->\n",
                            "<g id=\"node20\" class=\"node\">\n",
                            "<title>19</title>\n",
                            "<polygon fill=\"#399de5\" stroke=\"#000000\" points=\"1076.3113,-64 950.8225,-64 950.8225,0 1076.3113,0 1076.3113,-64\"/>\n",
                            "<text text-anchor=\"middle\" x=\"1013.5669\" y=\"-48.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"1013.5669\" y=\"-34.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 2</text>\n",
                            "<text text-anchor=\"middle\" x=\"1013.5669\" y=\"-20.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0, 2]</text>\n",
                            "<text text-anchor=\"middle\" x=\"1013.5669\" y=\"-6.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 17&#45;&gt;19 -->\n",
                            "<g id=\"edge19\" class=\"edge\">\n",
                            "<title>17&#45;&gt;19</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M989.5009,-99.7647C992.5174,-91.271 995.7176,-82.2599 998.766,-73.6762\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"1002.0769,-74.8116 1002.1254,-64.2169 995.4805,-72.4689 1002.0769,-74.8116\"/>\n",
                            "</g>\n",
                            "<!-- 21 -->\n",
                            "<g id=\"node22\" class=\"node\">\n",
                            "<title>21</title>\n",
                            "<polygon fill=\"#41a1e6\" stroke=\"#000000\" points=\"1038.3113,-406 912.8225,-406 912.8225,-328 1038.3113,-328 1038.3113,-406\"/>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-390.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">speed2 &lt;= 0.013</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-376.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.071</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-362.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 54</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-348.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [2, 52]</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-334.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 20&#45;&gt;21 -->\n",
                            "<g id=\"edge21\" class=\"edge\">\n",
                            "<title>20&#45;&gt;21</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M975.5669,-441.7677C975.5669,-433.6172 975.5669,-424.9283 975.5669,-416.4649\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"979.067,-416.3046 975.5669,-406.3046 972.067,-416.3047 979.067,-416.3046\"/>\n",
                            "</g>\n",
                            "<!-- 28 -->\n",
                            "<g id=\"node29\" class=\"node\">\n",
                            "<title>28</title>\n",
                            "<polygon fill=\"#e58139\" stroke=\"#000000\" points=\"1203.2007,-399 1055.933,-399 1055.933,-335 1203.2007,-335 1203.2007,-399\"/>\n",
                            "<text text-anchor=\"middle\" x=\"1129.5669\" y=\"-383.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"1129.5669\" y=\"-369.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 3</text>\n",
                            "<text text-anchor=\"middle\" x=\"1129.5669\" y=\"-355.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [3, 0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"1129.5669\" y=\"-341.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 20&#45;&gt;28 -->\n",
                            "<g id=\"edge28\" class=\"edge\">\n",
                            "<title>20&#45;&gt;28</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M1028.565,-441.7677C1044.5685,-429.9209 1062.109,-416.9364 1078.0166,-405.1606\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"1080.2147,-407.8881 1086.1697,-399.1252 1076.0498,-402.2619 1080.2147,-407.8881\"/>\n",
                            "</g>\n",
                            "<!-- 22 -->\n",
                            "<g id=\"node23\" class=\"node\">\n",
                            "<title>22</title>\n",
                            "<polygon fill=\"#e58139\" stroke=\"#000000\" points=\"1049.2007,-285 901.933,-285 901.933,-221 1049.2007,-221 1049.2007,-285\"/>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-269.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-255.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 1</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-241.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [1, 0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"975.5669\" y=\"-227.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 21&#45;&gt;22 -->\n",
                            "<g id=\"edge22\" class=\"edge\">\n",
                            "<title>21&#45;&gt;22</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M975.5669,-327.7677C975.5669,-317.3338 975.5669,-306.0174 975.5669,-295.4215\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"979.067,-295.1252 975.5669,-285.1252 972.067,-295.1252 979.067,-295.1252\"/>\n",
                            "</g>\n",
                            "<!-- 23 -->\n",
                            "<g id=\"node24\" class=\"node\">\n",
                            "<title>23</title>\n",
                            "<polygon fill=\"#3d9fe6\" stroke=\"#000000\" points=\"1192.3113,-292 1066.8225,-292 1066.8225,-214 1192.3113,-214 1192.3113,-292\"/>\n",
                            "<text text-anchor=\"middle\" x=\"1129.5669\" y=\"-276.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">altitude5 &lt;= 0.261</text>\n",
                            "<text text-anchor=\"middle\" x=\"1129.5669\" y=\"-262.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.037</text>\n",
                            "<text text-anchor=\"middle\" x=\"1129.5669\" y=\"-248.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 53</text>\n",
                            "<text text-anchor=\"middle\" x=\"1129.5669\" y=\"-234.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [1, 52]</text>\n",
                            "<text text-anchor=\"middle\" x=\"1129.5669\" y=\"-220.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 21&#45;&gt;23 -->\n",
                            "<g id=\"edge23\" class=\"edge\">\n",
                            "<title>21&#45;&gt;23</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M1028.565,-327.7677C1041.4621,-318.2204 1055.3575,-307.9342 1068.592,-298.1373\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"1070.7771,-300.8744 1076.7321,-292.1115 1066.6122,-295.2482 1070.7771,-300.8744\"/>\n",
                            "</g>\n",
                            "<!-- 24 -->\n",
                            "<g id=\"node25\" class=\"node\">\n",
                            "<title>24</title>\n",
                            "<polygon fill=\"#399de5\" stroke=\"#000000\" points=\"1192.3113,-171 1066.8225,-171 1066.8225,-107 1192.3113,-107 1192.3113,-171\"/>\n",
                            "<text text-anchor=\"middle\" x=\"1129.5669\" y=\"-155.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"1129.5669\" y=\"-141.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 49</text>\n",
                            "<text text-anchor=\"middle\" x=\"1129.5669\" y=\"-127.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0, 49]</text>\n",
                            "<text text-anchor=\"middle\" x=\"1129.5669\" y=\"-113.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 23&#45;&gt;24 -->\n",
                            "<g id=\"edge24\" class=\"edge\">\n",
                            "<title>23&#45;&gt;24</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M1129.5669,-213.7677C1129.5669,-203.3338 1129.5669,-192.0174 1129.5669,-181.4215\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"1133.067,-181.1252 1129.5669,-171.1252 1126.067,-181.1252 1133.067,-181.1252\"/>\n",
                            "</g>\n",
                            "<!-- 25 -->\n",
                            "<g id=\"node26\" class=\"node\">\n",
                            "<title>25</title>\n",
                            "<polygon fill=\"#7bbeee\" stroke=\"#000000\" points=\"1336.3113,-178 1210.8225,-178 1210.8225,-100 1336.3113,-100 1336.3113,-178\"/>\n",
                            "<text text-anchor=\"middle\" x=\"1273.5669\" y=\"-162.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">altitude2 &lt;= 0.01</text>\n",
                            "<text text-anchor=\"middle\" x=\"1273.5669\" y=\"-148.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.375</text>\n",
                            "<text text-anchor=\"middle\" x=\"1273.5669\" y=\"-134.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 4</text>\n",
                            "<text text-anchor=\"middle\" x=\"1273.5669\" y=\"-120.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [1, 3]</text>\n",
                            "<text text-anchor=\"middle\" x=\"1273.5669\" y=\"-106.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 23&#45;&gt;25 -->\n",
                            "<g id=\"edge25\" class=\"edge\">\n",
                            "<title>23&#45;&gt;25</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M1179.1235,-213.7677C1191.0027,-204.3633 1203.7875,-194.242 1215.9953,-184.5775\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"1218.2509,-187.2559 1223.9189,-178.3046 1213.906,-181.7675 1218.2509,-187.2559\"/>\n",
                            "</g>\n",
                            "<!-- 26 -->\n",
                            "<g id=\"node27\" class=\"node\">\n",
                            "<title>26</title>\n",
                            "<polygon fill=\"#399de5\" stroke=\"#000000\" points=\"1259.3113,-64 1133.8225,-64 1133.8225,0 1259.3113,0 1259.3113,-64\"/>\n",
                            "<text text-anchor=\"middle\" x=\"1196.5669\" y=\"-48.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"1196.5669\" y=\"-34.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 3</text>\n",
                            "<text text-anchor=\"middle\" x=\"1196.5669\" y=\"-20.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [0, 3]</text>\n",
                            "<text text-anchor=\"middle\" x=\"1196.5669\" y=\"-6.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = surveillance</text>\n",
                            "</g>\n",
                            "<!-- 25&#45;&gt;26 -->\n",
                            "<g id=\"edge26\" class=\"edge\">\n",
                            "<title>25&#45;&gt;26</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M1245.3322,-99.7647C1238.957,-90.9057 1232.1767,-81.4838 1225.7629,-72.571\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"1228.433,-70.2893 1219.751,-64.2169 1222.7512,-74.3781 1228.433,-70.2893\"/>\n",
                            "</g>\n",
                            "<!-- 27 -->\n",
                            "<g id=\"node28\" class=\"node\">\n",
                            "<title>27</title>\n",
                            "<polygon fill=\"#e58139\" stroke=\"#000000\" points=\"1424.2007,-64 1276.933,-64 1276.933,0 1424.2007,0 1424.2007,-64\"/>\n",
                            "<text text-anchor=\"middle\" x=\"1350.5669\" y=\"-48.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">gini = 0.0</text>\n",
                            "<text text-anchor=\"middle\" x=\"1350.5669\" y=\"-34.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">samples = 1</text>\n",
                            "<text text-anchor=\"middle\" x=\"1350.5669\" y=\"-20.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">value = [1, 0]</text>\n",
                            "<text text-anchor=\"middle\" x=\"1350.5669\" y=\"-6.8\" font-family=\"Times,serif\" font-size=\"14.00\" fill=\"#000000\">class = not surveillance</text>\n",
                            "</g>\n",
                            "<!-- 25&#45;&gt;27 -->\n",
                            "<g id=\"edge27\" class=\"edge\">\n",
                            "<title>25&#45;&gt;27</title>\n",
                            "<path fill=\"none\" stroke=\"#000000\" d=\"M1301.8016,-99.7647C1308.1768,-90.9057 1314.9571,-81.4838 1321.3709,-72.571\"/>\n",
                            "<polygon fill=\"#000000\" stroke=\"#000000\" points=\"1324.3826,-74.3781 1327.3828,-64.2169 1318.7008,-70.2893 1324.3826,-74.3781\"/>\n",
                            "</g>\n",
                            "</g>\n",
                            "</svg>\n"
                        ],
                        "text/plain": [
                            "<graphviz.files.Source at 0x10b3f86a0>"
                        ]
                    },
                    "execution_count": 27,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "from sklearn import tree\n",
                "import graphviz\n",
                "\n",
                "label_names = ['not surveillance', 'surveillance']\n",
                "feature_names = X.columns\n",
                "\n",
                "dot_data = tree.export_graphviz(clf,\n",
                "                    feature_names=feature_names,  \n",
                "                    filled=True,\n",
                "                    class_names=label_names)  \n",
                "graph = graphviz.Source(dot_data)  \n",
                "graph"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "# One more classifier: Random forest\n",
                "\n",
                "## Build and train your classifier\n",
                "\n",
                "We can build a random forest classifier like this:\n",
                "\n",
                "```python\n",
                "from sklearn.ensemble import RandomForestClassifier\n",
                "clf = RandomForestClassifier()\n",
                "```\n",
                "\n",
                "But you're in charge of fitting it to your training data!\n",
                "\n",
                "* **Tip:** You can also set `max_depth` here, but you won't be able to visualize the result.\n",
                "* **Tip:** Increase `n_estimators` to 100 to make a better classifier."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 28,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/plain": [
                            "RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',\n",
                            "                       max_depth=5, max_features='auto', max_leaf_nodes=None,\n",
                            "                       min_impurity_decrease=0.0, min_impurity_split=None,\n",
                            "                       min_samples_leaf=1, min_samples_split=2,\n",
                            "                       min_weight_fraction_leaf=0.0, n_estimators=100,\n",
                            "                       n_jobs=None, oob_score=False, random_state=None,\n",
                            "                       verbose=0, warm_start=False)"
                        ]
                    },
                    "execution_count": 28,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "from sklearn.ensemble import RandomForestClassifier\n",
                "\n",
                "clf = RandomForestClassifier(n_estimators=100, max_depth=5)\n",
                "clf.fit(X_train, y_train)"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## What are the important features?"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 29,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/html": [
                            "\n",
                            "    <style>\n",
                            "    table.eli5-weights tr:hover {\n",
                            "        filter: brightness(85%);\n",
                            "    }\n",
                            "</style>\n",
                            "\n",
                            "\n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "        \n",
                            "        <pre>\n",
                            "Random forest feature importances; values are numbers 0 <= x <= 1;\n",
                            "all values sum to 1.\n",
                            "</pre>\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "        <table class=\"eli5-weights eli5-feature-importances\" style=\"border-collapse: collapse; border: none; margin-top: 0em; table-layout: auto;\">\n",
                            "    <thead>\n",
                            "    <tr style=\"border: none;\">\n",
                            "        <th style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">Weight</th>\n",
                            "        <th style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">Feature</th>\n",
                            "    </tr>\n",
                            "    </thead>\n",
                            "    <tbody>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 80.00%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.1812\n",
                            "                \n",
                            "                    &plusmn; 0.5010\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                steer2\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 81.95%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.1566\n",
                            "                \n",
                            "                    &plusmn; 0.4234\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                steer1\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 84.35%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.1277\n",
                            "                \n",
                            "                    &plusmn; 0.3478\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                squawk_1\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 87.42%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0935\n",
                            "                \n",
                            "                    &plusmn; 0.3463\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                steer5\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 94.29%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0303\n",
                            "                \n",
                            "                    &plusmn; 0.1279\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                steer6\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 94.42%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0293\n",
                            "                \n",
                            "                    &plusmn; 0.1362\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                speed1\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 94.66%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0274\n",
                            "                \n",
                            "                    &plusmn; 0.1078\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                steer4\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 94.77%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0266\n",
                            "                \n",
                            "                    &plusmn; 0.1001\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                altitude1\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 94.81%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0264\n",
                            "                \n",
                            "                    &plusmn; 0.1079\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                altitude3\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 95.32%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0228\n",
                            "                \n",
                            "                    &plusmn; 0.0785\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                boxes1\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 95.44%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0219\n",
                            "                \n",
                            "                    &plusmn; 0.0915\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                duration5\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 95.74%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0199\n",
                            "                \n",
                            "                    &plusmn; 0.0866\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                speed4\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 95.76%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0198\n",
                            "                \n",
                            "                    &plusmn; 0.0819\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                duration4\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 96.16%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0172\n",
                            "                \n",
                            "                    &plusmn; 0.0924\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                boxes5\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 96.24%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0167\n",
                            "                \n",
                            "                    &plusmn; 0.0502\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                duration1\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 96.37%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0158\n",
                            "                \n",
                            "                    &plusmn; 0.0725\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                boxes2\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 96.55%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0147\n",
                            "                \n",
                            "                    &plusmn; 0.0670\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                type_code\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 96.63%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0143\n",
                            "                \n",
                            "                    &plusmn; 0.0502\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                speed2\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 96.67%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0140\n",
                            "                \n",
                            "                    &plusmn; 0.0632\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                observations\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "        <tr style=\"background-color: hsl(120, 100.00%, 96.77%); border: none;\">\n",
                            "            <td style=\"padding: 0 1em 0 0.5em; text-align: right; border: none;\">\n",
                            "                0.0134\n",
                            "                \n",
                            "                    &plusmn; 0.0595\n",
                            "                \n",
                            "            </td>\n",
                            "            <td style=\"padding: 0 0.5em 0 0.5em; text-align: left; border: none;\">\n",
                            "                steer8\n",
                            "            </td>\n",
                            "        </tr>\n",
                            "    \n",
                            "    \n",
                            "        \n",
                            "            <tr style=\"background-color: hsl(120, 100.00%, 96.77%); border: none;\">\n",
                            "                <td colspan=\"2\" style=\"padding: 0 0.5em 0 0.5em; text-align: center; border: none; white-space: nowrap;\">\n",
                            "                    <i>&hellip; 12 more &hellip;</i>\n",
                            "                </td>\n",
                            "            </tr>\n",
                            "        \n",
                            "    \n",
                            "    </tbody>\n",
                            "</table>\n",
                            "    \n",
                            "\n",
                            "    \n",
                            "\n",
                            "\n",
                            "\n"
                        ],
                        "text/plain": [
                            "<IPython.core.display.HTML object>"
                        ]
                    },
                    "execution_count": 29,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "feature_names = list(X.columns)\n",
                "eli5.show_weights(clf, feature_names=feature_names, show=['description', 'feature_importances'])"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Understanding the output\n",
                "\n",
                "What is a random forest, and **why is the feature importance difference than for the decision tree?** Isn't a random forest just like a decision tree or something?"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 30,
            "metadata": {},
            "outputs": [],
            "source": [
                "# It's a lot of decision trees that all work together, so it'll even try to use less useful features"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## How well does it perform?"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 31,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/html": [
                            "<div>\n",
                            "<style scoped>\n",
                            "    .dataframe tbody tr th:only-of-type {\n",
                            "        vertical-align: middle;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe tbody tr th {\n",
                            "        vertical-align: top;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe thead th {\n",
                            "        text-align: right;\n",
                            "    }\n",
                            "</style>\n",
                            "<table border=\"1\" class=\"dataframe\">\n",
                            "  <thead>\n",
                            "    <tr style=\"text-align: right;\">\n",
                            "      <th></th>\n",
                            "      <th>Predicted not surveil</th>\n",
                            "      <th>Predicted surveil</th>\n",
                            "    </tr>\n",
                            "  </thead>\n",
                            "  <tbody>\n",
                            "    <tr>\n",
                            "      <th>Is not surveil</th>\n",
                            "      <td>124</td>\n",
                            "      <td>0</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>Is surveil</th>\n",
                            "      <td>5</td>\n",
                            "      <td>21</td>\n",
                            "    </tr>\n",
                            "  </tbody>\n",
                            "</table>\n",
                            "</div>"
                        ],
                        "text/plain": [
                            "                Predicted not surveil  Predicted surveil\n",
                            "Is not surveil                    124                  0\n",
                            "Is surveil                          5                 21"
                        ]
                    },
                    "execution_count": 31,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "from sklearn.metrics import confusion_matrix\n",
                "\n",
                "y_true = y_test\n",
                "y_pred = clf.predict(X_test)\n",
                "matrix = confusion_matrix(y_true, y_pred)\n",
                "\n",
                "label_names = pd.Series(['not surveil', 'surveil'])\n",
                "pd.DataFrame(matrix,\n",
                "     columns='Predicted ' + label_names,\n",
                "     index='Is ' + label_names)"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### How confident do you feel in the model?"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 32,
            "metadata": {},
            "outputs": [],
            "source": [
                "# Very confident"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "# Actually finding spy planes\n",
                "\n",
                "Now let's try ot actually find our spy planes\n",
                "\n",
                "## Retrain our model\n",
                "\n",
                "When we did test/train split, we trained our model with only a subset of our data, so we could test with the rest. Now that we're working in the \"real world\" we want to re-train it using not just `_train` and `_test` data, but instead **everything we have labels for.**"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 33,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/plain": [
                            "RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',\n",
                            "                       max_depth=5, max_features='auto', max_leaf_nodes=None,\n",
                            "                       min_impurity_decrease=0.0, min_impurity_split=None,\n",
                            "                       min_samples_leaf=1, min_samples_split=2,\n",
                            "                       min_weight_fraction_leaf=0.0, n_estimators=100,\n",
                            "                       n_jobs=None, oob_score=False, random_state=None,\n",
                            "                       verbose=0, warm_start=False)"
                        ]
                    },
                    "execution_count": 33,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "clf.fit(X, y)"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## Filter for planes we want to predict\n",
                "\n",
                "We have a dataframe of features that includes three types of planes:\n",
                "\n",
                "* Those that are labeled as surveillance planes\n",
                "* Those that are labeled as not surveillance\n",
                "* Those that aren't labeled\n",
                "\n",
                "Which do we want to predictions for? **Filter a new dataframe that's just those.**\n",
                "\n",
                "* **Tip:** Scroll up to see where you created your `train_df`, it's the opposite!"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 34,
            "metadata": {},
            "outputs": [],
            "source": [
                "real_df = df[df.label.isna()]"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "How many planes do you have in that list? **Confirm it's about 19,200.**"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 35,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/plain": [
                            "(19202, 35)"
                        ]
                    },
                    "execution_count": 35,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "real_df.shape"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## Predicting \n",
                "\n",
                "Build your `X` - remember you need to drop a few columns - and use that to make a prediction for each plane.\n",
                "\n",
                "**Assign the prediction into the `predicted` column**.\n",
                "\n",
                "* **Tip:** Scroll up to see where you created your features for training, it's similar\n",
                "* **Tip:** pandas will yell at us about setting values on copies of a slice but it's fine"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 36,
            "metadata": {},
            "outputs": [],
            "source": [
                "X = real_df.drop(columns=['label', 'adshex', 'type'])\n",
                "real_df['predicted'] = clf.predict(X)"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## How many planes did it predict to be surveillance planes?\n",
                "\n",
                "It should be roughly around 70-80 planes."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 37,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/plain": [
                            "(70, 36)"
                        ]
                    },
                    "execution_count": 37,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "real_df[real_df.predicted == 1].shape"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "## But.. what about those other ones? The ones that are just below the threshold?\n",
                "\n",
                "The cutoff for a prediction of `1` is 50%, but since we have a lot of time we're interested in investigating the top 150. To get the probability for each row, you will use `clf.predict_proba` instead of `clf.predict`. Also, to get the predicted probability for the `1` category, you'll need to add `[:,1]` to the end of the\n",
                "\n",
                "```python\n",
                "clf.predict_proba(***your features***)[:,1]\n",
                "```\n",
                "\n",
                "**Create a new column called `predicted_prob` that is the chance that the plane is a surveillance plane.**\n",
                "\n",
                "* **Tip:** You dropped three columns when using `clf.predict`, but if you drop the same three you'll get an error now. There's now an extra column that you'll need to drop! What is it?"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 38,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/html": [
                            "<div>\n",
                            "<style scoped>\n",
                            "    .dataframe tbody tr th:only-of-type {\n",
                            "        vertical-align: middle;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe tbody tr th {\n",
                            "        vertical-align: top;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe thead th {\n",
                            "        text-align: right;\n",
                            "    }\n",
                            "</style>\n",
                            "<table border=\"1\" class=\"dataframe\">\n",
                            "  <thead>\n",
                            "    <tr style=\"text-align: right;\">\n",
                            "      <th></th>\n",
                            "      <th>adshex</th>\n",
                            "      <th>label</th>\n",
                            "      <th>duration1</th>\n",
                            "      <th>duration2</th>\n",
                            "      <th>duration3</th>\n",
                            "      <th>duration4</th>\n",
                            "      <th>duration5</th>\n",
                            "      <th>boxes1</th>\n",
                            "      <th>boxes2</th>\n",
                            "      <th>boxes3</th>\n",
                            "      <th>boxes4</th>\n",
                            "      <th>boxes5</th>\n",
                            "      <th>speed1</th>\n",
                            "      <th>speed2</th>\n",
                            "      <th>speed3</th>\n",
                            "      <th>speed4</th>\n",
                            "      <th>speed5</th>\n",
                            "      <th>altitude1</th>\n",
                            "      <th>altitude2</th>\n",
                            "      <th>altitude3</th>\n",
                            "      <th>altitude4</th>\n",
                            "      <th>altitude5</th>\n",
                            "      <th>steer1</th>\n",
                            "      <th>steer2</th>\n",
                            "      <th>steer3</th>\n",
                            "      <th>steer4</th>\n",
                            "      <th>steer5</th>\n",
                            "      <th>steer6</th>\n",
                            "      <th>steer7</th>\n",
                            "      <th>steer8</th>\n",
                            "      <th>flights</th>\n",
                            "      <th>squawk_1</th>\n",
                            "      <th>observations</th>\n",
                            "      <th>type</th>\n",
                            "      <th>type_code</th>\n",
                            "      <th>predicted</th>\n",
                            "      <th>predicted_prob</th>\n",
                            "    </tr>\n",
                            "  </thead>\n",
                            "  <tbody>\n",
                            "    <tr>\n",
                            "      <th>597</th>\n",
                            "      <td>A</td>\n",
                            "      <td>NaN</td>\n",
                            "      <td>0.120253</td>\n",
                            "      <td>0.075949</td>\n",
                            "      <td>0.183544</td>\n",
                            "      <td>0.335443</td>\n",
                            "      <td>0.284810</td>\n",
                            "      <td>0.088608</td>\n",
                            "      <td>0.044304</td>\n",
                            "      <td>0.069620</td>\n",
                            "      <td>0.120253</td>\n",
                            "      <td>0.677215</td>\n",
                            "      <td>0.021824</td>\n",
                            "      <td>0.020550</td>\n",
                            "      <td>0.062330</td>\n",
                            "      <td>0.100713</td>\n",
                            "      <td>0.794582</td>\n",
                            "      <td>0.042374</td>\n",
                            "      <td>0.060971</td>\n",
                            "      <td>0.066831</td>\n",
                            "      <td>0.106403</td>\n",
                            "      <td>0.723421</td>\n",
                            "      <td>0.020211</td>\n",
                            "      <td>0.048913</td>\n",
                            "      <td>0.270550</td>\n",
                            "      <td>0.344090</td>\n",
                            "      <td>0.097317</td>\n",
                            "      <td>0.186651</td>\n",
                            "      <td>0.011379</td>\n",
                            "      <td>0.009426</td>\n",
                            "      <td>158</td>\n",
                            "      <td>0</td>\n",
                            "      <td>11776</td>\n",
                            "      <td>GRND</td>\n",
                            "      <td>248</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.003261</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>598</th>\n",
                            "      <td>A00000</td>\n",
                            "      <td>NaN</td>\n",
                            "      <td>0.211735</td>\n",
                            "      <td>0.155612</td>\n",
                            "      <td>0.181122</td>\n",
                            "      <td>0.198980</td>\n",
                            "      <td>0.252551</td>\n",
                            "      <td>0.204082</td>\n",
                            "      <td>0.183673</td>\n",
                            "      <td>0.168367</td>\n",
                            "      <td>0.173469</td>\n",
                            "      <td>0.267857</td>\n",
                            "      <td>0.107348</td>\n",
                            "      <td>0.143410</td>\n",
                            "      <td>0.208139</td>\n",
                            "      <td>0.177013</td>\n",
                            "      <td>0.364090</td>\n",
                            "      <td>0.177318</td>\n",
                            "      <td>0.114457</td>\n",
                            "      <td>0.129648</td>\n",
                            "      <td>0.197694</td>\n",
                            "      <td>0.380882</td>\n",
                            "      <td>0.034976</td>\n",
                            "      <td>0.048127</td>\n",
                            "      <td>0.240732</td>\n",
                            "      <td>0.356314</td>\n",
                            "      <td>0.116116</td>\n",
                            "      <td>0.159325</td>\n",
                            "      <td>0.012828</td>\n",
                            "      <td>0.013628</td>\n",
                            "      <td>392</td>\n",
                            "      <td>0</td>\n",
                            "      <td>52465</td>\n",
                            "      <td>TBM7</td>\n",
                            "      <td>431</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.011371</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>599</th>\n",
                            "      <td>A00008</td>\n",
                            "      <td>NaN</td>\n",
                            "      <td>0.125000</td>\n",
                            "      <td>0.041667</td>\n",
                            "      <td>0.208333</td>\n",
                            "      <td>0.166667</td>\n",
                            "      <td>0.458333</td>\n",
                            "      <td>0.125000</td>\n",
                            "      <td>0.083333</td>\n",
                            "      <td>0.125000</td>\n",
                            "      <td>0.166667</td>\n",
                            "      <td>0.500000</td>\n",
                            "      <td>0.187960</td>\n",
                            "      <td>0.278952</td>\n",
                            "      <td>0.221048</td>\n",
                            "      <td>0.190257</td>\n",
                            "      <td>0.121783</td>\n",
                            "      <td>0.014706</td>\n",
                            "      <td>0.053309</td>\n",
                            "      <td>0.149816</td>\n",
                            "      <td>0.279871</td>\n",
                            "      <td>0.502298</td>\n",
                            "      <td>0.029871</td>\n",
                            "      <td>0.044118</td>\n",
                            "      <td>0.202665</td>\n",
                            "      <td>0.380515</td>\n",
                            "      <td>0.094669</td>\n",
                            "      <td>0.182904</td>\n",
                            "      <td>0.014706</td>\n",
                            "      <td>0.020221</td>\n",
                            "      <td>24</td>\n",
                            "      <td>0</td>\n",
                            "      <td>2176</td>\n",
                            "      <td>PA46</td>\n",
                            "      <td>350</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.008143</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>600</th>\n",
                            "      <td>A0001E</td>\n",
                            "      <td>NaN</td>\n",
                            "      <td>0.100000</td>\n",
                            "      <td>0.200000</td>\n",
                            "      <td>0.200000</td>\n",
                            "      <td>0.400000</td>\n",
                            "      <td>0.100000</td>\n",
                            "      <td>0.100000</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.100000</td>\n",
                            "      <td>0.400000</td>\n",
                            "      <td>0.400000</td>\n",
                            "      <td>0.007937</td>\n",
                            "      <td>0.026984</td>\n",
                            "      <td>0.084127</td>\n",
                            "      <td>0.179365</td>\n",
                            "      <td>0.701587</td>\n",
                            "      <td>0.041270</td>\n",
                            "      <td>0.085714</td>\n",
                            "      <td>0.039683</td>\n",
                            "      <td>0.111111</td>\n",
                            "      <td>0.722222</td>\n",
                            "      <td>0.019048</td>\n",
                            "      <td>0.049206</td>\n",
                            "      <td>0.249206</td>\n",
                            "      <td>0.326984</td>\n",
                            "      <td>0.112698</td>\n",
                            "      <td>0.206349</td>\n",
                            "      <td>0.012698</td>\n",
                            "      <td>0.011111</td>\n",
                            "      <td>10</td>\n",
                            "      <td>1135</td>\n",
                            "      <td>630</td>\n",
                            "      <td>C56X</td>\n",
                            "      <td>126</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.010685</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>601</th>\n",
                            "      <td>A0002B</td>\n",
                            "      <td>NaN</td>\n",
                            "      <td>0.166667</td>\n",
                            "      <td>0.166667</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.666667</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.333333</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.666667</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.767405</td>\n",
                            "      <td>0.191456</td>\n",
                            "      <td>0.023734</td>\n",
                            "      <td>0.017405</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.150316</td>\n",
                            "      <td>0.113924</td>\n",
                            "      <td>0.178797</td>\n",
                            "      <td>0.534810</td>\n",
                            "      <td>0.022152</td>\n",
                            "      <td>0.001582</td>\n",
                            "      <td>0.009494</td>\n",
                            "      <td>0.281646</td>\n",
                            "      <td>0.416139</td>\n",
                            "      <td>0.112342</td>\n",
                            "      <td>0.169304</td>\n",
                            "      <td>0.001582</td>\n",
                            "      <td>0.001582</td>\n",
                            "      <td>6</td>\n",
                            "      <td>2356</td>\n",
                            "      <td>632</td>\n",
                            "      <td>C82S</td>\n",
                            "      <td>133</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.049944</td>\n",
                            "    </tr>\n",
                            "  </tbody>\n",
                            "</table>\n",
                            "</div>"
                        ],
                        "text/plain": [
                            "     adshex  label  duration1  duration2  duration3  duration4  duration5  \\\n",
                            "597       A    NaN   0.120253   0.075949   0.183544   0.335443   0.284810   \n",
                            "598  A00000    NaN   0.211735   0.155612   0.181122   0.198980   0.252551   \n",
                            "599  A00008    NaN   0.125000   0.041667   0.208333   0.166667   0.458333   \n",
                            "600  A0001E    NaN   0.100000   0.200000   0.200000   0.400000   0.100000   \n",
                            "601  A0002B    NaN   0.166667   0.166667   0.000000   0.666667   0.000000   \n",
                            "\n",
                            "       boxes1    boxes2    boxes3    boxes4    boxes5    speed1    speed2  \\\n",
                            "597  0.088608  0.044304  0.069620  0.120253  0.677215  0.021824  0.020550   \n",
                            "598  0.204082  0.183673  0.168367  0.173469  0.267857  0.107348  0.143410   \n",
                            "599  0.125000  0.083333  0.125000  0.166667  0.500000  0.187960  0.278952   \n",
                            "600  0.100000  0.000000  0.100000  0.400000  0.400000  0.007937  0.026984   \n",
                            "601  0.333333  0.000000  0.000000  0.666667  0.000000  0.767405  0.191456   \n",
                            "\n",
                            "       speed3    speed4    speed5  altitude1  altitude2  altitude3  altitude4  \\\n",
                            "597  0.062330  0.100713  0.794582   0.042374   0.060971   0.066831   0.106403   \n",
                            "598  0.208139  0.177013  0.364090   0.177318   0.114457   0.129648   0.197694   \n",
                            "599  0.221048  0.190257  0.121783   0.014706   0.053309   0.149816   0.279871   \n",
                            "600  0.084127  0.179365  0.701587   0.041270   0.085714   0.039683   0.111111   \n",
                            "601  0.023734  0.017405  0.000000   0.150316   0.113924   0.178797   0.534810   \n",
                            "\n",
                            "     altitude5    steer1    steer2    steer3    steer4    steer5    steer6  \\\n",
                            "597   0.723421  0.020211  0.048913  0.270550  0.344090  0.097317  0.186651   \n",
                            "598   0.380882  0.034976  0.048127  0.240732  0.356314  0.116116  0.159325   \n",
                            "599   0.502298  0.029871  0.044118  0.202665  0.380515  0.094669  0.182904   \n",
                            "600   0.722222  0.019048  0.049206  0.249206  0.326984  0.112698  0.206349   \n",
                            "601   0.022152  0.001582  0.009494  0.281646  0.416139  0.112342  0.169304   \n",
                            "\n",
                            "       steer7    steer8  flights  squawk_1  observations  type  type_code  \\\n",
                            "597  0.011379  0.009426      158         0         11776  GRND        248   \n",
                            "598  0.012828  0.013628      392         0         52465  TBM7        431   \n",
                            "599  0.014706  0.020221       24         0          2176  PA46        350   \n",
                            "600  0.012698  0.011111       10      1135           630  C56X        126   \n",
                            "601  0.001582  0.001582        6      2356           632  C82S        133   \n",
                            "\n",
                            "     predicted  predicted_prob  \n",
                            "597        0.0        0.003261  \n",
                            "598        0.0        0.011371  \n",
                            "599        0.0        0.008143  \n",
                            "600        0.0        0.010685  \n",
                            "601        0.0        0.049944  "
                        ]
                    },
                    "execution_count": 38,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "# Predict the probability it's in the class represented by '1'\n",
                "real_df['predicted_prob'] = clf.predict_proba(real_df.drop(columns=['label', 'adshex', 'type', 'predicted']))[:,1]\n",
                "real_df.head()"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Get the top 200 predictions\n",
                "\n",
                "Take a look at what the probabilities look like, showing the top 200 planes that are **most likely to be surveillance planes.**\n",
                "\n",
                "Then save them to a file for later research."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 39,
            "metadata": {},
            "outputs": [
                {
                    "data": {
                        "text/html": [
                            "<div>\n",
                            "<style scoped>\n",
                            "    .dataframe tbody tr th:only-of-type {\n",
                            "        vertical-align: middle;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe tbody tr th {\n",
                            "        vertical-align: top;\n",
                            "    }\n",
                            "\n",
                            "    .dataframe thead th {\n",
                            "        text-align: right;\n",
                            "    }\n",
                            "</style>\n",
                            "<table border=\"1\" class=\"dataframe\">\n",
                            "  <thead>\n",
                            "    <tr style=\"text-align: right;\">\n",
                            "      <th></th>\n",
                            "      <th>adshex</th>\n",
                            "      <th>label</th>\n",
                            "      <th>duration1</th>\n",
                            "      <th>duration2</th>\n",
                            "      <th>duration3</th>\n",
                            "      <th>duration4</th>\n",
                            "      <th>duration5</th>\n",
                            "      <th>boxes1</th>\n",
                            "      <th>boxes2</th>\n",
                            "      <th>boxes3</th>\n",
                            "      <th>boxes4</th>\n",
                            "      <th>boxes5</th>\n",
                            "      <th>speed1</th>\n",
                            "      <th>speed2</th>\n",
                            "      <th>speed3</th>\n",
                            "      <th>speed4</th>\n",
                            "      <th>speed5</th>\n",
                            "      <th>altitude1</th>\n",
                            "      <th>altitude2</th>\n",
                            "      <th>altitude3</th>\n",
                            "      <th>altitude4</th>\n",
                            "      <th>altitude5</th>\n",
                            "      <th>steer1</th>\n",
                            "      <th>steer2</th>\n",
                            "      <th>steer3</th>\n",
                            "      <th>steer4</th>\n",
                            "      <th>steer5</th>\n",
                            "      <th>steer6</th>\n",
                            "      <th>steer7</th>\n",
                            "      <th>steer8</th>\n",
                            "      <th>flights</th>\n",
                            "      <th>squawk_1</th>\n",
                            "      <th>observations</th>\n",
                            "      <th>type</th>\n",
                            "      <th>type_code</th>\n",
                            "      <th>predicted</th>\n",
                            "      <th>predicted_prob</th>\n",
                            "    </tr>\n",
                            "  </thead>\n",
                            "  <tbody>\n",
                            "    <tr>\n",
                            "      <th>12275</th>\n",
                            "      <td>A7D925</td>\n",
                            "      <td>NaN</td>\n",
                            "      <td>0.121212</td>\n",
                            "      <td>0.141414</td>\n",
                            "      <td>0.070707</td>\n",
                            "      <td>0.070707</td>\n",
                            "      <td>0.595960</td>\n",
                            "      <td>0.212121</td>\n",
                            "      <td>0.515152</td>\n",
                            "      <td>0.242424</td>\n",
                            "      <td>0.030303</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.271168</td>\n",
                            "      <td>0.494554</td>\n",
                            "      <td>0.212671</td>\n",
                            "      <td>0.016859</td>\n",
                            "      <td>0.004747</td>\n",
                            "      <td>0.018678</td>\n",
                            "      <td>0.065840</td>\n",
                            "      <td>0.345793</td>\n",
                            "      <td>0.568557</td>\n",
                            "      <td>0.001131</td>\n",
                            "      <td>0.166840</td>\n",
                            "      <td>0.315047</td>\n",
                            "      <td>0.301537</td>\n",
                            "      <td>0.096653</td>\n",
                            "      <td>0.015661</td>\n",
                            "      <td>0.047095</td>\n",
                            "      <td>0.004015</td>\n",
                            "      <td>0.009250</td>\n",
                            "      <td>99</td>\n",
                            "      <td>230</td>\n",
                            "      <td>45079</td>\n",
                            "      <td>T206</td>\n",
                            "      <td>417</td>\n",
                            "      <td>1.0</td>\n",
                            "      <td>0.919753</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>2828</th>\n",
                            "      <td>A144AF</td>\n",
                            "      <td>NaN</td>\n",
                            "      <td>0.328358</td>\n",
                            "      <td>0.134328</td>\n",
                            "      <td>0.074627</td>\n",
                            "      <td>0.029851</td>\n",
                            "      <td>0.432836</td>\n",
                            "      <td>0.492537</td>\n",
                            "      <td>0.328358</td>\n",
                            "      <td>0.164179</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.014925</td>\n",
                            "      <td>0.134059</td>\n",
                            "      <td>0.274446</td>\n",
                            "      <td>0.197484</td>\n",
                            "      <td>0.148554</td>\n",
                            "      <td>0.245457</td>\n",
                            "      <td>0.001251</td>\n",
                            "      <td>0.005371</td>\n",
                            "      <td>0.008167</td>\n",
                            "      <td>0.053271</td>\n",
                            "      <td>0.931940</td>\n",
                            "      <td>0.152969</td>\n",
                            "      <td>0.248841</td>\n",
                            "      <td>0.266132</td>\n",
                            "      <td>0.175116</td>\n",
                            "      <td>0.010448</td>\n",
                            "      <td>0.064013</td>\n",
                            "      <td>0.014495</td>\n",
                            "      <td>0.018247</td>\n",
                            "      <td>67</td>\n",
                            "      <td>5103</td>\n",
                            "      <td>13591</td>\n",
                            "      <td>unknown</td>\n",
                            "      <td>454</td>\n",
                            "      <td>1.0</td>\n",
                            "      <td>0.896639</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>2720</th>\n",
                            "      <td>A13098</td>\n",
                            "      <td>NaN</td>\n",
                            "      <td>0.166667</td>\n",
                            "      <td>0.166667</td>\n",
                            "      <td>0.166667</td>\n",
                            "      <td>0.083333</td>\n",
                            "      <td>0.416667</td>\n",
                            "      <td>0.250000</td>\n",
                            "      <td>0.583333</td>\n",
                            "      <td>0.166667</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.866572</td>\n",
                            "      <td>0.071664</td>\n",
                            "      <td>0.035361</td>\n",
                            "      <td>0.020745</td>\n",
                            "      <td>0.005658</td>\n",
                            "      <td>0.053748</td>\n",
                            "      <td>0.123055</td>\n",
                            "      <td>0.665724</td>\n",
                            "      <td>0.157473</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.151344</td>\n",
                            "      <td>0.176803</td>\n",
                            "      <td>0.181047</td>\n",
                            "      <td>0.300802</td>\n",
                            "      <td>0.019331</td>\n",
                            "      <td>0.085809</td>\n",
                            "      <td>0.010372</td>\n",
                            "      <td>0.028289</td>\n",
                            "      <td>12</td>\n",
                            "      <td>4415</td>\n",
                            "      <td>2121</td>\n",
                            "      <td>unknown</td>\n",
                            "      <td>454</td>\n",
                            "      <td>1.0</td>\n",
                            "      <td>0.896194</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>8466</th>\n",
                            "      <td>A4FB3C</td>\n",
                            "      <td>NaN</td>\n",
                            "      <td>0.416667</td>\n",
                            "      <td>0.125000</td>\n",
                            "      <td>0.083333</td>\n",
                            "      <td>0.041667</td>\n",
                            "      <td>0.333333</td>\n",
                            "      <td>0.458333</td>\n",
                            "      <td>0.458333</td>\n",
                            "      <td>0.083333</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.562937</td>\n",
                            "      <td>0.226224</td>\n",
                            "      <td>0.138811</td>\n",
                            "      <td>0.056294</td>\n",
                            "      <td>0.015734</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.009091</td>\n",
                            "      <td>0.039860</td>\n",
                            "      <td>0.866434</td>\n",
                            "      <td>0.084615</td>\n",
                            "      <td>0.144406</td>\n",
                            "      <td>0.226923</td>\n",
                            "      <td>0.222378</td>\n",
                            "      <td>0.268182</td>\n",
                            "      <td>0.013986</td>\n",
                            "      <td>0.062587</td>\n",
                            "      <td>0.004196</td>\n",
                            "      <td>0.017483</td>\n",
                            "      <td>24</td>\n",
                            "      <td>5310</td>\n",
                            "      <td>2860</td>\n",
                            "      <td>P210</td>\n",
                            "      <td>322</td>\n",
                            "      <td>1.0</td>\n",
                            "      <td>0.890482</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>9204</th>\n",
                            "      <td>A565E6</td>\n",
                            "      <td>NaN</td>\n",
                            "      <td>0.333333</td>\n",
                            "      <td>0.200000</td>\n",
                            "      <td>0.066667</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.400000</td>\n",
                            "      <td>0.600000</td>\n",
                            "      <td>0.266667</td>\n",
                            "      <td>0.133333</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.058968</td>\n",
                            "      <td>0.120803</td>\n",
                            "      <td>0.190418</td>\n",
                            "      <td>0.151515</td>\n",
                            "      <td>0.478296</td>\n",
                            "      <td>0.000819</td>\n",
                            "      <td>0.001229</td>\n",
                            "      <td>0.017199</td>\n",
                            "      <td>0.014742</td>\n",
                            "      <td>0.966011</td>\n",
                            "      <td>0.106880</td>\n",
                            "      <td>0.240377</td>\n",
                            "      <td>0.303440</td>\n",
                            "      <td>0.207617</td>\n",
                            "      <td>0.008190</td>\n",
                            "      <td>0.064701</td>\n",
                            "      <td>0.014333</td>\n",
                            "      <td>0.017199</td>\n",
                            "      <td>15</td>\n",
                            "      <td>5106</td>\n",
                            "      <td>2442</td>\n",
                            "      <td>unknown</td>\n",
                            "      <td>454</td>\n",
                            "      <td>1.0</td>\n",
                            "      <td>0.889338</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>...</th>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "      <td>...</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>15256</th>\n",
                            "      <td>AA3DAF</td>\n",
                            "      <td>NaN</td>\n",
                            "      <td>0.087719</td>\n",
                            "      <td>0.192982</td>\n",
                            "      <td>0.280702</td>\n",
                            "      <td>0.175439</td>\n",
                            "      <td>0.263158</td>\n",
                            "      <td>0.087719</td>\n",
                            "      <td>0.526316</td>\n",
                            "      <td>0.245614</td>\n",
                            "      <td>0.140351</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.256432</td>\n",
                            "      <td>0.545244</td>\n",
                            "      <td>0.195710</td>\n",
                            "      <td>0.001842</td>\n",
                            "      <td>0.000772</td>\n",
                            "      <td>0.005823</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.114491</td>\n",
                            "      <td>0.879033</td>\n",
                            "      <td>0.000654</td>\n",
                            "      <td>0.135702</td>\n",
                            "      <td>0.091854</td>\n",
                            "      <td>0.285426</td>\n",
                            "      <td>0.106530</td>\n",
                            "      <td>0.086448</td>\n",
                            "      <td>0.211514</td>\n",
                            "      <td>0.020914</td>\n",
                            "      <td>0.011467</td>\n",
                            "      <td>57</td>\n",
                            "      <td>362</td>\n",
                            "      <td>16831</td>\n",
                            "      <td>C182</td>\n",
                            "      <td>91</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.323519</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>14100</th>\n",
                            "      <td>A95959</td>\n",
                            "      <td>NaN</td>\n",
                            "      <td>0.279817</td>\n",
                            "      <td>0.619266</td>\n",
                            "      <td>0.077982</td>\n",
                            "      <td>0.018349</td>\n",
                            "      <td>0.004587</td>\n",
                            "      <td>0.160550</td>\n",
                            "      <td>0.532110</td>\n",
                            "      <td>0.293578</td>\n",
                            "      <td>0.013761</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.185068</td>\n",
                            "      <td>0.140729</td>\n",
                            "      <td>0.144585</td>\n",
                            "      <td>0.149317</td>\n",
                            "      <td>0.380301</td>\n",
                            "      <td>0.000175</td>\n",
                            "      <td>0.075535</td>\n",
                            "      <td>0.594988</td>\n",
                            "      <td>0.328952</td>\n",
                            "      <td>0.000351</td>\n",
                            "      <td>0.124956</td>\n",
                            "      <td>0.090256</td>\n",
                            "      <td>0.117946</td>\n",
                            "      <td>0.293200</td>\n",
                            "      <td>0.011917</td>\n",
                            "      <td>0.170172</td>\n",
                            "      <td>0.060813</td>\n",
                            "      <td>0.075359</td>\n",
                            "      <td>218</td>\n",
                            "      <td>1200</td>\n",
                            "      <td>5706</td>\n",
                            "      <td>C208</td>\n",
                            "      <td>97</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.323506</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>19765</th>\n",
                            "      <td>ADFF65</td>\n",
                            "      <td>NaN</td>\n",
                            "      <td>0.200000</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.300000</td>\n",
                            "      <td>0.500000</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.100000</td>\n",
                            "      <td>0.100000</td>\n",
                            "      <td>0.500000</td>\n",
                            "      <td>0.100000</td>\n",
                            "      <td>0.200000</td>\n",
                            "      <td>0.016369</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.002976</td>\n",
                            "      <td>0.025298</td>\n",
                            "      <td>0.955357</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.034226</td>\n",
                            "      <td>0.014881</td>\n",
                            "      <td>0.074405</td>\n",
                            "      <td>0.876488</td>\n",
                            "      <td>0.098214</td>\n",
                            "      <td>0.092262</td>\n",
                            "      <td>0.159226</td>\n",
                            "      <td>0.245536</td>\n",
                            "      <td>0.032738</td>\n",
                            "      <td>0.218750</td>\n",
                            "      <td>0.043155</td>\n",
                            "      <td>0.059524</td>\n",
                            "      <td>10</td>\n",
                            "      <td>4552</td>\n",
                            "      <td>672</td>\n",
                            "      <td>unknown</td>\n",
                            "      <td>454</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.321905</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>2629</th>\n",
                            "      <td>A11FB5</td>\n",
                            "      <td>NaN</td>\n",
                            "      <td>0.090909</td>\n",
                            "      <td>0.272727</td>\n",
                            "      <td>0.090909</td>\n",
                            "      <td>0.363636</td>\n",
                            "      <td>0.181818</td>\n",
                            "      <td>0.181818</td>\n",
                            "      <td>0.454545</td>\n",
                            "      <td>0.181818</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.181818</td>\n",
                            "      <td>0.071307</td>\n",
                            "      <td>0.691002</td>\n",
                            "      <td>0.230900</td>\n",
                            "      <td>0.006791</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.000000</td>\n",
                            "      <td>0.044143</td>\n",
                            "      <td>0.723260</td>\n",
                            "      <td>0.232598</td>\n",
                            "      <td>0.112054</td>\n",
                            "      <td>0.101868</td>\n",
                            "      <td>0.190153</td>\n",
                            "      <td>0.134126</td>\n",
                            "      <td>0.028862</td>\n",
                            "      <td>0.317487</td>\n",
                            "      <td>0.039049</td>\n",
                            "      <td>0.028862</td>\n",
                            "      <td>11</td>\n",
                            "      <td>0</td>\n",
                            "      <td>589</td>\n",
                            "      <td>C82R</td>\n",
                            "      <td>132</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.321645</td>\n",
                            "    </tr>\n",
                            "    <tr>\n",
                            "      <th>1355</th>\n",
                            "      <td>A0519F</td>\n",
                            "      <td>NaN</td>\n",
                            "      <td>0.187500</td>\n",
                            "      <td>0.156250</td>\n",
                            "      <td>0.125000</td>\n",
                            "      <td>0.500000</td>\n",
                            "      <td>0.031250</td>\n",
                            "      <td>0.093750</td>\n",
                            "      <td>0.156250</td>\n",
                            "      <td>0.062500</td>\n",
                            "      <td>0.125000</td>\n",
                            "      <td>0.562500</td>\n",
                            "      <td>0.049751</td>\n",
                            "      <td>0.072554</td>\n",
                            "      <td>0.153814</td>\n",
                            "      <td>0.171642</td>\n",
                            "      <td>0.552239</td>\n",
                            "      <td>0.046434</td>\n",
                            "      <td>0.071310</td>\n",
                            "      <td>0.115257</td>\n",
                            "      <td>0.198176</td>\n",
                            "      <td>0.568823</td>\n",
                            "      <td>0.041874</td>\n",
                            "      <td>0.080431</td>\n",
                            "      <td>0.250415</td>\n",
                            "      <td>0.250829</td>\n",
                            "      <td>0.055970</td>\n",
                            "      <td>0.252488</td>\n",
                            "      <td>0.024046</td>\n",
                            "      <td>0.016998</td>\n",
                            "      <td>32</td>\n",
                            "      <td>4610</td>\n",
                            "      <td>2412</td>\n",
                            "      <td>C501</td>\n",
                            "      <td>119</td>\n",
                            "      <td>0.0</td>\n",
                            "      <td>0.321207</td>\n",
                            "    </tr>\n",
                            "  </tbody>\n",
                            "</table>\n",
                            "<p>200 rows \u00d7 37 columns</p>\n",
                            "</div>"
                        ],
                        "text/plain": [
                            "       adshex  label  duration1  duration2  duration3  duration4  duration5  \\\n",
                            "12275  A7D925    NaN   0.121212   0.141414   0.070707   0.070707   0.595960   \n",
                            "2828   A144AF    NaN   0.328358   0.134328   0.074627   0.029851   0.432836   \n",
                            "2720   A13098    NaN   0.166667   0.166667   0.166667   0.083333   0.416667   \n",
                            "8466   A4FB3C    NaN   0.416667   0.125000   0.083333   0.041667   0.333333   \n",
                            "9204   A565E6    NaN   0.333333   0.200000   0.066667   0.000000   0.400000   \n",
                            "...       ...    ...        ...        ...        ...        ...        ...   \n",
                            "15256  AA3DAF    NaN   0.087719   0.192982   0.280702   0.175439   0.263158   \n",
                            "14100  A95959    NaN   0.279817   0.619266   0.077982   0.018349   0.004587   \n",
                            "19765  ADFF65    NaN   0.200000   0.000000   0.300000   0.500000   0.000000   \n",
                            "2629   A11FB5    NaN   0.090909   0.272727   0.090909   0.363636   0.181818   \n",
                            "1355   A0519F    NaN   0.187500   0.156250   0.125000   0.500000   0.031250   \n",
                            "\n",
                            "         boxes1    boxes2    boxes3    boxes4    boxes5    speed1    speed2  \\\n",
                            "12275  0.212121  0.515152  0.242424  0.030303  0.000000  0.271168  0.494554   \n",
                            "2828   0.492537  0.328358  0.164179  0.000000  0.014925  0.134059  0.274446   \n",
                            "2720   0.250000  0.583333  0.166667  0.000000  0.000000  0.866572  0.071664   \n",
                            "8466   0.458333  0.458333  0.083333  0.000000  0.000000  0.562937  0.226224   \n",
                            "9204   0.600000  0.266667  0.133333  0.000000  0.000000  0.058968  0.120803   \n",
                            "...         ...       ...       ...       ...       ...       ...       ...   \n",
                            "15256  0.087719  0.526316  0.245614  0.140351  0.000000  0.256432  0.545244   \n",
                            "14100  0.160550  0.532110  0.293578  0.013761  0.000000  0.185068  0.140729   \n",
                            "19765  0.100000  0.100000  0.500000  0.100000  0.200000  0.016369  0.000000   \n",
                            "2629   0.181818  0.454545  0.181818  0.000000  0.181818  0.071307  0.691002   \n",
                            "1355   0.093750  0.156250  0.062500  0.125000  0.562500  0.049751  0.072554   \n",
                            "\n",
                            "         speed3    speed4    speed5  altitude1  altitude2  altitude3  \\\n",
                            "12275  0.212671  0.016859  0.004747   0.018678   0.065840   0.345793   \n",
                            "2828   0.197484  0.148554  0.245457   0.001251   0.005371   0.008167   \n",
                            "2720   0.035361  0.020745  0.005658   0.053748   0.123055   0.665724   \n",
                            "8466   0.138811  0.056294  0.015734   0.000000   0.009091   0.039860   \n",
                            "9204   0.190418  0.151515  0.478296   0.000819   0.001229   0.017199   \n",
                            "...         ...       ...       ...        ...        ...        ...   \n",
                            "15256  0.195710  0.001842  0.000772   0.005823   0.000000   0.114491   \n",
                            "14100  0.144585  0.149317  0.380301   0.000175   0.075535   0.594988   \n",
                            "19765  0.002976  0.025298  0.955357   0.000000   0.034226   0.014881   \n",
                            "2629   0.230900  0.006791  0.000000   0.000000   0.000000   0.044143   \n",
                            "1355   0.153814  0.171642  0.552239   0.046434   0.071310   0.115257   \n",
                            "\n",
                            "       altitude4  altitude5    steer1    steer2    steer3    steer4    steer5  \\\n",
                            "12275   0.568557   0.001131  0.166840  0.315047  0.301537  0.096653  0.015661   \n",
                            "2828    0.053271   0.931940  0.152969  0.248841  0.266132  0.175116  0.010448   \n",
                            "2720    0.157473   0.000000  0.151344  0.176803  0.181047  0.300802  0.019331   \n",
                            "8466    0.866434   0.084615  0.144406  0.226923  0.222378  0.268182  0.013986   \n",
                            "9204    0.014742   0.966011  0.106880  0.240377  0.303440  0.207617  0.008190   \n",
                            "...          ...        ...       ...       ...       ...       ...       ...   \n",
                            "15256   0.879033   0.000654  0.135702  0.091854  0.285426  0.106530  0.086448   \n",
                            "14100   0.328952   0.000351  0.124956  0.090256  0.117946  0.293200  0.011917   \n",
                            "19765   0.074405   0.876488  0.098214  0.092262  0.159226  0.245536  0.032738   \n",
                            "2629    0.723260   0.232598  0.112054  0.101868  0.190153  0.134126  0.028862   \n",
                            "1355    0.198176   0.568823  0.041874  0.080431  0.250415  0.250829  0.055970   \n",
                            "\n",
                            "         steer6    steer7    steer8  flights  squawk_1  observations     type  \\\n",
                            "12275  0.047095  0.004015  0.009250       99       230         45079     T206   \n",
                            "2828   0.064013  0.014495  0.018247       67      5103         13591  unknown   \n",
                            "2720   0.085809  0.010372  0.028289       12      4415          2121  unknown   \n",
                            "8466   0.062587  0.004196  0.017483       24      5310          2860     P210   \n",
                            "9204   0.064701  0.014333  0.017199       15      5106          2442  unknown   \n",
                            "...         ...       ...       ...      ...       ...           ...      ...   \n",
                            "15256  0.211514  0.020914  0.011467       57       362         16831     C182   \n",
                            "14100  0.170172  0.060813  0.075359      218      1200          5706     C208   \n",
                            "19765  0.218750  0.043155  0.059524       10      4552           672  unknown   \n",
                            "2629   0.317487  0.039049  0.028862       11         0           589     C82R   \n",
                            "1355   0.252488  0.024046  0.016998       32      4610          2412     C501   \n",
                            "\n",
                            "       type_code  predicted  predicted_prob  \n",
                            "12275        417        1.0        0.919753  \n",
                            "2828         454        1.0        0.896639  \n",
                            "2720         454        1.0        0.896194  \n",
                            "8466         322        1.0        0.890482  \n",
                            "9204         454        1.0        0.889338  \n",
                            "...          ...        ...             ...  \n",
                            "15256         91        0.0        0.323519  \n",
                            "14100         97        0.0        0.323506  \n",
                            "19765        454        0.0        0.321905  \n",
                            "2629         132        0.0        0.321645  \n",
                            "1355         119        0.0        0.321207  \n",
                            "\n",
                            "[200 rows x 37 columns]"
                        ]
                    },
                    "execution_count": 39,
                    "metadata": {},
                    "output_type": "execute_result"
                }
            ],
            "source": [
                "top_predictions = real_df.sort_values(by='predicted_prob', ascending=False).head(200)\n",
                "top_predictions"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 40,
            "metadata": {},
            "outputs": [],
            "source": [
                "top_predictions.to_csv(\"planes-to-research.csv\")"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "# Questions"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Question 1\n",
                "\n",
                "What kind of machine learning are we doing here, and why are we doing it?"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 41,
            "metadata": {},
            "outputs": [],
            "source": [
                "# Classification (or supervised learning) because we have labels"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Question 2\n",
                "\n",
                "What are a few different ways you can deal with categorical data? Think about how we dealt with race in the reveal regression compared to how we dealt with type in this dataset."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 42,
            "metadata": {},
            "outputs": [],
            "source": [
                "# You can one-hot encode them if you have few\n",
                "# You can just make them numbers if you have a lot"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Question 3\n",
                "\n",
                "Every time we ran a machine learning algorithm on our dataset, we looked at feature importance.\n",
                "\n",
                "* When might it be important to explain what our model found important?\n",
                "* When might it not be important?"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 43,
            "metadata": {},
            "outputs": [],
            "source": [
                "# If we're trying to understand what's going wrong or why it is/isn't working well\n",
                "# It's more important if we're presenting this to the public"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Question 4\n",
                "\n",
                "Using words and not column names, describe what the machine learning algorithm found to be important when identifying surveillance planes."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 44,
            "metadata": {},
            "outputs": [],
            "source": [
                "# Slow speed, constant turning vs going straight"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Question 5\n",
                "\n",
                "Why did we use test/train split when it would have been more effective to give our model all of the data from the start?"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 45,
            "metadata": {},
            "outputs": [],
            "source": [
                "# Shouldn't test on things that it's already seen"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Question 6\n",
                "\n",
                "Why did we use a random forest instead of a decision tree or logistic regression? Was there something about the data?"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 46,
            "metadata": {},
            "outputs": [],
            "source": [
                "# Because it did a better job!!!"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Question 7\n",
                "\n",
                "Why did we use probability instead of just looking for planes with a predicted value of 1? It seems like we should have just trusted the algorithm, right?"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 47,
            "metadata": {},
            "outputs": [],
            "source": [
                "# The 0/1 is an arbitrary cutoff of 50%, we're fine going lower because it gives us more to research"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Question 8\n",
                "\n",
                "What if our random forest or input dataset were flawed? What would be the repercussions?"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 48,
            "metadata": {},
            "outputs": [],
            "source": [
                "# We'd be investigating a bunch of planes that didn't need to be investigated"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Question 9\n",
                "\n",
                "The government could claim that we're threatening national security by publishing this paper as well as publishing this code - now anyone could look for planes that are surveilling them. What do you think?"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 49,
            "metadata": {},
            "outputs": [],
            "source": [
                "# Up to you!"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Question 10\n",
                "\n",
                "We're using data from the past, but you can get real-time flight data from many services. Can you think of any uses for this algorithm using real-time instead of historical data?"
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 50,
            "metadata": {},
            "outputs": [],
            "source": [
                "# Finding out when something crazy is going on police-wise, maybe"
            ]
        },
        {
            "cell_type": "markdown",
            "metadata": {},
            "source": [
                "### Question 11\n",
                "\n",
                "This isn't a question, but if you look at `candidates.csv` and `candidates-annotates.csv` you can see how Buzzfeed did their research after finding a list of suspicious planes."
            ]
        },
        {
            "cell_type": "code",
            "execution_count": 51,
            "metadata": {},
            "outputs": [],
            "source": [
                "# k"
            ]
        }
    ],
    "metadata": {
        "kernelspec": {
            "display_name": "Python 3",
            "language": "python",
            "name": "python3"
        },
        "language_info": {
            "codemirror_mode": {
                "name": "ipython",
                "version": 3
            },
            "file_extension": ".py",
            "mimetype": "text/x-python",
            "name": "python",
            "nbconvert_exporter": "python",
            "pygments_lexer": "ipython3",
            "version": "3.6.8"
        },
        "toc": {
            "base_numbering": 1,
            "nav_menu": {},
            "number_sections": true,
            "sideBar": true,
            "skip_h1_title": false,
            "title_cell": "Table of Contents",
            "title_sidebar": "Contents",
            "toc_cell": false,
            "toc_position": {},
            "toc_section_display": true,
            "toc_window_display": false
        }
    },
    "nbformat": 4,
    "nbformat_minor": 2
}