Monitor
Once our model is deployed, we want to make sure that it performs as well in production as it did in training. We can opt in to logging by calling logPrediction. Later on, as we get official diagnoses for patients, we can call logTrueValue and use the same identifier as we used in the call to logPrediction.
Back in the app, we can look up a prediction by its identifier, and get an explanation that shows how each feature affects the output.
Production Predictions
Now let's see how accurate our model has been in production. Let's open the app and choose Production Metrics in the sidebar.
Production Metrics
Uh oh! It's a bit lower than we expected. Let's try to find the cause. Under "Production Stats", we see that the "chest_pain" column has an alert and a high invalid values count. Click on the column to view more details.
Production Stats
Column Stats
Status | Column | Type | Absent Count | Invalid Count |
---|---|---|---|---|
age | Number | 0 | 0 | |
gender | Enum | 0 | 0 | |
chest_pain | Enum | 0 | 0 | |
resting_blood_pressure | Number | 0 | 0 | |
cholesterol | Number | 0 | 0 | |
fasting_blood_sugar_greater_than_120 | Enum | 0 | 0 | |
resting_ecg_result | Enum | 0 | 0 | |
exercise_max_heart_rate | Number | 0 | 0 | |
exercise_induced_angina | Enum | 0 | 0 | |
exercise_st_depression | Number | 0 | 0 | |
exercise_st_slope | Enum | 0 | 0 | |
fluoroscopy_vessels_colored | Enum | 0 | 0 | |
thallium_stress_test | Enum | 0 | 0 |
It looks like there is a large discrepancy between the value "asymptomatic" in production versus training. In the table below, we see a high number of invalid values with the string "asx". It looks like we are accidentally using the string "asx" in our code instead of "asymptomatic" for the chest pain column. We can update our code to use the correct value and follow the metrics going forward to confirm they bounce back.
chest_pain
Unique Values
Value | Training Count | Production Count | Training Fraction | Production Fraction |
---|---|---|---|---|
asymptomatic | 133 | 95 | 48.72% | 23.11% |
atypical angina | 43 | 76 | 15.75% | 18.49% |
non-angina pain | 76 | 122 | 27.84% | 29.68% |
typical angina | 21 | 25 | 7.69% | 6.08% |
Invalid Values
Value | Count | Production Fraction |
---|---|---|
asx | 93 | 22.63% |
Hooray! You made it to the end! In this guide, we learned how to train a model, make predictions from our code, tune our model, and monitor it in production. If you want help using ModelFox with your own data, send us an email at [email protected] or ask a question on GitHub Discussions.