Deep learning neural network predictive modeling models may need to be updated. This might be because the data has changed after the model was produced and deployed, or it could be that more labeled data has become available since the model was developed, with the expectation that the extra data would enhance the model's performance.
When updating neural network models with new data, it is critical to experiment and assess a variety of ways, especially if model updating will be automated, such as on a periodic basis.
There are several options of updating neural network models, but the two basic approaches are to use the existing model as a starting point and retrain it, or to leave the old model alone and combine the predictions from the existing model with a new model.
In this blog you will learn how to upgrade models in response to Continuous Data Updates.
Model Stability
A stable model is one that remains constant (or just minimally modified) in the presence of disturbances in dynamic systems analysis. Simply defined, a stable system is resistant to outside influences.
The process of selecting and finishing a deep learning neural network model for a predictive modeling project is only the start. The model may then be used to generate predictions on new data. One issue you may run into is that the nature of the prediction problem may alter over time. This may be seen in the fact that the accuracy of forecasts may continue to deteriorate with time. This might be due to the model's assumptions changing or no longer being true. This is known as "concept drift", and it occurs when the underlying probability distributions of variables and connections between variables change over time, severely impacting the model constructed from the data.
Concept drift might damage your model at different moments, depending on the prediction issue you're trying to solve and the model you've selected to solve it. It can be beneficial to track a model's performance over time and use a significant reduction in model performance as a catalyst to make a modification to your model, such as retraining it on new data.
Alternatively, you may be aware that data in your domain changes often enough that a model update is necessary on a regular basis, such as weekly, monthly, or annually.
Finally, you may run model for a period and collect new data with known outcomes to use to update your model in the hopes of increasing predictive performance.
Measure of Model Stability
It should be noted that the term "stability" in this context refers to the changes in the model's behavior noticed with continuous updates in the data used to train the model (CDUs). Jitter must be defined as a function of many "versions" of a given model pθ in order to measure differences in model behavior corresponding to CDUs. In this case, p represents a specific model architecture with a specific set of hyper-parameters θ that have been fixed over many training and testing regimens. The experimental variable that we change throughout these train-test regimens is the training data used to train model pθ. Given a "base" training data set D and model pθ, the model is trained N times, each time with a different version Di of the base data set D.
The N trained models, p1, p2,..., pN, are applied to a test set X, generating predictions Yi corresponding to each learned model pθi.
The concept of the difference between two models, pθi, and pθj, is known as Churn, which is essentially a measure of the proportion of data points in X where the outputs of the two models disagree (i.e., differences in Yi and Yj ). In this case, we utilize Churn to define the concept of "pair wise jitter"
If x is a data point in dataset X and pθi(x), pθj (x) are the predictions of the two models for x, respectively. By expanding this to include all models trained on the derived training sets (D1, D2,..., DN), we can take an average of pairwise jitter over all pairs of models and develop a more generic definition of jitter.
where xt is the tth item of sequence x in dataset X and |x| denotes sequence x's length.
It is important to note that jitter for a model is not restricted to neural architectures and can apply to any model that can be trained using labeled training data. It is also worth noting that the generally cited metric of variation in mistake rate or accuracy is not the same as jitter. Jitter captures the differences between the models' outputs for each individual test case, whereas variance reflects the differences between the models' aggregate accuracy or error rate metric.
Approaches for Model Stability with Continuous Data Updates
Prior models are often abandoned in the context of continuous data updates and replaced with new models based on refreshed or updated data. Rather than discarding or overwriting past models, we can look at two strategies that build on many models.
- First is incremental training, which has previously demonstrated robust models with low error rates. In incremental training, a model is trained using a preliminary version of the dataset D and subsequently retrained (or fine-tuned) with an updated dataset Di.
- Second is ensemble model training, which has been demonstrated to reliably promote stability in learning models. In ensemble model training, an ensemble is built from five models that have been trained on each of the updated datasets Di.
We may infer that, as compared to the baseline model, both ensemble (E) and incremental training (IT) result in lower jitter.
Conclusion
In this blog we looked at a prevalent problem in big complex systems: ML model "stability" in the face of constant data updates. We specifically investigate the effect of modeling choices on model stability as assessed by jitter. We discover that jitter is heavily influenced by architecture and input representation.
We also discover that model ensembles and incremental training have reduced jitter and hence higher stability. Ensemble approaches are definitely more stable than incremental training.