Retraining Existing Skills To Take Advantage Of Updated Algorithms
Firstly, the advantage of cloud Conversational AI providers is that you do not have to worry what happens under the hood.
The disadvantages is that you do not know what happens under the hood. 🙂
A while back, IBM Watson Assistant introduced enhanced intent detection. No real detail was given as to what this new version entailed.
However, I performed a few basic tests, and the results were encouraging.
But, the advantageous here is that you can switch between the two versions of intent detection models, should you experience a deprecation in accuracy. There needs to be some kind of balance, where administrator can technically dive deeper into detailed fine-tuning should they whish to do so. Or remain with the automatic optimization available.
Then there is also the competitive advantage the cloud providers need to protect and a value proportion, which differentiates them from the opensource crowd.
Secondly, there is a basic level of finetuning which needs to be available. It is not uncommon to have expiry dates linked to underlying base machine learning models, which need to be managed. In order to ensure the availability of the model.
A typical environment would be where each model or skill is marked against an underlying baseline machine learning model. Which is dated. And at least one older model is available for rollback.
The way-of-work will then entail:
- Create a skill trained with the production data against the new model.
- Benchmark the new model.
- Upon successful improvement, or at least equivalent results, migrate.
- In the case of a deprecation of accuracy and performance in general, don’t migrate and see if a leap can be made to a more recent model.
- If forced to move away from a model which will be deprecated, with no improvement in later models, contact cloud support.
- Or make adjustments to accommodate the changes in the NLU model responses.
The Approach Of Watson Assistant
23 August 2021 IBM introduced automatic retraining of skills. Watson Assistant is enabling the automatic retraining of existing skills. According to IBM, in most cases, the retraining will be seamless from an end-user point of view.
Hence these updates will not affect general chatbot functionality, but only intent and entity accuracy scores. IBM states the following:
The same inputs will result in the same intents and entities being detected. In some cases, the retraining might cause changes in accuracy.
This is encouraging, but the changes in accuracy can cause changes in predictability of the chatbot. The dialog nodes within a Dialog Skill can be set to be trigged based on accuracy thresholds or differences in thresholds.
This functionality can be broken with these changes in accuracy scores.
The Watson Assistant service will continually monitor all ML models, and will automatically retrain those models that have not been retrained in the previous 6 months.
Hence skills will automatically be retrained on the latest model. Skills that have been modified during the previous 6 months will not be affected.
Administrator Due Diligence
In order to pre-empt any possible aberrations or degradation in the chatbot application resulting from automatic retraining of excising models, Watson Assistant administrators can follow the following cadence:
- Identify skills nearing their six months expiry date from the last time created or updated.
- Copy those skills,
- Retrain a new version and
- run test scenarios
- If all works as expected, the newly trained skills can be productionized.
- If not, vulnerabilities can be corrected in the pre-production environment prior to being productionized.
The challenge of course is that the administrator/developer has no visibility into model expiry, transition etc.
There are a few improvements which should be made to this feature addition in IBM Watson Assistant.
- The ideal situation is where the machine learning models are dated or named. With at least the last three models available for use.
- The expiry date of the models must be displayed, and developers should be alerted when a model trained on is nearing expiration.
- Developers must be able to select the newer model, train their data on it. And after benchmarking decide if they want to migrate to the new model or not.
- If there are any deviation, adjustments can be made to training data, or dialog thresholds and conditions, allowing for a seamless transition from a user perspective.
- Skills will not be updated automatically within six months of training; hence a timeline displaying the time remaining will be helpful.
A positive is that no skill will stop working or expire in any way.
There is an allure to a fully or largely automated environment which is managed on behalf of the developer. But predictability and management of the solution is also important.
Hence there is a balance of surfacing complex functionality in a simplistic manner to the administrator to deal with.