摘要 |
An instance weighted learning (IWL) machine learning model. In one example embodiment, a method of employing an IWL machine learning model to train a classifier may include determining a quality value that should be associated with each machine learning training instance in a temporal sequence of reinforcement learning machine learning training instances, associating the corresponding determined quality value with each of the machine learning training instances, and training a classifier using each of the machine learning training instances. Each of the machine learning training instances includes a state-action pair and is weighted during the training based on its associated quality value using a weighting factor that weights different quality values differently such that the classifier learns more from a machine learning training instance with a higher quality value than from a machine learning training instance with a lower quality value. |