摘要 |
A feedback control system for automatic on-line training of a controller for a plant, the system having a reinforcement learning agent connected in parallel with the controller. The learning agent comprises an actor network and a critic network operatively arranged to carry out at least one sequence of a stability phase followed by a learning phase. During the stability phase, a multi-dimensional boundary of values is determined. During the learning phase, a plurality of updated weight values is generated in connection with the on-line training, if and until one of the updated weight values reaches the boundary, at which time a next sequence is carried out to determine a next multi-dimensional boundary of values followed by a next learning phase. Also, a method for automatic on-line training of a feedback controller within a system comprising the controller and a plant by employing a reinforcement learning agent comprising a neural network to carry out at least one sequence comprising a stability phase followed by a learning phase. Further included, a computer executable program code on a computer readable storage medium, for on-line training of a feedback controller within a system comprising the controller and a plant.
|