发明名称 REINFORCEMENT LEARNING CONTROLLER
摘要 PROBLEM TO BE SOLVED: To provide a control technique for learning a method of generating operation signal with which a controlled object can be safely operated even at an early stage of learning. SOLUTION: A controller is provided with functions to generate an operation signal to be applied to the controlled object 100 and to the model 400 for imitating the characteristics of the controlled object, receive an evaluation value signal calculated on the basis of a measurement signal obtained as a result of applying the operation signal to the controlled object and the model, and to learn a method of generating the operation signal so that an expected value of a total sum of the evaluation value signals obtained from the current state to a future state becomes minimum or maximum. In the controller, a first evaluation value 206 obtained on the basis of a deviation between a measurement signal 205 from the model and a target value and a second evaluation value 207 obtained on the basis of a difference in characteristics between the model and the controlled object are added, and an evaluation value signal 208 calculated on the basis of the measurement signal from the model is calculated. COPYRIGHT: (C)2007,JPO&INPIT
申请公布号 JP2007233634(A) 申请公布日期 2007.09.13
申请号 JP20060053671 申请日期 2006.02.28
申请人 HITACHI LTD 发明人 SEKIAI TAKAO;SHIMIZU SATORU;KAMINAGA EIICHI
分类号 G06N3/00;G05B13/02 主分类号 G06N3/00
代理机构 代理人
主权项
地址