发明名称 A method of reinforcement learning, corresponding computer program product, and data storage device therefor
摘要 The invention concerns a method of reinforcement learning, the method comprising the steps of perceiving (101) a current state from a fuzzy set of states of an environment; based on the current state and a policy, choosing (102) an action from a fuzzy set of actions, wherein the policy associates each state from the fuzzy set of states with an action from the fuzzy set of actions and, for each state from the fuzzy set of states, is based on a probability distribution on the fuzzy set of actions; receiving (103) from the environment a new state and a reward; and, based on the reward, optimizing (104) the policy. The invention further concerns a computer program product and a device therefor.
申请公布号 EP2381393(A1) 申请公布日期 2011.10.26
申请号 EP20100305410 申请日期 2010.04.20
申请人 ALCATEL LUCENT 发明人 RAZAVI, ROUZBEH;CLAUSSEN, HOLGER;HO, LESTER
分类号 G06N7/02 主分类号 G06N7/02
代理机构 代理人
主权项
地址