发明名称 ACTION CONTROL DEVICE, ACTION CONTROL METHOD AND ACTION CONTROL PROGRAM
摘要 <P>PROBLEM TO BE SOLVED: To provide an action control technology for making action determination following the statistics of learning data more than a conventional technology in the case of anything other than a desired action sequence. <P>SOLUTION: This action control device is configured to acquire one previous action a<SB POS="POST">t-1</SB>from an action storage part, and to refer to a POMDP probability/reward table storage part by using one previous action a<SB POS="POST">t-1</SB>and a current observation value o<SB POS="POST">t</SB>', and to acquire state transition probability P(s'¾s,a) changing from a state s to a state s' according to an action (a) and observation value output probability P(o'¾s',a) when an observation value o' is observed in the state s' according to the action (a), and to acquire probability distribution b<SB POS="POST">t-1</SB>(s) in one previous state from a state probability distribution storage part, and to calculate probability distribution in a current state as follows. <P>COPYRIGHT: (C)2012,JPO&INPIT
申请公布号 JP2012123529(A) 申请公布日期 2012.06.28
申请号 JP20100272627 申请日期 2010.12.07
申请人 NIPPON TELEGR & TELEPH CORP <NTT> 发明人 MINAMI YASUHIRO;HIGASHINAKA RYUICHIRO;DOSAKA KOJI;MEGURO TOYOMI;MAEDA EISAKU
分类号 G06N5/04 主分类号 G06N5/04
代理机构 代理人
主权项
地址