摘要 |
PROBLEM TO BE SOLVED: To efficiently search a vast state space for generating an action sequence for achieving a goal. SOLUTION: A prediction part 131 always predictively learns a value taken by a sensor input at time t+1 from an action A<SB>t</SB>and a sensor input S<SB>t</SB>carried out at a time t by an autonomous agent and uses a function approximator based on statistical learning in learning. A goal generation part 132 gives a previously designed goal state matching a task to a planning part 133. The planning part 133 plans an action sequence from the current state to the goal state. When the plan to the goal can be generated, its sequence is carried out sequentially. When the plan succeeds, relation between the observed state and the selected action is learned by using the goal state as a fixed input. A control part 134 performs leaning using the plan and an environment by the planning part 133 as a teacher for controlling an action of an autonomous agent and learns input-output relation of the action when the action succeeds. In this way, behavior can be controlled, and especially, predictive learning is carried out without respect to a dimension and autonomous action is controlled. When the autonomous action succeeds, input/output in success can be learned further. This invention can be applied to an autonomous action control model for the autonomous agent. COPYRIGHT: (C)2007,JPO&INPIT
|