摘要 |
PROBLEM TO BE SOLVED: To provide a reinforcement learning system at low calculation cost that explicitly has an environmental model, and a reinforcement learning method. SOLUTION: The learning system is provided with: an inventory database 105 for holding a plurality of inventory lists with a set of series of states and behaviors as an inventory list which reaches a pair of a state and a behavior immediately before for which reward is obtained; an inventory list management part 101 for classifying the pair of a state and a behavior in a plurality of inventory lists and storing them; and a learning control unit 107 for updating a reward expectation value of the pair of a state and a behavior being an element of each inventory list. COPYRIGHT: (C)2010,JPO&INPIT |