发明名称 METHOD AND APPARATUS FOR CONTEXTUAL LINEAR BANDITS
摘要 <p>A method of selection that maximizes an expected reward in a contextual multi-armed bandit setting gathers rewards from randomly selected items in a database of items, where the items correspond to arms in a contextual multi-armed bandit setting. Initially, an item is selected at random and is transmitted to a user device which generates a reward. The items and resulting rewards are recorded. Subsequently, a context is generated by the user device which causes a learning and selection engine to calculate an estimate for each arm in the specific context, the estimate calculated using the recorded items and resulting rewards. Using the estimate, an item from the database is selected and transferred to the user device. The selected item is chosen to maximize a probability of a reward from the user device.</p>
申请公布号 WO2013189261(A1) 申请公布日期 2013.12.27
申请号 WO2013CN77267 申请日期 2013.06.14
申请人 TECHNICOLOR (CHINA) TECHNOLOGY CO., LTD.;IOANNIDIS, STRATIS;YAN, JINYUN;PEREIRA, JOSE BENTO AYRES 发明人 IOANNIDIS, STRATIS;YAN, JINYUN;PEREIRA, JOSE BENTO AYRES
分类号 G06Q30/02;G06F15/18 主分类号 G06Q30/02
代理机构 代理人
主权项
地址