Action selection for reinforcement learning using influence diagrams,申请号US20050169503-传众专利搜索

首页产品黄页商标征信

会员服务注册登录

法人/股东/高管

发明名称	Action selection for reinforcement learning using influence diagrams
摘要	A system and method for online reinforcement learning is provided. In particular, a method for performing the explore-vs.-exploit tradeoff is provided. Although the method is heuristic, it can be applied in a principled manner while simultaneously learning the parameters and/or structure of the model (e.g., Bayesian network model). The system includes a model which receives an input (e.g., from a user) and provides a probability distribution associated with uncertainty regarding parameters of the model to a decision engine. The decision engine can determine whether to exploit the information known to it or to explore to obtain additional information based, at least in part, upon the explore-vs.-exploit tradeoff (e.g., Thompson strategy). A reinforcement learning component can obtain additional information (e.g., feedback from a user) and update parameter(s) and/or the structure of the model. The system can be employed in scenarios in which an influence diagram is used to make repeated decisions and maximization of long-term expected utility is desired.
申请公布号	US2006224535(A1)	申请公布日期	2006.10.05
申请号	US20050169503	申请日期	2005.06.29
申请人	MICROSOFT CORPORATION	发明人	CHICKERING DAVID M.;PAEK TIMOTHY S.;HORVITZ ERIC J.
分类号	G06F15/18	主分类号	G06F15/18
代理机构		代理人
主权项
地址

您可能感兴趣的专利

DETERMINING BEST MATCH AMONG A PLURALITY OF PATTERN RULES USING WILDCARDS WITH A TEXT STRING

AIR MOVEMENT SYSTEM AND AIR CLEANING SYSTEM

ENDOSCOPE DEVICE AND METHOD FOR DRIVING ENDOSCOPE DEVICE

Catheter System With Attachable Catheter Hub

CHILDREN'S SPORT STROLLER

METHOD AND SYSTEM FOR ALLOWING ACCESS TO DEVELOPED APPLICATIONS VIA A MULTI-TENANT ON-DEMAND DATABASE SERVICE

ESTABLISHING ELECTRONICALLY AUTHENTICATED INTERNET VOICE CONNECTIONS

MEMBRANE HAVING A CURED COATING LAYER

Scalable NAT Traversal

System and method for quality assured media file storage

FLUID DYNAMIC BEARING DEVICE

Composition and Method of Treating Side Effects From Antibiotic Treatment

DISHWASHER, ESPECIALLY DOMESTIC DISHWASHER

REMOTELY MANAGING ENTERPRISE RESOURCES

TRANSMISSION WITH COLLISION DETECTION AND MITIGATION FOR WIRELESS COMMUNICATION

Industrial Process Control Data Access Server Supporting Multiple Client Data Exchange Protocols

ELECTRONIC PAYMENT TRANSACTION SYSTEM

Method of setting communication path in storage system, and management apparatus therefor

VIRAL INHIBITORY NUCLEOTIDE SEQUENCES AND VACCINES