Learning controller with advantage updating algorithm,申请号US19940283729-传众专利搜索

首页产品黄页商标征信

会员服务注册登录

法人/股东/高管

发明名称	Learning controller with advantage updating algorithm
摘要	A new algorithm for reinforcement learning, advantage updating, is proposed. Advantage updating is a direct learning technique; it does not require a model to be given or learned. It is incremental, requiring only a constant amount of calculation per time step, independent of the number of possible actions, possible outcomes from a given action, or number of states. Analysis and simulation indicate that advantage updating is applicable to reinforcement learning systems working in continuous time (or discrete time with small time steps) for which Q-learning is not applicable. Simulation results are presented indicating that for a simple linear quadratic regulator (LQR) problem with no noise and large time steps, advantage updating learns slightly faster than Q-learning. When there is noise or small time steps, advantage updating learns more quickly than Q-learning by a factor of more than 100,000. Convergence properties and implementation issues are discussed. New convergence results are presented for R-learning and algorithms based upon change in value. It is proved that the learning rule for advantage updating converges to the optimal policy with probability one.
申请公布号	US5608843(A)	申请公布日期	1997.03.04
申请号	US19940283729	申请日期	1994.08.01
申请人	THE UNITED STATES OF AMERICA AS REPRESENTED BY THE SECRETARY OF THE AIR FORCE	发明人	BAIRD, III, LEEMON C.
分类号	G05B13/02;G06F15/18;(IPC1-7):G06E1/00;G06E3/00	主分类号	G05B13/02
代理机构		代理人
主权项
地址

您可能感兴趣的专利

VERFAHREN ZUR HERSTELLUNG EINES ANTIBIOTIKUMS

FASERMATTE UND VERFAHREN ZU DEREN HERSTELLUNG

VERFAHREN ZUM LASURFARBEN VON HOLZ

SIGNALBEHANDLINGSANORDNING FOR SAMPLINGSFREKVENSMODIFIERING

PROCEDE D'EPURATION DE GAZ D'ECHAPPEMENT D'AUTOMOBILES ET CATALYSEURS POUR L'EXECUTION DE CE PROCEDE

DISPOSITIF POUR EVACUER LES PIECES D'UNE PRESSE A DECOUPER

DERIVES DU VINCANE SUBSTITUES EN POSITION 14, ET LEUR PROCEDE DE PREPARATION

PERFECTIONNEMENT AUX MOTEURS A COMBUSTION INTERNE

CYLINDRE DE FREIN HYDRAULIQUE EVITANT LE BLOCAGE DES ROUES DE VEHICULE

METALLGEKAPSELTE, ELEKTRISCHE MEHRPHASEN-SCHALTANLAGE

FLAMMWIDRIGE FORMMASSE AUS LINEAREN, GESATTIGTEN POLYESTERN

GIPSKOMPOSITION

SETT ATT FORBETTRA LACKERBARHETEN HOS PLASTER

TRANSISTOR-EFFEKTFORSTERKARE

VERFAHREN ZUR HERSTELLUNG EINES SCHMIEROL-DISPERGIERADDITIVES

VORRICHTUNG ZUR SELBSTTATIGEN RICHTUNGSUMKEHR DES FARBBANDANTRIEBES IN SCHNELLDRUCKWERKEN

VAKUUM-FORMBARER SCHICHTSTOFF UND VERFAHREN ZU DESSEN HERSTELLUNG

VERFAHREN ZUR HERSTELLUNG VON COPOLYMEREN DES TRIOXANS UND VORRICHTUNG ZUR DURCHFUHRUNG DES VERFAHRENS