发明名称 Reliability-aware application scheduling
摘要 Reliability-aware scheduling of processing jobs on one or more processing entities is based on reliability scores assigned to processing entities and minimum acceptable reliability scores of processing jobs. The reliability scores of processing entities are based on independently derived statistical reliability models as applied to reliability data already available from modern computing hardware. Reliability scores of processing entities are continually updated based upon real-time reliability data, as well as prior reliability scores, which are weighted in accordance with the statistical reliability models being utilized. Individual processing jobs specify reliability requirements from which the minimum acceptable reliability score is determined. Such jobs are scheduled on processing entities whose reliability score is greater than or equal to the minimum acceptable reliability score for such jobs. Already scheduled jobs can be rescheduled on other processing entities if reliability scores change. Additionally, a hierarchical scheduling approach can be utilized.
申请公布号 US9436517(B2) 申请公布日期 2016.09.06
申请号 US201213730715 申请日期 2012.12.28
申请人 Microsoft Technology Licensing, LLC 发明人 Baek Woongki;Govindan Sriram;Sankar Sriram;Vaid Kushagra V.;Khessib Badriddine
分类号 G06F9/46;G06F9/50 主分类号 G06F9/46
代理机构 代理人 Gabryjelski Henry;Drakos Kate;Minhas Micky
主权项 1. A method of scheduling processing jobs on processing entities, the method comprising the steps of: receiving real-time reliability data associated with a processing entity; identifying a statistical reliability model that correlates processing entity failures to factors quantified by the received real-time reliability data; generating, with the identified statistical reliability model, a predicted future reliability of the processing entity based on at least some of the received real-time reliability data; generating a reliability score for the processing entity based on both the predicted future reliability of the processing entity and a prior reliability score for the processing entity; receiving a processing job for scheduling, the processing job having associated with it a minimum acceptable reliability score; and scheduling the received processing job on the processing entity only if the generated reliability score is greater than or equal to the minimum acceptable reliability score associated with the processing job.
地址 Redmond WA US