发明名称 Implementing enhanced error handling of a shared adapter in a virtualized system
摘要 A method, system and computer program product are provided for implementing enhanced error handling for a hardware I/O adapter, such as a Single Root Input/Output Virtualization (SRIOV) adapter, in a virtualized system. The hardware I/O adapter is partitioned into multiple endpoints, with each Partitionable Endpoint (PE) corresponding to a function, and there is an adapter PE associated with the entire adapter. The endpoints are managed both independently for actions limited in scope to a single function, and as a group for actions with the scope of the adapter. An error or failure of the adapter PE freezes the adapter PE and propagates to the VF PEs associated with the adapter, causing the VF PEs to be frozen. An adapter driver and VF device drivers are informed of the error, and start recovery. The hypervisor locks out the VF device drivers at key points enabling adapter recovery to successfully complete.
申请公布号 US9304849(B2) 申请公布日期 2016.04.05
申请号 US201313915943 申请日期 2013.06.12
申请人 International Business Machines Corporation 发明人 Arroyo Jesse P.;Graham Charles S.;Oberly, III John R.;Schimke Timothy J.
分类号 G06F11/07;G06F9/455;G06F11/14;G06F9/445 主分类号 G06F11/07
代理机构 代理人 Pennington Joan
主权项 1. A method for implementing enhanced error collection for an input/output (I/O) adapter in a computer system, the I/O adapter being partitioned into multiple Partitionable Endpoints, with each Partitionable Endpoint (PE) corresponding to a function, and including an adapter PE associated with the I/O adapter, and multiple virtual function (VF) PEs, said method comprising: responsive to an error of the I/O adapter, freezing the adapter PE; freezing each of the multiple VF PEs associated with the adapter, responsive to freezing the adapter PE; informing an adapter driver and each of a plurality of VF device drivers of the error, and said adapter driver and each of said plurality of VF device drivers starting recovery; each of said plurality of VF device drivers loops attempting to unfreeze respective VF PEs, and locking out said plurality of VF device drivers, enabling adapter recovery to successfully complete; responsive to completed adapter recovery, said plurality of VF device drivers unfreeze respective VF PEs and said plurality of VF device drivers commences recovery.
地址 Armonk NY US