scsi: megaraid_sas: Add watchdog thread to detect Firmware fault
authorShivasharan S <shivasharan.srikanteshwara@broadcom.com>
Wed, 17 Oct 2018 06:37:39 +0000 (23:37 -0700)
committerMartin K. Petersen <martin.petersen@oracle.com>
Wed, 7 Nov 2018 01:33:56 +0000 (20:33 -0500)
commit3f6194af539464d83b29ed347aceddb336a3625c
treea9e91aeb4437552b78ffa8715332b5420ab70cbb
parent8dbb748d4d1b6731f12dbdb855ffe320cfe2cb2b
scsi: megaraid_sas: Add watchdog thread to detect Firmware fault

Currently driver checks for Firmware state change from ISR context, and
only when there are interrupts tied with no I/O completions.  We have seen
multiple cases where doorbell interrupts sent by firmware to indicate FW
state change are not processed by driver and it takes long time for driver
to trigger OCR. And if there are no IOs running, since we only check the FW
state as part of ISR code, fault goes undetected by driver and OCR will not
be triggered.

This patch introduces a separate workqueue that runs every one second to
detect Firmware FAULT state and trigger reset immediately.  As an
additional gain, removing PCI reads from ISR to check FW state results in
improved performance as well.

Signed-off-by: Sumit Saxena <sumit.saxena@broadcom.com>
Signed-off-by: Shivasharan S <shivasharan.srikanteshwara@broadcom.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
drivers/scsi/megaraid/megaraid_sas.h
drivers/scsi/megaraid/megaraid_sas_base.c
drivers/scsi/megaraid/megaraid_sas_fusion.c