scsi: qla2xxx: fix invalid DMA access after command aborts in PCI device remove
authorMauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Mon, 7 Nov 2016 19:53:31 +0000 (17:53 -0200)
committerMartin K. Petersen <martin.petersen@oracle.com>
Wed, 9 Nov 2016 00:13:52 +0000 (19:13 -0500)
If a command is aborted in the kernel but not in the adapter, it might be
considered complete and its DMA memory released, but it is still alive in
the adapter, which will trigger an invalid DMA access upon its completion
(in the DMA operations to deliver the command response to the driver).

On powerpc platforms with IOMMU/EEH capabilities, the problem is observed
during PCI device removal with ongoing IO requests -- which might trigger
an EEH event very often, pointing to a 'TCE Request Page Access Error'.

In that path, which is qla2x00_remove_one(), the commands are aborted in
qla2x00_abort_all_cmds(), which does not perform an abort in the adapter
as is done in qla2xxx_eh_abort() for example.

So, this patch changes qla2x00_abort_all_cmds() to abort commands in the
adapter too, with a call to qla2xxx_eh_abort(), which already implements
all the logic to submit abort requests and handle responses.

Reported-by: Naresh Bannoth <nbannoth@in.ibm.com>
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
drivers/scsi/qla2xxx/qla_os.c

index e5db474436688739f76842a38ab3351dd1e2133d..567fa080e261c693845a2755093f7b5042b68b9d 100644 (file)
@@ -1456,6 +1456,15 @@ qla2x00_abort_all_cmds(scsi_qla_host_t *vha, int res)
                for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) {
                        sp = req->outstanding_cmds[cnt];
                        if (sp) {
+                               /* Get a reference to the sp and drop the lock.
+                                * The reference ensures this sp->done() call
+                                * - and not the call in qla2xxx_eh_abort() -
+                                * ends the SCSI command (with result 'res').
+                                */
+                               sp_get(sp);
+                               spin_unlock_irqrestore(&ha->hardware_lock, flags);
+                               qla2xxx_eh_abort(GET_CMD_SP(sp));
+                               spin_lock_irqsave(&ha->hardware_lock, flags);
                                req->outstanding_cmds[cnt] = NULL;
                                sp->done(vha, sp, res);
                        }