If a command is aborted in the kernel but not in the adapter, it might be
considered complete and its DMA memory released, but it is still alive in
the adapter, which will trigger an invalid DMA access upon its completion
(in the DMA operations to deliver the command response to the driver).
On powerpc platforms with IOMMU/EEH capabilities, the problem is observed
during PCI device removal with ongoing IO requests -- which might trigger
an EEH event very often, pointing to a 'TCE Request Page Access Error'.
In that path, which is qla2x00_remove_one(), the commands are aborted in
qla2x00_abort_all_cmds(), which does not perform an abort in the adapter
as is done in qla2xxx_eh_abort() for example.
So, this patch changes qla2x00_abort_all_cmds() to abort commands in the
adapter too, with a call to qla2xxx_eh_abort(), which already implements
all the logic to submit abort requests and handle responses.
Reported-by: Naresh Bannoth <nbannoth@in.ibm.com>
Signed-off-by: Mauricio Faria de Oliveira <mauricfo@linux.vnet.ibm.com>
Acked-by: Himanshu Madhani <himanshu.madhani@cavium.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
for (cnt = 1; cnt < req->num_outstanding_cmds; cnt++) {
sp = req->outstanding_cmds[cnt];
if (sp) {
+ /* Get a reference to the sp and drop the lock.
+ * The reference ensures this sp->done() call
+ * - and not the call in qla2xxx_eh_abort() -
+ * ends the SCSI command (with result 'res').
+ */
+ sp_get(sp);
+ spin_unlock_irqrestore(&ha->hardware_lock, flags);
+ qla2xxx_eh_abort(GET_CMD_SP(sp));
+ spin_lock_irqsave(&ha->hardware_lock, flags);
req->outstanding_cmds[cnt] = NULL;
sp->done(vha, sp, res);
}