scsi: lpfc: Fix error in remote port address change
authorJames Smart <jsmart2021@gmail.com>
Wed, 14 Aug 2019 23:56:51 +0000 (16:56 -0700)
committerMartin K. Petersen <martin.petersen@oracle.com>
Tue, 20 Aug 2019 02:41:10 +0000 (22:41 -0400)
In a test with high nvme remote port counts connected via a multi-hop FC
switch config where switches were systematically reset (e.g. fabric
partitioning and re-establishment), the nvme remote ports would switch
addresses based on the switch reconfiguration events. The driver would get
into a situation where the nvme port changed address, PLOGI and PRLI would
succeed nvme transport registration occurred, but subsequent LS requests by
the nvme subsystem failed due to a bad ndlp state and connectivity to the
device failed.

The driver hit a race condition on multiple devices that address swapped
simultaneously. In cases where the driver notices the remote port structure
came back as the same value as previously (meaning a nvme_rport structure
was re-enabled and did not go through devloss_tmo/connect_tmo_failures on
all controllers) the driver would unconditionally exit assuming the ndlp
information was correct. But, if the ndlp's had been swapped, the ndlp had
stale port state information, which when used by the LS request commands,
would fail the commands.

Fix by checking whether a node swap had occurred, and only exit if no ndlp
swap had occurred.

Signed-off-by: Dick Kennedy <dick.kennedy@broadcom.com>
Signed-off-by: James Smart <jsmart2021@gmail.com>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
drivers/scsi/lpfc/lpfc_nvme.c

index e8924e90c4eb0b1884acaa5d11cf598fcf6019a8..103708503592dc5c4c02a0436a7f4ad8b7bd48b5 100644 (file)
@@ -2348,7 +2348,7 @@ lpfc_nvme_register_port(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp)
                                 */
                                lpfc_printf_vlog(ndlp->vport, KERN_INFO,
                                                 LOG_NVME_DISC,
-                                                "6014 Rebinding lport to "
+                                                "6014 Rebind lport to current "
                                                 "remoteport %p wwpn 0x%llx, "
                                                 "Data: x%x x%x %p %p x%x x%06x\n",
                                                 remote_port,
@@ -2359,7 +2359,16 @@ lpfc_nvme_register_port(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp)
                                                 ndlp,
                                                 ndlp->nlp_type,
                                                 ndlp->nlp_DID);
-                               return 0;
+
+                               /* It's a complete rebind only if the driver
+                                * is registering with the same ndlp. Otherwise
+                                * the driver likely executed a node swap
+                                * prior to this registration and the ndlp to
+                                * remoteport binding needs to be redone.
+                                */
+                               if (prev_ndlp == ndlp)
+                                       return 0;
+
                        }
 
                        /* Sever the ndlp<->rport association
@@ -2393,8 +2402,8 @@ lpfc_nvme_register_port(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp)
                spin_unlock_irq(&vport->phba->hbalock);
                lpfc_printf_vlog(vport, KERN_INFO,
                                 LOG_NVME_DISC | LOG_NODE,
-                                "6022 Binding new rport to "
-                                "lport %p Remoteport %p rport %p WWNN 0x%llx, "
+                                "6022 Bind lport x%px to remoteport x%px "
+                                "rport x%px WWNN 0x%llx, "
                                 "Rport WWPN 0x%llx DID "
                                 "x%06x Role x%x, ndlp %p prev_ndlp %p\n",
                                 lport, remote_port, rport,