xprtrdma: Try to fail quickly if proto=rdma
authorChuck Lever <chuck.lever@oracle.com>
Fri, 4 May 2018 19:34:37 +0000 (15:34 -0400)
committerAnna Schumaker <Anna.Schumaker@Netapp.com>
Mon, 7 May 2018 13:20:03 +0000 (09:20 -0400)
rdma_resolve_addr(3) says:

> This call is used to map a given destination IP address to a
> usable RDMA address. The IP to RDMA address mapping is done
> using the local routing tables, or via ARP.

If this can't be done, there's no local device that can be used
to establish an RDMA-capable network path to the remote. In this
case, the RDMA CM very quickly posts an RDMA_CM_EVENT_ADDR_ERROR
upcall.

Currently rpcrdma_conn_upcall() converts RDMA_CM_EVENT_ADDR_ERROR
to EHOSTUNREACH. mount.nfs seems to want to retry EHOSTUNREACH
forever, thinking that this is a temporary situation. This makes
mount.nfs appear to hang if I try to mount with proto=rdma through,
say, a conventional Ethernet device.

If the admin has specified proto=rdma along with a server IP address
that requires a network path that does not support RDMA, instead
let's fail with a permanent error. -EPROTONOSUPPORT is returned when
NFSv4 or one of its minor versions is not supported.

-EPROTO is not (currently) retried by mount.nfs.

There are potentially other similar cases where -EPROTO is an
appropriate return code.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Tested-by: Olga Kornievskaia <kolga@netapp.com>
Tested-by: Anna Schumaker <Anna.Schumaker@netapp.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
net/sunrpc/xprtrdma/verbs.c

index b2c6f525b02020606b5ca7ccc893efd9dc8263b5..4ee9704ffe19802e928438e6d02c99a818600418 100644 (file)
@@ -232,7 +232,7 @@ rpcrdma_conn_upcall(struct rdma_cm_id *id, struct rdma_cm_event *event)
                complete(&ia->ri_done);
                break;
        case RDMA_CM_EVENT_ADDR_ERROR:
-               ia->ri_async_rc = -EHOSTUNREACH;
+               ia->ri_async_rc = -EPROTO;
                complete(&ia->ri_done);
                break;
        case RDMA_CM_EVENT_ROUTE_ERROR:
@@ -263,7 +263,7 @@ rpcrdma_conn_upcall(struct rdma_cm_id *id, struct rdma_cm_event *event)
                connstate = -ENOTCONN;
                goto connected;
        case RDMA_CM_EVENT_UNREACHABLE:
-               connstate = -ENETDOWN;
+               connstate = -ENETUNREACH;
                goto connected;
        case RDMA_CM_EVENT_REJECTED:
                dprintk("rpcrdma: connection to %s:%s rejected: %s\n",