After commit
b13912bbb4a2 ("IPoIB: Call skb_dst_drop() once skb is
enqueued for sending"), using connected mode and running multithreaded
iperf for long time, ie
iperf -c <IP> -P 16 -t 3600
results in a crash.
After the above-mentioned patch, the driver is calling skb_orphan() and
skb_dst_drop() after calling post_send() in ipoib_cm.c::ipoib_cm_send()
(also in ipoib_ib.c::ipoib_send())
The problem with this is, as is written in a comment in both routines,
"it's entirely possible that the completion handler will run before we
execute anything after the post_send()." This leads to running the
skb cleanup routines simultaneously in two different contexts.
The solution is to always perform the skb_orphan() and skb_dst_drop()
before queueing the send work request. If an error occurs, then it
will be no different than the regular case where dev_free_skb_any() in
the completion path, which is assumed to be after these two routines.
Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com>
Signed-off-by: Roland Dreier <roland@purestorage.com>
tx_req->mapping = addr;
+ skb_orphan(skb);
+ skb_dst_drop(skb);
+
rc = post_send(priv, tx, tx->tx_head & (ipoib_sendq_size - 1),
addr, skb->len);
if (unlikely(rc)) {
dev->trans_start = jiffies;
++tx->tx_head;
- skb_orphan(skb);
- skb_dst_drop(skb);
-
if (++priv->tx_outstanding == ipoib_sendq_size) {
ipoib_dbg(priv, "TX ring 0x%x full, stopping kernel net queue\n",
tx->qp->qp_num);
netif_stop_queue(dev);
}
+ skb_orphan(skb);
+ skb_dst_drop(skb);
+
rc = post_send(priv, priv->tx_head & (ipoib_sendq_size - 1),
address->ah, qpn, tx_req, phead, hlen);
if (unlikely(rc)) {
address->last_send = priv->tx_head;
++priv->tx_head;
-
- skb_orphan(skb);
- skb_dst_drop(skb);
}
if (unlikely(priv->tx_outstanding > MAX_SEND_CQE))