tcp: fix TCP_REPAIR xmit queue setup
authorEric Dumazet <edumazet@google.com>
Thu, 18 Oct 2018 16:12:19 +0000 (09:12 -0700)
committerDavid S. Miller <davem@davemloft.net>
Thu, 18 Oct 2018 23:51:02 +0000 (16:51 -0700)
Andrey reported the following warning triggered while running CRIU tests:

tcp_clean_rtx_queue()
...
last_ackt = tcp_skb_timestamp_us(skb);
WARN_ON_ONCE(last_ackt == 0);

This is caused by 5f6188a8003d ("tcp: do not change tcp_wstamp_ns
in tcp_mstamp_refresh"), as we end up having skbs in retransmit queue
with a zero skb->skb_mstamp_ns field.

We could fix this bug in different ways, like making sure
tp->tcp_wstamp_ns is not zero at socket creation, but as Neal pointed
out, we also do not want that pacing status of a repaired socket
could push tp->tcp_wstamp_ns far ahead in the future.

So we prefer changing tcp_write_xmit() to not call tcp_update_skb_after_send()
and instead do what is requested by TCP_REPAIR logic.

Fixes: 5f6188a8003d ("tcp: do not change tcp_wstamp_ns in tcp_mstamp_refresh")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
net/ipv4/tcp_output.c

index d212e4cbc68902e873afb4a12b43b467ccd6069b..c07990a35ff3bd9438d32c82863ef207c93bdb9e 100644 (file)
@@ -2321,18 +2321,19 @@ static bool tcp_write_xmit(struct sock *sk, unsigned int mss_now, int nonagle,
        while ((skb = tcp_send_head(sk))) {
                unsigned int limit;
 
+               if (unlikely(tp->repair) && tp->repair_queue == TCP_SEND_QUEUE) {
+                       /* "skb_mstamp_ns" is used as a start point for the retransmit timer */
+                       skb->skb_mstamp_ns = tp->tcp_wstamp_ns = tp->tcp_clock_cache;
+                       list_move_tail(&skb->tcp_tsorted_anchor, &tp->tsorted_sent_queue);
+                       goto repair; /* Skip network transmission */
+               }
+
                if (tcp_pacing_check(sk))
                        break;
 
                tso_segs = tcp_init_tso_segs(skb, mss_now);
                BUG_ON(!tso_segs);
 
-               if (unlikely(tp->repair) && tp->repair_queue == TCP_SEND_QUEUE) {
-                       /* "skb_mstamp" is used as a start point for the retransmit timer */
-                       tcp_update_skb_after_send(sk, skb, tp->tcp_wstamp_ns);
-                       goto repair; /* Skip network transmission */
-               }
-
                cwnd_quota = tcp_cwnd_test(tp, skb);
                if (!cwnd_quota) {
                        if (push_one == 2)