mlx4: avoid unnecessary dirtying of critical fields
authorEric Dumazet <edumazet@google.com>
Sun, 20 Nov 2016 17:24:36 +0000 (09:24 -0800)
committerDavid S. Miller <davem@davemloft.net>
Mon, 21 Nov 2016 16:33:31 +0000 (11:33 -0500)
commitdad42c3038a59d27fced28ee4ec1d4a891b28155
treefa0c1fdd84ef243b364948e662ab9b72b1bb09f0
parentb668534c1d9b80f4cda4d761eb11d3a6c9f4ced8
mlx4: avoid unnecessary dirtying of critical fields

While stressing a 40Gbit mlx4 NIC with busy polling, I found false
sharing in mlx4 driver that can be easily avoided.

This patch brings an additional 7 % performance improvement in UDP_RR
workload.

1) If we received no frame during one mlx4_en_process_rx_cq()
   invocation, no need to call mlx4_cq_set_ci() and/or dirty ring->cons

2) Do not refill rx buffers if we have plenty of them.
   This avoids false sharing and allows some bulk/batch optimizations.
   Page allocator and its locks will thank us.

Finally, mlx4_en_poll_rx_cq() should not return 0 if it determined
cpu handling NIC IRQ should be changed. We should return budget-1
instead, to not fool net_rx_action() and its netdev_budget.

v2: keep AVG_PERF_COUNTER(... polled) even if polled is 0

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
drivers/net/ethernet/mellanox/mlx4/en_rx.c