net/mlx4_en: Improve XDP xmit function
Several performance improvements in XDP TX datapath,
including:
- Ring a single doorbell for XDP TX ring per NAPI budget,
instead of doing it per a lower threshold (was 8).
This includes removing the flow of immediate doorbell ringing
in case of a full TX ring.
- Compiler branch predictor hints.
- Calculate values in compile time rather than in runtime.
Performance tests:
Tested on ConnectX3Pro, Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
Single queue no-RSS optimization ON.
XDP_TX packet rate:
-------------------------------------
| Before | After | Gain |
IPv4 | 10.3 Mpps | 12.0 Mpps | 17% |
IPv6 | 10.3 Mpps | 12.0 Mpps | 17% |
-------------------------------------
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Cc: kernel-team@fb.com
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>