bpf, arm: Optimize ALU64 ARSH X using orrpl conditional instruction
This patch optimizes the code generated by emit_a32_arsh_r64, which
handles the BPF_ALU64 BPF_ARSH BPF_X instruction.
The original code uses a conditional B followed by an unconditional ORR.
The optimization saves one instruction by removing the B instruction
and using a conditional ORR (with an inverted condition).
Example of the code generated for BPF_ALU64_REG(BPF_ARSH, BPF_REG_0,
BPF_REG_1), before optimization:
34: rsb ip, r2, #32
38: subs r9, r2, #32
3c: lsr lr, r0, r2
40: orr lr, lr, r1, lsl ip
44: bmi 0x4c
48: orr lr, lr, r1, asr r9
4c: asr ip, r1, r2
50: mov r0, lr
54: mov r1, ip
and after optimization:
34: rsb ip, r2, #32
38: subs r9, r2, #32
3c: lsr lr, r0, r2
40: orr lr, lr, r1, lsl ip
44: orrpl lr, lr, r1, asr r9
48: asr ip, r1, r2
4c: mov r0, lr
50: mov r1, ip
Tested on QEMU using lib/test_bpf and test_verifier.
Co-developed-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Xi Wang <xi.wang@gmail.com>
Signed-off-by: Luke Nelson <luke.r.nels@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20200501020210.32294-2-luke.r.nels@gmail.com