bonding: move dev_mc_sync after master_upper_dev_link in bond_enslave
Beniamino found a crash when adding vlan as slave of bond which is also
the parent link:
ip link add bond1 type bond
ip link set bond1 up
ip link add link bond1 vlan1 type vlan id 80
ip link set vlan1 master bond1
The call trace is as below:
[<
ffffffffa850842a>] queued_spin_lock_slowpath+0xb/0xf
[<
ffffffffa8515680>] _raw_spin_lock+0x20/0x30
[<
ffffffffa83f6f07>] dev_mc_sync+0x37/0x80
[<
ffffffffc08687dc>] vlan_dev_set_rx_mode+0x1c/0x30 [8021q]
[<
ffffffffa83efd2a>] __dev_set_rx_mode+0x5a/0xa0
[<
ffffffffa83f7138>] dev_mc_sync_multiple+0x78/0x80
[<
ffffffffc084127c>] bond_enslave+0x67c/0x1190 [bonding]
[<
ffffffffa8401909>] do_setlink+0x9c9/0xe50
[<
ffffffffa8403bf2>] rtnl_newlink+0x522/0x880
[<
ffffffffa8403ff7>] rtnetlink_rcv_msg+0xa7/0x260
[<
ffffffffa8424ecb>] netlink_rcv_skb+0xab/0xc0
[<
ffffffffa83fe498>] rtnetlink_rcv+0x28/0x30
[<
ffffffffa8424850>] netlink_unicast+0x170/0x210
[<
ffffffffa8424bf8>] netlink_sendmsg+0x308/0x420
[<
ffffffffa83cc396>] sock_sendmsg+0xb6/0xf0
This is actually a dead lock caused by sync slave hwaddr from master when
the master is the slave's 'slave'. This dead loop check is actually done
by netdev_master_upper_dev_link. However, Commit
1f718f0f4f97 ("bonding:
populate neighbour's private on enslave") moved it after dev_mc_sync.
This patch is to fix it by moving dev_mc_sync after master_upper_dev_link,
so that this loop check would be earlier than dev_mc_sync. It also moves
if (mode == BOND_MODE_8023AD) into if (!bond_uses_primary) clause as an
improvement.
Note team driver also has this issue, I will fix it in another patch.
Fixes: 1f718f0f4f97 ("bonding: populate neighbour's private on enslave")
Reported-by: Beniamino Galvani <bgalvani@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Andy Gospodarek <andy@greyhouse.net>
Signed-off-by: David S. Miller <davem@davemloft.net>