From: Christian Marangi Date: Sat, 3 Feb 2024 23:58:15 +0000 (+0100) Subject: generic: 6.1: add patch fixing bugs with LED netdev trigger X-Git-Url: http://git.lede-project.org./?a=commitdiff_plain;h=508f2dbfb37e6e23b671f642e5378f9b037fbeb1;p=openwrt%2Fstaging%2Fjow.git generic: 6.1: add patch fixing bugs with LED netdev trigger Backport one patch merged upstream that prevent a deadlock for LED netdev trigger and add a pending patch that fix kernel panic on interface rename trigger notification with invalid dev. Fixes: #14477 Signed-off-by: Christian Marangi --- diff --git a/target/linux/generic/backport-6.1/836-v6.7-leds-trigger-netdev-fix-RTNL-handling-to-prevent-pot.patch b/target/linux/generic/backport-6.1/836-v6.7-leds-trigger-netdev-fix-RTNL-handling-to-prevent-pot.patch new file mode 100644 index 0000000000..cbb9172032 --- /dev/null +++ b/target/linux/generic/backport-6.1/836-v6.7-leds-trigger-netdev-fix-RTNL-handling-to-prevent-pot.patch @@ -0,0 +1,170 @@ +From fe2b1226656afae56702d1d84c6900f6b67df297 Mon Sep 17 00:00:00 2001 +From: Heiner Kallweit +Date: Fri, 1 Dec 2023 11:23:22 +0100 +Subject: [PATCH] leds: trigger: netdev: fix RTNL handling to prevent potential + deadlock + +When working on LED support for r8169 I got the following lockdep +warning. Easiest way to prevent this scenario seems to be to take +the RTNL lock before the trigger_data lock in set_device_name(). + +====================================================== +WARNING: possible circular locking dependency detected +6.7.0-rc2-next-20231124+ #2 Not tainted +------------------------------------------------------ +bash/383 is trying to acquire lock: +ffff888103aa1c68 (&trigger_data->lock){+.+.}-{3:3}, at: netdev_trig_notify+0xec/0x190 [ledtrig_netdev] + +but task is already holding lock: +ffffffff8cddf808 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x12/0x20 + +which lock already depends on the new lock. + +the existing dependency chain (in reverse order) is: + +-> #1 (rtnl_mutex){+.+.}-{3:3}: + __mutex_lock+0x9b/0xb50 + mutex_lock_nested+0x16/0x20 + rtnl_lock+0x12/0x20 + set_device_name+0xa9/0x120 [ledtrig_netdev] + netdev_trig_activate+0x1a1/0x230 [ledtrig_netdev] + led_trigger_set+0x172/0x2c0 + led_trigger_write+0xf1/0x140 + sysfs_kf_bin_write+0x5d/0x80 + kernfs_fop_write_iter+0x15d/0x210 + vfs_write+0x1f0/0x510 + ksys_write+0x6c/0xf0 + __x64_sys_write+0x14/0x20 + do_syscall_64+0x3f/0xf0 + entry_SYSCALL_64_after_hwframe+0x6c/0x74 + +-> #0 (&trigger_data->lock){+.+.}-{3:3}: + __lock_acquire+0x1459/0x25a0 + lock_acquire+0xc8/0x2d0 + __mutex_lock+0x9b/0xb50 + mutex_lock_nested+0x16/0x20 + netdev_trig_notify+0xec/0x190 [ledtrig_netdev] + call_netdevice_register_net_notifiers+0x5a/0x100 + register_netdevice_notifier+0x85/0x120 + netdev_trig_activate+0x1d4/0x230 [ledtrig_netdev] + led_trigger_set+0x172/0x2c0 + led_trigger_write+0xf1/0x140 + sysfs_kf_bin_write+0x5d/0x80 + kernfs_fop_write_iter+0x15d/0x210 + vfs_write+0x1f0/0x510 + ksys_write+0x6c/0xf0 + __x64_sys_write+0x14/0x20 + do_syscall_64+0x3f/0xf0 + entry_SYSCALL_64_after_hwframe+0x6c/0x74 + +other info that might help us debug this: + + Possible unsafe locking scenario: + + CPU0 CPU1 + ---- ---- + lock(rtnl_mutex); + lock(&trigger_data->lock); + lock(rtnl_mutex); + lock(&trigger_data->lock); + + *** DEADLOCK *** + +8 locks held by bash/383: + #0: ffff888103ff33f0 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0x6c/0xf0 + #1: ffff888103aa1e88 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x114/0x210 + #2: ffff8881036f1890 (kn->active#82){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x11d/0x210 + #3: ffff888108e2c358 (&led_cdev->led_access){+.+.}-{3:3}, at: led_trigger_write+0x30/0x140 + #4: ffffffff8cdd9e10 (triggers_list_lock){++++}-{3:3}, at: led_trigger_write+0x75/0x140 + #5: ffff888108e2c270 (&led_cdev->trigger_lock){++++}-{3:3}, at: led_trigger_write+0xe3/0x140 + #6: ffffffff8cdde3d0 (pernet_ops_rwsem){++++}-{3:3}, at: register_netdevice_notifier+0x1c/0x120 + #7: ffffffff8cddf808 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x12/0x20 + +stack backtrace: +CPU: 0 PID: 383 Comm: bash Not tainted 6.7.0-rc2-next-20231124+ #2 +Hardware name: Default string Default string/Default string, BIOS ADLN.M6.SODIMM.ZB.CY.015 08/08/2023 +Call Trace: + + dump_stack_lvl+0x5c/0xd0 + dump_stack+0x10/0x20 + print_circular_bug+0x2dd/0x410 + check_noncircular+0x131/0x150 + __lock_acquire+0x1459/0x25a0 + lock_acquire+0xc8/0x2d0 + ? netdev_trig_notify+0xec/0x190 [ledtrig_netdev] + __mutex_lock+0x9b/0xb50 + ? netdev_trig_notify+0xec/0x190 [ledtrig_netdev] + ? __this_cpu_preempt_check+0x13/0x20 + ? netdev_trig_notify+0xec/0x190 [ledtrig_netdev] + ? __cancel_work_timer+0x11c/0x1b0 + ? __mutex_lock+0x123/0xb50 + mutex_lock_nested+0x16/0x20 + ? mutex_lock_nested+0x16/0x20 + netdev_trig_notify+0xec/0x190 [ledtrig_netdev] + call_netdevice_register_net_notifiers+0x5a/0x100 + register_netdevice_notifier+0x85/0x120 + netdev_trig_activate+0x1d4/0x230 [ledtrig_netdev] + led_trigger_set+0x172/0x2c0 + ? preempt_count_add+0x49/0xc0 + led_trigger_write+0xf1/0x140 + sysfs_kf_bin_write+0x5d/0x80 + kernfs_fop_write_iter+0x15d/0x210 + vfs_write+0x1f0/0x510 + ksys_write+0x6c/0xf0 + __x64_sys_write+0x14/0x20 + do_syscall_64+0x3f/0xf0 + entry_SYSCALL_64_after_hwframe+0x6c/0x74 +RIP: 0033:0x7f269055d034 +Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 35 c3 0d 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48 +RSP: 002b:00007ffddb7ef748 EFLAGS: 00000202 ORIG_RAX: 0000000000000001 +RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f269055d034 +RDX: 0000000000000007 RSI: 000055bf5f4af3c0 RDI: 0000000000000001 +RBP: 000055bf5f4af3c0 R08: 0000000000000073 R09: 0000000000000001 +R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000007 +R13: 00007f26906325c0 R14: 00007f269062ff20 R15: 0000000000000000 + + +Fixes: d5e01266e7f5 ("leds: trigger: netdev: add additional specific link speed mode") +Cc: stable@vger.kernel.org +Signed-off-by: Heiner Kallweit +Reviewed-by: Andrew Lunn +Acked-by: Lee Jones +Link: https://lore.kernel.org/r/fb5c8294-2a10-4bf5-8f10-3d2b77d2757e@gmail.com +Signed-off-by: Jakub Kicinski +--- + drivers/leds/trigger/ledtrig-netdev.c | 11 +++++++---- + 1 file changed, 7 insertions(+), 4 deletions(-) + +--- a/drivers/leds/trigger/ledtrig-netdev.c ++++ b/drivers/leds/trigger/ledtrig-netdev.c +@@ -235,6 +235,11 @@ static int set_device_name(struct led_ne + { + cancel_delayed_work_sync(&trigger_data->work); + ++ /* ++ * Take RTNL lock before trigger_data lock to prevent potential ++ * deadlock with netdev notifier registration. ++ */ ++ rtnl_lock(); + mutex_lock(&trigger_data->lock); + + if (trigger_data->net_dev) { +@@ -254,16 +259,14 @@ static int set_device_name(struct led_ne + trigger_data->carrier_link_up = false; + trigger_data->link_speed = SPEED_UNKNOWN; + trigger_data->duplex = DUPLEX_UNKNOWN; +- if (trigger_data->net_dev != NULL) { +- rtnl_lock(); ++ if (trigger_data->net_dev) + get_device_state(trigger_data); +- rtnl_unlock(); +- } + + trigger_data->last_activity = 0; + + set_baseline_state(trigger_data); + mutex_unlock(&trigger_data->lock); ++ rtnl_unlock(); + + return 0; + } diff --git a/target/linux/generic/pending-6.1/830-leds-trigger-netdev-Fix-kernel-panic-on-interface-re.patch b/target/linux/generic/pending-6.1/830-leds-trigger-netdev-Fix-kernel-panic-on-interface-re.patch new file mode 100644 index 0000000000..8006bdb2b4 --- /dev/null +++ b/target/linux/generic/pending-6.1/830-leds-trigger-netdev-Fix-kernel-panic-on-interface-re.patch @@ -0,0 +1,52 @@ +From 9e2a0a8bd7a392b5af6bf2cfa0b07d96d1fb6719 Mon Sep 17 00:00:00 2001 +From: Christian Marangi +Date: Sun, 4 Feb 2024 00:32:04 +0100 +Subject: [PATCH] leds: trigger: netdev: Fix kernel panic on interface rename + trig notify + +Commit d5e01266e7f5 ("leds: trigger: netdev: add additional specific link +speed mode") in the various changes, reworked the way to set the LINKUP +mode in commit cee4bd16c319 ("leds: trigger: netdev: Recheck +NETDEV_LED_MODE_LINKUP on dev rename") and moved it to a generic function. + +This changed the logic where, in the previous implementation the dev +from the trigger event was used to check if the carrier was ok, but in +the new implementation with the generic function, the dev in +trigger_data is used instead. + +This is problematic and cause a possible kernel panic due to the fact +that the dev in the trigger_data still reference the old one as the +new one (passed from the trigger event) still has to be hold and saved +in the trigger_data struct (done in the NETDEV_REGISTER case). + +On calling of get_device_state(), an invalid net_dev is used and this +cause a kernel panic. + +To handle this correctly, move the call to get_device_state() after the +new net_dev is correctly set in trigger_data (in the NETDEV_REGISTER +case) and correctly parse the new dev. + +Fixes: d5e01266e7f5 ("leds: trigger: netdev: add additional specific link speed mode") +Cc: stable@vger.kernel.org +Signed-off-by: Christian Marangi +--- + drivers/leds/trigger/ledtrig-netdev.c | 4 ++-- + 1 file changed, 2 insertions(+), 2 deletions(-) + +--- a/drivers/leds/trigger/ledtrig-netdev.c ++++ b/drivers/leds/trigger/ledtrig-netdev.c +@@ -489,12 +489,12 @@ static int netdev_trig_notify(struct not + trigger_data->duplex = DUPLEX_UNKNOWN; + switch (evt) { + case NETDEV_CHANGENAME: +- get_device_state(trigger_data); +- fallthrough; + case NETDEV_REGISTER: + dev_put(trigger_data->net_dev); + dev_hold(dev); + trigger_data->net_dev = dev; ++ if (evt == NETDEV_CHANGENAME) ++ get_device_state(trigger_data); + break; + case NETDEV_UNREGISTER: + dev_put(trigger_data->net_dev);