generic: 6.1: add patch fixing bugs with LED netdev trigger
authorChristian Marangi <ansuelsmth@gmail.com>
Sat, 3 Feb 2024 23:58:15 +0000 (00:58 +0100)
committerChristian Marangi <ansuelsmth@gmail.com>
Sun, 4 Feb 2024 00:02:30 +0000 (01:02 +0100)
Backport one patch merged upstream that prevent a deadlock for LED
netdev trigger and add a pending patch that fix kernel panic on
interface rename trigger notification with invalid dev.

Fixes: #14477
Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
target/linux/generic/backport-6.1/836-v6.7-leds-trigger-netdev-fix-RTNL-handling-to-prevent-pot.patch [new file with mode: 0644]
target/linux/generic/pending-6.1/830-leds-trigger-netdev-Fix-kernel-panic-on-interface-re.patch [new file with mode: 0644]

diff --git a/target/linux/generic/backport-6.1/836-v6.7-leds-trigger-netdev-fix-RTNL-handling-to-prevent-pot.patch b/target/linux/generic/backport-6.1/836-v6.7-leds-trigger-netdev-fix-RTNL-handling-to-prevent-pot.patch
new file mode 100644 (file)
index 0000000..cbb9172
--- /dev/null
@@ -0,0 +1,170 @@
+From fe2b1226656afae56702d1d84c6900f6b67df297 Mon Sep 17 00:00:00 2001
+From: Heiner Kallweit <hkallweit1@gmail.com>
+Date: Fri, 1 Dec 2023 11:23:22 +0100
+Subject: [PATCH] leds: trigger: netdev: fix RTNL handling to prevent potential
+ deadlock
+
+When working on LED support for r8169 I got the following lockdep
+warning. Easiest way to prevent this scenario seems to be to take
+the RTNL lock before the trigger_data lock in set_device_name().
+
+======================================================
+WARNING: possible circular locking dependency detected
+6.7.0-rc2-next-20231124+ #2 Not tainted
+------------------------------------------------------
+bash/383 is trying to acquire lock:
+ffff888103aa1c68 (&trigger_data->lock){+.+.}-{3:3}, at: netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
+
+but task is already holding lock:
+ffffffff8cddf808 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x12/0x20
+
+which lock already depends on the new lock.
+
+the existing dependency chain (in reverse order) is:
+
+-> #1 (rtnl_mutex){+.+.}-{3:3}:
+       __mutex_lock+0x9b/0xb50
+       mutex_lock_nested+0x16/0x20
+       rtnl_lock+0x12/0x20
+       set_device_name+0xa9/0x120 [ledtrig_netdev]
+       netdev_trig_activate+0x1a1/0x230 [ledtrig_netdev]
+       led_trigger_set+0x172/0x2c0
+       led_trigger_write+0xf1/0x140
+       sysfs_kf_bin_write+0x5d/0x80
+       kernfs_fop_write_iter+0x15d/0x210
+       vfs_write+0x1f0/0x510
+       ksys_write+0x6c/0xf0
+       __x64_sys_write+0x14/0x20
+       do_syscall_64+0x3f/0xf0
+       entry_SYSCALL_64_after_hwframe+0x6c/0x74
+
+-> #0 (&trigger_data->lock){+.+.}-{3:3}:
+       __lock_acquire+0x1459/0x25a0
+       lock_acquire+0xc8/0x2d0
+       __mutex_lock+0x9b/0xb50
+       mutex_lock_nested+0x16/0x20
+       netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
+       call_netdevice_register_net_notifiers+0x5a/0x100
+       register_netdevice_notifier+0x85/0x120
+       netdev_trig_activate+0x1d4/0x230 [ledtrig_netdev]
+       led_trigger_set+0x172/0x2c0
+       led_trigger_write+0xf1/0x140
+       sysfs_kf_bin_write+0x5d/0x80
+       kernfs_fop_write_iter+0x15d/0x210
+       vfs_write+0x1f0/0x510
+       ksys_write+0x6c/0xf0
+       __x64_sys_write+0x14/0x20
+       do_syscall_64+0x3f/0xf0
+       entry_SYSCALL_64_after_hwframe+0x6c/0x74
+
+other info that might help us debug this:
+
+ Possible unsafe locking scenario:
+
+       CPU0                    CPU1
+       ----                    ----
+  lock(rtnl_mutex);
+                               lock(&trigger_data->lock);
+                               lock(rtnl_mutex);
+  lock(&trigger_data->lock);
+
+ *** DEADLOCK ***
+
+8 locks held by bash/383:
+ #0: ffff888103ff33f0 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0x6c/0xf0
+ #1: ffff888103aa1e88 (&of->mutex){+.+.}-{3:3}, at: kernfs_fop_write_iter+0x114/0x210
+ #2: ffff8881036f1890 (kn->active#82){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x11d/0x210
+ #3: ffff888108e2c358 (&led_cdev->led_access){+.+.}-{3:3}, at: led_trigger_write+0x30/0x140
+ #4: ffffffff8cdd9e10 (triggers_list_lock){++++}-{3:3}, at: led_trigger_write+0x75/0x140
+ #5: ffff888108e2c270 (&led_cdev->trigger_lock){++++}-{3:3}, at: led_trigger_write+0xe3/0x140
+ #6: ffffffff8cdde3d0 (pernet_ops_rwsem){++++}-{3:3}, at: register_netdevice_notifier+0x1c/0x120
+ #7: ffffffff8cddf808 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x12/0x20
+
+stack backtrace:
+CPU: 0 PID: 383 Comm: bash Not tainted 6.7.0-rc2-next-20231124+ #2
+Hardware name: Default string Default string/Default string, BIOS ADLN.M6.SODIMM.ZB.CY.015 08/08/2023
+Call Trace:
+ <TASK>
+ dump_stack_lvl+0x5c/0xd0
+ dump_stack+0x10/0x20
+ print_circular_bug+0x2dd/0x410
+ check_noncircular+0x131/0x150
+ __lock_acquire+0x1459/0x25a0
+ lock_acquire+0xc8/0x2d0
+ ? netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
+ __mutex_lock+0x9b/0xb50
+ ? netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
+ ? __this_cpu_preempt_check+0x13/0x20
+ ? netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
+ ? __cancel_work_timer+0x11c/0x1b0
+ ? __mutex_lock+0x123/0xb50
+ mutex_lock_nested+0x16/0x20
+ ? mutex_lock_nested+0x16/0x20
+ netdev_trig_notify+0xec/0x190 [ledtrig_netdev]
+ call_netdevice_register_net_notifiers+0x5a/0x100
+ register_netdevice_notifier+0x85/0x120
+ netdev_trig_activate+0x1d4/0x230 [ledtrig_netdev]
+ led_trigger_set+0x172/0x2c0
+ ? preempt_count_add+0x49/0xc0
+ led_trigger_write+0xf1/0x140
+ sysfs_kf_bin_write+0x5d/0x80
+ kernfs_fop_write_iter+0x15d/0x210
+ vfs_write+0x1f0/0x510
+ ksys_write+0x6c/0xf0
+ __x64_sys_write+0x14/0x20
+ do_syscall_64+0x3f/0xf0
+ entry_SYSCALL_64_after_hwframe+0x6c/0x74
+RIP: 0033:0x7f269055d034
+Code: c7 00 16 00 00 00 b8 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 f3 0f 1e fa 80 3d 35 c3 0d 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48
+RSP: 002b:00007ffddb7ef748 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
+RAX: ffffffffffffffda RBX: 0000000000000007 RCX: 00007f269055d034
+RDX: 0000000000000007 RSI: 000055bf5f4af3c0 RDI: 0000000000000001
+RBP: 000055bf5f4af3c0 R08: 0000000000000073 R09: 0000000000000001
+R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000007
+R13: 00007f26906325c0 R14: 00007f269062ff20 R15: 0000000000000000
+ </TASK>
+
+Fixes: d5e01266e7f5 ("leds: trigger: netdev: add additional specific link speed mode")
+Cc: stable@vger.kernel.org
+Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
+Reviewed-by: Andrew Lunn <andrew@lunn.ch>
+Acked-by: Lee Jones <lee@kernel.org>
+Link: https://lore.kernel.org/r/fb5c8294-2a10-4bf5-8f10-3d2b77d2757e@gmail.com
+Signed-off-by: Jakub Kicinski <kuba@kernel.org>
+---
+ drivers/leds/trigger/ledtrig-netdev.c | 11 +++++++----
+ 1 file changed, 7 insertions(+), 4 deletions(-)
+
+--- a/drivers/leds/trigger/ledtrig-netdev.c
++++ b/drivers/leds/trigger/ledtrig-netdev.c
+@@ -235,6 +235,11 @@ static int set_device_name(struct led_ne
+ {
+       cancel_delayed_work_sync(&trigger_data->work);
++      /*
++       * Take RTNL lock before trigger_data lock to prevent potential
++       * deadlock with netdev notifier registration.
++       */
++      rtnl_lock();
+       mutex_lock(&trigger_data->lock);
+       if (trigger_data->net_dev) {
+@@ -254,16 +259,14 @@ static int set_device_name(struct led_ne
+       trigger_data->carrier_link_up = false;
+       trigger_data->link_speed = SPEED_UNKNOWN;
+       trigger_data->duplex = DUPLEX_UNKNOWN;
+-      if (trigger_data->net_dev != NULL) {
+-              rtnl_lock();
++      if (trigger_data->net_dev)
+               get_device_state(trigger_data);
+-              rtnl_unlock();
+-      }
+       trigger_data->last_activity = 0;
+       set_baseline_state(trigger_data);
+       mutex_unlock(&trigger_data->lock);
++      rtnl_unlock();
+       return 0;
+ }
diff --git a/target/linux/generic/pending-6.1/830-leds-trigger-netdev-Fix-kernel-panic-on-interface-re.patch b/target/linux/generic/pending-6.1/830-leds-trigger-netdev-Fix-kernel-panic-on-interface-re.patch
new file mode 100644 (file)
index 0000000..8006bdb
--- /dev/null
@@ -0,0 +1,52 @@
+From 9e2a0a8bd7a392b5af6bf2cfa0b07d96d1fb6719 Mon Sep 17 00:00:00 2001
+From: Christian Marangi <ansuelsmth@gmail.com>
+Date: Sun, 4 Feb 2024 00:32:04 +0100
+Subject: [PATCH] leds: trigger: netdev: Fix kernel panic on interface rename
+ trig notify
+
+Commit d5e01266e7f5 ("leds: trigger: netdev: add additional specific link
+speed mode") in the various changes, reworked the way to set the LINKUP
+mode in commit cee4bd16c319 ("leds: trigger: netdev: Recheck
+NETDEV_LED_MODE_LINKUP on dev rename") and moved it to a generic function.
+
+This changed the logic where, in the previous implementation the dev
+from the trigger event was used to check if the carrier was ok, but in
+the new implementation with the generic function, the dev in
+trigger_data is used instead.
+
+This is problematic and cause a possible kernel panic due to the fact
+that the dev in the trigger_data still reference the old one as the
+new one (passed from the trigger event) still has to be hold and saved
+in the trigger_data struct (done in the NETDEV_REGISTER case).
+
+On calling of get_device_state(), an invalid net_dev is used and this
+cause a kernel panic.
+
+To handle this correctly, move the call to get_device_state() after the
+new net_dev is correctly set in trigger_data (in the NETDEV_REGISTER
+case) and correctly parse the new dev.
+
+Fixes: d5e01266e7f5 ("leds: trigger: netdev: add additional specific link speed mode")
+Cc: stable@vger.kernel.org
+Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
+---
+ drivers/leds/trigger/ledtrig-netdev.c | 4 ++--
+ 1 file changed, 2 insertions(+), 2 deletions(-)
+
+--- a/drivers/leds/trigger/ledtrig-netdev.c
++++ b/drivers/leds/trigger/ledtrig-netdev.c
+@@ -489,12 +489,12 @@ static int netdev_trig_notify(struct not
+       trigger_data->duplex = DUPLEX_UNKNOWN;
+       switch (evt) {
+       case NETDEV_CHANGENAME:
+-              get_device_state(trigger_data);
+-              fallthrough;
+       case NETDEV_REGISTER:
+               dev_put(trigger_data->net_dev);
+               dev_hold(dev);
+               trigger_data->net_dev = dev;
++              if (evt == NETDEV_CHANGENAME)
++                      get_device_state(trigger_data);
+               break;
+       case NETDEV_UNREGISTER:
+               dev_put(trigger_data->net_dev);