workqueue: always clear WORKER_REBIND in busy_worker_rebind_fn()
busy_worker_rebind_fn() didn't clear WORKER_REBIND if rebinding failed
(CPU is down again). This used to be okay because the flag wasn't
used for anything else.
However, after
25511a477 "workqueue: reimplement CPU online rebinding
to handle idle workers", WORKER_REBIND is also used to command idle
workers to rebind. If not cleared, the worker may confuse the next
CPU_UP cycle by having REBIND spuriously set or oops / get stuck by
prematurely calling idle_worker_rebind().
WARNING: at /work/os/wq/kernel/workqueue.c:1323 worker_thread+0x4cd/0x5
00()
Hardware name: Bochs
Modules linked in: test_wq(O-)
Pid: 33, comm: kworker/1:1 Tainted: G O 3.6.0-rc1-work+ #3
Call Trace:
[<
ffffffff8109039f>] warn_slowpath_common+0x7f/0xc0
[<
ffffffff810903fa>] warn_slowpath_null+0x1a/0x20
[<
ffffffff810b3f1d>] worker_thread+0x4cd/0x500
[<
ffffffff810bc16e>] kthread+0xbe/0xd0
[<
ffffffff81bd2664>] kernel_thread_helper+0x4/0x10
---[ end trace
e977cf20f4661968 ]---
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<
ffffffff810b3db0>] worker_thread+0x360/0x500
PGD 0
Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
Modules linked in: test_wq(O-)
CPU 0
Pid: 33, comm: kworker/1:1 Tainted: G W O 3.6.0-rc1-work+ #3 Bochs Bochs
RIP: 0010:[<
ffffffff810b3db0>] [<
ffffffff810b3db0>] worker_thread+0x360/0x500
RSP: 0018:
ffff88001e1c9de0 EFLAGS:
00010086
RAX:
0000000000000000 RBX:
ffff88001e633e00 RCX:
0000000000004140
RDX:
0000000000000000 RSI:
0000000000000000 RDI:
0000000000000009
RBP:
ffff88001e1c9ea0 R08:
0000000000000000 R09:
0000000000000001
R10:
0000000000000002 R11:
0000000000000000 R12:
ffff88001fc8d580
R13:
ffff88001fc8d590 R14:
ffff88001e633e20 R15:
ffff88001e1c6900
FS:
0000000000000000(0000) GS:
ffff88001fc00000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
CR2:
0000000000000000 CR3:
00000000130e8000 CR4:
00000000000006f0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000ffff0ff0 DR7:
0000000000000400
Process kworker/1:1 (pid: 33, threadinfo
ffff88001e1c8000, task
ffff88001e1c6900)
Stack:
ffff880000000000 ffff88001e1c9e40 0000000000000001 ffff88001e1c8010
ffff88001e519c78 ffff88001e1c9e58 ffff88001e1c6900 ffff88001e1c6900
ffff88001e1c6900 ffff88001e1c6900 ffff88001fc8d340 ffff88001fc8d340
Call Trace:
[<
ffffffff810bc16e>] kthread+0xbe/0xd0
[<
ffffffff81bd2664>] kernel_thread_helper+0x4/0x10
Code: b1 00 f6 43 48 02 0f 85 91 01 00 00 48 8b 43 38 48 89 df 48 8b 00 48 89 45 90 e8 ac f0 ff ff 3c 01 0f 85 60 01 00 00 48 8b 53 50 <8b> 02 83 e8 01 85 c0 89 02 0f 84 3b 01 00 00 48 8b 43 38 48 8b
RIP [<
ffffffff810b3db0>] worker_thread+0x360/0x500
RSP <
ffff88001e1c9de0>
CR2:
0000000000000000
There was no reason to keep WORKER_REBIND on failure in the first
place - WORKER_UNBOUND is guaranteed to be set in such cases
preventing incorrectly activating concurrency management. Always
clear WORKER_REBIND.
tj: Updated comment and description.
Signed-off-by: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Tejun Heo <tj@kernel.org>