workqueue: fix CPU binding of flush_delayed_work[_sync]()
authorTejun Heo <tj@kernel.org>
Wed, 8 Aug 2012 16:38:42 +0000 (09:38 -0700)
committerTejun Heo <tj@kernel.org>
Mon, 13 Aug 2012 23:27:55 +0000 (16:27 -0700)
commit1265057fa02c7bed3b6d9ddc8a2048065a370364
treeb10e631ca6157103fcc71188e972b06e18c3570f
parent41f63c5359d14ca995172b8f6eaffd93f60fec54
workqueue: fix CPU binding of flush_delayed_work[_sync]()

delayed_work encodes the workqueue to use and the last CPU in
delayed_work->work.data while it's on timer.  The target CPU is
implicitly recorded as the CPU the timer is queued on and
delayed_work_timer_fn() queues delayed_work->work to the CPU it is
running on.

Unfortunately, this leaves flush_delayed_work[_sync]() no way to find
out which CPU the delayed_work was queued for when they try to
re-queue after killing the timer.  Currently, it chooses the local CPU
flush is running on.  This can unexpectedly move a delayed_work queued
on a specific CPU to another CPU and lead to subtle errors.

There isn't much point in trying to save several bytes in struct
delayed_work, which is already close to a hundred bytes on 64bit with
all debug options turned off.  This patch adds delayed_work->cpu to
remember the CPU it's queued for.

Note that if the timer is migrated during CPU down, the work item
could be queued to the downed global_cwq after this change.  As a
detached global_cwq behaves like an unbound one, this doesn't change
much for the delayed_work.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
include/linux/workqueue.h
kernel/workqueue.c