dm thin: wakeup worker only when deferred bios exist
Single thread fio test (read, bs=4k, ioengine=libaio, iodepth=128,
numjobs=1) over dm-thin device has poor performance versus bare nvme
device.
Further investigation with perf indicates that queue_work_on() consumes
over 20% CPU time when doing IO over dm-thin device. The call stack is
as follows.
- 40.57% thin_map
+ 22.07% queue_work_on
+ 9.95% dm_thin_find_block
+ 2.80% cell_defer_no_holder
1.91% inc_all_io_entry.isra.33.part.34
+ 1.78% bio_detain.isra.35
In cell_defer_no_holder(), wakeup_worker() is always called, no matter
whether the tc->deferred_bio_list list is empty or not. In single thread
IO model, this list is most likely empty. So skip waking up worker thread
if tc->deferred_bio_list list is empty.
Single thread IO performance improves from 448 MiB/s to 646 MiB/s (+44%)
once the needless wake_worker() calls are properly skipped.
Signed-off-by: Jeffle Xu <jefflexu@linux.alibaba.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>