btrfs: Fix NULL pointer exception in find_bio_stripe
On detaching of a disk which is a part of a RAID6 filesystem, the
following kernel OOPS may happen:
[63122.680461] BTRFS error (device sdo): bdev /dev/sdo errs: wr 0, rd 0, flush 1, corrupt 0, gen 0
[63122.719584] BTRFS warning (device sdo): lost page write due to IO error on /dev/sdo
[63122.719587] BTRFS error (device sdo): bdev /dev/sdo errs: wr 1, rd 0, flush 1, corrupt 0, gen 0
[63122.803516] BTRFS warning (device sdo): lost page write due to IO error on /dev/sdo
[63122.803519] BTRFS error (device sdo): bdev /dev/sdo errs: wr 2, rd 0, flush 1, corrupt 0, gen 0
[63122.863902] BTRFS critical (device sdo): fatal error on device /dev/sdo
[63122.935338] BUG: unable to handle kernel NULL pointer dereference at
0000000000000080
[63122.946554] IP: fail_bio_stripe+0x58/0xa0 [btrfs]
[63122.958185] PGD
9ecda067 P4D
9ecda067 PUD
b2b37067 PMD 0
[63122.971202] Oops: 0000 [#1] SMP
[63123.006760] CPU: 0 PID: 3979 Comm: kworker/u8:9 Tainted: G W 4.14.2-16-scst34x+ #8
[63123.007091] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[63123.007402] Workqueue: btrfs-worker btrfs_worker_helper [btrfs]
[63123.007595] task:
ffff880036ea4040 task.stack:
ffffc90006384000
[63123.007796] RIP: 0010:fail_bio_stripe+0x58/0xa0 [btrfs]
[63123.007968] RSP: 0018:
ffffc90006387ad8 EFLAGS:
00010287
[63123.008140] RAX:
0000000000000002 RBX:
ffff88004beaa0b8 RCX:
ffff8800b2bd5690
[63123.008359] RDX:
0000000000000000 RSI:
ffff88007bb43500 RDI:
ffff88004beaa000
[63123.008621] RBP:
ffffc90006387ae8 R08:
0000000099100000 R09:
ffff8800b2bd5600
[63123.008840] R10:
0000000000000004 R11:
0000000000010000 R12:
ffff88007bb43500
[63123.009059] R13:
00000000fffffffb R14:
ffff880036fc5180 R15:
0000000000000004
[63123.009278] FS:
0000000000000000(0000) GS:
ffff8800b7000000(0000) knlGS:
0000000000000000
[63123.009564] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[63123.009748] CR2:
0000000000000080 CR3:
00000000b0866000 CR4:
00000000000406f0
[63123.009969] Call Trace:
[63123.010085] raid_write_end_io+0x7e/0x80 [btrfs]
[63123.010251] bio_endio+0xa1/0x120
[63123.010378] generic_make_request+0x218/0x270
[63123.010921] submit_bio+0x66/0x130
[63123.011073] finish_rmw+0x3fc/0x5b0 [btrfs]
[63123.011245] full_stripe_write+0x96/0xc0 [btrfs]
[63123.011428] raid56_parity_write+0x117/0x170 [btrfs]
[63123.011604] btrfs_map_bio+0x2ec/0x320 [btrfs]
[63123.011759] ? ___cache_free+0x1c5/0x300
[63123.011909] __btrfs_submit_bio_done+0x26/0x50 [btrfs]
[63123.012087] run_one_async_done+0x9c/0xc0 [btrfs]
[63123.012257] normal_work_helper+0x19e/0x300 [btrfs]
[63123.012429] btrfs_worker_helper+0x12/0x20 [btrfs]
[63123.012656] process_one_work+0x14d/0x350
[63123.012888] worker_thread+0x4d/0x3a0
[63123.013026] ? _raw_spin_unlock_irqrestore+0x15/0x20
[63123.013192] kthread+0x109/0x140
[63123.013315] ? process_scheduled_works+0x40/0x40
[63123.013472] ? kthread_stop+0x110/0x110
[63123.013610] ret_from_fork+0x25/0x30
[63123.014469] RIP: fail_bio_stripe+0x58/0xa0 [btrfs] RSP:
ffffc90006387ad8
[63123.014678] CR2:
0000000000000080
[63123.016590] ---[ end trace
a295ea7259c17880 ]—
This is reproducible in a cycle, where a series of writes is followed by
SCSI device delete command. The test may take up to few minutes.
Fixes: 74d46992e0d9 ("block: replace bi_bdev with a gendisk pointer and partitions index")
[ no signed-off-by provided ]
Author: Dmitriy Gorokh <Dmitriy.Gorokh@wdc.com>
Reviewed-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>