Btrfs: fix block group ->space_info null pointer dereference
When we create a block group we add it to the rbtree of block groups
before setting its ->space_info field (while it's NULL). This is
problematic since other tasks can access the block group from the
rbtree and attempt to use its ->space_info before it is set by
btrfs_make_block_group().
This can happen for example when a concurrent fitrim ioctl operation
is ongoing, which produces a trace like the following when
CONFIG_DEBUG_PAGEALLOC is set.
[11509.604369] BUG: unable to handle kernel NULL pointer dereference at
0000000000000018
[11509.606373] IP: [<
ffffffff8107d675>] __lock_acquire+0xb4/0xf02
[11509.608179] PGD
2296a8067 PUD
22f4a2067 PMD 0
[11509.608179] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
[11509.608179] Modules linked in: btrfs crc32c_generic xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop fuse acpi_cpufreq processor i2c_piix4 psmou
[11509.608179] CPU: 10 PID: 8538 Comm: fstrim Tainted: G W 4.0.0-rc5-btrfs-next-9+ #2
[11509.608179] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[11509.608179] task:
ffff88009f5c46d0 ti:
ffff8801b3edc000 task.ti:
ffff8801b3edc000
[11509.608179] RIP: 0010:[<
ffffffff8107d675>] [<
ffffffff8107d675>] __lock_acquire+0xb4/0xf02
[11509.608179] RSP: 0018:
ffff8801b3edf9e8 EFLAGS:
00010002
[11509.608179] RAX:
0000000000000046 RBX:
0000000000000000 RCX:
0000000000000000
[11509.608179] RDX:
0000000000000000 RSI:
0000000000000000 RDI:
0000000000000018
[11509.608179] RBP:
ffff8801b3edfaa8 R08:
0000000000000001 R09:
0000000000000000
[11509.608179] R10:
0000000000000000 R11:
ffff88009f5c4f98 R12:
0000000000000000
[11509.608179] R13:
0000000000000000 R14:
0000000000000018 R15:
ffff88009f5c46d0
[11509.608179] FS:
00007f280a10e840(0000) GS:
ffff88023ed40000(0000) knlGS:
0000000000000000
[11509.608179] CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
[11509.608179] CR2:
0000000000000018 CR3:
00000002119bc000 CR4:
00000000000006e0
[11509.608179] Stack:
[11509.608179]
0000000000000000 0000000000000000 0000000000000004 0000000000000000
[11509.608179]
ffff880100000000 ffffffff00000000 0000000000000001 ffffffff00000000
[11509.608179]
0000000000000001 0000000000000000 ffff880100000000 00000000000006c4
[11509.608179] Call Trace:
[11509.608179] [<
ffffffff8107dc57>] ? __lock_acquire+0x696/0xf02
[11509.608179] [<
ffffffff8107e806>] lock_acquire+0xa5/0x116
[11509.608179] [<
ffffffffa04cc876>] ? do_trimming+0x51/0x145 [btrfs]
[11509.608179] [<
ffffffff81434f37>] _raw_spin_lock+0x34/0x44
[11509.608179] [<
ffffffffa04cc876>] ? do_trimming+0x51/0x145 [btrfs]
[11509.608179] [<
ffffffffa04cc876>] do_trimming+0x51/0x145 [btrfs]
[11509.608179] [<
ffffffffa04cde7d>] btrfs_trim_block_group+0x201/0x491 [btrfs]
[11509.608179] [<
ffffffffa04849e2>] btrfs_trim_fs+0xe0/0x129 [btrfs]
[11509.608179] [<
ffffffffa04bb80a>] btrfs_ioctl_fitrim+0x138/0x167 [btrfs]
[11509.608179] [<
ffffffffa04c002f>] btrfs_ioctl+0x50d/0x21e8 [btrfs]
[11509.608179] [<
ffffffff81123bda>] ? might_fault+0x58/0xb5
[11509.608179] [<
ffffffff81123bda>] ? might_fault+0x58/0xb5
[11509.608179] [<
ffffffff81123bda>] ? might_fault+0x58/0xb5
[11509.608179] [<
ffffffff81158050>] ? cp_new_stat+0x147/0x15e
[11509.608179] [<
ffffffff81163041>] do_vfs_ioctl+0x3c6/0x479
[11509.608179] [<
ffffffff81158116>] ? SYSC_newfstat+0x25/0x2e
[11509.608179] [<
ffffffff81435b54>] ? ret_from_sys_call+0x1d/0x58
[11509.608179] [<
ffffffff8116b915>] ? __fget_light+0x2d/0x4f
[11509.608179] [<
ffffffff8116314e>] SyS_ioctl+0x5a/0x7f
[11509.608179] [<
ffffffff81435b32>] system_call_fastpath+0x12/0x17
[11509.608179] Code: f4 01 00 0f 85 c0 00 00 00 48 c7 c1 f3 1f 7d 81 48 c7 c2 aa cb 7c 81 be fc 0b 00 00 eb 70 83 3d 61 eb 9c 00 00 0f 84 a5 00 00 00 <49> 81 3e 40 a3 2b 82 b8 00 00 00
[11509.608179] RIP [<
ffffffff8107d675>] __lock_acquire+0xb4/0xf02
[11509.608179] RSP <
ffff8801b3edf9e8>
[11509.608179] CR2:
0000000000000018
[11509.608179] ---[ end trace
570a5c6769f0e49a ]---
Which corresponds to the following access in fs/btrfs/free-space-cache.c:
static int do_trimming(struct btrfs_block_group_cache *block_group,
u64 *total_trimmed, u64 start, u64 bytes,
u64 reserved_start, u64 reserved_bytes,
struct btrfs_trim_range *trim_entry)
{
struct btrfs_space_info *space_info = block_group->space_info;
(...)
spin_lock(&space_info->lock);
^^^^^ - block_group->space_info is NULL...
Fix this by ensuring the block group's ->space_info is set before adding
the block group to the rbtree.
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>