Btrfs: fix btrfs_write_inode vs delayed iput deadlock
authorJosef Bacik <jbacik@fb.com>
Fri, 20 Jul 2018 18:46:10 +0000 (11:46 -0700)
committerDavid Sterba <dsterba@suse.com>
Mon, 6 Aug 2018 11:12:58 +0000 (13:12 +0200)
We recently ran into the following deadlock involving
btrfs_write_inode():

[  +0.005066]  __schedule+0x38e/0x8c0
[  +0.007144]  schedule+0x36/0x80
[  +0.006447]  bit_wait+0x11/0x60
[  +0.006446]  __wait_on_bit+0xbe/0x110
[  +0.007487]  ? bit_wait_io+0x60/0x60
[  +0.007319]  __inode_wait_for_writeback+0x96/0xc0
[  +0.009568]  ? autoremove_wake_function+0x40/0x40
[  +0.009565]  inode_wait_for_writeback+0x21/0x30
[  +0.009224]  evict+0xb0/0x190
[  +0.006099]  iput+0x1a8/0x210
[  +0.006103]  btrfs_run_delayed_iputs+0x73/0xc0
[  +0.009047]  btrfs_commit_transaction+0x799/0x8c0
[  +0.009567]  btrfs_write_inode+0x81/0xb0
[  +0.008008]  __writeback_single_inode+0x267/0x320
[  +0.009569]  writeback_sb_inodes+0x25b/0x4e0
[  +0.008702]  wb_writeback+0x102/0x2d0
[  +0.007487]  wb_workfn+0xa4/0x310
[  +0.006794]  ? wb_workfn+0xa4/0x310
[  +0.007143]  process_one_work+0x150/0x410
[  +0.008179]  worker_thread+0x6d/0x520
[  +0.007490]  kthread+0x12c/0x160
[  +0.006620]  ? put_pwq_unlocked+0x80/0x80
[  +0.008185]  ? kthread_park+0xa0/0xa0
[  +0.007484]  ? do_syscall_64+0x53/0x150
[  +0.007837]  ret_from_fork+0x29/0x40

Writeback calls:

btrfs_write_inode
  btrfs_commit_transaction
    btrfs_run_delayed_iputs

If iput() is called on that same inode, evict() will wait for writeback
forever.

btrfs_write_inode() was originally added way back in 4730a4bc5bf3
("btrfs_dirty_inode") to support O_SYNC writes. However, ->write_inode()
hasn't been used for O_SYNC since 148f948ba877 ("vfs: Introduce new
helpers for syncing after writing to O_SYNC file or IS_SYNC inode"), so
btrfs_write_inode() is actually unnecessary (and leads to a bunch of
unnecessary commits). Get rid of it, which also gets rid of the
deadlock.

CC: stable@vger.kernel.org # 3.2+
Signed-off-by: Josef Bacik <jbacik@fb.com>
[Omar: new commit message]
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: David Sterba <dsterba@suse.com>
fs/btrfs/inode.c
fs/btrfs/super.c

index 4955e04da4c85ff48157114376a3ae044563915b..4724577954864fbafbb2bbdd429ac6d4f3f9591a 100644 (file)
@@ -6021,32 +6021,6 @@ err:
        return ret;
 }
 
-int btrfs_write_inode(struct inode *inode, struct writeback_control *wbc)
-{
-       struct btrfs_root *root = BTRFS_I(inode)->root;
-       struct btrfs_trans_handle *trans;
-       int ret = 0;
-       bool nolock = false;
-
-       if (test_bit(BTRFS_INODE_DUMMY, &BTRFS_I(inode)->runtime_flags))
-               return 0;
-
-       if (btrfs_fs_closing(root->fs_info) &&
-                       btrfs_is_free_space_inode(BTRFS_I(inode)))
-               nolock = true;
-
-       if (wbc->sync_mode == WB_SYNC_ALL) {
-               if (nolock)
-                       trans = btrfs_join_transaction_nolock(root);
-               else
-                       trans = btrfs_join_transaction(root);
-               if (IS_ERR(trans))
-                       return PTR_ERR(trans);
-               ret = btrfs_commit_transaction(trans);
-       }
-       return ret;
-}
-
 /*
  * This is somewhat expensive, updating the tree every time the
  * inode changes.  But, it is most likely to find the inode in cache.
index efe8b03ce380ee43e8045e836e349b097f2aed83..67de3c0fc85beb79053005dfb5d4757bd0bce483 100644 (file)
@@ -2344,7 +2344,6 @@ static const struct super_operations btrfs_super_ops = {
        .sync_fs        = btrfs_sync_fs,
        .show_options   = btrfs_show_options,
        .show_devname   = btrfs_show_devname,
-       .write_inode    = btrfs_write_inode,
        .alloc_inode    = btrfs_alloc_inode,
        .destroy_inode  = btrfs_destroy_inode,
        .statfs         = btrfs_statfs,