ocfs2: fix deadlock on mmapped page in ocfs2_write_begin_nolock()
authorEric Ren <zren@suse.com>
Fri, 30 Sep 2016 22:11:32 +0000 (15:11 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Fri, 30 Sep 2016 22:26:52 +0000 (15:26 -0700)
The testcase "mmaptruncate" of ocfs2-test deadlocks occasionally.

In this testcase, we create a 2*CLUSTER_SIZE file and mmap() on it;
there are 2 process repeatedly performing the following operations
respectively: one is doing memset(mmaped_addr + 2*CLUSTER_SIZE - 1, 'a',
1), while the another is playing ftruncate(fd, 2*CLUSTER_SIZE) and then
ftruncate(fd, CLUSTER_SIZE) again and again.

This is the backtrace when the deadlock happens:

   __wait_on_bit_lock+0x50/0xa0
   __lock_page+0xb7/0xc0
   ocfs2_write_begin_nolock+0x163f/0x1790 [ocfs2]
   ocfs2_page_mkwrite+0x1c7/0x2a0 [ocfs2]
   do_page_mkwrite+0x66/0xc0
   handle_mm_fault+0x685/0x1350
   __do_page_fault+0x1d8/0x4d0
   trace_do_page_fault+0x37/0xf0
   do_async_page_fault+0x19/0x70
   async_page_fault+0x28/0x30

In ocfs2_write_begin_nolock(), we first grab the pages and then allocate
disk space for this write; ocfs2_try_to_free_truncate_log() will be
called if -ENOSPC is returned; if we're lucky to get enough clusters,
which is usually the case, we start over again.

But in ocfs2_free_write_ctxt() the target page isn't unlocked, so we
will deadlock when trying to grab the target page again.

Also, -ENOMEM might be returned in ocfs2_grab_pages_for_write().
Another deadlock will happen in __do_page_mkwrite() if
ocfs2_page_mkwrite() returns non-VM_FAULT_LOCKED, and along with a
locked target page.

These two errors fail on the same path, so fix them by unlocking the
target page manually before ocfs2_free_write_ctxt().

Jan Kara helps me clear out the JBD2 part, and suggest the hint for root
cause.

Changes since v1:
1. Also put ENOMEM error case into consideration.

Link: http://lkml.kernel.org/r/1474173902-32075-1-git-send-email-zren@suse.com
Signed-off-by: Eric Ren <zren@suse.com>
Reviewed-by: He Gang <ghe@suse.com>
Acked-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
fs/ocfs2/aops.c

index 98d36548153dc3696619b2923e2eecce35bd6bbc..bbb4b3e5b4ffc5bbf52fc5d4206f52995a988ea1 100644 (file)
@@ -1842,6 +1842,16 @@ out_commit:
        ocfs2_commit_trans(osb, handle);
 
 out:
+       /*
+        * The mmapped page won't be unlocked in ocfs2_free_write_ctxt(),
+        * even in case of error here like ENOSPC and ENOMEM. So, we need
+        * to unlock the target page manually to prevent deadlocks when
+        * retrying again on ENOSPC, or when returning non-VM_FAULT_LOCKED
+        * to VM code.
+        */
+       if (wc->w_target_locked)
+               unlock_page(mmap_page);
+
        ocfs2_free_write_ctxt(inode, wc);
 
        if (data_ac) {