dm: avoid destroying table in dm_any_congested
authorChandra Seetharaman <sekharan@us.ibm.com>
Thu, 13 Nov 2008 23:39:14 +0000 (23:39 +0000)
committerAlasdair G Kergon <agk@redhat.com>
Thu, 13 Nov 2008 23:39:14 +0000 (23:39 +0000)
dm_any_congested() just checks for the DMF_BLOCK_IO and has no
code to make sure that suspend waits for dm_any_congested() to
complete.  This patch adds such a check.

Without it, a race can occur with dm_table_put() attempting to
destroying the table in the wrong thread, the one running
dm_any_congested() which is meant to be quick and return
immediately.

Two examples of problems:
1. Sleeping functions called from congested code, the caller
   of which holds a spin lock.
2. An ABBA deadlock between pdflush and multipathd. The two locks
   in contention are inode lock and kernel lock.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Alasdair G Kergon <agk@redhat.com>
drivers/md/dm.c

index dc25d8a07bc77da014e86fc385544dcc69c0c085..c99e4728ff4162ed16c4a7ec99fdc0c27c8cb35f 100644 (file)
@@ -937,16 +937,24 @@ static void dm_unplug_all(struct request_queue *q)
 
 static int dm_any_congested(void *congested_data, int bdi_bits)
 {
-       int r;
-       struct mapped_device *md = (struct mapped_device *) congested_data;
-       struct dm_table *map = dm_get_table(md);
+       int r = bdi_bits;
+       struct mapped_device *md = congested_data;
+       struct dm_table *map;
 
-       if (!map || test_bit(DMF_BLOCK_IO, &md->flags))
-               r = bdi_bits;
-       else
-               r = dm_table_any_congested(map, bdi_bits);
+       atomic_inc(&md->pending);
+
+       if (!test_bit(DMF_BLOCK_IO, &md->flags)) {
+               map = dm_get_table(md);
+               if (map) {
+                       r = dm_table_any_congested(map, bdi_bits);
+                       dm_table_put(map);
+               }
+       }
+
+       if (!atomic_dec_return(&md->pending))
+               /* nudge anyone waiting on suspend queue */
+               wake_up(&md->wait);
 
-       dm_table_put(map);
        return r;
 }