btrfs: Add overview of device replace
authorQu Wenruo <wqu@suse.com>
Thu, 23 Jan 2020 07:44:50 +0000 (15:44 +0800)
committerDavid Sterba <dsterba@suse.com>
Mon, 23 Mar 2020 16:01:23 +0000 (17:01 +0100)
The overview of btrfs dev-replace.  It mentions some corner cases caused
by the write duplication and scrub based data copy.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ adjust wording ]
Signed-off-by: David Sterba <dsterba@suse.com>
fs/btrfs/dev-replace.c

index 2ca2a09d0e238e5e006441974820e0b2fa7b0060..2aad07cdaea884a03e5c16e943aa898d33b421a7 100644 (file)
 #include "dev-replace.h"
 #include "sysfs.h"
 
+/*
+ * Device replace overview
+ *
+ * [Objective]
+ * To copy all extents (both new and on-disk) from source device to target
+ * device, while still keeping the filesystem read-write.
+ *
+ * [Method]
+ * There are two main methods involved:
+ *
+ * - Write duplication
+ *
+ *   All new writes will be written to both target and source devices, so even
+ *   if replace gets canceled, sources device still contans up-to-date data.
+ *
+ *   Location:         handle_ops_on_dev_replace() from __btrfs_map_block()
+ *   Start:            btrfs_dev_replace_start()
+ *   End:              btrfs_dev_replace_finishing()
+ *   Content:          Latest data/metadata
+ *
+ * - Copy existing extents
+ *
+ *   This happens by re-using scrub facility, as scrub also iterates through
+ *   existing extents from commit root.
+ *
+ *   Location:         scrub_write_block_to_dev_replace() from
+ *                     scrub_block_complete()
+ *   Content:          Data/meta from commit root.
+ *
+ * Due to the content difference, we need to avoid nocow write when dev-replace
+ * is happening.  This is done by marking the block group read-only and waiting
+ * for NOCOW writes.
+ *
+ * After replace is done, the finishing part is done by swapping the target and
+ * source devices.
+ *
+ *   Location:         btrfs_dev_replace_update_device_in_mapping_tree() from
+ *                     btrfs_dev_replace_finishing()
+ */
+
 static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
                                       int scrub_ret);
 static void btrfs_dev_replace_update_device_in_mapping_tree(