PM/hibernate: touch NMI watchdog when creating snapshot
authorChen Yu <yu.c.chen@intel.com>
Fri, 25 Aug 2017 22:55:30 +0000 (15:55 -0700)
committerLinus Torvalds <torvalds@linux-foundation.org>
Fri, 25 Aug 2017 23:12:46 +0000 (16:12 -0700)
There is a problem that when counting the pages for creating the
hibernation snapshot will take significant amount of time, especially on
system with large memory.  Since the counting job is performed with irq
disabled, this might lead to NMI lockup.  The following warning were
found on a system with 1.5TB DRAM:

  Freezing user space processes ... (elapsed 0.002 seconds) done.
  OOM killer disabled.
  PM: Preallocating image memory...
  NMI watchdog: Watchdog detected hard LOCKUP on cpu 27
  CPU: 27 PID: 3128 Comm: systemd-sleep Not tainted 4.13.0-0.rc2.git0.1.fc27.x86_64 #1
  task: ffff9f01971ac000 task.stack: ffffb1a3f325c000
  RIP: 0010:memory_bm_find_bit+0xf4/0x100
  Call Trace:
   swsusp_set_page_free+0x2b/0x30
   mark_free_pages+0x147/0x1c0
   count_data_pages+0x41/0xa0
   hibernate_preallocate_memory+0x80/0x450
   hibernation_snapshot+0x58/0x410
   hibernate+0x17c/0x310
   state_store+0xdf/0xf0
   kobj_attr_store+0xf/0x20
   sysfs_kf_write+0x37/0x40
   kernfs_fop_write+0x11c/0x1a0
   __vfs_write+0x37/0x170
   vfs_write+0xb1/0x1a0
   SyS_write+0x55/0xc0
   entry_SYSCALL_64_fastpath+0x1a/0xa5
  ...
  done (allocated 6590003 pages)
  PM: Allocated 26360012 kbytes in 19.89 seconds (1325.28 MB/s)

It has taken nearly 20 seconds(2.10GHz CPU) thus the NMI lockup was
triggered.  In case the timeout of the NMI watch dog has been set to 1
second, a safe interval should be 6590003/20 = 320k pages in theory.
However there might also be some platforms running at a lower frequency,
so feed the watchdog every 100k pages.

[yu.c.chen@intel.com: simplification]
Link: http://lkml.kernel.org/r/1503460079-29721-1-git-send-email-yu.c.chen@intel.com
[yu.c.chen@intel.com: use interval of 128k instead of 100k to avoid modulus]
Link: http://lkml.kernel.org/r/1503328098-5120-1-git-send-email-yu.c.chen@intel.com
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Reported-by: Jan Filipcewicz <jan.filipcewicz@intel.com>
Suggested-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Michal Hocko <mhocko@suse.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Len Brown <lenb@kernel.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
mm/page_alloc.c

index 1bad301820c7a2e2729fc2f7c04e4b3694131576..7a58eb5757e3bd61d9dadfd8d851b9641454552d 100644 (file)
@@ -66,6 +66,7 @@
 #include <linux/kthread.h>
 #include <linux/memcontrol.h>
 #include <linux/ftrace.h>
+#include <linux/nmi.h>
 
 #include <asm/sections.h>
 #include <asm/tlbflush.h>
@@ -2535,9 +2536,14 @@ void drain_all_pages(struct zone *zone)
 
 #ifdef CONFIG_HIBERNATION
 
+/*
+ * Touch the watchdog for every WD_PAGE_COUNT pages.
+ */
+#define WD_PAGE_COUNT  (128*1024)
+
 void mark_free_pages(struct zone *zone)
 {
-       unsigned long pfn, max_zone_pfn;
+       unsigned long pfn, max_zone_pfn, page_count = WD_PAGE_COUNT;
        unsigned long flags;
        unsigned int order, t;
        struct page *page;
@@ -2552,6 +2558,11 @@ void mark_free_pages(struct zone *zone)
                if (pfn_valid(pfn)) {
                        page = pfn_to_page(pfn);
 
+                       if (!--page_count) {
+                               touch_nmi_watchdog();
+                               page_count = WD_PAGE_COUNT;
+                       }
+
                        if (page_zone(page) != zone)
                                continue;
 
@@ -2565,8 +2576,13 @@ void mark_free_pages(struct zone *zone)
                        unsigned long i;
 
                        pfn = page_to_pfn(page);
-                       for (i = 0; i < (1UL << order); i++)
+                       for (i = 0; i < (1UL << order); i++) {
+                               if (!--page_count) {
+                                       touch_nmi_watchdog();
+                                       page_count = WD_PAGE_COUNT;
+                               }
                                swsusp_set_page_free(pfn_to_page(pfn + i));
+                       }
                }
        }
        spin_unlock_irqrestore(&zone->lock, flags);