x86, pat: Update the page flags for memtype atomically instead of using memtype_lock
authorRobin Holt <holt@sgi.com>
Fri, 23 Apr 2010 15:36:22 +0000 (10:36 -0500)
committerH. Peter Anvin <hpa@zytor.com>
Fri, 23 Apr 2010 22:57:23 +0000 (15:57 -0700)
commit1f9cc3cb6a27521edfe0a21abf97d2bb11c4d237
treec9af6a71398aed690c1fa813498a0aed8abf2d7b
parent4daa2a8093ecd1148270a1fc64e99f072b8c2901
x86, pat: Update the page flags for memtype atomically instead of using memtype_lock

While testing an application using the xpmem (out of kernel) driver, we
noticed a significant page fault rate reduction of x86_64 with respect
to ia64.  For one test running with 32 cpus, one thread per cpu, it
took 01:08 for each of the threads to vm_insert_pfn 2GB worth of pages.
For the same test running on 256 cpus, one thread per cpu, it took 14:48
to vm_insert_pfn 2 GB worth of pages.

The slowdown was tracked to lookup_memtype which acquires the
spinlock memtype_lock.  This heavily contended lock was slowing down
vm_insert_pfn().

With the cmpxchg on page->flags method, both the 32 cpu and 256 cpu
cases take approx 00:01.3 seconds to complete.

Signed-off-by: Robin Holt <holt@sgi.com>
LKML-Reference: <20100423153627.751194346@gulag1.americas.sgi.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@gmail.com>
Cc: Rafael Wysocki <rjw@novell.com>
Reviewed-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
arch/x86/include/asm/cacheflush.h
arch/x86/mm/pat.c