Martin Schwidefsky [Fri, 1 Apr 2016 13:42:15 +0000 (15:42 +0200)]
s390/fpu: allocate 'struct fpu' with the task_struct
Analog to git commit
0c8c0f03e3a292e031596484275c14cf39c0ab7a
"x86/fpu, sched: Dynamically allocate 'struct fpu'"
move the struct fpu to the end of the struct thread_struct,
set CONFIG_ARCH_WANTS_DYNAMIC_TASK_STRUCT and add the
setup_task_size() function to calculate the correct size
fo the task struct.
For the performance_defconfig this increases the size of
struct task_struct from 7424 bytes to 7936 bytes (MACHINE_HAS_VX==1)
or 7552 bytes (MACHINE_HAS_VX==0). The dynamic allocation of the
struct fpu is removed. The slab cache uses an 8KB block for the
task struct in all cases, there is enough room for the struct fpu.
For MACHINE_HAS_VX==1 each task now needs 512 bytes less memory.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Martin Schwidefsky [Thu, 17 Mar 2016 14:22:12 +0000 (15:22 +0100)]
s390/crypto: cleanup and move the header with the cpacf definitions
The CPACF instructions are going be used in KVM as well, move the
defines and the inline functions from arch/s390/crypt/crypt_s390.h
to arch/s390/include/asm. Rename the header to cpacf.h and replace
the crypt_s390_xxx names with cpacf_xxx.
While we are at it, cleanup the header as well. The encoding for
the CPACF operations is odd, there is an enum for each of the CPACF
instructions with the hardware function code in the lower 8 bits of
each entry and a software defined number for the CPACF instruction
in the upper 8 bits. Remove the superfluous software number and
replace the enums with simple defines.
The crypt_s390_func_available() function tests for the presence
of a specific CPACF operations. The new name of the function is
cpacf_query and it works slightly different than before. It gets
passed an opcode of an CPACF instruction and a function code for
this instruction. The facility_mask parameter is gone, the opcode
is used to find the correct MSA facility bit to check if the CPACF
instruction itself is available. If it is the query function of the
given instruction is used to test if the requested CPACF operation
is present.
Acked-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Mon, 29 Jun 2015 16:47:28 +0000 (18:47 +0200)]
s390/pci: fmb enhancements
Implement the function type specific function measurement block used
in new machines.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Jan Höppner [Wed, 19 Aug 2015 11:41:20 +0000 (13:41 +0200)]
s390/dasd: Add new ioctl BIODASDCHECKFMT
Implement new DASD IOCTL BIODASDCHECKFMT to check a range of tracks on a
DASD volume for correct formatting. The following characteristics are
checked:
- Block size
- ECKD key length
- ECKD record ID
- Number of records per track
Signed-off-by: Jan Höppner <hoeppner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Fri, 27 Nov 2015 11:33:18 +0000 (12:33 +0100)]
s390/sclp: event type macro cleanup
Sort the sclp event type defines and use a macro to create the
corresponding event type masks. In addition to that one unused
type/mask pair is removed and another previously unused define
is used now (it was probably unused/unknown because it didn't
follow the EVTYP_X EVTYP_X_MASK convention).
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Fri, 27 Nov 2015 10:22:57 +0000 (11:22 +0100)]
s390/pci: add report_error attribute
Provide an report_error attribute to send an adapter-error
notification associated with a PCI function.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Fri, 27 Nov 2015 10:18:02 +0000 (11:18 +0100)]
s390/sclp: add error notification command
Add SCLP event 24 "Adapter-error notification".
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Fri, 27 Nov 2015 10:15:36 +0000 (11:15 +0100)]
s390/sclp: move pci related commands to separate file
sclp commands only used by the PCI code shouldn't be build for
!CONFIG_PCI. Instead of adding more ifdefs to sclp_cmd.c just
move them to a new file.
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sascha Silbe [Tue, 5 Apr 2016 10:53:38 +0000 (12:53 +0200)]
Documentation: document the nosmt and smt s390 kernel parameters
Commit
6c2ff84e979a ("s390: add SMT support") added the nosmt and smt
early kernel parameters for restricting the use of symmetric
multithreading (SMT). They are quite useful for debugging and testing,
so document them.
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Peter Zijlstra [Tue, 22 Mar 2016 20:42:53 +0000 (21:42 +0100)]
s390: Clarify pagefault interrupt
While looking at set_task_state() users I stumbled over the s390 pfault
interrupt code. Since Heiko provided a great explanation on how it
worked, I figured we ought to preserve this.
Also make a few little tweaks to the code to aid in readability and
explicitly comment the unusual blocking scheme.
Based-on-text-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Stefan Haberland [Fri, 18 Mar 2016 08:42:13 +0000 (09:42 +0100)]
s390/dasd: add query host access to volume support
With this feature, applications can query if a DASD volume is online
to another operating system instances by checking the online status of
all attached hosts from the storage server.
Reviewed-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Stefan Haberland <stefan.haberland@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Thu, 14 Apr 2016 07:00:27 +0000 (09:00 +0200)]
s390: add CPU_BIG_ENDIAN config option
Make sure that s390 appears to be a big endian machine by defining
this config option.
Without this s390 appears to be little endian as seen by e.g. the
recordmount script: "perl ./scripts/recordmcount.pl "s390" "little"
"64""
This has no practical impact within the script since the endian
variable is only evaluated for mips. However there are already a
couple of common code places which evaluate this config option. None
of them is relevant for s390 currently though.
To avoid any issues in the future (and fix the recordmcount oddity)
add the new config option.
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Heiko Carstens [Wed, 13 Apr 2016 09:05:20 +0000 (11:05 +0200)]
s390/spinlock: avoid yield to non existent cpu
arch_spin_lock_wait_flags() checks if a spinlock is not held before
trying a compare and swap instruction. If the lock is unlocked it
tries the compare and swap instruction, however if a different cpu
grabbed the lock in the meantime the instruction will fail as
expected.
Subsequently the arch_spin_lock_wait_flags() incorrectly tries to
figure out if the cpu that holds the lock is running. However it is
using the wrong cpu number for this (-1) and then will also yield the
current cpu to the wrong cpu.
Fix this by adding a missing continue statement.
Fixes: 470ada6b1a1d ("s390/spinlock: refactor arch_spin_lock_wait[_flags]")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Gerald Schaefer [Fri, 8 Apr 2016 11:23:52 +0000 (13:23 +0200)]
s390/dcssblk: fix possible deadlock in remove vs. per-device attributes
dcssblk_remove_store() holds the dcssblk_devices_sem semaphore while
calling device_unregister(), which in turn tries to acquire the kernfs
kn->dev_map rwsem for the device sysfs subtree. The same rwsem is also
acquired when using the per-device sysfs attributes in the device sub-tree,
and the attribute handlers then also acquire the dcssblk_devices_sem.
This can lead to a deadlock when removing a DCSS while concurrently
reading from / writing to one of its sysfs attributes. The following
lockdep warning hinted towards the issue (CPU0 = dcssblk_remove_store,
CPU1 = dcssblk_shared_store):
[ 76.496047] Possible unsafe locking scenario:
[ 76.496054] CPU0 CPU1
[ 76.496059] ---- ----
[ 76.496087] lock(&dcssblk_devices_sem);
[ 76.496090] lock(s_active#175);
[ 76.496106] lock(&dcssblk_devices_sem);
[ 76.496110] lock(s_active#175);
[ 76.496115]
*** DEADLOCK ***
Fix this by releasing the dcssblk_devices_sem semaphore, which only
protects internal DCSS data, before calling device_unregister().
Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sudip Mukherjee [Fri, 1 Apr 2016 12:35:16 +0000 (13:35 +0100)]
s390/seccomp: include generic seccomp header file
Fixes this build error on linux-next:
kernel/seccomp.c: In function '__secure_computing_strict':
kernel/seccomp.c:526:3: error: implicit declaration of function
'get_compat_mode1_syscalls'
The retrieval of compat syscall numbers were moved into inline function
defined in asm-generic header but the asm-generic header is not being
used by s390.
[heiko.carstens@de.ibm.com]: even though the build error will trigger
only in the next merge window it makes sense to include the generic
header file already now.
Fixes: ("seccomp: Get compat syscalls from asm-generic header")
Cc: Matt Redfearn <matt.redfearn@imgtec.com>
Signed-off-by: Sudip Mukherjee <sudip.mukherjee@codethink.co.uk>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Thu, 31 Mar 2016 09:48:31 +0000 (11:48 +0200)]
s390/pci: add extra padding to function measurement block
Newer machines might use a different (larger) format for function
measurement blocks. To ensure that we comply with the alignment
requirement on these machines and prevent memory corruption (when
firmware writes more data than we expect) add 16 padding bytes
at the end of the fmb.
Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Sebastian Ott [Tue, 22 Mar 2016 18:00:55 +0000 (19:00 +0100)]
s390/scm_blk: fix deadlock for requests != REQ_TYPE_FS
When we refuse a non REQ_TYPE_FS request in the build request function
we already hold the queue lock. Thus we must not call blk_end_request_all
but __blk_end_request_all.
Reported-by: Peter Oberparleiter <oberpar@linux.vnet.ibm.com>
Fixes: de9587a ('s390/scm_blk: fix endless loop for requests != REQ_TYPE_FS')
Signed-off-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Linus Torvalds [Fri, 1 Apr 2016 12:21:18 +0000 (07:21 -0500)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Pull drm fixes from Dave Airlie:
"Nothing too crazy in here: a bunch of AMD fixes/quirks, two msm fixes,
some rockchip fixes, and a udl warning fix, along with one locking fix
for displayport that seems to fix some dodgy monitors"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/udl: Use unlocked gem unreferencing
drm/dp: move hw_mutex up the call stack
drm/amdgpu: Don't move pinned BOs
drm/radeon: Don't move pinned BOs
drm/radeon: add a dpm quirk for all R7 370 parts
drm/radeon: add another R7 370 quirk
drm/rockchip: dw_hdmi: Don't call platform_set_drvdata()
drm/rockchip: vop: Fix vop crtc cleanup
drm/rockchip: dw_hdmi: Call drm_encoder_cleanup() in error path
drm/rockchip: vop: Disable planes when disabling CRTC
drm/rockchip: vop: Don't reject empty modesets
drm/rockchip: cancel pending vblanks on close
drm/rockchip: vop: fix crtc size in plane check
drm/radeon: add a dpm quirk for sapphire Dual-X R7 370 2G D5
drm/amd: Beef up ACP Kconfig menu text
drm/msm: fix typo in the !COMMON_CLK case
drm/msm: fix bug after preclose removal
Linus Torvalds [Fri, 1 Apr 2016 12:18:27 +0000 (07:18 -0500)]
Merge tag 'powerpc-4.6-2' of git://git./linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Fixup preempt underflow with huge pages from Sebastian Siewior
- Fix altivec SPR not being saved from Oliver O'Halloran
- Correct used_vsr comment from Simon Guo
* tag 'powerpc-4.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc: Correct used_vsr comment
powerpc/process: Fix altivec SPR not being saved
powerpc/mm: Fixup preempt underflow with huge pages
Linus Torvalds [Fri, 1 Apr 2016 12:15:54 +0000 (07:15 -0500)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
- A proper fix for the locking issue in the dasd driver
- Wire up the new preadv2 nad pwritev2 system calls
- Add the mark_rodata_ro function and set DEBUG_RODATA=y
- A few more bug fixes.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: wire up preadv2/pwritev2 syscalls
s390/pci: PCI function group 0 is valid for clp_query_pci_fn
s390/crypto: provide correct file mode at device register.
s390/mm: handle PTE-mapped tail pages in fast gup
s390: add DEBUG_RODATA support
s390: disable postinit-readonly for now
s390/dasd: reorder lcu and device lock
s390/cpum_sf: Fix cpu hotplug notifier transitions
s390/cpum_cf: Fix missing cpu hotplug notifier transition
Heiko Carstens [Mon, 21 Mar 2016 12:48:43 +0000 (13:48 +0100)]
s390: wire up preadv2/pwritev2 syscalls
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Pierre Morel [Wed, 16 Mar 2016 12:56:35 +0000 (13:56 +0100)]
s390/pci: PCI function group 0 is valid for clp_query_pci_fn
The PCI function group 0 is a valid function group,
it is wrong to reject it.
Let's accept PCI function group 0.
Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Acked-by: Sebastian Ott <sebott@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Daniel Vetter [Wed, 30 Mar 2016 09:40:43 +0000 (11:40 +0200)]
drm/udl: Use unlocked gem unreferencing
For drm_gem_object_unreference callers are required to hold
dev->struct_mutex, which these paths don't. Enforcing this requirement
has become a bit more strict with
commit
ef4c6270bf2867e2f8032e9614d1a8cfc6c71663
Author: Daniel Vetter <daniel.vetter@ffwll.ch>
Date: Thu Oct 15 09:36:25 2015 +0200
drm/gem: Check locking in drm_gem_object_unreference
Cc: stable@vger.kernel.org
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Rob Clark [Thu, 25 Feb 2016 21:15:05 +0000 (16:15 -0500)]
drm/dp: move hw_mutex up the call stack
1) don't let other threads trying to bang on aux channel interrupt the
defer timeout/logic
2) don't let other threads interrupt the i2c over aux logic
Technically, according to people who actually have the DP spec, this
should not be required. In practice, it makes some troublesome Dell
monitor (and perhaps others) work, so probably a case of "It's compliant
if it works with windows" on the hw vendor's part..
v2: rebased to come before DPCD/AUX logging patch for easier backport
to stable branches.
Reported-by: Dave Wysochanski <dwysocha@redhat.com>
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=
1274157
Cc: stable@vger.kernel.org
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 1 Apr 2016 03:14:24 +0000 (13:14 +1000)]
Merge branch 'drm-rockchip-next-fixes-2016-03-28' of https://github.com/markyzq/kernel-drm-rockchip into drm-fixes
bunch of rockchip fixes.
* 'drm-rockchip-next-fixes-2016-03-28' of https://github.com/markyzq/kernel-drm-rockchip:
drm/rockchip: dw_hdmi: Don't call platform_set_drvdata()
drm/rockchip: vop: Fix vop crtc cleanup
drm/rockchip: dw_hdmi: Call drm_encoder_cleanup() in error path
drm/rockchip: vop: Disable planes when disabling CRTC
drm/rockchip: vop: Don't reject empty modesets
drm/rockchip: cancel pending vblanks on close
drm/rockchip: vop: fix crtc size in plane check
Dave Airlie [Fri, 1 Apr 2016 03:14:10 +0000 (13:14 +1000)]
Merge branch 'msm-fixes-4.6-rc1' of git://people.freedesktop.org/~robclark/linux into drm-fixes
two minor msm fixes.
* 'msm-fixes-4.6-rc1' of git://people.freedesktop.org/~robclark/linux:
drm/msm: fix typo in the !COMMON_CLK case
drm/msm: fix bug after preclose removal
Dave Airlie [Fri, 1 Apr 2016 03:13:34 +0000 (13:13 +1000)]
Merge branch 'drm-next-4.6' of git://people.freedesktop.org/~agd5f/linux into drm-fixes
Just a few fixes for 4.6 this week:
- Add some SI DPM quirks
- Improve the ACP Kconfig text
- Additional BO pinning checks
* 'drm-next-4.6' of git://people.freedesktop.org/~agd5f/linux:
drm/amdgpu: Don't move pinned BOs
drm/radeon: Don't move pinned BOs
drm/radeon: add a dpm quirk for all R7 370 parts
drm/radeon: add another R7 370 quirk
drm/radeon: add a dpm quirk for sapphire Dual-X R7 370 2G D5
drm/amd: Beef up ACP Kconfig menu text
Linus Torvalds [Thu, 31 Mar 2016 12:55:14 +0000 (07:55 -0500)]
Merge branch 'parisc-4.6-2' of git://git./linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
"Fix seccomp filter support and SIGSYS signals on compat kernel.
Both patches are tagged for v4.5 stable kernel"
* 'parisc-4.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: Fix and enable seccomp filter support
parisc: Fix SIGSYS signals in compat case
Linus Torvalds [Thu, 31 Mar 2016 12:19:39 +0000 (07:19 -0500)]
Merge branch 'for-linus' of git://git./linux/kernel/git/viro/vfs
Pull vfs fix from Al Viro.
Automount handling was broken by commit
e3c13928086f ("namei: massage
lookup_slow() to be usable by lookup_one_len_unlocked()") moving the
test for negative dentry too early.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
fix the braino in "namei: massage lookup_slow() to be usable by lookup_one_len_unlocked()"
Linus Torvalds [Thu, 31 Mar 2016 12:13:56 +0000 (07:13 -0500)]
Merge tag 'gpio-v4.6-2' of git://git./linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"Here are two GPIO fixes for the v4.6 series, both in drivers:
- Prevent NULL dereference in the Xgene driver
- Fix an uninitialized spinlock in the menz127 driver"
* tag 'gpio-v4.6-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: xgene: Prevent NULL pointer dereference
gpio: menz127: Drop lock field from struct men_z127_gpio
Linus Torvalds [Thu, 31 Mar 2016 11:56:50 +0000 (06:56 -0500)]
Merge branch 'libnvdimm-for-next' of git://git./linux/kernel/git/nvdimm/nvdimm
Pull nvdimm mcsafe_memcpy use from Dan Williams:
"Now that mcsafe_memcpy() has landed, and the return value was been
clarified in commit
cbf8b5a2b649 ("x86/mm, x86/mce: Fix return
type/value for memcpy_mcsafe()"), let's hook up its primary usage in
the pmem driver.
The compilation problems from the initial posting have been fixed,
this has appeared in a -next release with no reported issues, and it
picked up an ack from Ingo. There is no pressing need to merge this
in 4.6- rc2. However, if we wait until 4.7 the new memcpy_mcsafe()
capability will ship without a user in 4.6-final"
* 'libnvdimm-for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
x86, pmem: use memcpy_mcsafe() for memcpy_from_pmem()
Helge Deller [Wed, 30 Mar 2016 12:14:31 +0000 (14:14 +0200)]
parisc: Fix and enable seccomp filter support
The seccomp filter support requires careful handling of task registers. This
includes reloading of the return value (%r28) and proper syscall exit if
secure_computing() returned -1.
Additionally we need to sign-extend the syscall number from signed 32bit to
signed 64bit in do_syscall_trace_enter() since the ptrace interface only allows
storing 32bit values in compat mode.
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v4.5
Helge Deller [Wed, 30 Mar 2016 12:11:50 +0000 (14:11 +0200)]
parisc: Fix SIGSYS signals in compat case
Signed-off-by: Helge Deller <deller@gmx.de>
Cc: stable@vger.kernel.org # v4.5
Al Viro [Thu, 31 Mar 2016 04:23:05 +0000 (00:23 -0400)]
fix the braino in "namei: massage lookup_slow() to be usable by lookup_one_len_unlocked()"
We should try to trigger automount *before* bailing out on negative dentry.
Reported-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Reported-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Reported-by: Arend van Spriel <arend@broadcom.com>
Tested-by: Arend van Spriel <arend@broadcom.com>
Tested-by: Jun'ichi Nomura <j-nomura@ce.jp.nec.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Linus Torvalds [Thu, 31 Mar 2016 01:40:42 +0000 (20:40 -0500)]
Merge tag 'nios2-v4.6-rc2' of git://git./linux/kernel/git/lftan/nios2
Pull nios2 fix from Ley Foon Tan:
"Replace fdt_translate_address with of_flat_dt_translate_address"
Fixes a build failure.
* tag 'nios2-v4.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/lftan/nios2:
nios2: Replace fdt_translate_address with of_flat_dt_translate_address
Guenter Roeck [Tue, 29 Mar 2016 09:02:13 +0000 (17:02 +0800)]
nios2: Replace fdt_translate_address with of_flat_dt_translate_address
nios2 builds fail with the following build error.
arch/nios2/kernel/prom.c: In function 'early_init_dt_scan_serial':
arch/nios2/kernel/prom.c:100:2: error:
implicit declaration of function 'fdt_translate_address'
Commit
c90fe9c0394b ("of: earlycon: Move address translation to
of_setup_earlycon()") replaced fdt_translate_address() with
of_flat_dt_translate_address() but missed updating the nios2 code.
Fixes: c90fe9c0394b ("of: earlycon: Move address translation to of_setup_earlycon()")
Cc: Peter Hurley <peter@hurleysoftware.com>
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Ley Foon Tan <lftan@altera.com>
Linus Torvalds [Wed, 30 Mar 2016 18:28:34 +0000 (13:28 -0500)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fix from Herbert Xu:
"This fixes a bug in pkcs7_validate_trust and its users where the
output value may in fact be taken from uninitialised memory"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
PKCS#7: pkcs7_validate_trust(): initialize the _trusted output argument
Linus Torvalds [Wed, 30 Mar 2016 18:24:28 +0000 (13:24 -0500)]
Merge tag 'dlm-4.6-fixes' of git://git./linux/kernel/git/teigland/linux-dlm
Pull dlm fix from David Teigland:
"This fixes a bug from the configfs cleanup"
* tag 'dlm-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm:
dlm: config: Fix ENOMEM failures in make_cluster()
Linus Torvalds [Wed, 30 Mar 2016 18:22:47 +0000 (13:22 -0500)]
Merge tag 'hwmon-for-linus-v4.6-rc2' of git://git./linux/kernel/git/groeck/linux-staging
Pull hwmon fix from Guenter Roeck:
"Fix crash due to NULL pointer access in max1111 driver"
* tag 'hwmon-for-linus-v4.6-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
hwmon: (max1111) Return -ENODEV from max1111_read_channel if not instantiated
Axel Lin [Mon, 21 Mar 2016 12:03:50 +0000 (20:03 +0800)]
gpio: xgene: Prevent NULL pointer dereference
platform_get_resource() can return NULL, thus add NULL test to prevent NULL
pointer dereference.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Axel Lin [Wed, 9 Mar 2016 12:38:55 +0000 (20:38 +0800)]
gpio: menz127: Drop lock field from struct men_z127_gpio
Current code uses a uninitialized spin lock.
bgpio_init() already initialized a spin lock, so let's switch to use
&gc->bgpio_lock instead and remove the lock from struct men_z127_gpio.
Fixes: f436bc2726c6 "gpio: add driver for MEN 16Z127 GPIO controller"
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Andrew Price [Tue, 22 Mar 2016 17:36:34 +0000 (17:36 +0000)]
dlm: config: Fix ENOMEM failures in make_cluster()
Commit
1ae1602de0 "configfs: switch ->default groups to a linked list"
left the NULL gps pointer behind after removing the kcalloc() call which
made it non-NULL. It also left the !gps check in place so make_cluster()
now fails with ENOMEM. Remove the remaining uses of the gps variable to
fix that.
Reviewed-by: Bob Peterson <rpeterso@redhat.com>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Andrew Price <anprice@redhat.com>
Signed-off-by: David Teigland <teigland@redhat.com>
Harald Freudenberger [Thu, 17 Mar 2016 13:52:17 +0000 (14:52 +0100)]
s390/crypto: provide correct file mode at device register.
When the prng device driver calls misc_register() there is the possibility
to also provide the recommented file permissions. This fix now gives
useful values (0644) where previously just the default was used (resulting
in 0600 for the device file).
Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Simon Guo [Thu, 24 Mar 2016 17:12:21 +0000 (01:12 +0800)]
powerpc: Correct used_vsr comment
The used_vsr flag is set if process has used VSX registers, not Altivec
registers. But the comment says otherwise, correct the comment.
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Oliver O'Halloran [Mon, 7 Mar 2016 22:08:47 +0000 (09:08 +1100)]
powerpc/process: Fix altivec SPR not being saved
In save_sprs() in process.c contains the following test:
if (cpu_has_feature(cpu_has_feature(CPU_FTR_ALTIVEC)))
t->vrsave = mfspr(SPRN_VRSAVE);
CPU feature with the mask 0x1 is CPU_FTR_COHERENT_ICACHE so the test
is equivilent to:
if (cpu_has_feature(CPU_FTR_ALTIVEC) &&
cpu_has_feature(CPU_FTR_COHERENT_ICACHE))
On CPUs without support for both (i.e G5) this results in vrsave not
being saved between context switches. The vector register save/restore
code doesn't use VRSAVE to determine which registers to save/restore,
but the value of VRSAVE is used to determine if altivec is being used
in several code paths.
Fixes: 152d523e6307 ("powerpc: Create context switch helpers save_sprs() and restore_sprs()")
Cc: stable@vger.kernel.org
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Sebastian Siewior [Tue, 8 Mar 2016 09:03:56 +0000 (10:03 +0100)]
powerpc/mm: Fixup preempt underflow with huge pages
hugepd_free() used __get_cpu_var() once. Nothing ensured that the code
accessing the variable did not migrate from one CPU to another and soon
this was noticed by Tiejun Chen in
94b09d755462 ("powerpc/hugetlb:
Replace __get_cpu_var with get_cpu_var"). So we had it fixed.
Christoph Lameter was doing his __get_cpu_var() replaces and forgot
PowerPC. Then he noticed this and sent his fixed up batch again which
got applied as
69111bac42f5 ("powerpc: Replace __get_cpu_var uses").
The careful reader will noticed one little detail: get_cpu_var() got
replaced with this_cpu_ptr(). So now we have a put_cpu_var() which does
a preempt_enable() and nothing that does preempt_disable() so we
underflow the preempt counter.
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: stable@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Dan Williams [Tue, 8 Mar 2016 18:30:19 +0000 (10:30 -0800)]
x86, pmem: use memcpy_mcsafe() for memcpy_from_pmem()
Update the definition of memcpy_from_pmem() to return 0 or a negative
error code. Implement x86/arch_memcpy_from_pmem() with memcpy_mcsafe().
Cc: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Linus Torvalds [Mon, 28 Mar 2016 20:17:02 +0000 (15:17 -0500)]
Merge git://git./linux/kernel/git/davem/ide
Pull IDE fixes from David Miller:
"Just two small changes:
1) Remove bogus init annotation in icside, from Arnd Bergmann.
2) Don't use zero clock rates in palm_bk3710 driver, from Wolfram
Sang"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/ide:
ide: palm_bk3710: test clock rate to avoid division by 0
ide: icside: remove incorrect initconst annotation
Linus Torvalds [Mon, 28 Mar 2016 20:14:11 +0000 (15:14 -0500)]
Merge git://git./linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
"Minor typing cleanup from Joe Perches, and some comment typo fixes
from Adam Buchbinder"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc: Convert naked unsigned uses to unsigned int
sparc: Fix misspellings in comments.
Linus Torvalds [Mon, 28 Mar 2016 20:04:55 +0000 (15:04 -0500)]
Merge git://git./linux/kernel/git/cmetcalf/linux-tile
Pull arch/tile bugfixes from Chris Metcalf:
"These include updates to MAINTAINERS, some comment spelling fixes, and
a bugfix to the tile kgdb.c support"
* git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
tile: Fix misspellings in comments.
MAINTAINERS: update web link for tile architecture
MAINTAINERS: update arch/tile maintainer email domain
tile kgdb: fix bug in copy to gdb regs, and optimize memset
Michel Dänzer [Mon, 28 Mar 2016 03:53:02 +0000 (12:53 +0900)]
drm/amdgpu: Don't move pinned BOs
The purpose of pinning is to prevent a buffer from moving.
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Rex Zhu <Rex.Zhu@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Michel Dänzer [Mon, 28 Mar 2016 07:39:14 +0000 (16:39 +0900)]
drm/radeon: Don't move pinned BOs
The purpose of pinning is to prevent a buffer from moving.
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Alex Deucher [Mon, 28 Mar 2016 14:21:20 +0000 (10:21 -0400)]
drm/radeon: add a dpm quirk for all R7 370 parts
Higher mclk values are not stable due to a bug somewhere.
Limit them for now.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Alex Deucher [Mon, 28 Mar 2016 14:16:40 +0000 (10:16 -0400)]
drm/radeon: add another R7 370 quirk
bug:
https://bugzilla.kernel.org/show_bug.cgi?id=115291
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Douglas Anderson [Mon, 7 Mar 2016 22:00:53 +0000 (14:00 -0800)]
drm/rockchip: dw_hdmi: Don't call platform_set_drvdata()
The Rockchip dw_hdmi driver just called platform_set_drvdata() to get
your hopes up that maybe, somehow, you'd be able to retrieve the 'struct
rockchip_hdmi' from a pointer to the 'struct device'. You can't. When
we call dw_hdmi_bind() the main driver calls dev_set_drvdata(), which
clobbers our setting.
Let's just remove the platform_set_drvdata() to avoid dashing people's
hopes.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Douglas Anderson [Mon, 7 Mar 2016 22:00:52 +0000 (14:00 -0800)]
drm/rockchip: vop: Fix vop crtc cleanup
This fixes a few problems in the vop crtc cleanup (handling error
paths and cleanup upon exit):
* The vop_create_crtc() error path had an unsafe version of the
iterator used for iterating over all planes (though it was
destroying planes in the iterator so should have used the safe
version)
* vop_destroy_crtc() - wasn't calling vop_plane_destroy(), which made
slub_debug unhappy, at least if we ended up running this due to a
deferred probe.
* In vop_create_crtc() if we were missing the "port" device tree node
we would fail but not return an error (found by code inspection).
Fix these problems.
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Douglas Anderson [Mon, 7 Mar 2016 22:00:50 +0000 (14:00 -0800)]
drm/rockchip: dw_hdmi: Call drm_encoder_cleanup() in error path
The drm_encoder_cleanup() was missing both from the error path of
dw_hdmi_rockchip_bind(). This caused a crash when slub_debug was
enabled and we ended up deferring probe of HDMI at boot.
This call isn't needed from unbind() because if dw_hdmi_bind() returns
no error then it takes over the job of freeing the encoder (in
dw_hdmi_unbind).
Signed-off-by: Douglas Anderson <dianders@chromium.org>
Tomeu Vizoso [Tue, 22 Mar 2016 15:08:04 +0000 (16:08 +0100)]
drm/rockchip: vop: Disable planes when disabling CRTC
When a VOP is re-enabled, it will start scanning right away the
framebuffers that were configured from the last time, even if those have
been destroyed already.
To prevent the VOP from trying to access freed memory, disable all its
windows when the CRTC is being disabled, then each window will get a
valid framebuffer address before it's enabled again.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Link: http://lkml.kernel.org/g/CAAObsKAv+05ih5U+=4kic_NsjGMhfxYheHR8xXXmacZs+p5SHw@mail.gmail.com
Tomeu Vizoso [Fri, 18 Mar 2016 11:22:02 +0000 (12:22 +0100)]
drm/rockchip: vop: Don't reject empty modesets
So that when DRM_IOCTL_MODE_SETCRTC is called without a FB nor mode, the
CRTC gets disabled.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Link: http://lkml.kernel.org/g/CAAObsKAv+05ih5U+=4kic_NsjGMhfxYheHR8xXXmacZs+p5SHw@mail.gmail.com
John Keeping [Fri, 11 Mar 2016 17:21:17 +0000 (17:21 +0000)]
drm/rockchip: cancel pending vblanks on close
When closing the DRM device while a vblank is pending, we access
file_priv after it has been free'd, which gives:
Unable to handle kernel NULL pointer dereference at virtual address
00000000
...
PC is at __list_add+0x5c/0xe8
LR is at send_vblank_event+0x54/0x1f0
...
[<
c02952e8>] (__list_add) from [<
c031a7b4>] (send_vblank_event+0x54/0x1f0)
[<
c031a760>] (send_vblank_event) from [<
c031a9c0>] (drm_send_vblank_event+0x70/0x78)
[<
c031a950>] (drm_send_vblank_event) from [<
c031a9f8>] (drm_crtc_send_vblank_event+0x30/0x34)
[<
c031a9c8>] (drm_crtc_send_vblank_event) from [<
c0339ad8>] (vop_isr+0x224/0x28c)
[<
c03398b4>] (vop_isr) from [<
c0081780>] (handle_irq_event_percpu+0x12c/0x3e4)
This can be triggered somewhat reliably with:
modetest -M rockchip -v -s ...
Add a preclose hook to the driver so that we can discard any pending
vblank events when the device is closed.
Signed-off-by: John Keeping <john@metanate.com>
John Keeping [Fri, 4 Mar 2016 11:04:03 +0000 (11:04 +0000)]
drm/rockchip: vop: fix crtc size in plane check
If the geometry of a crtc is changing in an atomic update then we must
validate the plane size against the new state of the crtc and not the
current size, otherwise if the crtc size is increasing the plane will be
cropped at the previous size and will not fill the screen.
Signed-off-by: John Keeping <john@metanate.com>
Guenter Roeck [Sat, 26 Mar 2016 19:28:05 +0000 (12:28 -0700)]
hwmon: (max1111) Return -ENODEV from max1111_read_channel if not instantiated
arm:pxa_defconfig can result in the following crash if the max1111 driver
is not instantiated.
Unhandled fault: page domain fault (0x01b) at 0x00000000
pgd =
c0004000
[
00000000] *pgd=
00000000
Internal error: : 1b [#1] PREEMPT ARM
Modules linked in:
CPU: 0 PID: 300 Comm: kworker/0:1 Not tainted
4.5.0-01301-g1701f680407c #10
Hardware name: SHARP Akita
Workqueue: events sharpsl_charge_toggle
task:
c390a000 ti:
c391e000 task.ti:
c391e000
PC is at max1111_read_channel+0x20/0x30
LR is at sharpsl_pm_pxa_read_max1111+0x2c/0x3c
pc : [<
c03aaab0>] lr : [<
c0024b50>] psr:
20000013
...
[<
c03aaab0>] (max1111_read_channel) from [<
c0024b50>]
(sharpsl_pm_pxa_read_max1111+0x2c/0x3c)
[<
c0024b50>] (sharpsl_pm_pxa_read_max1111) from [<
c00262e0>]
(spitzpm_read_devdata+0x5c/0xc4)
[<
c00262e0>] (spitzpm_read_devdata) from [<
c0024094>]
(sharpsl_check_battery_temp+0x78/0x110)
[<
c0024094>] (sharpsl_check_battery_temp) from [<
c0024f9c>]
(sharpsl_charge_toggle+0x48/0x110)
[<
c0024f9c>] (sharpsl_charge_toggle) from [<
c004429c>]
(process_one_work+0x14c/0x48c)
[<
c004429c>] (process_one_work) from [<
c0044618>] (worker_thread+0x3c/0x5d4)
[<
c0044618>] (worker_thread) from [<
c004a238>] (kthread+0xd0/0xec)
[<
c004a238>] (kthread) from [<
c000a670>] (ret_from_fork+0x14/0x24)
This can occur because the SPI controller driver (SPI_PXA2XX) is built as
module and thus not necessarily loaded. While building SPI_PXA2XX into the
kernel would make the problem disappear, it appears prudent to ensure that
the driver is instantiated before accessing its data structures.
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: stable@vger.kernel.org
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Linus Torvalds [Sat, 26 Mar 2016 23:03:24 +0000 (16:03 -0700)]
Linux 4.6-rc1
Linus Torvalds [Sat, 26 Mar 2016 22:53:16 +0000 (15:53 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
Pull Ceph updates from Sage Weil:
"There is quite a bit here, including some overdue refactoring and
cleanup on the mon_client and osd_client code from Ilya, scattered
writeback support for CephFS and a pile of bug fixes from Zheng, and a
few random cleanups and fixes from others"
[ I already decided not to pull this because of it having been rebased
recently, but ended up changing my mind after all. Next time I'll
really hold people to it. Oh well. - Linus ]
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: (34 commits)
libceph: use KMEM_CACHE macro
ceph: use kmem_cache_zalloc
rbd: use KMEM_CACHE macro
ceph: use lookup request to revalidate dentry
ceph: kill ceph_get_dentry_parent_inode()
ceph: fix security xattr deadlock
ceph: don't request vxattrs from MDS
ceph: fix mounting same fs multiple times
ceph: remove unnecessary NULL check
ceph: avoid updating directory inode's i_size accidentally
ceph: fix race during filling readdir cache
libceph: use sizeof_footer() more
ceph: kill ceph_empty_snapc
ceph: fix a wrong comparison
ceph: replace CURRENT_TIME by current_fs_time()
ceph: scattered page writeback
libceph: add helper that duplicates last extent operation
libceph: enable large, variable-sized OSD requests
libceph: osdc->req_mempool should be backed by a slab pool
libceph: make r_request msg_size calculation clearer
...
Linus Torvalds [Sat, 26 Mar 2016 19:59:04 +0000 (12:59 -0700)]
Merge tag 'ofs-pull-tag-1' of git://git./linux/kernel/git/hubcap/linux
Pull orangefs filesystem from Mike Marshall.
This finally merges the long-pending orangefs filesystem, which has been
much cleaned up with input from Al Viro over the last six months. From
the documentation file:
"OrangeFS is an LGPL userspace scale-out parallel storage system. It
is ideal for large storage problems faced by HPC, BigData, Streaming
Video, Genomics, Bioinformatics.
Orangefs, originally called PVFS, was first developed in 1993 by Walt
Ligon and Eric Blumer as a parallel file system for Parallel Virtual
Machine (PVM) as part of a NASA grant to study the I/O patterns of
parallel programs.
Orangefs features include:
- Distributes file data among multiple file servers
- Supports simultaneous access by multiple clients
- Stores file data and metadata on servers using local file system
and access methods
- Userspace implementation is easy to install and maintain
- Direct MPI support
- Stateless"
see Documentation/filesystems/orangefs.txt for more in-depth details.
* tag 'ofs-pull-tag-1' of git://git.kernel.org/pub/scm/linux/kernel/git/hubcap/linux: (174 commits)
orangefs: fix orangefs_superblock locking
orangefs: fix do_readv_writev() handling of error halfway through
orangefs: have ->kill_sb() evict the VFS side of things first
orangefs: sanitize ->llseek()
orangefs-bufmap.h: trim unused junk
orangefs: saner calling conventions for getting a slot
orangefs_copy_{to,from}_bufmap(): don't pass bufmap pointer
orangefs: get rid of readdir_handle_s
ornagefs: ensure that truncate has an up to date inode size
orangefs: move code which sets i_link to orangefs_inode_getattr
orangefs: remove needless wrapper around GFP_KERNEL
orangefs: remove wrapper around mutex_lock(&inode->i_mutex)
orangefs: refactor inode type or link_target change detection
orangefs: use new getattr for revalidate and remove old getattr
orangefs: use new getattr in inode getattr and permission
orangefs: use new orangefs_inode_getattr to get size in write and llseek
orangefs: use new orangefs_inode_getattr to create new inodes
orangefs: rename orangefs_inode_getattr to orangefs_inode_old_getattr
orangefs: remove inode->i_lock wrapper
orangefs: put register_chrdev immediately before register_filesystem
...
Linus Torvalds [Sat, 26 Mar 2016 18:37:42 +0000 (11:37 -0700)]
Merge tag 'ntb-4.6' of git://github.com/jonmason/ntb
Pull NTB bug fixes from Jon Mason:
"NTB bug fixes for tasklet from spinning forever, link errors,
translation window setup, NULL ptr dereference, and ntb-perf errors.
Also, a modification to the driver API that makes _addr functions
optional"
* tag 'ntb-4.6' of git://github.com/jonmason/ntb:
NTB: Remove _addr functions from ntb_hw_amd
NTB: Make _addr functions optional in the API
NTB: Fix incorrect clean up routine in ntb_perf
NTB: Fix incorrect return check in ntb_perf
ntb: fix possible NULL dereference
ntb: add missing setup of translation window
ntb: stop link work when we do not have memory
ntb: stop tasklet from spinning forever during shutdown.
ntb: perf test: fix address space confusion
Linus Torvalds [Sat, 26 Mar 2016 18:31:01 +0000 (11:31 -0700)]
Merge tag 'scsi-misc' of git://git./linux/kernel/git/jejb/scsi
Pull more SCSI updates from James Bottomley:
"The only new stuff which missed the first pull request is an update to
the UFS driver.
The rest is an assortment of bug fixes and minor tweaks which appeared
recently (some are fixes for recent code and some are stuff spotted
recently by the checkers or the new gcc-6 compiler [most of Arnd's
stuff])"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (32 commits)
scsi_common: do not clobber fixed sense information
scsi: ufs: select CONFIG_NLS
scsi: fc: use get/put_unaligned64 for wwn access
fnic: move printk()s outside of the critical code section.
qla2xxx: avoid maybe_uninitialized warning
megaraid_sas: add missing curly braces in ioctl handler
lpfc: fix misleading indentation
scsi_transport_sas: add 'scsi_target_id' sysfs attribute
scsi_dh_alua: uninitialized variable in alua_check_vpd()
scsi: ufs-qcom: add printouts of testbus debug registers
scsi: ufs-qcom: enable/disable the device ref clock
scsi: ufs-qcom: set PA_Local_TX_LCC_Enable before link startup
scsi: ufs: add device quirk delay before putting UFS rails in LPM
scsi: ufs: fix leakage during link off state
scsi: ufs: tune UniPro parameters to optimize hibern8 exit time
scsi: ufs: handle non spec compliant bkops behaviour by device
scsi: ufs: add retry for query descriptors
scsi: ufs: add error recovery after DL NAC error
scsi: ufs: make error handling bit faster
scsi: ufs: disable vccq if it's not needed by UFS device
...
Linus Torvalds [Sat, 26 Mar 2016 17:13:05 +0000 (10:13 -0700)]
f2fs/crypto: fix xts_tweak initialization
Commit
0b81d07790726 ("fs crypto: move per-file encryption from f2fs
tree to fs/crypto") moved the f2fs crypto files to fs/crypto/ and
renamed the symbol prefixes from "f2fs_" to "fscrypt_" (and from "F2FS_"
to just "FS" for preprocessor symbols).
Because of the symbol renaming, it's a bit hard to see it as a file
move: use
git show -M30
0b81d07790726
to lower the rename detection to just 30% similarity and make git show
the files as renamed (the header file won't be shown as a rename even
then - since all it contains is symbol definitions, it looks almost
completely different).
Even with the renames showing as renames, the diffs are not all that
easy to read, since so much is just the renames. But Eric Biggers
noticed that it's not just all renames: the initialization of the
xts_tweak had been broken too, using the inode number rather than the
page offset.
That's not right - it makes the xfs_tweak the same for all pages of each
inode. It _might_ make sense to make the xfs_tweak contain both the
offset _and_ the inode number, but not just the inode number.
Reported-by: Eric Biggers <ebiggers3@gmail.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Allen Hubbe [Mon, 21 Mar 2016 08:53:14 +0000 (04:53 -0400)]
NTB: Remove _addr functions from ntb_hw_amd
Kernel zero day testing warned about address space confusion. A virtual
iomem address was used where a physical address is expected. The
offending functions implement an optional part of the api, so they are
removed. They can be added later, after testing.
Fixes: a1b3695820aa490e58915d720a1438069813008b
Signed-off-by: Allen Hubbe <Allen.Hubbe@emc.com>
Acked-by: Xiangliang Yu <Xiangliang.Yu@amd.com>
Signed-off-by: Jon Mason <jdmason@kudzu.us>
Al Viro [Fri, 25 Mar 2016 23:56:34 +0000 (19:56 -0400)]
orangefs: fix orangefs_superblock locking
* switch orangefs_remount() to taking ORANGEFS_SB(sb) instead of sb
* remove from the list _before_ orangefs_unmount() - request_mutex
in the latter will make sure that nothing observed in the loop in
ORANGEFS_DEV_REMOUNT_ALL handling will get freed until the end
of loop
* on removal, keep the forward pointer and zero the back one. That
way we can drop and regain the spinlock in the loop body (again,
ORANGEFS_DEV_REMOUNT_ALL one) and still be able to get to the
rest of the list.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Al Viro [Wed, 17 Feb 2016 01:15:43 +0000 (20:15 -0500)]
orangefs: fix do_readv_writev() handling of error halfway through
Error should only be returned if nothing had been read/written.
Otherwise we need to report a short read/write instead.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Al Viro [Wed, 17 Feb 2016 02:08:29 +0000 (21:08 -0500)]
orangefs: have ->kill_sb() evict the VFS side of things first
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Al Viro [Wed, 17 Feb 2016 01:25:19 +0000 (20:25 -0500)]
orangefs: sanitize ->llseek()
a) open files can't have NULL inodes
b) it's SEEK_END, not ORANGEFS_SEEK_END; no need to get cute.
c) make_bad_inode() on lseek()?
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Al Viro [Wed, 17 Feb 2016 01:12:04 +0000 (20:12 -0500)]
orangefs-bufmap.h: trim unused junk
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Al Viro [Wed, 17 Feb 2016 01:10:26 +0000 (20:10 -0500)]
orangefs: saner calling conventions for getting a slot
just have it return the slot number or -E... - the caller checks
the sign anyway
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Al Viro [Wed, 17 Feb 2016 01:06:19 +0000 (20:06 -0500)]
orangefs_copy_{to,from}_bufmap(): don't pass bufmap pointer
it's always __orangefs_bufmap
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Al Viro [Wed, 17 Feb 2016 00:54:13 +0000 (19:54 -0500)]
orangefs: get rid of readdir_handle_s
no point, really - we couldn't keep those across the calls of
getdents(); it would be too easy to DoS, having all slots exhausted.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Mike Marshall <hubcap@omnibond.com>
Linus Torvalds [Fri, 25 Mar 2016 23:59:11 +0000 (16:59 -0700)]
Merge branch 'akpm' (patches from Andrew)
Merge fourth patch-bomb from Andrew Morton:
"A lot more stuff than expected, sorry. A bunch of ocfs2 reviewing was
finished off.
- mhocko's oom-reaper out-of-memory-handler changes
- ocfs2 fixes and features
- KASAN feature work
- various fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (42 commits)
thp: fix typo in khugepaged_scan_pmd()
MAINTAINERS: fill entries for KASAN
mm/filemap: generic_file_read_iter(): check for zero reads unconditionally
kasan: test fix: warn if the UAF could not be detected in kmalloc_uaf2
mm, kasan: stackdepot implementation. Enable stackdepot for SLAB
arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections
mm, kasan: add GFP flags to KASAN API
mm, kasan: SLAB support
kasan: modify kmalloc_large_oob_right(), add kmalloc_pagealloc_oob_right()
include/linux/oom.h: remove undefined oom_kills_count()/note_oom_kill()
mm/page_alloc: prevent merging between isolated and other pageblocks
drivers/memstick/host/r592.c: avoid gcc-6 warning
ocfs2: extend enough credits for freeing one truncate record while replaying truncate records
ocfs2: extend transaction for ocfs2_remove_rightmost_path() and ocfs2_update_edge_lengths() before to avoid inconsistency between inode and et
ocfs2/dlm: move lock to the tail of grant queue while doing in-place convert
ocfs2: solve a problem of crossing the boundary in updating backups
ocfs2: fix occurring deadlock by changing ocfs2_wq from global to local
ocfs2/dlm: fix BUG in dlm_move_lockres_to_recovery_list
ocfs2/dlm: fix race between convert and recovery
ocfs2: fix a deadlock issue in ocfs2_dio_end_io_write()
...
Linus Torvalds [Fri, 25 Mar 2016 23:55:37 +0000 (16:55 -0700)]
Merge tag 'pm+acpi-4.6-rc1-3' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management fixlet from Rafael Wysocki:
"One of commits in my previous pull request changed the permissions of
drivers/power/avs/rockchip-io-domain.c to executable by mistake"
* tag 'pm+acpi-4.6-rc1-3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Fix permissions of drivers/power/avs/rockchip-io-domain.c
Linus Torvalds [Fri, 25 Mar 2016 23:48:45 +0000 (16:48 -0700)]
Merge tag 'please-pull-preadv2' of git://git./linux/kernel/git/aegl/linux
Pull ia64 update from Tony Luck:
"Wire up new system calls p{read,write}v2 for ia64"
* tag 'please-pull-preadv2' of git://git.kernel.org/pub/scm/linux/kernel/git/aegl/linux:
[IA64] Enable preadv2 and pwritev2 syscalls for ia64
Linus Torvalds [Fri, 25 Mar 2016 23:39:05 +0000 (16:39 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input
Pull more input updates from Dmitry Torokhov:
"Second round of updates for the input subsystem.
The BYD PS/2 protocol driver now uses absolute reporting mode and
should behave more like other touchpads; Synaptics driver needed to
extend one of its quirks to a newer firmware version, and a few USB
drivers got tightened up checks for the contents of their descriptors"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: sur40 - fix DMA on stack
Input: ati_remote2 - fix crashes on detecting device with invalid descriptor
Input: synaptics - handle spurious release of trackstick buttons, again
Input: synaptics-rmi4 - remove check of Non-NULL array
Input: byd - enable absolute mode
Input: ims-pcu - sanity check against missing interfaces
Input: melfas_mip4 - add hw_version sysfs attribute
Kirill A. Shutemov [Fri, 25 Mar 2016 21:22:20 +0000 (14:22 -0700)]
thp: fix typo in khugepaged_scan_pmd()
!PageLRU should lead to SCAN_PAGE_LRU, not SCAN_SCAN_ABORT result.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Ebru Akagunduz <ebru.akagunduz@gmail.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrey Ryabinin [Fri, 25 Mar 2016 21:22:17 +0000 (14:22 -0700)]
MAINTAINERS: fill entries for KASAN
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Alexander Potapenko <glider@google.com>
Acked-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nicolai Stange [Fri, 25 Mar 2016 21:22:14 +0000 (14:22 -0700)]
mm/filemap: generic_file_read_iter(): check for zero reads unconditionally
If
- generic_file_read_iter() gets called with a zero read length,
- the read offset is at a page boundary,
- IOCB_DIRECT is not set
- and the page in question hasn't made it into the page cache yet,
then do_generic_file_read() will trigger a readahead with a req_size hint
of zero.
Since roundup_pow_of_two(0) is undefined, UBSAN reports
UBSAN: Undefined behaviour in include/linux/log2.h:63:13
shift exponent 64 is too large for 64-bit type 'long unsigned int'
CPU: 3 PID: 1017 Comm: sa1 Tainted: G L 4.5.0-next-
20160318+ #14
[...]
Call Trace:
[...]
[<
ffffffff813ef61a>] ondemand_readahead+0x3aa/0x3d0
[<
ffffffff813ef61a>] ? ondemand_readahead+0x3aa/0x3d0
[<
ffffffff813c73bd>] ? find_get_entry+0x2d/0x210
[<
ffffffff813ef9c3>] page_cache_sync_readahead+0x63/0xa0
[<
ffffffff813cc04d>] do_generic_file_read+0x80d/0xf90
[<
ffffffff813cc955>] generic_file_read_iter+0x185/0x420
[...]
[<
ffffffff81510b06>] __vfs_read+0x256/0x3d0
[...]
when get_init_ra_size() gets called from ondemand_readahead().
The net effect is that the initial readahead size is arch dependent for
requested read lengths of zero: for example, since
1UL << (sizeof(unsigned long) * 8)
evaluates to 1 on x86 while its result is 0 on ARMv7, the initial readahead
size becomes 4 on the former and 0 on the latter.
What's more, whether or not the file access timestamp is updated for zero
length reads is decided differently for the two cases of IOCB_DIRECT
being set or cleared: in the first case, generic_file_read_iter()
explicitly skips updating that timestamp while in the latter case, it is
always updated through the call to do_generic_file_read().
According to POSIX, zero length reads "do not modify the last data access
timestamp" and thus, the IOCB_DIRECT behaviour is POSIXly correct.
Let generic_file_read_iter() unconditionally check the requested read
length at its entry and return immediately with success if it is zero.
Signed-off-by: Nicolai Stange <nicstange@gmail.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Potapenko [Fri, 25 Mar 2016 21:22:11 +0000 (14:22 -0700)]
kasan: test fix: warn if the UAF could not be detected in kmalloc_uaf2
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Potapenko [Fri, 25 Mar 2016 21:22:08 +0000 (14:22 -0700)]
mm, kasan: stackdepot implementation. Enable stackdepot for SLAB
Implement the stack depot and provide CONFIG_STACKDEPOT. Stack depot
will allow KASAN store allocation/deallocation stack traces for memory
chunks. The stack traces are stored in a hash table and referenced by
handles which reside in the kasan_alloc_meta and kasan_free_meta
structures in the allocated memory chunks.
IRQ stack traces are cut below the IRQ entry point to avoid unnecessary
duplication.
Right now stackdepot support is only enabled in SLAB allocator. Once
KASAN features in SLAB are on par with those in SLUB we can switch SLUB
to stackdepot as well, thus removing the dependency on SLUB stack
bookkeeping, which wastes a lot of memory.
This patch is based on the "mm: kasan: stack depots" patch originally
prepared by Dmitry Chernenkov.
Joonsoo has said that he plans to reuse the stackdepot code for the
mm/page_owner.c debugging facility.
[akpm@linux-foundation.org: s/depot_stack_handle/depot_stack_handle_t]
[aryabinin@virtuozzo.com: comment style fixes]
Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Potapenko [Fri, 25 Mar 2016 21:22:05 +0000 (14:22 -0700)]
arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections
KASAN needs to know whether the allocation happens in an IRQ handler.
This lets us strip everything below the IRQ entry point to reduce the
number of unique stack traces needed to be stored.
Move the definition of __irq_entry to <linux/interrupt.h> so that the
users don't need to pull in <linux/ftrace.h>. Also introduce the
__softirq_entry macro which is similar to __irq_entry, but puts the
corresponding functions to the .softirqentry.text section.
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Potapenko [Fri, 25 Mar 2016 21:22:02 +0000 (14:22 -0700)]
mm, kasan: add GFP flags to KASAN API
Add GFP flags to KASAN hooks for future patches to use.
This patch is based on the "mm: kasan: unified support for SLUB and SLAB
allocators" patch originally prepared by Dmitry Chernenkov.
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Potapenko [Fri, 25 Mar 2016 21:21:59 +0000 (14:21 -0700)]
mm, kasan: SLAB support
Add KASAN hooks to SLAB allocator.
This patch is based on the "mm: kasan: unified support for SLUB and SLAB
allocators" patch originally prepared by Dmitry Chernenkov.
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Alexander Potapenko [Fri, 25 Mar 2016 21:21:56 +0000 (14:21 -0700)]
kasan: modify kmalloc_large_oob_right(), add kmalloc_pagealloc_oob_right()
This patchset implements SLAB support for KASAN
Unlike SLUB, SLAB doesn't store allocation/deallocation stacks for heap
objects, therefore we reimplement this feature in mm/kasan/stackdepot.c.
The intention is to ultimately switch SLUB to use this implementation as
well, which will save a lot of memory (right now SLUB bloats each object
by 256 bytes to store the allocation/deallocation stacks).
Also neither SLUB nor SLAB delay the reuse of freed memory chunks, which
is necessary for better detection of use-after-free errors. We
introduce memory quarantine (mm/kasan/quarantine.c), which allows
delayed reuse of deallocated memory.
This patch (of 7):
Rename kmalloc_large_oob_right() to kmalloc_pagealloc_oob_right(), as
the test only checks the page allocator functionality. Also reimplement
kmalloc_large_oob_right() so that the test allocates a large enough
chunk of memory that still does not trigger the page allocator fallback.
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Andrey Konovalov <adech.fo@gmail.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Konstantin Serebryany <kcc@google.com>
Cc: Dmitry Chernenkov <dmitryc@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Tetsuo Handa [Fri, 25 Mar 2016 21:21:53 +0000 (14:21 -0700)]
include/linux/oom.h: remove undefined oom_kills_count()/note_oom_kill()
A leftover from commit
c32b3cbe0d06 ("oom, PM: make OOM detection in the
freezer path raceless").
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Vlastimil Babka [Fri, 25 Mar 2016 21:21:50 +0000 (14:21 -0700)]
mm/page_alloc: prevent merging between isolated and other pageblocks
Hanjun Guo has reported that a CMA stress test causes broken accounting of
CMA and free pages:
> Before the test, I got:
> -bash-4.3# cat /proc/meminfo | grep Cma
> CmaTotal: 204800 kB
> CmaFree: 195044 kB
>
>
> After running the test:
> -bash-4.3# cat /proc/meminfo | grep Cma
> CmaTotal: 204800 kB
> CmaFree:
6602584 kB
>
> So the freed CMA memory is more than total..
>
> Also the the MemFree is more than mem total:
>
> -bash-4.3# cat /proc/meminfo
> MemTotal:
16342016 kB
> MemFree:
22367268 kB
> MemAvailable:
22370528 kB
Laura Abbott has confirmed the issue and suspected the freepage accounting
rewrite around 3.18/4.0 by Joonsoo Kim. Joonsoo had a theory that this is
caused by unexpected merging between MIGRATE_ISOLATE and MIGRATE_CMA
pageblocks:
> CMA isolates MAX_ORDER aligned blocks, but, during the process,
> partialy isolated block exists. If MAX_ORDER is 11 and
> pageblock_order is 9, two pageblocks make up MAX_ORDER
> aligned block and I can think following scenario because pageblock
> (un)isolation would be done one by one.
>
> (each character means one pageblock. 'C', 'I' means MIGRATE_CMA,
> MIGRATE_ISOLATE, respectively.
>
> CC -> IC -> II (Isolation)
> II -> CI -> CC (Un-isolation)
>
> If some pages are freed at this intermediate state such as IC or CI,
> that page could be merged to the other page that is resident on
> different type of pageblock and it will cause wrong freepage count.
This was supposed to be prevented by CMA operating on MAX_ORDER blocks,
but since it doesn't hold the zone->lock between pageblocks, a race
window does exist.
It's also likely that unexpected merging can occur between
MIGRATE_ISOLATE and non-CMA pageblocks. This should be prevented in
__free_one_page() since commit
3c605096d315 ("mm/page_alloc: restrict
max order of merging on isolated pageblock"). However, we only check
the migratetype of the pageblock where buddy merging has been initiated,
not the migratetype of the buddy pageblock (or group of pageblocks)
which can be MIGRATE_ISOLATE.
Joonsoo has suggested checking for buddy migratetype as part of
page_is_buddy(), but that would add extra checks in allocator hotpath
and bloat-o-meter has shown significant code bloat (the function is
inline).
This patch reduces the bloat at some expense of more complicated code.
The buddy-merging while-loop in __free_one_page() is initially bounded
to pageblock_border and without any migratetype checks. The checks are
placed outside, bumping the max_order if merging is allowed, and
returning to the while-loop with a statement which can't be possibly
considered harmful.
This fixes the accounting bug and also removes the arguably weird state
in the original commit
3c605096d315 where buddies could be left
unmerged.
Fixes: 3c605096d315 ("mm/page_alloc: restrict max order of merging on isolated pageblock")
Link: https://lkml.org/lkml/2016/3/2/280
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Hanjun Guo <guohanjun@huawei.com>
Tested-by: Hanjun Guo <guohanjun@huawei.com>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Debugged-by: Laura Abbott <labbott@redhat.com>
Debugged-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: <stable@vger.kernel.org> [3.18+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Fri, 25 Mar 2016 21:21:47 +0000 (14:21 -0700)]
drivers/memstick/host/r592.c: avoid gcc-6 warning
The r592 driver relies on behavior of the DMA mapping API that is
normally observed but not guaranteed by the API. Instead it uses a
runtime check to fail transfers if the API ever behaves
When CONFIG_NEED_SG_DMA_LENGTH is not set, one of the checks turns into a
comparison of a variable with itself, which gcc-6.0 now warns about:
drivers/memstick/host/r592.c: In function 'r592_transfer_fifo_dma':
drivers/memstick/host/r592.c:302:31: error: self-comparison always evaluates to false [-Werror=tautological-compare]
(sg_dma_len(&dev->req->sg) < dev->req->sg.length)) {
^
The check itself is not a problem, so this patch just rephrases the
condition in a way that gcc does not consider an indication of a mistake.
We already know that dev->req->sg.length was initially R592_LFIFO_SIZE, so
we can compare it to that constant again.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Maxim Levitsky <maximlevitsky@gmail.com>
Cc: Quentin Lambert <lambert.quentin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Xue jiufei [Fri, 25 Mar 2016 21:21:44 +0000 (14:21 -0700)]
ocfs2: extend enough credits for freeing one truncate record while replaying truncate records
Now function ocfs2_replay_truncate_records() first modifies tl_used,
then calls ocfs2_extend_trans() to extend transactions for gd and alloc
inode used for freeing clusters. jbd2_journal_restart() may be called
and it may happen that tl_used in truncate log is decreased but the
clusters are not freed, which means these clusters are lost. So we
should avoid extending transactions in these two operations.
Signed-off-by: joyce.xue <xuejiufei@huawei.com>
Reviewed-by: Mark Fasheh <mfasheh@suse.de>
Acked-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Xue jiufei [Fri, 25 Mar 2016 21:21:41 +0000 (14:21 -0700)]
ocfs2: extend transaction for ocfs2_remove_rightmost_path() and ocfs2_update_edge_lengths() before to avoid inconsistency between inode and et
I found that jbd2_journal_restart() is called in some places without
keeping things consistently before. However, jbd2_journal_restart() may
commit the handle's transaction and restart another one. If the first
transaction is committed successfully while another not, it may cause
filesystem inconsistency or read only. This is an effort to fix this
kind of problems.
This patch (of 3):
The following functions will be called while truncating an extent:
ocfs2_remove_btree_range
-> ocfs2_start_trans
-> ocfs2_remove_extent
-> ocfs2_truncate_rec
-> ocfs2_extend_rotate_transaction
-> jbd2_journal_restart if jbd2_journal_extend fail
-> ocfs2_rotate_tree_left
-> ocfs2_remove_rightmost_path
-> ocfs2_extend_rotate_transaction
-> ocfs2_unlink_subtree
-> ocfs2_update_edge_lengths
-> ocfs2_extend_trans
-> jbd2_journal_restart if jbd2_journal_extend fail
-> ocfs2_et_update_clusters
-> ocfs2_commit_trans
jbd2_journal_restart() may be called and it may happened that the buffers
dirtied in ocfs2_truncate_rec() are committed while buffers dirtied in
ocfs2_et_update_clusters() are not, the total clusters on extent tree and
i_clusters in ocfs2_dinode is inconsistency. So the clusters got from
ocfs2_dinode is incorrect, and it also cause read-only problem when call
ocfs2_commit_truncate() with the error message: "Inode %llu has empty
extent block at %llu".
We should extend enough credits for function ocfs2_remove_rightmost_path
and ocfs2_update_edge_lengths to avoid this inconsistency.
Signed-off-by: joyce.xue <xuejiufei@huawei.com>
Acked-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
xuejiufei [Fri, 25 Mar 2016 21:21:38 +0000 (14:21 -0700)]
ocfs2/dlm: move lock to the tail of grant queue while doing in-place convert
We have found a bug when two nodes doing umount one after another.
1) Node 1 migrate a lockres that has 3 locks in grant queue such as
N2(PR)<->N3(NL)<->N4(PR) to N2. After migration, lvb of the lock
N3(NL) and N4(PR) are empty on node 2 because migration target do not
copy lvb to these two lock.
2) Node 3 want to convert to PR, it can be granted in
__dlmconvert_master(), and the order of these locks is unchanged. The
lvb of the lock N3(PR) on node 2 is copyed from lockres in function
dlm_update_lvb() while the lvb of lock N4(PR) is still empty.
3) Node 2 want to leave domain, it will migrate this lockres to node 3.
Then node 2 will trigger the BUG in dlm_prepare_lvb_for_migration()
when adding the lock N4(PR) to mres with the following message because
the lvb of mres is already copied from lock N3(PR), but the lvb of lock
N4(PR) is empty.
"Mismatched lvb in lock cookie=%u:%llu, name=%.*s, node=%u"
[akpm@linux-foundation.org: tweak comment]
Signed-off-by: xuejiufei <xuejiufei@huawei.com>
Acked-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
jiangyiwen [Fri, 25 Mar 2016 21:21:35 +0000 (14:21 -0700)]
ocfs2: solve a problem of crossing the boundary in updating backups
In update_backups() there exists a problem of crossing the boundary as
follows:
we assume that lun will be resized to 1TB(cluster_size is 32kb), it will
include 0~
33554431 cluster, in update_backups func, it will backup super
block in location of 1TB which is the 33554432th cluster, so the
phenomenon of crossing the boundary happens.
Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Xue jiufei <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
jiangyiwen [Fri, 25 Mar 2016 21:21:32 +0000 (14:21 -0700)]
ocfs2: fix occurring deadlock by changing ocfs2_wq from global to local
This patch fixes a deadlock, as follows:
Node 1 Node 2 Node 3
1)volume a and b are only mount vol a only mount vol b
mounted
2) start to mount b start to mount a
3) check hb of Node 3 check hb of Node 2
in vol a, qs_holds++ in vol b, qs_holds++
4) -------------------- all nodes' network down --------------------
5) progress of mount b the same situation as
failed, and then call Node 2
ocfs2_dismount_volume.
but the process is hung,
since there is a work
in ocfs2_wq cannot beo
completed. This work is
about vol a, because
ocfs2_wq is global wq.
BTW, this work which is
scheduled in ocfs2_wq is
ocfs2_orphan_scan_work,
and the context in this work
needs to take inode lock
of orphan_dir, because
lockres owner are Node 1 and
all nodes' nework has been down
at the same time, so it can't
get the inode lock.
6) Why can't this node be fenced
when network disconnected?
Because the process of
mount is hung what caused qs_holds
is not equal 0.
Because all works in the ocfs2_wq are relative to the super block.
The solution is to change the ocfs2_wq from global to local. In other
words, move it into struct ocfs2_super.
Signed-off-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Joseph Qi <joseph.qi@huawei.com>
Cc: Xue jiufei <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joseph Qi [Fri, 25 Mar 2016 21:21:29 +0000 (14:21 -0700)]
ocfs2/dlm: fix BUG in dlm_move_lockres_to_recovery_list
When master handles convert request, it queues ast first and then
returns status. This may happen that the ast is sent before the request
status because the above two messages are sent by two threads. And
right after the ast is sent, if master down, it may trigger BUG in
dlm_move_lockres_to_recovery_list in the requested node because ast
handler moves it to grant list without clear lock->convert_pending. So
remove BUG_ON statement and check if the ast is processed in
dlmconvert_remote.
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reported-by: Yiwen Jiang <jiangyiwen@huawei.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Tariq Saeed <tariq.x.saeed@oracle.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Joseph Qi [Fri, 25 Mar 2016 21:21:26 +0000 (14:21 -0700)]
ocfs2/dlm: fix race between convert and recovery
There is a race window between dlmconvert_remote and
dlm_move_lockres_to_recovery_list, which will cause a lock with
OCFS2_LOCK_BUSY in grant list, thus system hangs.
dlmconvert_remote
{
spin_lock(&res->spinlock);
list_move_tail(&lock->list, &res->converting);
lock->convert_pending = 1;
spin_unlock(&res->spinlock);
status = dlm_send_remote_convert_request();
>>>>>> race window, master has queued ast and return DLM_NORMAL,
and then down before sending ast.
this node detects master down and calls
dlm_move_lockres_to_recovery_list, which will revert the
lock to grant list.
Then OCFS2_LOCK_BUSY won't be cleared as new master won't
send ast any more because it thinks already be authorized.
spin_lock(&res->spinlock);
lock->convert_pending = 0;
if (status != DLM_NORMAL)
dlm_revert_pending_convert(res, lock);
spin_unlock(&res->spinlock);
}
In this case, check if res->state has DLM_LOCK_RES_RECOVERING bit set
(res is still in recovering) or res master changed (new master has
finished recovery), reset the status to DLM_RECOVERING, then it will
retry convert.
Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reported-by: Yiwen Jiang <jiangyiwen@huawei.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Tariq Saeed <tariq.x.saeed@oracle.com>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>