openwrt/staging/blogic.git
7 years agoInput: wdt87xx_i2c - use managed devm_device_add_group
Andi Shyti [Fri, 29 Sep 2017 23:40:25 +0000 (16:40 -0700)]
Input: wdt87xx_i2c - use managed devm_device_add_group

Commit 57b8ff070f98 ("driver core: add devm_device_add_group() and
friends") has added the managed version for creating sysfs group files.

Use devm_device_add_group instead of sysfs_create_group and remove the
relative sysfs_remove_group.

Signed-off-by: Andi Shyti <andi@etezian.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: rohm_bu21023 - use managed devm_device_add_group
Andi Shyti [Fri, 29 Sep 2017 23:39:34 +0000 (16:39 -0700)]
Input: rohm_bu21023 - use managed devm_device_add_group

Commit 57b8ff070f98 ("driver core: add devm_device_add_group() and
friends") has added the managed version for creating sysfs group files.

Use devm_device_add_group instead of sysfs_create_group and remove the
action that cleans the sysfs file when exiting the driver.

Signed-off-by: Andi Shyti <andi@etezian.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: raydium_i2c_ts - use managed devm_device_add_group
Andi Shyti [Fri, 29 Sep 2017 23:38:57 +0000 (16:38 -0700)]
Input: raydium_i2c_ts - use managed devm_device_add_group

Commit 57b8ff070f98 ("driver core: add devm_device_add_group() and
friends") has added the managed version for creating sysfs group files.

Use devm_device_add_group instead of sysfs_create_group and remove the
action that cleans the sysfs file when exiting the driver.

Signed-off-by: Andi Shyti <andi@etezian.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: melfas_mip4 - use managed devm_device_add_group
Andi Shyti [Fri, 29 Sep 2017 23:38:23 +0000 (16:38 -0700)]
Input: melfas_mip4 - use managed devm_device_add_group

Commit 57b8ff070f98 ("driver core: add devm_device_add_group() and
friends") has added the managed version for creating sysfs group files.

Use devm_device_add_group instead of sysfs_create_group and remove the
action that cleans the sysfs file when exiting the driver.

Signed-off-by: Andi Shyti <andi@etezian.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: elants_i2c - use managed devm_device_add_group
Andi Shyti [Fri, 29 Sep 2017 23:37:33 +0000 (16:37 -0700)]
Input: elants_i2c - use managed devm_device_add_group

Commit 57b8ff070f98 ("driver core: add devm_device_add_group() and
friends") has added the managed version for creating sysfs group files.

Use devm_device_add_group instead of sysfs_create_group and remove the
action that cleans the sysfs file when exiting the driver.

Signed-off-by: Andi Shyti <andi@etezian.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: elan_i2c - do not clobber interrupt trigger on x86
Dmitry Torokhov [Thu, 28 Sep 2017 16:57:34 +0000 (09:57 -0700)]
Input: elan_i2c - do not clobber interrupt trigger on x86

On x86 we historically used falling edge interrupts in the driver
because that's how first Chrome devices were configured. They also
did not use ACPI to enumerate I2C devices (because back then there
was no kernel support for that), so trigger was hard-coded in the
driver. However the controller behavior is much more reliable if
we use level triggers, and that is how we configured ARM devices,
and how want to configure newer x86 devices as well. All newer
x86 boxes have their I2C devices enumerated in ACPI.

Let's see if platform code (ACPI, DT) described interrupt and
specified particular trigger type, and if so, let's use it instead
of always clobbering trigger with IRQF_TRIGGER_FALLING. We will
still use this trigger type as a fallback if platform code left
interrupt trigger unconfigured.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=196761
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: usbtouchscreen - use EXPERT instead of EMBEDDED for EasyTouch
Randy Dunlap [Tue, 26 Sep 2017 18:09:34 +0000 (11:09 -0700)]
Input: usbtouchscreen - use EXPERT instead of EMBEDDED for EasyTouch

Change control of TOUCHSCREEN_USB_EASYTOUCH prompt string from
EMBEDDED to EXPERT to match the rest of this Kconfig file.

Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: sa1111ps2 - extend test delay
Russell King [Tue, 26 Sep 2017 16:57:12 +0000 (09:57 -0700)]
Input: sa1111ps2 - extend test delay

A 2us delay is too small for the bus to settle after writing to the
register.  Extend to 10us which gives more reliable results.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: sa1111ps2 - remove special sa1111 mmio accessors
Russell King [Tue, 26 Sep 2017 16:56:56 +0000 (09:56 -0700)]
Input: sa1111ps2 - remove special sa1111 mmio accessors

Remove the special SA1111 MMIO accessors from the SA1111 PS/2 driver
as their definition will be removed shortly.  The SA1111 accessors are
barrierless, so use the _relaxed variants.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: sa1111ps2 - use sa1111_get_irq() to obtain IRQ resources
Russell King [Tue, 26 Sep 2017 16:56:23 +0000 (09:56 -0700)]
Input: sa1111ps2 - use sa1111_get_irq() to obtain IRQ resources

Use the provided sa1111_get_irq() to fetch the IRQ resources for the
SA1111 PS/2 driver.

Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: stmfts - use devm_device_add_group
Andi Shyti [Fri, 22 Sep 2017 16:57:53 +0000 (09:57 -0700)]
Input: stmfts - use devm_device_add_group

instead of sysfs_create_group.

Signed-off-by: Andi Shyti <andi.shyti@samsung.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: elan_i2c - remove duplicate ELAN0605 id
Nik Nyby [Thu, 21 Sep 2017 23:34:20 +0000 (16:34 -0700)]
Input: elan_i2c - remove duplicate ELAN0605 id

ELAN0605 appears twice here.

Signed-off-by: Nik Nyby <nikolas@gnu.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoMerge tag 'ib-mfd-input-rtc-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git...
Dmitry Torokhov [Thu, 21 Sep 2017 23:41:15 +0000 (16:41 -0700)]
Merge tag 'ib-mfd-input-rtc-v4.14' of git://git./linux/kernel/git/lee/mfd into next

Merge "Immutable branch between MFD, Input and RTC due for the v3.14
merge window" to have dm355evm_msp.h header moved into right place.

7 years agoMerge tag 'ib-mfd-many-v4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/lee...
Dmitry Torokhov [Thu, 21 Sep 2017 23:38:09 +0000 (16:38 -0700)]
Merge tag 'ib-mfd-many-v4.14' of git://git./linux/kernel/git/lee/mfd into next

Merge "Immutable branch between MFD and many other subsystems due for
the v4.14 merge window" to get the TWL headers moved to the right place.

7 years agoInput: adxl34x - do not treat FIFO_MODE() as boolean
Arnd Bergmann [Wed, 20 Sep 2017 19:04:04 +0000 (12:04 -0700)]
Input: adxl34x - do not treat FIFO_MODE() as boolean

FIFO_MODE() is a macro expression with a '<<' operator, which gcc points
out could be misread as a '<':

drivers/input/misc/adxl34x.c: In function 'adxl34x_probe':
drivers/input/misc/adxl34x.c:799:36: error: '<<' in boolean context, did you mean '<' ? [-Werror=int-in-bool-context]

While utility of this warning is being disputed (Chief Penguin: "This
warning is clearly pure garbage.") FIFO_MODE() extracts range of values,
with 0 being FIFO_BYPASS, and not something that is logically boolean.

This converts the test to an explicit comparison with FIFO_BYPASS,
making it clearer to gcc and the reader what is intended.

Fixes: e27c729219ad ("Input: add driver for ADXL345/346 Digital Accelerometers")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: i8042 - add Gigabyte P57 to the keyboard reset table
Kai-Heng Feng [Fri, 15 Sep 2017 16:36:16 +0000 (09:36 -0700)]
Input: i8042 - add Gigabyte P57 to the keyboard reset table

Similar to other Gigabyte laptops, the touchpad on P57 requires a
keyboard reset to detect Elantech touchpad correctly.

BugLink: https://bugs.launchpad.net/bugs/1594214
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: xpad - validate USB endpoint type during probe
Cameron Gutman [Tue, 12 Sep 2017 18:27:44 +0000 (11:27 -0700)]
Input: xpad - validate USB endpoint type during probe

We should only see devices with interrupt endpoints. Ignore any other
endpoints that we find, so we don't send try to send them interrupt URBs
and trigger a WARN down in the USB stack.

Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Cc: <stable@vger.kernel.org> # c01b5e7464f0 Input: xpad - don't depend on endpoint order
Signed-off-by: Cameron Gutman <aicommander@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: ucb1400_ts - fix suspend and resume handling
Dmitry Torokhov [Fri, 1 Sep 2017 01:23:11 +0000 (18:23 -0700)]
Input: ucb1400_ts - fix suspend and resume handling

Instead of stopping the touchscreen we were starting it in suspend, and
disabling it in resume.

Fixes: c899afedf168 ("Input: ucb1400_ts - convert to threaded IRQ")
Reported-by: Anton Volkov <avolkov@ispras.ru>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: edt-ft5x06 - fix access to non-existing register
Luca Ceresoli [Thu, 7 Sep 2017 21:28:28 +0000 (14:28 -0700)]
Input: edt-ft5x06 - fix access to non-existing register

reg_addr->reg_report_rate is supposed to exist in M06, not M09.

The driver is written to skip avoids access to non-existing registers
when the register address is NO_REGISTER (0xff). But
reg_addr->reg_report_rate is initialized to 0x00 by devm_kzalloc() (in
edt_ft5x06_ts_probe()) and not changed thereafter. So the checks do
not work and an access to register 0x00 is done.

Fix by setting reg_addr->reg_report_rate to NO_REGISTER.

Also fix the only place where reg_report_rate is checked against zero
instead of NO_REGISTER.

Signed-off-by: Luca Ceresoli <luca@lucaceresoli.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: elantech - make arrays debounce_packet static, reduces object code size
Colin Ian King [Thu, 7 Sep 2017 21:27:26 +0000 (14:27 -0700)]
Input: elantech - make arrays debounce_packet static, reduces object code size

Don't populate the arrays debounce_packet on the stack, instead make
them static.  Makes the object code smaller by over 870 bytes:

Before:
   text    data     bss     dec     hex filename
  30553    9152       0   39705    9b19 drivers/input/mouse/elantech.o

After:
   text    data     bss     dec     hex filename
  29521    9312       0   38833    97b1 drivers/input/mouse/elantech.o

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: surface3_spi - make const array header static, reduces object code size
Colin Ian King [Thu, 7 Sep 2017 21:27:12 +0000 (14:27 -0700)]
Input: surface3_spi - make const array header static, reduces object code size

Don't populate the const array header on the stack, instead make it
static.  Makes the object code smaller by over 180 bytes:

Before:
   text    data     bss     dec     hex filename
   6003    1536       0    7539    1d73 surface3_spi.o

After:
   text    data     bss     dec     hex filename
   5726    1632       0    7358    1cbe surface3_spi.o

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: goodix - add support for capacitive home button
Sergei A. Trusov [Thu, 7 Sep 2017 00:29:24 +0000 (17:29 -0700)]
Input: goodix - add support for capacitive home button

On some x86 tablets with a Goodix touchscreen, the Windows logo on the
front is a capacitive home button. Touching this button results in a touch
with bit 4 of the first byte set, while only the lower 4 bits (0-3) are
used to indicate the number of touches.

Report a KEY_LEFTMETA press when this happens.

Note that the hardware might support more than one button, in which
case the "id" byte of coor_data would identify the button in question.
This is not implemented as we don't have access to hardware with
multiple buttons.

Signed-off-by: Sergei A. Trusov <sergei.a.trusov@ya.ru>
Acked-by: Bastien Nocera <hadess@hadess.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: add a driver for PWM controllable vibrators
Sebastian Reichel [Mon, 4 Sep 2017 16:30:03 +0000 (09:30 -0700)]
Input: add a driver for PWM controllable vibrators

Provide a simple driver for PWM controllable vibrators.
It will be used by Motorola Droid 4.

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: adi - make array seq static, reduces object code size
Colin Ian King [Mon, 4 Sep 2017 16:17:39 +0000 (09:17 -0700)]
Input: adi - make array seq static, reduces object code size

Don't populate the array seq on the stack, instead make it static.
Makes the object code smaller by over 170 bytes:

Before:
   text    data     bss     dec     hex filename
  13227    3232       0   16459    404b drivers/input/joystick/adi.o

After:
   text    data     bss     dec     hex filename
  12957    3328       0   16285    3f9d drivers/input/joystick/adi.o

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agomfd: twl: Move header file out of I2C realm
Wolfram Sang [Mon, 14 Aug 2017 16:34:24 +0000 (18:34 +0200)]
mfd: twl: Move header file out of I2C realm

include/linux/i2c is not for client devices. Move the header file to a
more appropriate location.

Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Alexandre Belloni <alexandre.belloni@free-electrons.com>
Acked-by: Mark Brown <broonie@kernel.org>
Acked-by: Sebastian Reichel <sebastian.reichel@collabora.co.uk>
Acked-by: Jonathan Cameron <jic23@kernel.org>
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Acked-by: Kishon Vijay Abraham I <kishon@ti.com>
Acked-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Acked-by: Thierry Reding <thierry.reding@gmail.com>
Acked-by: Tony Lindgren <tony@atomide.com>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Linus Walleij <linus.walleij@linaro.org>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Lee Jones <lee.jones@linaro.org>
7 years agoLinux 4.13
Linus Torvalds [Sun, 3 Sep 2017 20:56:17 +0000 (13:56 -0700)]
Linux 4.13

7 years agoMerge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Linus Torvalds [Sun, 3 Sep 2017 16:50:26 +0000 (09:50 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus

Pull MIPS fixes from Ralf Baechle:
 "The two indirect syscall fixes have sat in linux-next for a few days.
  I did check back with a hardware designer to ensure a SYNC is really
  what's required for the GIC fix and so the GIC fix didn't make it into
  to linux-next in time for this final pull request.

  It builds in local build tests and passes Imagination's test system"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  irqchip: mips-gic: SYNC after enabling GIC region
  MIPS: Remove pt_regs adjustments in indirect syscall handler
  MIPS: seccomp: Fix indirect syscall args

7 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 3 Sep 2017 16:35:21 +0000 (09:35 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:

 - Expand the space for uncompressing as the LZ4 worst case does not fit
   into the currently reserved space

 - Validate boot parameters more strictly to prevent out of bound access
   in the decompressor/boot code

 - Fix off by one errors in get_segment_base()

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/boot: Prevent faulty bootparams.screeninfo from causing harm
  x86/boot: Provide more slack space during decompression
  x86/ldt: Fix off by one in get_segment_base()

7 years agoMerge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 3 Sep 2017 16:30:40 +0000 (09:30 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer fix from Thomas Gleixner:
 "A single fix for a thinko in the raw timekeeper update which causes
  clock MONOTONIC_RAW to run with erratically increased frequency"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  time: Fix ktime_get_raw() incorrect base accumulation

7 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 3 Sep 2017 16:23:23 +0000 (09:23 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Thomas Gleixner:

 - Prevent a potential inconistency in the perf user space access which
   might lead to evading sanity checks.

 - Prevent perf recording function trace entries twice

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/ftrace: Fix double traces of perf on ftrace:function
  perf/core: Fix potential double-fetch bug

7 years agoMerge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sat, 2 Sep 2017 03:57:27 +0000 (20:57 -0700)]
Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs version warning fix from Steve French:
 "As requested, additional kernel warning messages to clarify the
  default dialect changes"

[ There is still some discussion about exactly which version should be
  the new default.  Longer-term we have auto-negotiation coming, but
  that's not there yet..  - Linus ]

* 'for-next' of git://git.samba.org/sfrench/cifs-2.6:
  Fix warning messages when mounting to older servers

7 years agoMerge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc
Linus Torvalds [Sat, 2 Sep 2017 00:16:40 +0000 (17:16 -0700)]
Merge tag 'armsoc-fixes' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "A couple of late-arriving fixes before final 4.13:

   - A few reverts of DT bindings on Allwinner for their ethernet
     driver. Discussion didn't converge, and since bindings are
     considered ABI it makes sense to revert instead of having to
     support two bindings long-term.

   - A fix to enumerate GPIOs properly on Marvell Armada AP806"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  arm64: dts: marvell: fix number of GPIOs in Armada AP806 description
  arm: dts: sunxi: Revert EMAC changes
  arm64: dts: allwinner: Revert EMAC changes
  dt-bindings: net: Revert sun8i dwmac binding

7 years agoMerge tag 'mvebu-fixes-4.13-3' of git://git.infradead.org/linux-mvebu into fixes
Olof Johansson [Fri, 1 Sep 2017 23:37:02 +0000 (16:37 -0700)]
Merge tag 'mvebu-fixes-4.13-3' of git://git.infradead.org/linux-mvebu into fixes

mvebu fixes for 4.13 (part 3)

Fix number of GPIOs in AP806 description for Armada 7K/8K

* tag 'mvebu-fixes-4.13-3' of git://git.infradead.org/linux-mvebu:
  arm64: dts: marvell: fix number of GPIOs in Armada AP806 description

Signed-off-by: Olof Johansson <olof@lixom.net>
7 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Fri, 1 Sep 2017 22:03:13 +0000 (15:03 -0700)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "The ismt driver had a problem with a rarely used transaction type and
  the designware driver was made even more robust against non standard
  ACPI tables"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: designware: Round down ACPI provided clk to nearest supported clk
  i2c: ismt: Return EMSGSIZE for block reads with bogus length
  i2c: ismt: Don't duplicate the receive length for block reads

7 years agoepoll: fix race between ep_poll_callback(POLLFREE) and ep_free()/ep_remove()
Oleg Nesterov [Fri, 1 Sep 2017 16:55:33 +0000 (18:55 +0200)]
epoll: fix race between ep_poll_callback(POLLFREE) and ep_free()/ep_remove()

The race was introduced by me in commit 971316f0503a ("epoll:
ep_unregister_pollwait() can use the freed pwq->whead").  I did not
realize that nothing can protect eventpoll after ep_poll_callback() sets
->whead = NULL, only whead->lock can save us from the race with
ep_free() or ep_remove().

Move ->whead = NULL to the end of ep_poll_callback() and add the
necessary barriers.

TODO: cleanup the ewake/EPOLLEXCLUSIVE logic, it was confusing even
before this patch.

Hopefully this explains use-after-free reported by syzcaller:

BUG: KASAN: use-after-free in debug_spin_lock_before
...
 _raw_spin_lock_irqsave+0x4a/0x60 kernel/locking/spinlock.c:159
 ep_poll_callback+0x29f/0xff0 fs/eventpoll.c:1148

this is spin_lock(eventpoll->lock),

...
Freed by task 17774:
...
 kfree+0xe8/0x2c0 mm/slub.c:3883
 ep_free+0x22c/0x2a0 fs/eventpoll.c:865

Fixes: 971316f0503a ("epoll: ep_unregister_pollwait() can use the freed pwq->whead")
Reported-by: 范龙飞 <long7573@126.com>
Cc: stable@vger.kernel.org
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Fri, 1 Sep 2017 19:49:03 +0000 (12:49 -0700)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Fix handling of pinned BPF map nodes in hash of maps, from Daniel
    Borkmann.

 2) IPSEC ESP error paths leak memory, from Steffen Klassert.

 3) We need an RCU grace period before freeing fib6_node objects, from
    Wei Wang.

 4) Must check skb_put_padto() return value in HSR driver, from FLorian
    Fainelli.

 5) Fix oops on PHY probe failure in ftgmac100 driver, from Andrew
    Jeffery.

 6) Fix infinite loop in UDP queue when using SO_PEEK_OFF, from Eric
    Dumazet.

 7) Use after free when tcf_chain_destroy() called multiple times, from
    Jiri Pirko.

 8) Fix KSZ DSA tag layer multiple free of SKBS, from Florian Fainelli.

 9) Fix leak of uninitialized memory in sctp_get_sctp_info(),
    inet_diag_msg_sctpladdrs_fill() and inet_diag_msg_sctpaddrs_fill().
    From Stefano Brivio.

10) L2TP tunnel refcount fixes from Guillaume Nault.

11) Don't leak UDP secpath in udp_set_dev_scratch(), from Yossi
    Kauperman.

12) Revert a PHY layer change wrt. handling of PHY_HALTED state in
    phy_stop_machine(), it causes regressions for multiple people. From
    Florian Fainelli.

13) When packets are sent out of br0 we have to clear the
    offload_fwdq_mark value.

14) Several NULL pointer deref fixes in packet schedulers when their
    ->init() routine fails. From Nikolay Aleksandrov.

15) Aquantium devices cannot checksum offload correctly when the packet
    is <= 60 bytes. From Pavel Belous.

16) Fix vnet header access past end of buffer in AF_PACKET, from
    Benjamin Poirier.

17) Double free in probe error paths of nfp driver, from Dan Carpenter.

18) QOS capability not checked properly in DCB init paths of mlx5
    driver, from Huy Nguyen.

19) Fix conflicts between firmware load failure and health_care timer in
    mlx5, also from Huy Nguyen.

20) Fix dangling page pointer when DMA mapping errors occur in mlx5,
    from Eran Ben ELisha.

21) ->ndo_setup_tc() in bnxt_en driver doesn't count rings properly,
    from Michael Chan.

22) Missing MSIX vector free in bnxt_en, also from Michael Chan.

23) Refcount leak in xfrm layer when using sk_policy, from Lorenzo
    Colitti.

24) Fix copy of uninitialized data in qlge driver, from Arnd Bergmann.

25) bpf_setsockopts() erroneously always returns -EINVAL even on
    success. Fix from Yuchung Cheng.

26) tipc_rcv() needs to linearize the SKB before parsing the inner
    headers, from Parthasarathy Bhuvaragan.

27) Fix deadlock between link status updates and link removal in netvsc
    driver, from Stephen Hemminger.

28) Missed locking of page fragment handling in ESP output, from Steffen
    Klassert.

29) Fix refcnt leak in ebpf congestion control code, from Sabrina
    Dubroca.

30) sxgbe_probe_config_dt() doesn't check devm_kzalloc()'s return value,
    from Christophe Jaillet.

31) Fix missing ipv6 rx_dst_cookie update when rx_dst is updated during
    early demux, from Paolo Abeni.

32) Several info leaks in xfrm_user layer, from Mathias Krause.

33) Fix out of bounds read in cxgb4 driver, from Stefano Brivio.

34) Properly propagate obsolete state of route upwards in ipv6 so that
    upper holders like xfrm can see it. From Xin Long.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (118 commits)
  udp: fix secpath leak
  bridge: switchdev: Clear forward mark when transmitting packet
  mlxsw: spectrum: Forbid linking to devices that have uppers
  wl1251: add a missing spin_lock_init()
  Revert "net: phy: Correctly process PHY_HALTED in phy_stop_machine()"
  net: dsa: bcm_sf2: Fix number of CFP entries for BCM7278
  kcm: do not attach PF_KCM sockets to avoid deadlock
  sch_tbf: fix two null pointer dereferences on init failure
  sch_sfq: fix null pointer dereference on init failure
  sch_netem: avoid null pointer deref on init failure
  sch_fq_codel: avoid double free on init failure
  sch_cbq: fix null pointer dereferences on init failure
  sch_hfsc: fix null pointer deref and double free on init failure
  sch_hhf: fix null pointer dereference on init failure
  sch_multiq: fix double free on init failure
  sch_htb: fix crash on init failure
  net/mlx5e: Fix CQ moderation mode not set properly
  net/mlx5e: Fix inline header size for small packets
  net/mlx5: E-Switch, Unload the representors in the correct order
  net/mlx5e: Properly resolve TC offloaded ipv6 vxlan tunnel source address
  ...

7 years agoMerge tag 'ceph-for-4.13-rc8' of git://github.com/ceph/ceph-client
Linus Torvalds [Fri, 1 Sep 2017 19:46:30 +0000 (12:46 -0700)]
Merge tag 'ceph-for-4.13-rc8' of git://github.com/ceph/ceph-client

Pull ceph fix from Ilya Dryomov:
 "ceph fscache page locking fix from Zheng, marked for stable"

* tag 'ceph-for-4.13-rc8' of git://github.com/ceph/ceph-client:
  ceph: fix readpage from fscache

7 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Fri, 1 Sep 2017 17:43:37 +0000 (10:43 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Pull input fixes from Dmitry Torokhov:
 "Just a couple drivers fixes (Synaptics PS/2, Xpad)"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: xpad - fix PowerA init quirk for some gamepad models
  Input: synaptics - fix device info appearing different on reconnect

7 years agoMerge tag 'mmc-v4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Fri, 1 Sep 2017 17:41:02 +0000 (10:41 -0700)]
Merge tag 'mmc-v4.13-rc7' of git://git./linux/kernel/git/ulfh/mmc

Pull two more MMC fixes from Ulf Hansson:
 "MMC core:
   - Fix block status codes

  MMC host:
   - sdhci-xenon: Fix SD bus voltage select"

* tag 'mmc-v4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: sdhci-xenon: add set_power callback
  mmc: block: Fix block status codes

7 years agoMerge tag 'sound-4.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai...
Linus Torvalds [Fri, 1 Sep 2017 17:38:00 +0000 (10:38 -0700)]
Merge tag 'sound-4.13-rc8' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "Three regression fixes that should be addressed before the final
  release: a missing mutex call in OSS PCM emulation ioctl, ASoC rt5670
  headset detection breakage, and a regression in simple-card parser
  code"

* tag 'sound-4.13-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ASoC: simple_card_utils: fix fallback when "label" property isn't present
  ALSA: pcm: Fix power lock unbalance via OSS emulation
  ASoC: rt5670: Fix GPIO headset detection regression

7 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Fri, 1 Sep 2017 17:36:22 +0000 (10:36 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux

Pull s390 fixes from Martin Schwidefsky:
 "Three more bug fixes for v4.13.

  The two memory management related fixes are quite new, they fix kernel
  crashes that can be triggered by user space.

  The third commit fixes a bug in the vfio ccw translation code"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/mm: fix BUG_ON in crst_table_upgrade
  s390/mm: fork vs. 5 level page tabel
  vfio: ccw: fix bad ptr math for TIC cda translation

7 years agoMerge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Fri, 1 Sep 2017 17:30:03 +0000 (10:30 -0700)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
 "This fixes the following issues:

   - Regression in chacha20 handling of chunked input

   - Crash in algif_skcipher when used with async io

   - Potential bogus pointer dereference in lib/mpi"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: algif_skcipher - only call put_page on referenced and used pages
  crypto: testmgr - add chunked test cases for chacha20
  crypto: chacha20 - fix handling of chunked input
  lib/mpi: kunmap after finishing accessing buffer

7 years agoudp: fix secpath leak
Yossi Kuperman [Fri, 1 Sep 2017 12:42:30 +0000 (14:42 +0200)]
udp: fix secpath leak

After commit dce4551cb2ad ("udp: preserve head state for IP_CMSG_PASSSEC")
we preserve the secpath for the whole skb lifecycle, but we also
end up leaking a reference to it.

We must clear the head state on skb reception, if secpath is
present.

Fixes: dce4551cb2ad ("udp: preserve head state for IP_CMSG_PASSSEC")
Signed-off-by: Yossi Kuperman <yossiku@mellanox.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agobridge: switchdev: Clear forward mark when transmitting packet
Ido Schimmel [Fri, 1 Sep 2017 09:22:25 +0000 (12:22 +0300)]
bridge: switchdev: Clear forward mark when transmitting packet

Commit 6bc506b4fb06 ("bridge: switchdev: Add forward mark support for
stacked devices") added the 'offload_fwd_mark' bit to the skb in order
to allow drivers to indicate to the bridge driver that they already
forwarded the packet in L2.

In case the bit is set, before transmitting the packet from each port,
the port's mark is compared with the mark stored in the skb's control
block. If both marks are equal, we know the packet arrived from a switch
device that already forwarded the packet and it's not re-transmitted.

However, if the packet is transmitted from the bridge device itself
(e.g., br0), we should clear the 'offload_fwd_mark' bit as the mark
stored in the skb's control block isn't valid.

This scenario can happen in rare cases where a packet was trapped during
L3 forwarding and forwarded by the kernel to a bridge device.

Fixes: 6bc506b4fb06 ("bridge: switchdev: Add forward mark support for stacked devices")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Yotam Gigi <yotamg@mellanox.com>
Tested-by: Yotam Gigi <yotamg@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agomlxsw: spectrum: Forbid linking to devices that have uppers
Ido Schimmel [Fri, 1 Sep 2017 08:52:31 +0000 (10:52 +0200)]
mlxsw: spectrum: Forbid linking to devices that have uppers

The mlxsw driver relies on NETDEV_CHANGEUPPER events to configure the
device in case a port is enslaved to a master netdev such as bridge or
bond.

Since the driver ignores events unrelated to its ports and their
uppers, it's possible to engineer situations in which the device's data
path differs from the kernel's.

One example to such a situation is when a port is enslaved to a bond
that is already enslaved to a bridge. When the bond was enslaved the
driver ignored the event - as the bond wasn't one of its uppers - and
therefore a bridge port instance isn't created in the device.

Until such configurations are supported forbid them by checking that the
upper device doesn't have uppers of its own.

Fixes: 0d65fc13042f ("mlxsw: spectrum: Implement LAG port join/leave")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Nogah Frankel <nogahf@mellanox.com>
Tested-by: Nogah Frankel <nogahf@mellanox.com>
Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoFix warning messages when mounting to older servers
Steve French [Fri, 1 Sep 2017 02:34:24 +0000 (21:34 -0500)]
Fix warning messages when mounting to older servers

When mounting to older servers, such as Windows XP (or even Windows 7),
the limited error messages that can be passed back to user space can
get confusing since the default dialect has changed from SMB1 (CIFS) to
more secure SMB3 dialect. Log additional information when the user chooses
to use the default dialects and when the server does not support the
dialect requested.

Signed-off-by: Steve French <smfrench@gmail.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
7 years agoMerge tag 'cifs-fixes-for-4.13-rc7-and-stable' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Fri, 1 Sep 2017 01:45:04 +0000 (18:45 -0700)]
Merge tag 'cifs-fixes-for-4.13-rc7-and-stable' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Two cifs bug fixes for stable"

* tag 'cifs-fixes-for-4.13-rc7-and-stable' of git://git.samba.org/sfrench/cifs-2.6:
  CIFS: remove endian related sparse warning
  CIFS: Fix maximum SMB2 header size

7 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block
Linus Torvalds [Fri, 1 Sep 2017 01:42:21 +0000 (18:42 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Unfortunately a few issues that warrant sending another pull request,
  even if I had hoped to avoid it. This contains:

   - A fix for multiqueue xen-blkback, on tear down / disconnect.

   - A few fixups for NVMe, including a wrong bit definition, fix for
     host memory buffers, and an nvme rdma page size fix"

* 'for-linus' of git://git.kernel.dk/linux-block:
  nvme: fix the definition of the doorbell buffer config support bit
  nvme-pci: use dma memory for the host memory buffer descriptors
  nvme-rdma: default MR page size to 4k
  xen-blkback: stop blkback thread of every queue in xen_blkif_disconnect

7 years agoMerge tag 'for-4.13/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 1 Sep 2017 01:39:19 +0000 (18:39 -0700)]
Merge tag 'for-4.13/dm-fixes-2' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - A couple fixes for bugs introduced as part of the blk_status_t block
   layer changes during the 4.13 merge window

 - A printk throttling fix to use discrete rate limiting state for each
   DM log level

 - A stable@ fix for DM multipath that delays request requeueing to
   avoid CPU lockup if/when the request queue is "dying"

* tag 'for-4.13/dm-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm mpath: do not lock up a CPU with requeuing activity
  dm: fix printk() rate limiting code
  dm mpath: retry BLK_STS_RESOURCE errors
  dm: fix the second dec_pending() argument in __split_and_process_bio()

7 years agoInput: byd - make array seq static, reduces object code size
Colin Ian King [Thu, 31 Aug 2017 16:30:44 +0000 (09:30 -0700)]
Input: byd - make array seq static, reduces object code size

Don't populate the array seq on the stack, instead make it static.
Makes the object code smaller by over 1100 bytes:

Before:
   text    data     bss     dec     hex filename
   6152    1216      64    7432    1d08 drivers/input/mouse/byd.o

After:
   text    data     bss     dec     hex filename
   4974    1280      64    6318    18ae drivers/input/mouse/byd.o

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Fri, 1 Sep 2017 00:56:56 +0000 (17:56 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge more fixes from Andrew Morton:
 "6 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  scripts/dtc: fix '%zx' warning
  include/linux/compiler.h: don't perform compiletime_assert with -O0
  mm, madvise: ensure poisoned pages are removed from per-cpu lists
  mm, uprobes: fix multiple free of ->uprobes_state.xol_area
  kernel/kthread.c: kthread_worker: don't hog the cpu
  mm,page_alloc: don't call __node_reclaim() with oom_lock held.

7 years agoMerge branch 'mmu_notifier_fixes'
Linus Torvalds [Fri, 1 Sep 2017 00:30:01 +0000 (17:30 -0700)]
Merge branch 'mmu_notifier_fixes'

Merge mmu_notifier fixes from Jérôme Glisse:
 "The invalidate_page callback suffered from 2 pitfalls. First it used
  to happen after page table lock was release and thus a new page might
  have been setup for the virtual address before the call to
  invalidate_page().

  This is in a weird way fixed by commit c7ab0d2fdc84 ("mm: convert
  try_to_unmap_one() to use page_vma_mapped_walk()") which moved the
  callback under the page table lock. Which also broke several existing
  user of the mmu_notifier API that assumed they could sleep inside this
  callback.

  The second pitfall was invalidate_page being the only callback not
  taking a range of address in respect to invalidation but was giving an
  address and a page. Lot of the callback implementer assumed this could
  never be THP and thus failed to invalidate the appropriate range for
  THP pages.

  By killing this callback we unify the mmu_notifier callback API to
  always take a virtual address range as input.

  There is now two clear API (I am not mentioning the youngess API which
  is seldomly used):

   - invalidate_range_start()/end() callback (which allow you to sleep)

   - invalidate_range() where you can not sleep but happen right after
     page table update under page table lock

  Note that a lot of existing user feels broken in respect to
  range_start/ range_end. Many user only have range_start() callback but
  there is nothing preventing them to undo what was invalidated in their
  range_start() callback after it returns but before any CPU page table
  update take place.

  The code pattern use in kvm or umem odp is an example on how to
  properly avoid such race. In a nutshell use some kind of sequence
  number and active range invalidation counter to block anything that
  might undo what the range_start() callback did.

  If you do not care about keeping fully in sync with CPU page table (ie
  you can live with CPU page table pointing to new different page for a
  given virtual address) then you can take a reference on the pages
  inside the range_start callback and drop it in range_end or when your
  driver is done with those pages.

  Last alternative is to use invalidate_range() if you can do
  invalidation without sleeping as invalidate_range() callback happens
  under the CPU page table spinlock right after the page table is
  updated.

  The first two patches convert existing mmu_notifier_invalidate_page()
  calls to mmu_notifier_invalidate_range() and bracket those call with
  call to mmu_notifier_invalidate_range_start()/end().

  The next ten patches remove existing invalidate_page() callback as it
  can no longer happen.

  Finally the last page remove the invalidate_page() callback completely
  so it can RIP.

  Changes since v1:
   - remove more dead code in kvm (no testing impact)
   - more accurate end address computation (patch 2) in page_mkclean_one
     and try_to_unmap_one
   - added tested-by/reviewed-by gotten so far"

* emailed patches from Jérôme Glisse <jglisse@redhat.com>:
  mm/mmu_notifier: kill invalidate_page
  KVM: update to new mmu_notifier semantic v2
  xen/gntdev: update to new mmu_notifier semantic
  sgi-gru: update to new mmu_notifier semantic
  misc/mic/scif: update to new mmu_notifier semantic
  iommu/intel: update to new mmu_notifier semantic
  iommu/amd: update to new mmu_notifier semantic
  IB/hfi1: update to new mmu_notifier semantic
  IB/umem: update to new mmu_notifier semantic
  drm/amdgpu: update to new mmu_notifier semantic
  powerpc/powernv: update to new mmu_notifier semantic
  mm/rmap: update to new mmu_notifier semantic v2
  dax: update to new mmu_notifier semantic

7 years agojfs should use MAX_LFS_FILESIZE when calculating s_maxbytes
Dave Kleikamp [Thu, 31 Aug 2017 21:46:59 +0000 (16:46 -0500)]
jfs should use MAX_LFS_FILESIZE when calculating s_maxbytes

jfs had previously avoided the use of MAX_LFS_FILESIZE because it hadn't
accounted for the whole 32-bit index range on 32-bit systems.  That has
been fixed by commit 0cc3b0ec23ce ("Clarify (and fix) MAX_LFS_FILESIZE
macros"), so we can simplify the code now.

Suggested by Andreas Dilger.

Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Cc: jfs-discussion@lists.sourceforge.net
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoscripts/dtc: fix '%zx' warning
Russell King [Thu, 31 Aug 2017 23:15:36 +0000 (16:15 -0700)]
scripts/dtc: fix '%zx' warning

dtc uses an incorrect format specifier for printing a uint64_t value.
uint64_t may be either 'unsigned long' or 'unsigned long long' depending
on the host architecture.

Fix this by using %llx and casting to unsigned long long, which ensures
that we always have a wide enough variable to print 64 bits of hex.

    HOSTCC  scripts/dtc/checks.o
  scripts/dtc/checks.c: In function 'check_simple_bus_reg':
  scripts/dtc/checks.c:876:2: warning: format '%zx' expects argument of type 'size_t', but argument 4 has type 'uint64_t' [-Wformat=]
    snprintf(unit_addr, sizeof(unit_addr), "%zx", reg);
    ^
  scripts/dtc/checks.c:876:2: warning: format '%zx' expects argument of type 'size_t', but argument 4 has type 'uint64_t' [-Wformat=]

Link: http://lkml.kernel.org/r/20170829222034.GJ20805@n2100.armlinux.org.uk
Fixes: 828d4cdd012c ("dtc: check.c fix compile error")
Signed-off-by: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Cc: Shuah Khan <shuahkh@osg.samsung.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoinclude/linux/compiler.h: don't perform compiletime_assert with -O0
Joe Stringer [Thu, 31 Aug 2017 23:15:33 +0000 (16:15 -0700)]
include/linux/compiler.h: don't perform compiletime_assert with -O0

Commit c7acec713d14 ("kernel.h: handle pointers to arrays better in
container_of()") made use of __compiletime_assert() from container_of()
thus increasing the usage of this macro, allowing developers to notice
type conflicts in usage of container_of() at compile time.

However, the implementation of __compiletime_assert relies on compiler
optimizations to report an error.  This means that if a developer uses
"-O0" with any code that performs container_of(), the compiler will always
report an error regardless of whether there is an actual problem in the
code.

This patch disables compile_time_assert when optimizations are disabled to
allow such code to compile with CFLAGS="-O0".

Example compilation failure:

./include/linux/compiler.h:547:38: error: call to `__compiletime_assert_94' declared with attribute error: pointer type mismatch in container_of()
  _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
                                      ^
./include/linux/compiler.h:530:4: note: in definition of macro `__compiletime_assert'
    prefix ## suffix();    \
    ^~~~~~
./include/linux/compiler.h:547:2: note: in expansion of macro `_compiletime_assert'
  _compiletime_assert(condition, msg, __compiletime_assert_, __LINE__)
  ^~~~~~~~~~~~~~~~~~~
./include/linux/build_bug.h:46:37: note: in expansion of macro `compiletime_assert'
 #define BUILD_BUG_ON_MSG(cond, msg) compiletime_assert(!(cond), msg)
                                     ^~~~~~~~~~~~~~~~~~
./include/linux/kernel.h:860:2: note: in expansion of macro `BUILD_BUG_ON_MSG'
  BUILD_BUG_ON_MSG(!__same_type(*(ptr), ((type *)0)->member) && \
  ^~~~~~~~~~~~~~~~

[akpm@linux-foundation.org: use do{}while(0), per Michal]
Link: http://lkml.kernel.org/r/20170829230114.11662-1-joe@ovn.org
Fixes: c7acec713d14c6c ("kernel.h: handle pointers to arrays better in container_of()")
Signed-off-by: Joe Stringer <joe@ovn.org>
Cc: Ian Abbott <abbotti@mev.co.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agomm, madvise: ensure poisoned pages are removed from per-cpu lists
Mel Gorman [Thu, 31 Aug 2017 23:15:30 +0000 (16:15 -0700)]
mm, madvise: ensure poisoned pages are removed from per-cpu lists

Wendy Wang reported off-list that a RAS HWPOISON-SOFT test case failed
and bisected it to the commit 479f854a207c ("mm, page_alloc: defer
debugging checks of pages allocated from the PCP").

The problem is that a page that was poisoned with madvise() is reused.
The commit removed a check that would trigger if DEBUG_VM was enabled
but re-enabling the check only fixes the problem as a side-effect by
printing a bad_page warning and recovering.

The root of the problem is that an madvise() can leave a poisoned page
on the per-cpu list.  This patch drains all per-cpu lists after pages
are poisoned so that they will not be reused.  Wendy reports that the
test case in question passes with this patch applied.  While this could
be done in a targeted fashion, it is over-complicated for such a rare
operation.

Link: http://lkml.kernel.org/r/20170828133414.7qro57jbepdcyz5x@techsingularity.net
Fixes: 479f854a207c ("mm, page_alloc: defer debugging checks of pages allocated from the PCP")
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
Reported-by: Wang, Wendy <wendy.wang@intel.com>
Tested-by: Wang, Wendy <wendy.wang@intel.com>
Acked-by: David Rientjes <rientjes@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: "Hansen, Dave" <dave.hansen@intel.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Naoya Horiguchi <nao.horiguchi@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agomm, uprobes: fix multiple free of ->uprobes_state.xol_area
Eric Biggers [Thu, 31 Aug 2017 23:15:26 +0000 (16:15 -0700)]
mm, uprobes: fix multiple free of ->uprobes_state.xol_area

Commit 7c051267931a ("mm, fork: make dup_mmap wait for mmap_sem for
write killable") made it possible to kill a forking task while it is
waiting to acquire its ->mmap_sem for write, in dup_mmap().

However, it was overlooked that this introduced an new error path before
the new mm_struct's ->uprobes_state.xol_area has been set to NULL after
being copied from the old mm_struct by the memcpy in dup_mm().  For a
task that has previously hit a uprobe tracepoint, this resulted in the
'struct xol_area' being freed multiple times if the task was killed at
just the right time while forking.

Fix it by setting ->uprobes_state.xol_area to NULL in mm_init() rather
than in uprobe_dup_mmap().

With CONFIG_UPROBE_EVENTS=y, the bug can be reproduced by the same C
program given by commit 2b7e8665b4ff ("fork: fix incorrect fput of
->exe_file causing use-after-free"), provided that a uprobe tracepoint
has been set on the fork_thread() function.  For example:

    $ gcc reproducer.c -o reproducer -lpthread
    $ nm reproducer | grep fork_thread
    0000000000400719 t fork_thread
    $ echo "p $PWD/reproducer:0x719" > /sys/kernel/debug/tracing/uprobe_events
    $ echo 1 > /sys/kernel/debug/tracing/events/uprobes/enable
    $ ./reproducer

Here is the use-after-free reported by KASAN:

    BUG: KASAN: use-after-free in uprobe_clear_state+0x1c4/0x200
    Read of size 8 at addr ffff8800320a8b88 by task reproducer/198

    CPU: 1 PID: 198 Comm: reproducer Not tainted 4.13.0-rc7-00015-g36fde05f3fb5 #255
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-20170228_101828-anatol 04/01/2014
    Call Trace:
     dump_stack+0xdb/0x185
     print_address_description+0x7e/0x290
     kasan_report+0x23b/0x350
     __asan_report_load8_noabort+0x19/0x20
     uprobe_clear_state+0x1c4/0x200
     mmput+0xd6/0x360
     do_exit+0x740/0x1670
     do_group_exit+0x13f/0x380
     get_signal+0x597/0x17d0
     do_signal+0x99/0x1df0
     exit_to_usermode_loop+0x166/0x1e0
     syscall_return_slowpath+0x258/0x2c0
     entry_SYSCALL_64_fastpath+0xbc/0xbe

    ...

    Allocated by task 199:
     save_stack_trace+0x1b/0x20
     kasan_kmalloc+0xfc/0x180
     kmem_cache_alloc_trace+0xf3/0x330
     __create_xol_area+0x10f/0x780
     uprobe_notify_resume+0x1674/0x2210
     exit_to_usermode_loop+0x150/0x1e0
     prepare_exit_to_usermode+0x14b/0x180
     retint_user+0x8/0x20

    Freed by task 199:
     save_stack_trace+0x1b/0x20
     kasan_slab_free+0xa8/0x1a0
     kfree+0xba/0x210
     uprobe_clear_state+0x151/0x200
     mmput+0xd6/0x360
     copy_process.part.8+0x605f/0x65d0
     _do_fork+0x1a5/0xbd0
     SyS_clone+0x19/0x20
     do_syscall_64+0x22f/0x660
     return_from_SYSCALL_64+0x0/0x7a

Note: without KASAN, you may instead see a "Bad page state" message, or
simply a general protection fault.

Link: http://lkml.kernel.org/r/20170830033303.17927-1-ebiggers3@gmail.com
Fixes: 7c051267931a ("mm, fork: make dup_mmap wait for mmap_sem for write killable")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: <stable@vger.kernel.org> [4.7+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agokernel/kthread.c: kthread_worker: don't hog the cpu
Shaohua Li [Thu, 31 Aug 2017 23:15:23 +0000 (16:15 -0700)]
kernel/kthread.c: kthread_worker: don't hog the cpu

If the worker thread continues getting work, it will hog the cpu and rcu
stall complains.  Make it a good citizen.  This is triggered in a loop
block device test.

Link: http://lkml.kernel.org/r/5de0a179b3184e1a2183fc503448b0269f24d75b.1503697127.git.shli@fb.com
Signed-off-by: Shaohua Li <shli@fb.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agomm,page_alloc: don't call __node_reclaim() with oom_lock held.
Tetsuo Handa [Thu, 31 Aug 2017 23:15:20 +0000 (16:15 -0700)]
mm,page_alloc: don't call __node_reclaim() with oom_lock held.

We are doing a last second memory allocation attempt before calling
out_of_memory().  But since slab shrinker functions might indirectly
wait for other thread's __GFP_DIRECT_RECLAIM && !__GFP_NORETRY memory
allocations via sleeping locks, calling slab shrinker functions from
node_reclaim() from get_page_from_freelist() with oom_lock held has
possibility of deadlock.  Therefore, make sure that last second memory
allocation attempt does not call slab shrinker functions.

Link: http://lkml.kernel.org/r/1503577106-9196-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agomm/mmu_notifier: kill invalidate_page
Jérôme Glisse [Thu, 31 Aug 2017 21:17:38 +0000 (17:17 -0400)]
mm/mmu_notifier: kill invalidate_page

The invalidate_page callback suffered from two pitfalls.  First it used
to happen after the page table lock was release and thus a new page
might have setup before the call to invalidate_page() happened.

This is in a weird way fixed by commit c7ab0d2fdc84 ("mm: convert
try_to_unmap_one() to use page_vma_mapped_walk()") that moved the
callback under the page table lock but this also broke several existing
users of the mmu_notifier API that assumed they could sleep inside this
callback.

The second pitfall was invalidate_page() being the only callback not
taking a range of address in respect to invalidation but was giving an
address and a page.  Lots of the callback implementers assumed this
could never be THP and thus failed to invalidate the appropriate range
for THP.

By killing this callback we unify the mmu_notifier callback API to
always take a virtual address range as input.

Finally this also simplifies the end user life as there is now two clear
choices:
  - invalidate_range_start()/end() callback (which allow you to sleep)
  - invalidate_range() where you can not sleep but happen right after
    page table update under page table lock

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Bernhard Held <berny156@gmx.de>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: axie <axie@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoKVM: update to new mmu_notifier semantic v2
Jérôme Glisse [Thu, 31 Aug 2017 21:17:37 +0000 (17:17 -0400)]
KVM: update to new mmu_notifier semantic v2

Calls to mmu_notifier_invalidate_page() were replaced by calls to
mmu_notifier_invalidate_range() and are now bracketed by calls to
mmu_notifier_invalidate_range_start()/end()

Remove now useless invalidate_page callback.

Changed since v1 (Linus Torvalds)
    - remove now useless kvm_arch_mmu_notifier_invalidate_page()

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Tested-by: Mike Galbraith <efault@gmx.de>
Tested-by: Adam Borowski <kilobyte@angband.pl>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: kvm@vger.kernel.org
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoxen/gntdev: update to new mmu_notifier semantic
Jérôme Glisse [Thu, 31 Aug 2017 21:17:36 +0000 (17:17 -0400)]
xen/gntdev: update to new mmu_notifier semantic

Calls to mmu_notifier_invalidate_page() were replaced by calls to
mmu_notifier_invalidate_range() and are now bracketed by calls to
mmu_notifier_invalidate_range_start()/end()

Remove now useless invalidate_page callback.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Roger Pau Monné <roger.pau@citrix.com>
Cc: xen-devel@lists.xenproject.org (moderated for non-subscribers)
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agosgi-gru: update to new mmu_notifier semantic
Jérôme Glisse [Thu, 31 Aug 2017 21:17:35 +0000 (17:17 -0400)]
sgi-gru: update to new mmu_notifier semantic

Calls to mmu_notifier_invalidate_page() were replaced by calls to
mmu_notifier_invalidate_range() and are now bracketed by calls to
mmu_notifier_invalidate_range_start()/end()

Remove now useless invalidate_page callback.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Cc: Jack Steiner <steiner@sgi.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agomisc/mic/scif: update to new mmu_notifier semantic
Jérôme Glisse [Thu, 31 Aug 2017 21:17:34 +0000 (17:17 -0400)]
misc/mic/scif: update to new mmu_notifier semantic

Calls to mmu_notifier_invalidate_page() were replaced by calls to
mmu_notifier_invalidate_range() and are now bracketed by calls to
mmu_notifier_invalidate_range_start()/end()

Remove now useless invalidate_page callback.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Sudeep Dutt <sudeep.dutt@intel.com>
Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoiommu/intel: update to new mmu_notifier semantic
Jérôme Glisse [Thu, 31 Aug 2017 21:17:33 +0000 (17:17 -0400)]
iommu/intel: update to new mmu_notifier semantic

Calls to mmu_notifier_invalidate_page() were replaced by calls to
mmu_notifier_invalidate_range() and are now bracketed by calls to
mmu_notifier_invalidate_range_start()/end()

Remove now useless invalidate_page callback.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: iommu@lists.linux-foundation.org
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoiommu/amd: update to new mmu_notifier semantic
Jérôme Glisse [Thu, 31 Aug 2017 21:17:32 +0000 (17:17 -0400)]
iommu/amd: update to new mmu_notifier semantic

Calls to mmu_notifier_invalidate_page() were replaced by calls to
mmu_notifier_invalidate_range() and are now bracketed by calls to
mmu_notifier_invalidate_range_start()/end()

Remove now useless invalidate_page callback.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: iommu@lists.linux-foundation.org
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoIB/hfi1: update to new mmu_notifier semantic
Jérôme Glisse [Thu, 31 Aug 2017 21:17:31 +0000 (17:17 -0400)]
IB/hfi1: update to new mmu_notifier semantic

Calls to mmu_notifier_invalidate_page() were replaced by calls to
mmu_notifier_invalidate_range() and are now bracketed by calls to
mmu_notifier_invalidate_range_start()/end()

Remove now useless invalidate_page callback.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: linux-rdma@vger.kernel.org
Cc: Dean Luick <dean.luick@intel.com>
Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoIB/umem: update to new mmu_notifier semantic
Jérôme Glisse [Thu, 31 Aug 2017 21:17:30 +0000 (17:17 -0400)]
IB/umem: update to new mmu_notifier semantic

Calls to mmu_notifier_invalidate_page() were replaced by calls to
mmu_notifier_invalidate_range() and are now bracketed by calls to
mmu_notifier_invalidate_range_start()/end()

Remove now useless invalidate_page callback.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Tested-by: Leon Romanovsky <leonro@mellanox.com>
Cc: linux-rdma@vger.kernel.org
Cc: Artemy Kovalyov <artemyko@mellanox.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agodrm/amdgpu: update to new mmu_notifier semantic
Jérôme Glisse [Thu, 31 Aug 2017 21:17:29 +0000 (17:17 -0400)]
drm/amdgpu: update to new mmu_notifier semantic

Calls to mmu_notifier_invalidate_page() were replaced by calls to
mmu_notifier_invalidate_range() and are now bracketed by calls to
mmu_notifier_invalidate_range_start()/end()

Remove now useless invalidate_page callback.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: Felix Kuehling <Felix.Kuehling@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agopowerpc/powernv: update to new mmu_notifier semantic
Jérôme Glisse [Thu, 31 Aug 2017 21:17:28 +0000 (17:17 -0400)]
powerpc/powernv: update to new mmu_notifier semantic

Calls to mmu_notifier_invalidate_page() were replaced by calls to
mmu_notifier_invalidate_range() and now are bracketed by calls to
mmu_notifier_invalidate_range_start()/end()

Remove now useless invalidate_page callback.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Alistair Popple <alistair@popple.id.au>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agomm/rmap: update to new mmu_notifier semantic v2
Jérôme Glisse [Thu, 31 Aug 2017 21:17:27 +0000 (17:17 -0400)]
mm/rmap: update to new mmu_notifier semantic v2

Replace all mmu_notifier_invalidate_page() calls by *_invalidate_range()
and make sure it is bracketed by calls to *_invalidate_range_start()/end().

Note that because we can not presume the pmd value or pte value we have
to assume the worst and unconditionaly report an invalidation as
happening.

Changed since v2:
  - try_to_unmap_one() only one call to mmu_notifier_invalidate_range()
  - compute end with PAGE_SIZE << compound_order(page)
  - fix PageHuge() case in try_to_unmap_one()

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Bernhard Held <berny156@gmx.de>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: axie <axie@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agodax: update to new mmu_notifier semantic
Jérôme Glisse [Thu, 31 Aug 2017 21:17:26 +0000 (17:17 -0400)]
dax: update to new mmu_notifier semantic

Replace all mmu_notifier_invalidate_page() calls by *_invalidate_range()
and make sure it is bracketed by calls to *_invalidate_range_start()/end().

Note that because we can not presume the pmd value or pte value we have
to assume the worst and unconditionaly report an invalidation as
happening.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Bernhard Held <berny156@gmx.de>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: axie <axie@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
7 years agoceph: fix readpage from fscache
Yan, Zheng [Fri, 4 Aug 2017 03:22:31 +0000 (11:22 +0800)]
ceph: fix readpage from fscache

ceph_readpage() unlocks page prematurely prematurely in the case
that page is reading from fscache. Caller of readpage expects that
page is uptodate when it get unlocked. So page shoule get locked
by completion callback of fscache_read_or_alloc_pages()

Cc: stable@vger.kernel.org # 4.1+, needs backporting for < 4.7
Signed-off-by: "Yan, Zheng" <zyan@redhat.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
7 years agoInput: xilinx_ps2 - fix multiline comment style
Michal Simek [Thu, 31 Aug 2017 21:55:45 +0000 (14:55 -0700)]
Input: xilinx_ps2 - fix multiline comment style

Fix multiline comments style not to be reported by checkpatch.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agowl1251: add a missing spin_lock_init()
Cong Wang [Thu, 31 Aug 2017 14:47:43 +0000 (16:47 +0200)]
wl1251: add a missing spin_lock_init()

wl1251: add a missing spin_lock_init()

This fixes the following kernel warning:

 [ 5668.771453] BUG: spinlock bad magic on CPU#0, kworker/u2:3/9745
 [ 5668.771850]  lock: 0xce63ef20, .magic: 00000000, .owner: <none>/-1,
 .owner_cpu: 0
 [ 5668.772277] CPU: 0 PID: 9745 Comm: kworker/u2:3 Tainted: G        W
 4.12.0-03002-gec979a4-dirty #40
 [ 5668.772796] Hardware name: Nokia RX-51 board
 [ 5668.773071] Workqueue: phy1 wl1251_irq_work
 [ 5668.773345] [<c010c9e4>] (unwind_backtrace) from [<c010a274>]
 (show_stack+0x10/0x14)
 [ 5668.773803] [<c010a274>] (show_stack) from [<c01545a4>]
 (do_raw_spin_lock+0x6c/0xa0)
 [ 5668.774230] [<c01545a4>] (do_raw_spin_lock) from [<c06ca578>]
 (_raw_spin_lock_irqsave+0x10/0x18)
 [ 5668.774658] [<c06ca578>] (_raw_spin_lock_irqsave) from [<c048c010>]
 (wl1251_op_tx+0x38/0x5c)
 [ 5668.775115] [<c048c010>] (wl1251_op_tx) from [<c06a12e8>]
 (ieee80211_tx_frags+0x188/0x1c0)
 [ 5668.775543] [<c06a12e8>] (ieee80211_tx_frags) from [<c06a138c>]
 (__ieee80211_tx+0x6c/0x130)
 [ 5668.775970] [<c06a138c>] (__ieee80211_tx) from [<c06a3dbc>]
 (ieee80211_tx+0xdc/0x104)
 [ 5668.776367] [<c06a3dbc>] (ieee80211_tx) from [<c06a4af0>]
 (__ieee80211_subif_start_xmit+0x454/0x8c8)
 [ 5668.776824] [<c06a4af0>] (__ieee80211_subif_start_xmit) from
 [<c06a4f94>] (ieee80211_subif_start_xmit+0x30/0x2fc)
 [ 5668.777343] [<c06a4f94>] (ieee80211_subif_start_xmit) from
 [<c0578848>] (dev_hard_start_xmit+0x80/0x118)
...

    by adding the missing spin_lock_init().

Reported-by: Pavel Machek <pavel@ucw.cz>
Cc: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Cc: stable@kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoInput: pxa27x_keypad - handle return value of clk_prepare_enable
Arvind Yadav [Thu, 31 Aug 2017 18:39:13 +0000 (11:39 -0700)]
Input: pxa27x_keypad - handle return value of clk_prepare_enable

clk_prepare_enable() can fail here and we must check its return value.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: tegra-kbc - handle return value of clk_prepare_enable
Arvind Yadav [Thu, 31 Aug 2017 18:35:29 +0000 (11:35 -0700)]
Input: tegra-kbc - handle return value of clk_prepare_enable

clk_prepare_enable() can fail here and we must check its return value.

Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoInput: xpad - fix PowerA init quirk for some gamepad models
Cameron Gutman [Thu, 31 Aug 2017 18:52:20 +0000 (11:52 -0700)]
Input: xpad - fix PowerA init quirk for some gamepad models

The PowerA gamepad initialization quirk worked with the PowerA
wired gamepad I had around (0x24c6:0x543a), but a user reported [0]
that it didn't work for him, even though our gamepads shared the
same vendor and product IDs.

When I initially implemented the PowerA quirk, I wanted to avoid
actually triggering the rumble action during init. My tests showed
that my gamepad would work correctly even if it received a rumble
of 0 intensity, so that's what I went with.

Unfortunately, this apparently isn't true for all models (perhaps
a firmware difference?). This non-working gamepad seems to require
the real magic rumble packet that the Microsoft driver sends, which
actually vibrates the gamepad. To counteract this effect, I still
send the old zero-rumble PowerA quirk packet which cancels the
rumble effect before the motors can spin up enough to vibrate.

[0]: https://github.com/paroj/xpad/issues/48#issuecomment-313904867

Reported-by: Kyle Beauchamp <kyleabeauchamp@gmail.com>
Tested-by: Kyle Beauchamp <kyleabeauchamp@gmail.com>
Fixes: 81093c9848a7 ("Input: xpad - support some quirky Xbox One pads")
Cc: stable@vger.kernel.org # v4.12
Signed-off-by: Cameron Gutman <aicommander@gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
7 years agoi2c: designware: Round down ACPI provided clk to nearest supported clk
Hans de Goede [Tue, 29 Aug 2017 12:08:35 +0000 (14:08 +0200)]
i2c: designware: Round down ACPI provided clk to nearest supported clk

The Lenovo Miix2 8 DSDT contains an i2c clk / bus speed of 1700000 Hz
for one if its devices, which is not supported.

This is the second DSDT to show up with an unsupported clk in a short
time, remove the hardcoded fix for DSDTs with a 1 MiHz clock and simply
always round down the clk to the nearest supported value.

Reported-by: russianneuromancer@ya.ru
Fixes: 682c6c2188 ("i2c: designware: Some broken DSTDs use 1MiHz ...")
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
7 years agoMerge tag 'asoc-fix-v4.13-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/broon...
Takashi Iwai [Thu, 31 Aug 2017 12:08:26 +0000 (14:08 +0200)]
Merge tag 'asoc-fix-v4.13-rc7' of git://git./linux/kernel/git/broonie/sound into for-linus

ASoC: Fixes for v4.13

A couple of fixes, one for a regression in simple-card introduced during
the merge window that was only reported this week and another for a
regression in registration of ACPI GPIOs.

7 years agoMerge tag 'vfio-ccw-20170724' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms39...
Martin Schwidefsky [Thu, 31 Aug 2017 12:05:20 +0000 (14:05 +0200)]
Merge tag 'vfio-ccw-20170724' of git://git./linux/kernel/git/kvms390/vfio-ccw into fixes

Pull vfio-ccw fix from Cornelia Huck:
"A bugfix in the ccw translation code."

7 years agos390/mm: fix BUG_ON in crst_table_upgrade
Martin Schwidefsky [Thu, 31 Aug 2017 11:18:22 +0000 (13:18 +0200)]
s390/mm: fix BUG_ON in crst_table_upgrade

A 31-bit compat process can force a BUG_ON in crst_table_upgrade
with specific, invalid mmap calls, e.g.

   mmap((void*) 0x7fff8000, 0x10000, 3, 32, -1, 0)

The arch_get_unmapped_area[_topdown] functions miss an if condition
in the decision to do a page table upgrade.

Fixes: 9b11c7912d00 ("s390/mm: simplify arch_get_unmapped_area[_topdown]")
Cc: <stable@vger.kernel.org> # v4.12+
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agos390/mm: fork vs. 5 level page tabel
Martin Schwidefsky [Thu, 31 Aug 2017 10:30:54 +0000 (12:30 +0200)]
s390/mm: fork vs. 5 level page tabel

The mm->context.asce field of a new process is not set up correctly
in case of a fork with a 5 level page table.
Add the missing case to init_new_context().

Fixes: 1aea9b3f9210 ("s390/mm: implement 5 level pages tables")
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
7 years agoMerge remote-tracking branch 'asoc/fix/rt5670' into asoc-fixes
Mark Brown [Thu, 31 Aug 2017 11:47:58 +0000 (12:47 +0100)]
Merge remote-tracking branch 'asoc/fix/rt5670' into asoc-fixes

7 years agoRevert "net: phy: Correctly process PHY_HALTED in phy_stop_machine()"
Florian Fainelli [Thu, 31 Aug 2017 00:49:29 +0000 (17:49 -0700)]
Revert "net: phy: Correctly process PHY_HALTED in phy_stop_machine()"

This reverts commit 7ad813f208533cebfcc32d3d7474dc1677d1b09a ("net: phy:
Correctly process PHY_HALTED in phy_stop_machine()") because it is
creating the possibility for a NULL pointer dereference.

David Daney provide the following call trace and diagram of events:

When ndo_stop() is called we call:

 phy_disconnect()
    +---> phy_stop_interrupts() implies: phydev->irq = PHY_POLL;
    +---> phy_stop_machine()
    |      +---> phy_state_machine()
    |              +----> queue_delayed_work(): Work queued.
    +--->phy_detach() implies: phydev->attached_dev = NULL;

Now at a later time the queued work does:

 phy_state_machine()
    +---->netif_carrier_off(phydev->attached_dev): Oh no! It is NULL:

 CPU 12 Unable to handle kernel paging request at virtual address
0000000000000048, epc == ffffffff80de37ec, ra == ffffffff80c7c
Oops[#1]:
CPU: 12 PID: 1502 Comm: kworker/12:1 Not tainted 4.9.43-Cavium-Octeon+ #1
Workqueue: events_power_efficient phy_state_machine
task: 80000004021ed100 task.stack: 8000000409d70000
$ 0   : 0000000000000000 ffffffff84720060 0000000000000048 0000000000000004
$ 4   : 0000000000000000 0000000000000001 0000000000000004 0000000000000000
$ 8   : 0000000000000000 0000000000000000 00000000ffff98f3 0000000000000000
$12   : 8000000409d73fe0 0000000000009c00 ffffffff846547c8 000000000000af3b
$16   : 80000004096bab68 80000004096babd0 0000000000000000 80000004096ba800
$20   : 0000000000000000 0000000000000000 ffffffff81090000 0000000000000008
$24   : 0000000000000061 ffffffff808637b0
$28   : 8000000409d70000 8000000409d73cf0 80000000271bd300 ffffffff80c7804c
Hi    : 000000000000002a
Lo    : 000000000000003f
epc   : ffffffff80de37ec netif_carrier_off+0xc/0x58
ra    : ffffffff80c7804c phy_state_machine+0x48c/0x4f8
Status: 14009ce3        KX SX UX KERNEL EXL IE
Cause : 00800008 (ExcCode 02)
BadVA : 0000000000000048
PrId  : 000d9501 (Cavium Octeon III)
Modules linked in:
Process kworker/12:1 (pid: 1502, threadinfo=8000000409d70000,
task=80000004021ed100, tls=0000000000000000)
Stack : 8000000409a54000 80000004096bab68 80000000271bd300 80000000271c1e00
        0000000000000000 ffffffff808a1708 8000000409a54000 80000000271bd300
        80000000271bd320 8000000409a54030 ffffffff80ff0f00 0000000000000001
        ffffffff81090000 ffffffff808a1ac0 8000000402182080 ffffffff84650000
        8000000402182080 ffffffff84650000 ffffffff80ff0000 8000000409a54000
        ffffffff808a1970 0000000000000000 80000004099e8000 8000000402099240
        0000000000000000 ffffffff808a8598 0000000000000000 8000000408eeeb00
        8000000409a54000 00000000810a1d00 0000000000000000 8000000409d73de8
        8000000409d73de8 0000000000000088 000000000c009c00 8000000409d73e08
        8000000409d73e08 8000000402182080 ffffffff808a84d0 8000000402182080
        ...
Call Trace:
[<ffffffff80de37ec>] netif_carrier_off+0xc/0x58
[<ffffffff80c7804c>] phy_state_machine+0x48c/0x4f8
[<ffffffff808a1708>] process_one_work+0x158/0x368
[<ffffffff808a1ac0>] worker_thread+0x150/0x4c0
[<ffffffff808a8598>] kthread+0xc8/0xe0
[<ffffffff808617f0>] ret_from_kernel_thread+0x14/0x1c

The original motivation for this change originated from Marc Gonzales
indicating that his network driver did not have its adjust_link callback
executing with phydev->link = 0 while he was expecting it.

PHYLIB has never made any such guarantees ever because phy_stop() merely just
tells the workqueue to move into PHY_HALTED state which will happen
asynchronously.

Reported-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reported-by: David Daney <ddaney.cavm@gmail.com>
Fixes: 7ad813f20853 ("net: phy: Correctly process PHY_HALTED in phy_stop_machine()")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge tag 'mlx5-fixes-2017-08-30' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Wed, 30 Aug 2017 23:39:01 +0000 (16:39 -0700)]
Merge tag 'mlx5-fixes-2017-08-30' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2017-08-30

This series contains some misc fixes to the mlx5 driver.

Please pull and let me know if there's any problem.

For -stable:

Kernels >= 4.12
net/mlx5e: Fix CQ moderation mode not set properly
net/mlx5e: Don't override user RSS upon set channels

Kernels >= 4.11
net/mlx5e: Properly resolve TC offloaded ipv6 vxlan tunnel source address

Kernels >= 4.10
net/mlx5e: Fix DCB_CAP_ATTR_DCBX capability for DCBNL getcap
net/mlx5e: Check for qos capability in dcbnl_initialize

Kernels >= 4.9
net/mlx5e: Fix dangling page pointer on DMA mapping error

Kernels >= 4.8
net/mlx5e: Fix inline header size for small packets
net/mlx5: E-Switch, Unload the representors in the correct order
     net/mlx5: Fix arm SRQ command for ISSI version 0
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agonet: dsa: bcm_sf2: Fix number of CFP entries for BCM7278
Florian Fainelli [Wed, 30 Aug 2017 19:39:33 +0000 (12:39 -0700)]
net: dsa: bcm_sf2: Fix number of CFP entries for BCM7278

BCM7278 has only 128 entries while BCM7445 has the full 256 entries set,
fix that.

Fixes: 7318166cacad ("net: dsa: bcm_sf2: Add support for ethtool::rxnfc")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agokcm: do not attach PF_KCM sockets to avoid deadlock
Eric Dumazet [Wed, 30 Aug 2017 16:29:31 +0000 (09:29 -0700)]
kcm: do not attach PF_KCM sockets to avoid deadlock

syzkaller had no problem to trigger a deadlock, attaching a KCM socket
to another one (or itself). (original syzkaller report was a very
confusing lockdep splat during a sendmsg())

It seems KCM claims to only support TCP, but no enforcement is done,
so we might need to add additional checks.

Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Tom Herbert <tom@quantonium.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge branch 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdim...
Linus Torvalds [Wed, 30 Aug 2017 22:28:47 +0000 (15:28 -0700)]
Merge branch 'libnvdimm-fixes' of git://git./linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fix from Dan Williams:
 "A single patch removing some structure definitions from a uapi header
  file. These payloads are never processed directly by the kernel they
  are simply passed through an ioctl as opaque blobs to the ACPI _DSM
  (Device Specific Method) interface.

  Userspace should not be depending on the kernel to define these
  payloads. We will instead provide these definitions via the existing
  libndctl (https://github.com/pmem/ndctl) project that has NVDIMM
  command helpers and other definitions"

* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  libnvdimm: clean up command definitions

7 years agoMerge branch 'net-sched-init-failure-fixes'
David S. Miller [Wed, 30 Aug 2017 22:26:12 +0000 (15:26 -0700)]
Merge branch 'net-sched-init-failure-fixes'

Nikolay Aleksandrov says:

====================
net/sched: init failure fixes

I went over all qdiscs' init, destroy and reset callbacks and found the
issues fixed in each patch. Mostly they are null pointer dereferences due
to uninitialized timer (qdisc watchdog) or double frees due to ->destroy
cleaning up a second time. There's more information in each patch.
I've tested these by either sending wrong attributes from user-spaces, no
attributes or by simulating memory alloc failure where applicable. Also
tried all of the qdiscs as a default qdisc.

Most of these bugs were present before commit 87b60cfacf9f, I've tried to
include proper fixes tags in each patch.

I haven't included individual patch acks in the set, I'd appreciate it if
you take another look and resend them.
====================

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosch_tbf: fix two null pointer dereferences on init failure
Nikolay Aleksandrov [Wed, 30 Aug 2017 09:49:05 +0000 (12:49 +0300)]
sch_tbf: fix two null pointer dereferences on init failure

sch_tbf calls qdisc_watchdog_cancel() in both its ->reset and ->destroy
callbacks but it may fail before the timer is initialized due to missing
options (either not supplied by user-space or set as a default qdisc),
also q->qdisc is used by ->reset and ->destroy so we need it initialized.

Reproduce:
$ sysctl net.core.default_qdisc=tbf
$ ip l set ethX up

Crash log:
[  959.160172] BUG: unable to handle kernel NULL pointer dereference at 0000000000000018
[  959.160323] IP: qdisc_reset+0xa/0x5c
[  959.160400] PGD 59cdb067
[  959.160401] P4D 59cdb067
[  959.160466] PUD 59ccb067
[  959.160532] PMD 0
[  959.160597]
[  959.160706] Oops: 0000 [#1] SMP
[  959.160778] Modules linked in: sch_tbf sch_sfb sch_prio sch_netem
[  959.160891] CPU: 2 PID: 1562 Comm: ip Not tainted 4.13.0-rc6+ #62
[  959.160998] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[  959.161157] task: ffff880059c9a700 task.stack: ffff8800376d0000
[  959.161263] RIP: 0010:qdisc_reset+0xa/0x5c
[  959.161347] RSP: 0018:ffff8800376d3610 EFLAGS: 00010286
[  959.161531] RAX: ffffffffa001b1dd RBX: ffff8800373a2800 RCX: 0000000000000000
[  959.161733] RDX: ffffffff8215f160 RSI: ffffffff8215f160 RDI: 0000000000000000
[  959.161939] RBP: ffff8800376d3618 R08: 00000000014080c0 R09: 00000000ffffffff
[  959.162141] R10: ffff8800376d3578 R11: 0000000000000020 R12: ffffffffa001d2c0
[  959.162343] R13: ffff880037538000 R14: 00000000ffffffff R15: 0000000000000001
[  959.162546] FS:  00007fcc5126b740(0000) GS:ffff88005d900000(0000) knlGS:0000000000000000
[  959.162844] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  959.163030] CR2: 0000000000000018 CR3: 000000005abc4000 CR4: 00000000000406e0
[  959.163233] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  959.163436] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  959.163638] Call Trace:
[  959.163788]  tbf_reset+0x19/0x64 [sch_tbf]
[  959.163957]  qdisc_destroy+0x8b/0xe5
[  959.164119]  qdisc_create_dflt+0x86/0x94
[  959.164284]  ? dev_activate+0x129/0x129
[  959.164449]  attach_one_default_qdisc+0x36/0x63
[  959.164623]  netdev_for_each_tx_queue+0x3d/0x48
[  959.164795]  dev_activate+0x4b/0x129
[  959.164957]  __dev_open+0xe7/0x104
[  959.165118]  __dev_change_flags+0xc6/0x15c
[  959.165287]  dev_change_flags+0x25/0x59
[  959.165451]  do_setlink+0x30c/0xb3f
[  959.165613]  ? check_chain_key+0xb0/0xfd
[  959.165782]  rtnl_newlink+0x3a4/0x729
[  959.165947]  ? rtnl_newlink+0x117/0x729
[  959.166121]  ? ns_capable_common+0xd/0xb1
[  959.166288]  ? ns_capable+0x13/0x15
[  959.166450]  rtnetlink_rcv_msg+0x188/0x197
[  959.166617]  ? rcu_read_unlock+0x3e/0x5f
[  959.166783]  ? rtnl_newlink+0x729/0x729
[  959.166948]  netlink_rcv_skb+0x6c/0xce
[  959.167113]  rtnetlink_rcv+0x23/0x2a
[  959.167273]  netlink_unicast+0x103/0x181
[  959.167439]  netlink_sendmsg+0x326/0x337
[  959.167607]  sock_sendmsg_nosec+0x14/0x3f
[  959.167772]  sock_sendmsg+0x29/0x2e
[  959.167932]  ___sys_sendmsg+0x209/0x28b
[  959.168098]  ? do_raw_spin_unlock+0xcd/0xf8
[  959.168267]  ? _raw_spin_unlock+0x27/0x31
[  959.168432]  ? __handle_mm_fault+0x651/0xdb1
[  959.168602]  ? check_chain_key+0xb0/0xfd
[  959.168773]  __sys_sendmsg+0x45/0x63
[  959.168934]  ? __sys_sendmsg+0x45/0x63
[  959.169100]  SyS_sendmsg+0x19/0x1b
[  959.169260]  entry_SYSCALL_64_fastpath+0x23/0xc2
[  959.169432] RIP: 0033:0x7fcc5097e690
[  959.169592] RSP: 002b:00007ffd0d5c7b48 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[  959.169887] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007fcc5097e690
[  959.170089] RDX: 0000000000000000 RSI: 00007ffd0d5c7b90 RDI: 0000000000000003
[  959.170292] RBP: ffff8800376d3f98 R08: 0000000000000001 R09: 0000000000000003
[  959.170494] R10: 00007ffd0d5c7910 R11: 0000000000000246 R12: 0000000000000006
[  959.170697] R13: 000000000066f1a0 R14: 00007ffd0d5cfc40 R15: 0000000000000000
[  959.170900]  ? trace_hardirqs_off_caller+0xa7/0xcf
[  959.171076] Code: 00 41 c7 84 24 14 01 00 00 00 00 00 00 41 c7 84 24
98 00 00 00 00 00 00 00 41 5c 41 5d 41 5e 5d c3 66 66 66 66 90 55 48 89
e5 53 <48> 8b 47 18 48 89 fb 48 8b 40 48 48 85 c0 74 02 ff d0 48 8b bb
[  959.171637] RIP: qdisc_reset+0xa/0x5c RSP: ffff8800376d3610
[  959.171821] CR2: 0000000000000018

Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation")
Fixes: 0fbbeb1ba43b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosch_sfq: fix null pointer dereference on init failure
Nikolay Aleksandrov [Wed, 30 Aug 2017 09:49:04 +0000 (12:49 +0300)]
sch_sfq: fix null pointer dereference on init failure

Currently only a memory allocation failure can lead to this, so let's
initialize the timer first.

Fixes: 6529eaba33f0 ("net: sched: introduce tcf block infractructure")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosch_netem: avoid null pointer deref on init failure
Nikolay Aleksandrov [Wed, 30 Aug 2017 09:49:03 +0000 (12:49 +0300)]
sch_netem: avoid null pointer deref on init failure

netem can fail in ->init due to missing options (either not supplied by
user-space or used as a default qdisc) causing a timer->base null
pointer deref in its ->destroy() and ->reset() callbacks.

Reproduce:
$ sysctl net.core.default_qdisc=netem
$ ip l set ethX up

Crash log:
[ 1814.846943] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 1814.847181] IP: hrtimer_active+0x17/0x8a
[ 1814.847270] PGD 59c34067
[ 1814.847271] P4D 59c34067
[ 1814.847337] PUD 37374067
[ 1814.847403] PMD 0
[ 1814.847468]
[ 1814.847582] Oops: 0000 [#1] SMP
[ 1814.847655] Modules linked in: sch_netem(O) sch_fq_codel(O)
[ 1814.847761] CPU: 3 PID: 1573 Comm: ip Tainted: G           O 4.13.0-rc6+ #62
[ 1814.847884] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 1814.848043] task: ffff88003723a700 task.stack: ffff88005adc8000
[ 1814.848235] RIP: 0010:hrtimer_active+0x17/0x8a
[ 1814.848407] RSP: 0018:ffff88005adcb590 EFLAGS: 00010246
[ 1814.848590] RAX: 0000000000000000 RBX: ffff880058e359d8 RCX: 0000000000000000
[ 1814.848793] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff880058e359d8
[ 1814.848998] RBP: ffff88005adcb5b0 R08: 00000000014080c0 R09: 00000000ffffffff
[ 1814.849204] R10: ffff88005adcb660 R11: 0000000000000020 R12: 0000000000000000
[ 1814.849410] R13: ffff880058e359d8 R14: 00000000ffffffff R15: 0000000000000001
[ 1814.849616] FS:  00007f733bbca740(0000) GS:ffff88005d980000(0000) knlGS:0000000000000000
[ 1814.849919] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1814.850107] CR2: 0000000000000000 CR3: 0000000059f0d000 CR4: 00000000000406e0
[ 1814.850313] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1814.850518] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 1814.850723] Call Trace:
[ 1814.850875]  hrtimer_try_to_cancel+0x1a/0x93
[ 1814.851047]  hrtimer_cancel+0x15/0x20
[ 1814.851211]  qdisc_watchdog_cancel+0x12/0x14
[ 1814.851383]  netem_reset+0xe6/0xed [sch_netem]
[ 1814.851561]  qdisc_destroy+0x8b/0xe5
[ 1814.851723]  qdisc_create_dflt+0x86/0x94
[ 1814.851890]  ? dev_activate+0x129/0x129
[ 1814.852057]  attach_one_default_qdisc+0x36/0x63
[ 1814.852232]  netdev_for_each_tx_queue+0x3d/0x48
[ 1814.852406]  dev_activate+0x4b/0x129
[ 1814.852569]  __dev_open+0xe7/0x104
[ 1814.852730]  __dev_change_flags+0xc6/0x15c
[ 1814.852899]  dev_change_flags+0x25/0x59
[ 1814.853064]  do_setlink+0x30c/0xb3f
[ 1814.853228]  ? check_chain_key+0xb0/0xfd
[ 1814.853396]  ? check_chain_key+0xb0/0xfd
[ 1814.853565]  rtnl_newlink+0x3a4/0x729
[ 1814.853728]  ? rtnl_newlink+0x117/0x729
[ 1814.853905]  ? ns_capable_common+0xd/0xb1
[ 1814.854072]  ? ns_capable+0x13/0x15
[ 1814.854234]  rtnetlink_rcv_msg+0x188/0x197
[ 1814.854404]  ? rcu_read_unlock+0x3e/0x5f
[ 1814.854572]  ? rtnl_newlink+0x729/0x729
[ 1814.854737]  netlink_rcv_skb+0x6c/0xce
[ 1814.854902]  rtnetlink_rcv+0x23/0x2a
[ 1814.855064]  netlink_unicast+0x103/0x181
[ 1814.855230]  netlink_sendmsg+0x326/0x337
[ 1814.855398]  sock_sendmsg_nosec+0x14/0x3f
[ 1814.855584]  sock_sendmsg+0x29/0x2e
[ 1814.855747]  ___sys_sendmsg+0x209/0x28b
[ 1814.855912]  ? do_raw_spin_unlock+0xcd/0xf8
[ 1814.856082]  ? _raw_spin_unlock+0x27/0x31
[ 1814.856251]  ? __handle_mm_fault+0x651/0xdb1
[ 1814.856421]  ? check_chain_key+0xb0/0xfd
[ 1814.856592]  __sys_sendmsg+0x45/0x63
[ 1814.856755]  ? __sys_sendmsg+0x45/0x63
[ 1814.856923]  SyS_sendmsg+0x19/0x1b
[ 1814.857083]  entry_SYSCALL_64_fastpath+0x23/0xc2
[ 1814.857256] RIP: 0033:0x7f733b2dd690
[ 1814.857419] RSP: 002b:00007ffe1d3387d8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 1814.858238] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007f733b2dd690
[ 1814.858445] RDX: 0000000000000000 RSI: 00007ffe1d338820 RDI: 0000000000000003
[ 1814.858651] RBP: ffff88005adcbf98 R08: 0000000000000001 R09: 0000000000000003
[ 1814.858856] R10: 00007ffe1d3385a0 R11: 0000000000000246 R12: 0000000000000002
[ 1814.859060] R13: 000000000066f1a0 R14: 00007ffe1d3408d0 R15: 0000000000000000
[ 1814.859267]  ? trace_hardirqs_off_caller+0xa7/0xcf
[ 1814.859446] Code: 10 55 48 89 c7 48 89 e5 e8 45 a1 fb ff 31 c0 5d c3
31 c0 c3 66 66 66 66 90 55 48 89 e5 41 56 41 55 41 54 53 49 89 fd 49 8b
45 30 <4c> 8b 20 41 8b 5c 24 38 31 c9 31 d2 48 c7 c7 50 8e 1d 82 41 89
[ 1814.860022] RIP: hrtimer_active+0x17/0x8a RSP: ffff88005adcb590
[ 1814.860214] CR2: 0000000000000000

Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation")
Fixes: 0fbbeb1ba43b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosch_fq_codel: avoid double free on init failure
Nikolay Aleksandrov [Wed, 30 Aug 2017 09:49:02 +0000 (12:49 +0300)]
sch_fq_codel: avoid double free on init failure

It is very unlikely to happen but the backlogs memory allocation
could fail and will free q->flows, but then ->destroy() will free
q->flows too. For correctness remove the first free and let ->destroy
clean up.

Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosch_cbq: fix null pointer dereferences on init failure
Nikolay Aleksandrov [Wed, 30 Aug 2017 09:49:01 +0000 (12:49 +0300)]
sch_cbq: fix null pointer dereferences on init failure

CBQ can fail on ->init by wrong nl attributes or simply for missing any,
f.e. if it's set as a default qdisc then TCA_OPTIONS (opt) will be NULL
when it is activated. The first thing init does is parse opt but it will
dereference a null pointer if used as a default qdisc, also since init
failure at default qdisc invokes ->reset() which cancels all timers then
we'll also dereference two more null pointers (timer->base) as they were
never initialized.

To reproduce:
$ sysctl net.core.default_qdisc=cbq
$ ip l set ethX up

Crash log of the first null ptr deref:
[44727.907454] BUG: unable to handle kernel NULL pointer dereference at (null)
[44727.907600] IP: cbq_init+0x27/0x205
[44727.907676] PGD 59ff4067
[44727.907677] P4D 59ff4067
[44727.907742] PUD 59c70067
[44727.907807] PMD 0
[44727.907873]
[44727.907982] Oops: 0000 [#1] SMP
[44727.908054] Modules linked in:
[44727.908126] CPU: 1 PID: 21312 Comm: ip Not tainted 4.13.0-rc6+ #60
[44727.908235] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[44727.908477] task: ffff88005ad42700 task.stack: ffff880037214000
[44727.908672] RIP: 0010:cbq_init+0x27/0x205
[44727.908838] RSP: 0018:ffff8800372175f0 EFLAGS: 00010286
[44727.909018] RAX: ffffffff816c3852 RBX: ffff880058c53800 RCX: 0000000000000000
[44727.909222] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffff8800372175f8
[44727.909427] RBP: ffff880037217650 R08: ffffffff81b0f380 R09: 0000000000000000
[44727.909631] R10: ffff880037217660 R11: 0000000000000020 R12: ffffffff822a44c0
[44727.909835] R13: ffff880058b92000 R14: 00000000ffffffff R15: 0000000000000001
[44727.910040] FS:  00007ff8bc583740(0000) GS:ffff88005d880000(0000) knlGS:0000000000000000
[44727.910339] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[44727.910525] CR2: 0000000000000000 CR3: 00000000371e5000 CR4: 00000000000406e0
[44727.910731] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[44727.910936] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[44727.911141] Call Trace:
[44727.911291]  ? lockdep_init_map+0xb6/0x1ba
[44727.911461]  ? qdisc_alloc+0x14e/0x187
[44727.911626]  qdisc_create_dflt+0x7a/0x94
[44727.911794]  ? dev_activate+0x129/0x129
[44727.911959]  attach_one_default_qdisc+0x36/0x63
[44727.912132]  netdev_for_each_tx_queue+0x3d/0x48
[44727.912305]  dev_activate+0x4b/0x129
[44727.912468]  __dev_open+0xe7/0x104
[44727.912631]  __dev_change_flags+0xc6/0x15c
[44727.912799]  dev_change_flags+0x25/0x59
[44727.912966]  do_setlink+0x30c/0xb3f
[44727.913129]  ? check_chain_key+0xb0/0xfd
[44727.913294]  ? check_chain_key+0xb0/0xfd
[44727.913463]  rtnl_newlink+0x3a4/0x729
[44727.913626]  ? rtnl_newlink+0x117/0x729
[44727.913801]  ? ns_capable_common+0xd/0xb1
[44727.913968]  ? ns_capable+0x13/0x15
[44727.914131]  rtnetlink_rcv_msg+0x188/0x197
[44727.914300]  ? rcu_read_unlock+0x3e/0x5f
[44727.914465]  ? rtnl_newlink+0x729/0x729
[44727.914630]  netlink_rcv_skb+0x6c/0xce
[44727.914796]  rtnetlink_rcv+0x23/0x2a
[44727.914956]  netlink_unicast+0x103/0x181
[44727.915122]  netlink_sendmsg+0x326/0x337
[44727.915291]  sock_sendmsg_nosec+0x14/0x3f
[44727.915459]  sock_sendmsg+0x29/0x2e
[44727.915619]  ___sys_sendmsg+0x209/0x28b
[44727.915784]  ? do_raw_spin_unlock+0xcd/0xf8
[44727.915954]  ? _raw_spin_unlock+0x27/0x31
[44727.916121]  ? __handle_mm_fault+0x651/0xdb1
[44727.916290]  ? check_chain_key+0xb0/0xfd
[44727.916461]  __sys_sendmsg+0x45/0x63
[44727.916626]  ? __sys_sendmsg+0x45/0x63
[44727.916792]  SyS_sendmsg+0x19/0x1b
[44727.916950]  entry_SYSCALL_64_fastpath+0x23/0xc2
[44727.917125] RIP: 0033:0x7ff8bbc96690
[44727.917286] RSP: 002b:00007ffc360991e8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[44727.917579] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007ff8bbc96690
[44727.917783] RDX: 0000000000000000 RSI: 00007ffc36099230 RDI: 0000000000000003
[44727.917987] RBP: ffff880037217f98 R08: 0000000000000001 R09: 0000000000000003
[44727.918190] R10: 00007ffc36098fb0 R11: 0000000000000246 R12: 0000000000000006
[44727.918393] R13: 000000000066f1a0 R14: 00007ffc360a12e0 R15: 0000000000000000
[44727.918597]  ? trace_hardirqs_off_caller+0xa7/0xcf
[44727.918774] Code: 41 5f 5d c3 66 66 66 66 90 55 48 8d 56 04 45 31 c9
49 c7 c0 80 f3 b0 81 48 89 e5 41 55 41 54 53 48 89 fb 48 8d 7d a8 48 83
ec 48 <0f> b7 0e be 07 00 00 00 83 e9 04 e8 e6 f7 d8 ff 85 c0 0f 88 bb
[44727.919332] RIP: cbq_init+0x27/0x205 RSP: ffff8800372175f0
[44727.919516] CR2: 0000000000000000

Fixes: 0fbbeb1ba43b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosch_hfsc: fix null pointer deref and double free on init failure
Nikolay Aleksandrov [Wed, 30 Aug 2017 09:49:00 +0000 (12:49 +0300)]
sch_hfsc: fix null pointer deref and double free on init failure

Depending on where ->init fails we can get a null pointer deref due to
uninitialized hires timer (watchdog) or a double free of the qdisc hash
because it is already freed by ->destroy().

Fixes: 8d5537387505 ("net/sched/hfsc: allocate tcf block for hfsc root class")
Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosch_hhf: fix null pointer dereference on init failure
Nikolay Aleksandrov [Wed, 30 Aug 2017 09:48:59 +0000 (12:48 +0300)]
sch_hhf: fix null pointer dereference on init failure

If sch_hhf fails in its ->init() function (either due to wrong
user-space arguments as below or memory alloc failure of hh_flows) it
will do a null pointer deref of q->hh_flows in its ->destroy() function.

To reproduce the crash:
$ tc qdisc add dev eth0 root hhf quantum 2000000 non_hh_weight 10000000

Crash log:
[  690.654882] BUG: unable to handle kernel NULL pointer dereference at (null)
[  690.655565] IP: hhf_destroy+0x48/0xbc
[  690.655944] PGD 37345067
[  690.655948] P4D 37345067
[  690.656252] PUD 58402067
[  690.656554] PMD 0
[  690.656857]
[  690.657362] Oops: 0000 [#1] SMP
[  690.657696] Modules linked in:
[  690.658032] CPU: 3 PID: 920 Comm: tc Not tainted 4.13.0-rc6+ #57
[  690.658525] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[  690.659255] task: ffff880058578000 task.stack: ffff88005acbc000
[  690.659747] RIP: 0010:hhf_destroy+0x48/0xbc
[  690.660146] RSP: 0018:ffff88005acbf9e0 EFLAGS: 00010246
[  690.660601] RAX: 0000000000000000 RBX: 0000000000000020 RCX: 0000000000000000
[  690.661155] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff821f63f0
[  690.661710] RBP: ffff88005acbfa08 R08: ffffffff81b10a90 R09: 0000000000000000
[  690.662267] R10: 00000000f42b7019 R11: ffff880058578000 R12: 00000000ffffffea
[  690.662820] R13: ffff8800372f6400 R14: 0000000000000000 R15: 0000000000000000
[  690.663769] FS:  00007f8ae5e8b740(0000) GS:ffff88005d980000(0000) knlGS:0000000000000000
[  690.667069] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  690.667965] CR2: 0000000000000000 CR3: 0000000058523000 CR4: 00000000000406e0
[  690.668918] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  690.669945] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  690.671003] Call Trace:
[  690.671743]  qdisc_create+0x377/0x3fd
[  690.672534]  tc_modify_qdisc+0x4d2/0x4fd
[  690.673324]  rtnetlink_rcv_msg+0x188/0x197
[  690.674204]  ? rcu_read_unlock+0x3e/0x5f
[  690.675091]  ? rtnl_newlink+0x729/0x729
[  690.675877]  netlink_rcv_skb+0x6c/0xce
[  690.676648]  rtnetlink_rcv+0x23/0x2a
[  690.677405]  netlink_unicast+0x103/0x181
[  690.678179]  netlink_sendmsg+0x326/0x337
[  690.678958]  sock_sendmsg_nosec+0x14/0x3f
[  690.679743]  sock_sendmsg+0x29/0x2e
[  690.680506]  ___sys_sendmsg+0x209/0x28b
[  690.681283]  ? __handle_mm_fault+0xc7d/0xdb1
[  690.681915]  ? check_chain_key+0xb0/0xfd
[  690.682449]  __sys_sendmsg+0x45/0x63
[  690.682954]  ? __sys_sendmsg+0x45/0x63
[  690.683471]  SyS_sendmsg+0x19/0x1b
[  690.683974]  entry_SYSCALL_64_fastpath+0x23/0xc2
[  690.684516] RIP: 0033:0x7f8ae529d690
[  690.685016] RSP: 002b:00007fff26d2d6b8 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[  690.685931] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007f8ae529d690
[  690.686573] RDX: 0000000000000000 RSI: 00007fff26d2d700 RDI: 0000000000000003
[  690.687047] RBP: ffff88005acbff98 R08: 0000000000000001 R09: 0000000000000000
[  690.687519] R10: 00007fff26d2d480 R11: 0000000000000246 R12: 0000000000000002
[  690.687996] R13: 0000000001258070 R14: 0000000000000001 R15: 0000000000000000
[  690.688475]  ? trace_hardirqs_off_caller+0xa7/0xcf
[  690.688887] Code: 00 00 e8 2a 02 ae ff 49 8b bc 1d 60 02 00 00 48 83
c3 08 e8 19 02 ae ff 48 83 fb 20 75 dc 45 31 f6 4d 89 f7 4d 03 bd 20 02
00 00 <49> 8b 07 49 39 c7 75 24 49 83 c6 10 49 81 fe 00 40 00 00 75 e1
[  690.690200] RIP: hhf_destroy+0x48/0xbc RSP: ffff88005acbf9e0
[  690.690636] CR2: 0000000000000000

Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation")
Fixes: 10239edf86f1 ("net-qdisc-hhf: Heavy-Hitter Filter (HHF) qdisc")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosch_multiq: fix double free on init failure
Nikolay Aleksandrov [Wed, 30 Aug 2017 09:48:58 +0000 (12:48 +0300)]
sch_multiq: fix double free on init failure

The below commit added a call to ->destroy() on init failure, but multiq
still frees ->queues on error in init, but ->queues is also freed by
->destroy() thus we get double free and corrupted memory.

Very easy to reproduce (eth0 not multiqueue):
$ tc qdisc add dev eth0 root multiq
RTNETLINK answers: Operation not supported
$ ip l add dumdum type dummy
(crash)

Trace log:
[ 3929.467747] general protection fault: 0000 [#1] SMP
[ 3929.468083] Modules linked in:
[ 3929.468302] CPU: 3 PID: 967 Comm: ip Not tainted 4.13.0-rc6+ #56
[ 3929.468625] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 3929.469124] task: ffff88003716a700 task.stack: ffff88005872c000
[ 3929.469449] RIP: 0010:__kmalloc_track_caller+0x117/0x1be
[ 3929.469746] RSP: 0018:ffff88005872f6a0 EFLAGS: 00010246
[ 3929.470042] RAX: 00000000000002de RBX: 0000000058a59000 RCX: 00000000000002df
[ 3929.470406] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffff821f7020
[ 3929.470770] RBP: ffff88005872f6e8 R08: 000000000001f010 R09: 0000000000000000
[ 3929.471133] R10: ffff88005872f730 R11: 0000000000008cdd R12: ff006d75646d7564
[ 3929.471496] R13: 00000000014000c0 R14: ffff88005b403c00 R15: ffff88005b403c00
[ 3929.471869] FS:  00007f0b70480740(0000) GS:ffff88005d980000(0000) knlGS:0000000000000000
[ 3929.472286] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3929.472677] CR2: 00007ffcee4f3000 CR3: 0000000059d45000 CR4: 00000000000406e0
[ 3929.473209] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 3929.474109] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 3929.474873] Call Trace:
[ 3929.475337]  ? kstrdup_const+0x23/0x25
[ 3929.475863]  kstrdup+0x2e/0x4b
[ 3929.476338]  kstrdup_const+0x23/0x25
[ 3929.478084]  __kernfs_new_node+0x28/0xbc
[ 3929.478478]  kernfs_new_node+0x35/0x55
[ 3929.478929]  kernfs_create_link+0x23/0x76
[ 3929.479478]  sysfs_do_create_link_sd.isra.2+0x85/0xd7
[ 3929.480096]  sysfs_create_link+0x33/0x35
[ 3929.480649]  device_add+0x200/0x589
[ 3929.481184]  netdev_register_kobject+0x7c/0x12f
[ 3929.481711]  register_netdevice+0x373/0x471
[ 3929.482174]  rtnl_newlink+0x614/0x729
[ 3929.482610]  ? rtnl_newlink+0x17f/0x729
[ 3929.483080]  rtnetlink_rcv_msg+0x188/0x197
[ 3929.483533]  ? rcu_read_unlock+0x3e/0x5f
[ 3929.483984]  ? rtnl_newlink+0x729/0x729
[ 3929.484420]  netlink_rcv_skb+0x6c/0xce
[ 3929.484858]  rtnetlink_rcv+0x23/0x2a
[ 3929.485291]  netlink_unicast+0x103/0x181
[ 3929.485735]  netlink_sendmsg+0x326/0x337
[ 3929.486181]  sock_sendmsg_nosec+0x14/0x3f
[ 3929.486614]  sock_sendmsg+0x29/0x2e
[ 3929.486973]  ___sys_sendmsg+0x209/0x28b
[ 3929.487340]  ? do_raw_spin_unlock+0xcd/0xf8
[ 3929.487719]  ? _raw_spin_unlock+0x27/0x31
[ 3929.488092]  ? __handle_mm_fault+0x651/0xdb1
[ 3929.488471]  ? check_chain_key+0xb0/0xfd
[ 3929.488847]  __sys_sendmsg+0x45/0x63
[ 3929.489206]  ? __sys_sendmsg+0x45/0x63
[ 3929.489576]  SyS_sendmsg+0x19/0x1b
[ 3929.489901]  entry_SYSCALL_64_fastpath+0x23/0xc2
[ 3929.490172] RIP: 0033:0x7f0b6fb93690
[ 3929.490423] RSP: 002b:00007ffcee4ed588 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
[ 3929.490881] RAX: ffffffffffffffda RBX: ffffffff810d278c RCX: 00007f0b6fb93690
[ 3929.491198] RDX: 0000000000000000 RSI: 00007ffcee4ed5d0 RDI: 0000000000000003
[ 3929.491521] RBP: ffff88005872ff98 R08: 0000000000000001 R09: 0000000000000000
[ 3929.491801] R10: 00007ffcee4ed350 R11: 0000000000000246 R12: 0000000000000002
[ 3929.492075] R13: 000000000066f1a0 R14: 00007ffcee4f5680 R15: 0000000000000000
[ 3929.492352]  ? trace_hardirqs_off_caller+0xa7/0xcf
[ 3929.492590] Code: 8b 45 c0 48 8b 45 b8 74 17 48 8b 4d c8 83 ca ff 44
89 ee 4c 89 f7 e8 83 ca ff ff 49 89 c4 eb 49 49 63 56 20 48 8d 48 01 4d
8b 06 <49> 8b 1c 14 48 89 c2 4c 89 e0 65 49 0f c7 08 0f 94 c0 83 f0 01
[ 3929.493335] RIP: __kmalloc_track_caller+0x117/0x1be RSP: ffff88005872f6a0

Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation")
Fixes: f07d1501292b ("multiq: Further multiqueue cleanup")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agosch_htb: fix crash on init failure
Nikolay Aleksandrov [Wed, 30 Aug 2017 09:48:57 +0000 (12:48 +0300)]
sch_htb: fix crash on init failure

The commit below added a call to the ->destroy() callback for all qdiscs
which failed in their ->init(), but some were not prepared for such
change and can't handle partially initialized qdisc. HTB is one of them
and if any error occurs before the qdisc watchdog timer and qdisc work are
initialized then we can hit either a null ptr deref (timer->base) when
canceling in ->destroy or lockdep error info about trying to register
a non-static key and a stack dump. So to fix these two move the watchdog
timer and workqueue init before anything that can err out.
To reproduce userspace needs to send broken htb qdisc create request,
tested with a modified tc (q_htb.c).

Trace log:
[ 2710.897602] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 2710.897977] IP: hrtimer_active+0x17/0x8a
[ 2710.898174] PGD 58fab067
[ 2710.898175] P4D 58fab067
[ 2710.898353] PUD 586c0067
[ 2710.898531] PMD 0
[ 2710.898710]
[ 2710.899045] Oops: 0000 [#1] SMP
[ 2710.899232] Modules linked in:
[ 2710.899419] CPU: 1 PID: 950 Comm: tc Not tainted 4.13.0-rc6+ #54
[ 2710.899646] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 2710.900035] task: ffff880059ed2700 task.stack: ffff88005ad4c000
[ 2710.900262] RIP: 0010:hrtimer_active+0x17/0x8a
[ 2710.900467] RSP: 0018:ffff88005ad4f960 EFLAGS: 00010246
[ 2710.900684] RAX: 0000000000000000 RBX: ffff88003701e298 RCX: 0000000000000000
[ 2710.900933] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88003701e298
[ 2710.901177] RBP: ffff88005ad4f980 R08: 0000000000000001 R09: 0000000000000001
[ 2710.901419] R10: ffff88005ad4f800 R11: 0000000000000400 R12: 0000000000000000
[ 2710.901663] R13: ffff88003701e298 R14: ffffffff822a4540 R15: ffff88005ad4fac0
[ 2710.901907] FS:  00007f2f5e90f740(0000) GS:ffff88005d880000(0000) knlGS:0000000000000000
[ 2710.902277] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2710.902500] CR2: 0000000000000000 CR3: 0000000058ca3000 CR4: 00000000000406e0
[ 2710.902744] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2710.902977] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2710.903180] Call Trace:
[ 2710.903332]  hrtimer_try_to_cancel+0x1a/0x93
[ 2710.903504]  hrtimer_cancel+0x15/0x20
[ 2710.903667]  qdisc_watchdog_cancel+0x12/0x14
[ 2710.903866]  htb_destroy+0x2e/0xf7
[ 2710.904097]  qdisc_create+0x377/0x3fd
[ 2710.904330]  tc_modify_qdisc+0x4d2/0x4fd
[ 2710.904511]  rtnetlink_rcv_msg+0x188/0x197
[ 2710.904682]  ? rcu_read_unlock+0x3e/0x5f
[ 2710.904849]  ? rtnl_newlink+0x729/0x729
[ 2710.905017]  netlink_rcv_skb+0x6c/0xce
[ 2710.905183]  rtnetlink_rcv+0x23/0x2a
[ 2710.905345]  netlink_unicast+0x103/0x181
[ 2710.905511]  netlink_sendmsg+0x326/0x337
[ 2710.905679]  sock_sendmsg_nosec+0x14/0x3f
[ 2710.905847]  sock_sendmsg+0x29/0x2e
[ 2710.906010]  ___sys_sendmsg+0x209/0x28b
[ 2710.906176]  ? do_raw_spin_unlock+0xcd/0xf8
[ 2710.906346]  ? _raw_spin_unlock+0x27/0x31
[ 2710.906514]  ? __handle_mm_fault+0x651/0xdb1
[ 2710.906685]  ? check_chain_key+0xb0/0xfd
[ 2710.906855]  __sys_sendmsg+0x45/0x63
[ 2710.907018]  ? __sys_sendmsg+0x45/0x63
[ 2710.907185]  SyS_sendmsg+0x19/0x1b
[ 2710.907344]  entry_SYSCALL_64_fastpath+0x23/0xc2

Note that probably this bug goes further back because the default qdisc
handling always calls ->destroy on init failure too.

Fixes: 87b60cfacf9f ("net_sched: fix error recovery at qdisc creation")
Fixes: 0fbbeb1ba43b ("[PKT_SCHED]: Fix missing qdisc_destroy() in qdisc_create_dflt()")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
7 years agoMerge tag 'drm-fixes-for-v4.13-rc8' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Wed, 30 Aug 2017 22:10:56 +0000 (15:10 -0700)]
Merge tag 'drm-fixes-for-v4.13-rc8' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "Two fixes (a vmwgfx and core drm fix) in the queue for 4.13 final,
  hopefully that is it"

* tag 'drm-fixes-for-v4.13-rc8' of git://people.freedesktop.org/~airlied/linux:
  drm/vmwgfx: Fix F26 Wayland screen update issue
  drm/bridge/sii8620: Fix memory corruption