Andrew Lunn [Sat, 4 Jun 2016 19:17:04 +0000 (21:17 +0200)]
net: dsa: Make mdio bus optional
The switch may want to instantiate its own MDIO bus. Only do it
centrally if the switch has not already created one, and the read op
is implemented.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 4 Jun 2016 19:17:03 +0000 (21:17 +0200)]
net: dsa: Refactor selection of tag ops into a function
Replace the two switch statements with an array lookup, and store the
result in the dsa tree structure. The drivers no longer need to know
the selected tag protocol, so remove it from the dsa switch structure.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 4 Jun 2016 19:17:02 +0000 (21:17 +0200)]
net: dsa: mv88e6xxx: Only support EDSA tagging
The merged driver no longer offers the option to use DSA tagging. So
remove the code to setup the switch to do DSA tagging and hard code
the use of EDSA.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>y
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 4 Jun 2016 19:17:01 +0000 (21:17 +0200)]
net: dsa: Split up creating/destroying of DSA and CPU ports
Refactor the code to setup a single DSA/CPU port into a function of
its own, and export it, so it can be used by the new binding.
Similarly, refactor the destroy code into a function. When destroying
the ports, don't put the of node. They should be released at the end
along with the normal ports.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 4 Jun 2016 19:17:00 +0000 (21:17 +0200)]
net: dsa: Copy the routing table into the switch structure
The new binding will not have a chip data structure, it will place the
routing directly into the switch structure. To enable backwards
compatibility, copy the routing from the chip data into the switch
structure.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 4 Jun 2016 19:16:59 +0000 (21:16 +0200)]
net: dsa: Remove dynamic allocate of routing table
With a maximum of four switches, the size of the routing table is the
same as the pointer to it. Removing it makes the code simpler.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 4 Jun 2016 19:16:58 +0000 (21:16 +0200)]
net: dsa: Move port device node into port structure
Move the port device node structure into the port structure, from the
chip data. This information is needed in the next step of implementing
the new binding.
The chip data structure is used while parsing the whole old binding,
before the individual switch structures exist. With the new bindings,
this is reversed, the switches exist first, and the interconnections
between the switches is derived from the individual switch
bindings. Thus this chip data structure becomes unneeded.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
eviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 4 Jun 2016 19:16:57 +0000 (21:16 +0200)]
net: dsa: Add a ports structure and use it in the switch structure
There are going to be more per-port members added to the switch
structure. So add a port structure and move the netdev into it.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 4 Jun 2016 19:16:56 +0000 (21:16 +0200)]
net: dsa: tag_{e}dsa.c: Remove dependency on platform data
The platform data nr_chips is used when validating a received packet,
to ensure it comes from a know switch chip. The number of possible
switches is limited to DSA_MAX_SWITCHES, so use this as the first
validation step. The new binding allows holes in the dst->ds[] array,
so also ensure ensure there is a valid dsa_switch for this packet.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 4 Jun 2016 19:16:55 +0000 (21:16 +0200)]
net: dsa: slave: Remove MDIO address from switch MDIO bus name
The DSA layer should no longer assume the switch is connected to an
MDIO bus. As a result, we cannot use the address on the MDIO bus when
forming the name of the switches internal MDIO bus for its builtin and
possibly external PHYs. The switch index is sufficient to make the
name unique, so drop the MDIO address.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vivien Didelot [Sat, 4 Jun 2016 19:16:54 +0000 (21:16 +0200)]
net: dsa: mv88e6xxx: fix circular lock in PPU work
Lock debugging shows that there is a possible circular lock in the PPU
work code. Switch the lock order of smi_mutex and ppu_mutex to fix this.
Here's the full trace:
[ 4.341325] ======================================================
[ 4.347519] [ INFO: possible circular locking dependency detected ]
[ 4.353800] 4.6.0 #4 Not tainted
[ 4.357039] -------------------------------------------------------
[ 4.363315] kworker/0:1/328 is trying to acquire lock:
[ 4.368463] (&ps->smi_mutex){+.+.+.}, at: [<
8049c758>] mv88e6xxx_reg_read+0x30/0x54
[ 4.376313]
[ 4.376313] but task is already holding lock:
[ 4.382160] (&ps->ppu_mutex){+.+...}, at: [<
8049cac0>] mv88e6xxx_ppu_reenable_work+0x28/0xd4
[ 4.390772]
[ 4.390772] which lock already depends on the new lock.
[ 4.390772]
[ 4.398963]
[ 4.398963] the existing dependency chain (in reverse order) is:
[ 4.406461]
[ 4.406461] -> #1 (&ps->ppu_mutex){+.+...}:
[ 4.410897] [<
806d86bc>] mutex_lock_nested+0x54/0x360
[ 4.416606] [<
8049a800>] mv88e6xxx_ppu_access_get+0x28/0x100
[ 4.422906] [<
8049b778>] mv88e6xxx_phy_read+0x90/0xdc
[ 4.428599] [<
806a4534>] dsa_slave_phy_read+0x3c/0x40
[ 4.434300] [<
804943ec>] mdiobus_read+0x68/0x80
[ 4.439481] [<
804939d4>] get_phy_device+0x58/0x1d8
[ 4.444914] [<
80493ed0>] mdiobus_scan+0x24/0xf4
[ 4.450078] [<
8049409c>] __mdiobus_register+0xfc/0x1ac
[ 4.455857] [<
806a40b0>] dsa_probe+0x860/0xca8
[ 4.460934] [<
8043246c>] platform_drv_probe+0x5c/0xc0
[ 4.466627] [<
804305a0>] driver_probe_device+0x118/0x450
[ 4.472589] [<
80430b00>] __device_attach_driver+0xac/0x128
[ 4.478724] [<
8042e350>] bus_for_each_drv+0x74/0xa8
[ 4.484235] [<
804302d8>] __device_attach+0xc4/0x154
[ 4.489755] [<
80430cec>] device_initial_probe+0x1c/0x20
[ 4.495612] [<
8042f620>] bus_probe_device+0x98/0xa0
[ 4.501123] [<
8042fbd0>] deferred_probe_work_func+0x4c/0xd4
[ 4.507328] [<
8013a794>] process_one_work+0x1a8/0x604
[ 4.513030] [<
8013ac54>] worker_thread+0x64/0x528
[ 4.518367] [<
801409e8>] kthread+0xec/0x100
[ 4.523201] [<
80108f30>] ret_from_fork+0x14/0x24
[ 4.528462]
[ 4.528462] -> #0 (&ps->smi_mutex){+.+.+.}:
[ 4.532895] [<
8015ad5c>] lock_acquire+0xb4/0x1dc
[ 4.538154] [<
806d86bc>] mutex_lock_nested+0x54/0x360
[ 4.543856] [<
8049c758>] mv88e6xxx_reg_read+0x30/0x54
[ 4.549549] [<
8049cad8>] mv88e6xxx_ppu_reenable_work+0x40/0xd4
[ 4.556022] [<
8013a794>] process_one_work+0x1a8/0x604
[ 4.561707] [<
8013ac54>] worker_thread+0x64/0x528
[ 4.567053] [<
801409e8>] kthread+0xec/0x100
[ 4.571878] [<
80108f30>] ret_from_fork+0x14/0x24
[ 4.577139]
[ 4.577139] other info that might help us debug this:
[ 4.577139]
[ 4.585159] Possible unsafe locking scenario:
[ 4.585159]
[ 4.591093] CPU0 CPU1
[ 4.595631] ---- ----
[ 4.600169] lock(&ps->ppu_mutex);
[ 4.603693] lock(&ps->smi_mutex);
[ 4.609742] lock(&ps->ppu_mutex);
[ 4.615790] lock(&ps->smi_mutex);
[ 4.619314]
[ 4.619314] *** DEADLOCK ***
[ 4.619314]
[ 4.625256] 3 locks held by kworker/0:1/328:
[ 4.629537] #0: ("events"){.+.+..}, at: [<
8013a704>] process_one_work+0x118/0x604
[ 4.637288] #1: ((&ps->ppu_work)){+.+...}, at: [<
8013a704>] process_one_work+0x118/0x604
[ 4.645653] #2: (&ps->ppu_mutex){+.+...}, at: [<
8049cac0>] mv88e6xxx_ppu_reenable_work+0x28/0xd4
[ 4.654714]
[ 4.654714] stack backtrace:
[ 4.659098] CPU: 0 PID: 328 Comm: kworker/0:1 Not tainted 4.6.0 #4
[ 4.665286] Hardware name: Freescale Vybrid VF5xx/VF6xx (Device Tree)
[ 4.671748] Workqueue: events mv88e6xxx_ppu_reenable_work
[ 4.677174] Backtrace:
[ 4.679674] [<
8010d354>] (dump_backtrace) from [<
8010d5a0>] (show_stack+0x20/0x24)
[ 4.687252] r6:
80fb3c88 r5:
80fb3c88 r4:
80fb4728 r3:
00000002
[ 4.693003] [<
8010d580>] (show_stack) from [<
803b45e8>] (dump_stack+0x24/0x28)
[ 4.700246] [<
803b45c4>] (dump_stack) from [<
80157398>] (print_circular_bug+0x208/0x32c)
[ 4.708361] [<
80157190>] (print_circular_bug) from [<
8015a630>] (__lock_acquire+0x185c/0x1b80)
[ 4.716982] r10:
9ec22a00 r9:
00000060 r8:
8164b6bc r7:
00000040 r6:
00000003 r5:
8163a5b4
[ 4.724905] r4:
00000003 r3:
9ec22de8
[ 4.728537] [<
80158dd4>] (__lock_acquire) from [<
8015ad5c>] (lock_acquire+0xb4/0x1dc)
[ 4.736378] r10:
60000013 r9:
00000000 r8:
00000000 r7:
00000000 r6:
9e5e9c50 r5:
80e618e0
[ 4.744301] r4:
00000000
[ 4.746879] [<
8015aca8>] (lock_acquire) from [<
806d86bc>] (mutex_lock_nested+0x54/0x360)
[ 4.754976] r10:
9e5e9c1c r9:
80e616c4 r8:
9f685ea0 r7:
0000001b r6:
9ec22a00 r5:
8163a5b4
[ 4.762899] r4:
9e5e9c1c
[ 4.765477] [<
806d8668>] (mutex_lock_nested) from [<
8049c758>] (mv88e6xxx_reg_read+0x30/0x54)
[ 4.774008] r10:
80e60c5b r9:
80e616c4 r8:
9f685ea0 r7:
0000001b r6:
00000004 r5:
9e5e9c10
[ 4.781930] r4:
9e5e9c1c
[ 4.784507] [<
8049c728>] (mv88e6xxx_reg_read) from [<
8049cad8>] (mv88e6xxx_ppu_reenable_work+0x40/0xd4)
[ 4.793907] r7:
9ffd5400 r6:
9e5e9c68 r5:
9e5e9cb0 r4:
9e5e9c10
[ 4.799659] [<
8049ca98>] (mv88e6xxx_ppu_reenable_work) from [<
8013a794>] (process_one_work+0x1a8/0x604)
[ 4.809059] r9:
80e616c4 r8:
9f685ea0 r7:
9ffd5400 r6:
80e0a1c8 r5:
9f5f2e80 r4:
9e5e9cb0
[ 4.816910] [<
8013a5ec>] (process_one_work) from [<
8013ac54>] (worker_thread+0x64/0x528)
[ 4.825010] r10:
9f5f2e80 r9:
00000008 r8:
80e0dc80 r7:
80e0a1fc r6:
80e0a1c8 r5:
9f5f2e98
[ 4.832933] r4:
80e0a1c8
[ 4.835510] [<
8013abf0>] (worker_thread) from [<
801409e8>] (kthread+0xec/0x100)
[ 4.842827] r10:
00000000 r9:
00000000 r8:
00000000 r7:
8013abf0 r6:
9f5f2e80 r5:
9ec15740
[ 4.850749] r4:
00000000
[ 4.853327] [<
801408fc>] (kthread) from [<
80108f30>] (ret_from_fork+0x14/0x24)
[ 4.860557] r7:
00000000 r6:
00000000 r5:
801408fc r4:
9ec15740
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andrew Lunn [Sat, 4 Jun 2016 19:16:52 +0000 (21:16 +0200)]
net: dsa: slave: chip data is optional, don't dereference NULL
The new binding does not make use of dsa_chip_data, a.k.a cd. When
retrieving the size of the EEPROM attached to a switch, don't assume
there is a cd attached to the switch structure.
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 4 Jun 2016 05:56:28 +0000 (22:56 -0700)]
net: Add docbook description for 'mtu' arg to skb_gso_validate_mtu()
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 4 Jun 2016 05:53:26 +0000 (22:53 -0700)]
sctp: Fix warning in sctp_packet_transmit_chunk()
size_t objects should be printed with %Z printf format.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Sat, 4 Jun 2016 05:20:16 +0000 (08:20 +0300)]
qed: Fix next-ptr chains for BE / 32-bit
Commit
a91eb52abb50 ("qed: Revisit chain implementation") contains an
incorrect implementation for BE platforms, as device's regpairs containing
addresses are LE and they're not converted correctly when read back.
In addition, it raises a compilation warning for 32-bit platforms where
dma_addr_t is a 32-bit variable.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sat, 4 Jun 2016 00:08:45 +0000 (20:08 -0400)]
Merge branch 'qed-roce-iscsi'
Yuval Mintz says:
====================
qed: RocE & iSCSI infrastructure
We plan on sending 2 new protocol drivers in the imminent future -
both our RoCE [qedr] and iSCSI [qedi] drivers. As both submissions
would be rather massive and in order to avoid collisions between them,
the common infrastructure on the qed side was prepared as an independent
patch-series to be sent ahead of those 2 submissions.
This patch series introduces in QED 2 new 'ids' - one for iscsi and
one for roce. It then goes and adds logic required for configuring
said protocols in HW. Notice it *doesn't* actually add any client using
said ids, but rather only the infrastructure to allow their later usage.
What this patch doesn't contain is the slowpath protocol-configuration
toward the firmware. I.e., it contains register-setting logic, memory
allocations, etc., but not actual flow-related configuration specific
to the protocl. Those would be sent as part of the protocol driver
submissions.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Fri, 3 Jun 2016 11:35:35 +0000 (14:35 +0300)]
qed: Initialize hardware for new protocols
RoCE and iSCSI would require some added/changed hw configuration in order
to properly run; The biggest single change being the requirement of
allocating and mapping host memory for several HW blocks that aren't being
used by qede [SRC, QM, TM, etc.].
In addition, whereas qede is only using context memory for HW blocks, the
new protocol would also require task memories to be added.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Fri, 3 Jun 2016 11:35:34 +0000 (14:35 +0300)]
qed: Add iscsi/rdma personalities
This patch adds in the ecore 2 new personalities in addition to
QED_PCI_ETH - QED_PCI_ISCSI and QED_PCI_ETH_ROCE.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Fri, 3 Jun 2016 11:35:33 +0000 (14:35 +0300)]
qed: Add common HSI for new protocols
This adds the qed portion of the RoCE & iSCSI firmware HSI,
as well as adding several new common HSI files which would be required
by both qed and qed* protocols.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Fri, 3 Jun 2016 11:35:32 +0000 (14:35 +0300)]
qed: Revisit chain implementation
RoCE driver is going to need a 32-bit chain [current chain implementation
for qed* currently supports only 16-bit producer/consumer chains].
This patch adds said support, as well as doing other slight tweaks and
modifications to qed's chain API.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ivan Khoronzhuk [Thu, 2 Jun 2016 22:37:08 +0000 (01:37 +0300)]
net: ethernet: ti: cpsw: remove unused priv lock
There is no reason in this lock. At least for now.
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Joe Perches [Thu, 2 Jun 2016 19:08:52 +0000 (12:08 -0700)]
rxrpc: Use pr_<level> and pr_fmt, reduce object size a few KB
Use the more common kernel logging style and reduce object size.
The logging message prefix changes from a mixture of
"RxRPC:" and "RXRPC:" to "af_rxrpc: ".
$ size net/rxrpc/built-in.o*
text data bss dec hex filename
64172 1972 8304 74448 122d0 net/rxrpc/built-in.o.new
67512 1972 8304 77788 12fdc net/rxrpc/built-in.o.old
Miscellanea:
o Consolidate the ASSERT macros to use a single pr_err call with
decimal and hexadecimal output and a stringified #OP argument
Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Haiyang Zhang [Thu, 2 Jun 2016 19:02:04 +0000 (12:02 -0700)]
hv_netvsc: Fix VF register on vlan devices
Added a condition to avoid vlan devices with same MAC registering
as VF.
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: K. Y. Srinivasan <kys@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Fri, 3 Jun 2016 23:37:26 +0000 (19:37 -0400)]
Merge branch 'sctp-gso'
Marcelo Ricardo Leitner says:
====================
sctp: Add GSO support
This patchset adds sctp GSO support.
Performance tests indicates that increases throughput by 10% if using
bigger chunk sizes, specially if bigger than MTU. For small chunks, it
doesn't help much if not using heavy firewall rules.
For small chunks it will probably be of more use once we get something
like MSG_MORE as David Laight had suggested.
overall changes:
v1->v2:
Added support for receiving GSO frames on SCTP stack, as requested by
Dave Miller.
v2->v3:
Consider sctphdr size in skb_gso_transport_seglen()
rebased due to
5c7cdf339af5 ("gso: Remove arbitrary checks for
unsupported GSO")
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Thu, 2 Jun 2016 18:05:44 +0000 (15:05 -0300)]
sctp: improve debug message to also log curr pkt and new chunk size
This is useful for debugging packet sizes.
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Thu, 2 Jun 2016 18:05:43 +0000 (15:05 -0300)]
sctp: Add GSO support
SCTP has this pecualiarity that its packets cannot be just segmented to
(P)MTU. Its chunks must be contained in IP segments, padding respected.
So we can't just generate a big skb, set gso_size to the fragmentation
point and deliver it to IP layer.
This patch takes a different approach. SCTP will now build a skb as it
would be if it was received using GRO. That is, there will be a cover
skb with protocol headers and children ones containing the actual
segments, already segmented to a way that respects SCTP RFCs.
With that, we can tell skb_segment() to just split based on frag_list,
trusting its sizes are already in accordance.
This way SCTP can benefit from GSO and instead of passing several
packets through the stack, it can pass a single large packet.
v2:
- Added support for receiving GSO frames, as requested by Dave Miller.
- Clear skb->cb if packet is GSO (otherwise it's not used by SCTP)
- Added heuristics similar to what we have in TCP for not generating
single GSO packets that fills cwnd.
v3:
- consider sctphdr size in skb_gso_transport_seglen()
- rebased due to
5c7cdf339af5 ("gso: Remove arbitrary checks for
unsupported GSO")
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Thu, 2 Jun 2016 18:05:42 +0000 (15:05 -0300)]
sctp: delay as much as possible skb_linearize
This patch is a preparation for the GSO one. In order to successfully
handle GSO packets on rx path we must not call skb_linearize, otherwise
it defeats any gain GSO may have had.
This patch thus delays as much as possible the call to skb_linearize,
leaving it to sctp_inq_pop() moment. For that the sanity checks
performed now know how to deal with fragments.
One positive side-effect of this is that if the socket is backlogged it
will have the chance of doing it on backlog processing instead of
during softirq.
With this move, it's evident that a check for non-linearity in
sctp_inq_pop was ineffective and is now removed. Note that a similar
check is performed a bit below this one.
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Thu, 2 Jun 2016 18:05:41 +0000 (15:05 -0300)]
skbuff: introduce skb_gso_validate_mtu
skb_gso_network_seglen is not enough for checking fragment sizes if
skb is using GSO_BY_FRAGS as we have to check frag per frag.
This patch introduces skb_gso_validate_mtu, based on the former, which
will wrap the use case inside it as all calls to skb_gso_network_seglen
were to validate if it fits on a given TMU, and improve the check.
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Thu, 2 Jun 2016 18:05:40 +0000 (15:05 -0300)]
sk_buff: allow segmenting based on frag sizes
This patch allows segmenting a skb based on its frags sizes instead of
based on a fixed value.
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Thu, 2 Jun 2016 18:05:39 +0000 (15:05 -0300)]
skbuff: export skb_gro_receive
sctp GSO requires it and sctp can be compiled as a module, so we need to
export this function.
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Marcelo Ricardo Leitner [Thu, 2 Jun 2016 18:05:38 +0000 (15:05 -0300)]
loopback: make use of NETIF_F_GSO_SOFTWARE
NETIF_F_GSO_SOFTWARE was defined to list all GSO software types, so lets
make use of it in loopback code. Note that veth/vxlan/others already
uses it.
Within this patch series, this patch causes lo to pick up SCTP GSO feature
automatically (as it's added to NETIF_F_GSO_SOFTWARE) and thus avoiding
segmentation if possible.
Signed-off-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bhaktipriya Shridhar [Thu, 2 Jun 2016 09:30:57 +0000 (15:00 +0530)]
net: fjes: fjes_main: Remove create_workqueue
alloc_workqueue replaces deprecated create_workqueue().
The workqueue adapter->txrx_wq has workitem
&adapter->raise_intr_rxdata_task per adapter. Extended Socket Network
Device is shared memory based, so someone's transmission denotes other's
reception. raise_intr_rxdata_task raises interruption of receivers from
the sender in order to notify receivers.
The workqueue adapter->control_wq has workitem
&adapter->interrupt_watch_task per adapter. interrupt_watch_task is used
to prevent delay of interrupts.
Dedicated workqueues have been used in both cases since the workitems
on the workqueues are involved in normal device operation and require
forward progress under memory pressure.
max_active has been set to 0 since there is no need for throttling
the number of active work items.
Since network devices may be used for memory reclaim,
WQ_MEM_RECLAIM has been set to guarantee forward progress.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Yuval Mintz [Thu, 2 Jun 2016 07:23:29 +0000 (10:23 +0300)]
qed: Utilize FW 8.10.3.0
The New QED firmware contains several fixes, including:
- Wrong classification of packets in 4-port devices.
- Anti-spoof interoperability with encapsulated packets.
- Tx-switching of encapsulated packets.
It also slightly improves Tx performance of the device.
In addition, this firmware contains the necessary logic for
supporting iscsi & rdma, for which we plan on pushing protocol
drivers in the imminent future.
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David Ahern [Thu, 2 Jun 2016 04:16:39 +0000 (21:16 -0700)]
net: vrf: set operstate and mtu at link create
The VRF device exists to define L3 domains and guide FIB lookups. As
such its operstate is not relevant. Seeing 'state UNKNOWN' in the
output of 'ip link show' can be confusing, so set operstate at link
create.
Similarly, the MTU for a VRF device is not used; any fragmentation
of the payload is done on the output path based on the real egress
device. An MTU of 1500 on the VRF device while enslaved devices
have a higher MTU can lead to confusion. Since the VRF MTU is not
relevant set to 64k similar to what is done for loopback.
Signed-off-by: David Ahern <dsa@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Zhang Shengju [Tue, 31 May 2016 13:41:02 +0000 (13:41 +0000)]
ovs: set name assign type of internal port
Set name_assign_type of internal port to NET_NAME_USER.
Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bhaktipriya Shridhar [Wed, 1 Jun 2016 17:59:15 +0000 (23:29 +0530)]
net: ethernet: wiznet: Remove create_workqueue
alloc_workqueue replaces deprecated create_workqueue().
A dedicated workqueue has been used since the workitems are involved
in normal device operation. Workitems &priv->rx_work and &priv->tx_work,
map to w5100_rx_work and w5100_tx_work respectively and are involved in
receiving and transmitting packets. Forward progress under
memory pressure is a requirement here.
create_workqueue has been replaced with alloc_workqueue with max_active
as 0 since there is no need for throttling the number of active work
items.
Since the driver may be used in memory reclaim path,
WQ_MEM_RECLAIM has been set to guarantee forward progress.
flush_workqueue is unnecessary since destroy_workqueue() itself calls
drain_workqueue() which flushes repeatedly till the workqueue
becomes empty. Hence the call to flush_workqueue() has been dropped.
Signed-off-by: Bhaktipriya Shridhar <bhaktipriya96@gmail.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Masaru Nagai [Wed, 1 Jun 2016 16:41:05 +0000 (01:41 +0900)]
ravb: Remove manual pause frame transmit
Writing a non-zero value to the manual PAUSE frame register (MPR) starts
the transmission of a PAUSE frame.
A PAUSE frame is sent in ravb_emac_init(), but it is not expected behavior.
Signed-off-by: Masaru Nagai <masaru.nagai.vx@renesas.com>
Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Peter Robinson [Wed, 1 Jun 2016 12:28:58 +0000 (13:28 +0100)]
stmmac: make platform drivers depend on their associated SoC
There's not much point, except compile test, enabling the stmmac
platform drivers unless their actual SoC is enabled. They're not
useful without it.
Signed-off-by: Peter Robinson <pbrobinson@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Thu, 2 Jun 2016 00:48:46 +0000 (17:48 -0700)]
Merge branch 'macvlan-less-mcast-cloning'
Herbert Xu says:
====================
macvlan: Avoid unnecessary multicast cloning
This patch tries to improve macvlan multicast performance by
maintaining a filter hash at the macvlan_port level so that we
can quickly determine whether a given packet is needed or not.
It is preceded by a patch that fixes a potential use-after-free
bug that I discovered while looking over this.
v2 fixed a bug where promiscuous/allmulti settings weren't handled
correctly.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Wed, 1 Jun 2016 03:45:44 +0000 (11:45 +0800)]
macvlan: Avoid unnecessary multicast cloning
Currently we always queue a multicast packet for further processing,
even if none of the macvlan devices are subscribed to the address.
This patch optimises this by adding a global multicast filter for
a macvlan_port.
Note that this patch doesn't handle the broadcast addresses of the
individual macvlan devices correctly, if they are not all identical
to vlan->lowerdev. However, this is already broken because there
is no mechanism in place to update the individual multicast filters
when you change the broadcast address.
If someone cares enough they should fix this by collecting all
broadcast addresses for a macvlan as we do for multicast and unicast.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Wed, 1 Jun 2016 03:43:00 +0000 (11:43 +0800)]
macvlan: Fix potential use-after free for broadcasts
When we postpone a broadcast packet we save the source port in
the skb if it is local. However, the source port can disappear
before we get a chance to process the packet.
This patch fixes this by holding a ref count on the netdev.
It also delays the skb->cb modification until after we allocate
the new skb as you should not modify shared skbs.
Fixes: 412ca1550cbe ("macvlan: Move broadcasts into a work queue")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Dumazet [Tue, 31 May 2016 22:22:41 +0000 (15:22 -0700)]
udp: avoid csum_partial() for validated skb
In commit
e6afc8ace6dd5 ("udp: remove headers from UDP packets before
queueing"), udp_csum_pull_header() helper was added but missed fact
that CHECKSUM_UNNECESSARY packets were now converted to CHECKSUM_NONE
and skb->csum_valid was set to 1 for them.
Since csum_partial() is quite expensive, even for 8-byte area, it is
worth adding a test.
We also can use skb->data instead of udp_hdr() as we are pulling
UDP headers, as it is sightly faster.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Kazuya Mizuguchi [Sun, 29 May 2016 20:25:43 +0000 (05:25 +0900)]
ravb: Add SET_RUNTIME_PM_OPS macro
Use SET_RUNTIME_PM_OPS macro instead of assigning a member of
dev_pm_ops directly.
Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Masaru Nagai [Tue, 31 May 2016 18:01:28 +0000 (03:01 +0900)]
ravb: Add ESF in RCR for enabling separation filter
The H/W manual recommends B'10 or B'11 in a value of the separation
filtering select bits in the receive configuration register (RCR.ESF).
When B'10 is set, frames from non-matching streams are discarded.
When B'11 is set, frames from non-matching streams are processed in
reception queue 0 (best effort).
This patch sets B'11 in ESF.
[ykaneko0929@gmail.com: revised this commit message]
Signed-off-by: Masaru Nagai <masaru.nagai.vx@renesas.com>
Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
Signed-off-by: Yoshihiro Kaneko <ykaneko0929@gmail.com>
Acked-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Wed, 1 Jun 2016 05:28:28 +0000 (22:28 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Fix negative error code usage in ATM layer, from Stefan Hajnoczi.
2) If CONFIG_SYSCTL is disabled, the default TTL is not initialized
properly. From Ezequiel Garcia.
3) Missing spinlock init in mvneta driver, from Gregory CLEMENT.
4) Missing unlocks in hwmb error paths, also from Gregory CLEMENT.
5) Fix deadlock on team->lock when propagating features, from Ivan
Vecera.
6) Work around buffer offset hw bug in alx chips, from Feng Tang.
7) Fix double listing of SCTP entries in sctp_diag dumps, from Xin
Long.
8) Various statistics bug fixes in mlx4 from Eric Dumazet.
9) Fix some randconfig build errors wrt fou ipv6 from Arnd Bergmann.
10) All of l2tp was namespace aware, but the ipv6 support code was not
doing so. From Shmulik Ladkani.
11) Handle on-stack hrtimers properly in pktgen, from Guenter Roeck.
12) Propagate MAC changes properly through VLAN devices, from Mike
Manning.
13) Fix memory leak in bnx2x_init_one(), from Vitaly Kuznetsov.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (62 commits)
sfc: Track RPS flow IDs per channel instead of per function
usbnet: smsc95xx: fix link detection for disabled autonegotiation
virtio_net: fix virtnet_open and virtnet_probe competing for try_fill_recv
bnx2x: avoid leaking memory on bnx2x_init_one() failures
fou: fix IPv6 Kconfig options
openvswitch: update checksum in {push,pop}_mpls
sctp: sctp_diag should dump sctp socket type
net: fec: update dirty_tx even if no skb
vlan: Propagate MAC address to VLANs
atm: iphase: off by one in rx_pkt()
atm: firestream: add more reserved strings
vxlan: Accept user specified MTU value when create new vxlan link
net: pktgen: Call destroy_hrtimer_on_stack()
timer: Export destroy_hrtimer_on_stack()
net: l2tp: Make l2tp_ip6 namespace aware
Documentation: ip-sysctl.txt: clarify secure_redirects
sfc: use flow dissector helpers for aRFS
ieee802154: fix logic error in ieee802154_llsec_parse_dev_addr
net: nps_enet: Disable interrupts before napi reschedule
net/lapb: tuse %*ph to dump buffers
...
Linus Torvalds [Wed, 1 Jun 2016 05:20:56 +0000 (22:20 -0700)]
Merge git://git./linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:
"sparc64 mmu context allocation and trap return bug fixes"
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc64: Fix return from trap window fill crashes.
sparc: Harden signal return frame checks.
sparc64: Take ctx_alloc_lock properly in hugetlb_setup().
Jon Cooper [Tue, 31 May 2016 18:12:32 +0000 (19:12 +0100)]
sfc: Track RPS flow IDs per channel instead of per function
Otherwise we get confused when two flows on different channels get the
same flow ID.
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Christoph Fritz [Thu, 26 May 2016 02:06:47 +0000 (04:06 +0200)]
usbnet: smsc95xx: fix link detection for disabled autonegotiation
To detect link status up/down for connections where autonegotiation is
explicitly disabled, we don't get an irq but need to poll the status
register for link up/down detection.
This patch adds a workqueue to poll for link status.
Signed-off-by: Christoph Fritz <chf.fritz@googlemail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
wangyunjian [Tue, 31 May 2016 03:52:43 +0000 (11:52 +0800)]
virtio_net: fix virtnet_open and virtnet_probe competing for try_fill_recv
In function virtnet_open() and virtnet_probe(), func try_fill_recv() may
be executed at the same time. VQ in virtqueue_add() has not been protected
well and BUG_ON will be triggered when virito_net.ko being removed.
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Vitaly Kuznetsov [Mon, 30 May 2016 13:00:54 +0000 (15:00 +0200)]
bnx2x: avoid leaking memory on bnx2x_init_one() failures
bnx2x_init_bp() allocates memory with bnx2x_alloc_mem_bp() so if we
fail later in bnx2x_init_one() we need to free this memory
with bnx2x_free_mem_bp() to avoid leakages. E.g. I'm observing memory
leaks reported by kmemleak when a failure (unrelated) happens in
bnx2x_vfpf_acquire().
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Acked-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Tue, 31 May 2016 20:42:11 +0000 (22:42 +0200)]
fou: fix IPv6 Kconfig options
The Kconfig options I added to work around broken compilation ended
up screwing up things more, as I used the wrong symbol to control
compilation of the file, resulting in IPv6 fou support to never be built
into the kernel.
Changing CONFIG_NET_FOU_IPV6_TUNNELS to CONFIG_IPV6_FOU fixes that
problem, I had renamed the symbol in one location but not the other,
and as the file is never being used by other kernel code, this did not
lead to a build failure that I would have caught.
After that fix, another issue with the same patch becomes obvious, as we
'select INET6_TUNNEL', which is related to IPV6_TUNNEL, but not the same,
and this can still cause the original build failure when IPV6_TUNNEL is
not built-in but IPV6_FOU is. The fix is equally trivial, we just need
to select the right symbol.
I have successfully build 350 randconfig kernels with this patch
and verified that the driver is now being built.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reported-by: Valentin Rothberg <valentinrothberg@gmail.com>
Fixes: fabb13db448e ("fou: add Kconfig options for IPv6 support")
Signed-off-by: David S. Miller <davem@davemloft.net>
Simon Horman [Mon, 30 May 2016 05:04:25 +0000 (14:04 +0900)]
openvswitch: update checksum in {push,pop}_mpls
In the case of CHECKSUM_COMPLETE the skb checksum should be updated in
{push,pop}_mpls() as they the type in the ethernet header.
As suggested by Pravin Shelar.
Cc: Pravin Shelar <pshelar@nicira.com>
Fixes: 25cd9ba0abc0 ("openvswitch: Add basic MPLS support to kernel")
Signed-off-by: Simon Horman <simon.horman@netronome.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long [Sun, 29 May 2016 09:42:13 +0000 (17:42 +0800)]
sctp: sctp_diag should dump sctp socket type
Now we cannot distinguish that one sk is a udp or sctp style when
we use ss to dump sctp_info. it's necessary to dump it as well.
For sctp_diag, ss support is not officially available, thus there
are no official users of this yet, so we can add this field in the
middle of sctp_info without breaking user API.
v1->v2:
- move 'sctpi_s_type' field to the end of struct sctp_info, so
that it won't cause incompatibility with applications already
built.
- add __reserved3 in sctp_info to make sure sctp_info is 8-byte
alignment.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Troy Kisky [Fri, 27 May 2016 20:30:40 +0000 (13:30 -0700)]
net: fec: update dirty_tx even if no skb
If dirty_tx isn't updated, then dma_unmap_single
can be called twice.
This fixes a
[ 58.420980] ------------[ cut here ]------------
[ 58.425667] WARNING: CPU: 0 PID: 377 at /home/schurig/d/mkarm/linux-4.5/lib/dma-debug.c:1096 check_unmap+0x9d0/0xab8()
[ 58.436405] fec
2188000.ethernet: DMA-API: device driver tries to free DMA memory it has not allocated [device address=0x0000000000000000] [size=66 bytes]
encountered by Holger
Signed-off-by: Troy Kisky <troy.kisky@boundarydevices.com>
Tested-by: <holgerschurig@gmail.com>
Acked-by: Fugang Duan <fugang.duan@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Mike Manning [Fri, 27 May 2016 16:45:07 +0000 (17:45 +0100)]
vlan: Propagate MAC address to VLANs
The MAC address of the physical interface is only copied to the VLAN
when it is first created, resulting in an inconsistency after MAC
address changes of only newly created VLANs having an up-to-date MAC.
The VLANs should continue inheriting the MAC address of the physical
interface until the VLAN MAC address is explicitly set to any value.
This allows IPv6 EUI64 addresses for the VLAN to reflect any changes
to the MAC of the physical interface and thus for DAD to behave as
expected.
Signed-off-by: Mike Manning <mmanning@brocade.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Fri, 27 May 2016 10:34:35 +0000 (13:34 +0300)]
atm: iphase: off by one in rx_pkt()
The iadev->rx_open[] array holds "iadev->num_vc" pointers (this code
assumes that pointers are 32 bits). So the > here should be >= or else
we could end up reading a garbage pointer from one element beyond the
end of the array.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Fri, 27 May 2016 10:33:50 +0000 (13:33 +0300)]
atm: firestream: add more reserved strings
This bug was there when the driver was first added in back in year 2000.
It causes a Smatch warning:
drivers/atm/firestream.c:849 process_incoming()
error: buffer overflow 'res_strings' 60 <= 63
There are supposed to be 64 entries in this array and the missing
strings are clearly in the 30 40 range. I added them as reserved 37 to
reserved 40. It's possible that strings are really supposed to be added
in the middle instead of at the end, but this approach is safe, in that
it fixes the bug and doesn't break anything that wasn't already broken.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Chen Haiquan [Fri, 27 May 2016 02:49:11 +0000 (10:49 +0800)]
vxlan: Accept user specified MTU value when create new vxlan link
When create a new vxlan link, example:
ip link add vtap mtu 1440 type vxlan vni 1 dev eth0
The argument "mtu" has no effect, because it is not set to conf->mtu. The
default value is used in vxlan_dev_configure function.
This problem was introduced by commit
0dfbdf4102b9 (vxlan: Factor out device
configuration).
Fixes: 0dfbdf4102b9 (vxlan: Factor out device configuration)
Signed-off-by: Chen Haiquan <oc@yunify.com>
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guenter Roeck [Fri, 27 May 2016 00:21:06 +0000 (17:21 -0700)]
net: pktgen: Call destroy_hrtimer_on_stack()
If CONFIG_DEBUG_OBJECTS_TIMERS=y, hrtimer_init_on_stack() requires
a matching call to destroy_hrtimer_on_stack() to clean up timer
debug objects.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Guenter Roeck [Fri, 27 May 2016 00:21:05 +0000 (17:21 -0700)]
timer: Export destroy_hrtimer_on_stack()
hrtimer_init_on_stack() needs a matching call to
destroy_hrtimer_on_stack(), so both need to be exported.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Tue, 31 May 2016 16:43:24 +0000 (09:43 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
"Three bugs fixes and an update for the default configuration"
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: fix info leak in do_sigsegv
s390/config: update default configuration
s390/bpf: fix recache skb->data/hlen for skb_vlan_push/pop
s390/bpf: reduce maximum program size to 64 KB
Linus Torvalds [Tue, 31 May 2016 16:27:00 +0000 (09:27 -0700)]
Merge tag 'gpio-v4.7-2' of git://git./linux/kernel/git/linusw/linux-gpio
Pull GPIO fixes from Linus Walleij:
"A bunch of GPIO fixes for the v4.7 series:
- Drop the lock before reading out the GPIO direction setting in
drivers supporting the .get_direction() callback: some of them may
be slowpath.
- Flush GPIO direction setting before locking a GPIO as an IRQ: some
electronics or other poking around in the registers behind our back
may have happened, so flush the direction status before trying to
lock the line for use by IRQs.
- Bail out silently when asked to perform operations on NULL GPIO
descriptors. That is what all the get_*_optional() is about: we
get optional GPIO handles, if they are not there, we get NULL.
- Handle compatible ioctl() correctly: we need to convert the ioctl()
pointer using compat_ptr() here like everyone else.
- Disable the broken .to_irq() on the LPC32xx platform. The whole
irqchip infrastructure was replaced in the last merge window, and a
new implementation will be needed"
* tag 'gpio-v4.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
gpio: drop lock before reading GPIO direction
gpio: bail out silently on NULL descriptors
gpio: handle compatible ioctl() pointers
gpio: flush direction status in gpiochip_lock_as_irq()
gpio: lpc32xx: disable broken to_irq support
Linus Torvalds [Mon, 30 May 2016 22:27:07 +0000 (15:27 -0700)]
Merge branch 'uuid' (lib/uuid fixes from Andy)
Merge lib/uuid fixes from Andy Shevchenko.
* emailed patches from Andy Shevchenko <andriy.shevchenko@linux.intel.com>:
lib/uuid.c: use correct offset in uuid parser
lib/uuid: add a test module
Bjørn Mork [Mon, 30 May 2016 14:40:42 +0000 (17:40 +0300)]
lib/uuid.c: use correct offset in uuid parser
Use '+ 0' and '+ 1' as offsets, like they were intended, instead of
adding to the result.
Fixes: 2b1b0d66704a ("lib/uuid.c: introduce a few more generic helpers")
Signed-off-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andy Shevchenko [Mon, 30 May 2016 14:40:41 +0000 (17:40 +0300)]
lib/uuid: add a test module
It appears that somehow I missed a test of the latest UUID rework which
landed in the kernel. Present a small test module to avoid such cases
in the future.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Mon, 30 May 2016 22:20:18 +0000 (15:20 -0700)]
Merge branch 'linus' of git://git./linux/kernel/git/herbert/crypto-2.6
Pull crypto fixes from Herbert Xu:
"This fixes the following issues:
- missing selection in public_key that may result in a build failure
- Potential crash in error path in omap-sham
- ccp AES XTS bug that affects requests larger than 4096"
* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: ccp - Fix AES XTS error for request sizes above 4096
crypto: public_key: select CRYPTO_AKCIPHER
crypto: omap-sham - potential Oops on error in probe
Linus Walleij [Mon, 30 May 2016 15:11:59 +0000 (17:11 +0200)]
gpio: drop lock before reading GPIO direction
When adding the gpiochip, the GPIO HW drivers' callback get_direction()
could get called in atomic context. Some of the GPIO HW drivers may
sleep when accessing the register.
Move the lock before initializing the descriptors.
Reported-by: Laxman Dewangan <ldewangan@nvidia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Linus Walleij [Mon, 30 May 2016 14:48:39 +0000 (16:48 +0200)]
gpio: bail out silently on NULL descriptors
In
fdeb8e1547cb9dd39d5d7223b33f3565cf86c28e
("gpio: reflect base and ngpio into gpio_device")
assumed that GPIO descriptors are either valid or error
pointers, but gpiod_get_[index_]optional() actually return
NULL descriptors and then all subsequent calls should just
bail out.
Cc: stable@vger.kernel.org
Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Andrew Lunn <andrew@lunn.ch>
Fixes: fdeb8e1547cb ("gpio: reflect base and ngpio into gpio_device")
Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Linus Walleij [Fri, 27 May 2016 12:24:04 +0000 (14:24 +0200)]
gpio: handle compatible ioctl() pointers
If we're using the compatible ioctl() we need to handle the
argument pointer in a special way or there will be trouble.
Fixes: 3c702e9987e2 ("gpio: add a userspace chardev ABI for GPIOs")
Reported-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Reviewed-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Linus Walleij [Wed, 25 May 2016 08:56:03 +0000 (10:56 +0200)]
gpio: flush direction status in gpiochip_lock_as_irq()
As irqchip and gpiochip functions are orthogonal, the IRQ
set-up or something else can have changed the direction of
the GPIO line from what the GPIO descriptor knows when we
get into gpiochip_lock_as_irq(). Make sure to re-read the
direction setting if we have the .get_direction() callback
enabled for the chip.
Else we get problems like this:
iio iio:device2: interrupts on the rising edge
gpio gpiochip2: (
8012e080.gpio): gpiochip_lock_as_irq:
tried to flag a GPIO set as output for IRQ
gpio gpiochip2: (
8012e080.gpio): unable to lock HW IRQ 0 for IRQ
genirq: Failed to request resources for l3g4200d-trigger
(irq 111) on irqchip nmk1-32-63
iio iio:device2: failed to request trigger IRQ.
st-gyro-i2c: probe of 2-0068 failed with error -22
Fixes: 72d320006177 ("gpio: set up initial state from .get_direction()")
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Sylvain Lemieux [Wed, 11 May 2016 17:40:00 +0000 (13:40 -0400)]
gpio: lpc32xx: disable broken to_irq support
The "to_irq" functionality is broken inside this driver since commit
76ba59f8366f ("genirq: Add irq_domain-aware core IRQ handler").
The addition of the new lpc32xx irqchip driver in 4.7, fixed the
lpc32xx platform interrupt issue.
When switching to the new lpc32xx irqchip driver, a warning appear
in the lpc32xx gpio driver: warning: "NR_IRQS" redefined.
To remove this warning (temporary solution), this patch
disables the broken "to_irq" mapping functionality support.
Signed-off-by: Sylvain Lemieux <slemieux@tycoint.com>
Acked-by: Vladimir Zapolskiy <vz@mleia.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Shmulik Ladkani [Thu, 26 May 2016 17:16:36 +0000 (20:16 +0300)]
net: l2tp: Make l2tp_ip6 namespace aware
l2tp_ip6 tunnel and session lookups were still using init_net, although
the l2tp core infrastructure already supports lookups keyed by 'net'.
As a result, l2tp_ip6_recv discarded packets for tunnels/sessions
created in namespaces other than the init_net.
Fix, by using dev_net(skb->dev) or sock_net(sk) where appropriate.
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Eric Garver [Thu, 26 May 2016 16:28:05 +0000 (12:28 -0400)]
Documentation: ip-sysctl.txt: clarify secure_redirects
Clarify how secure_redirects works. Mention that RFC1122 always applies.
Signed-off-by: Eric Garver <e@erig.me>
Signed-off-by: David S. Miller <davem@davemloft.net>
Edward Cree [Thu, 26 May 2016 20:46:05 +0000 (21:46 +0100)]
sfc: use flow dissector helpers for aRFS
Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Baozeng Ding [Thu, 26 May 2016 13:07:42 +0000 (21:07 +0800)]
ieee802154: fix logic error in ieee802154_llsec_parse_dev_addr
Fix a logic error to avoid potential null pointer dereference.
Signed-off-by: Baozeng Ding <sploving1@gmail.com>
Reviewed-by: Stefan Schmidt<stefan@osg.samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Elad Kanfi [Thu, 26 May 2016 12:00:06 +0000 (15:00 +0300)]
net: nps_enet: Disable interrupts before napi reschedule
Since NAPI works by shutting down event interrupts when theres
work and turning them on when theres none, the net driver must
make sure that interrupts are disabled when it reschedules polling.
By calling napi_reschedule, the driver switches to polling mode,
therefor there should be no interrupt interference.
Any received packets will be handled in nps_enet_poll by polling the HW
indication of received packet until all packets are handled.
Signed-off-by: Elad Kanfi <eladkan@mellanox.com>
Acked-by: Noam Camus <noamca@mellanox.com>
Tested-by: Alexey Brodkin <abrodkin@synopsys.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Andy Shevchenko [Thu, 26 May 2016 11:43:52 +0000 (14:43 +0300)]
net/lapb: tuse %*ph to dump buffers
Use %*ph specifier to dump small buffers in hex format instead doing this
byte-by-byte.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Carpenter [Thu, 26 May 2016 06:46:22 +0000 (09:46 +0300)]
ptp: oops in ptp_ioctl()
If we pass ERR_PTR(-EFAULT) to kfree() then it's going to oops.
Fixes: 2ece068e1b1d ('ptp: use memdup_user().')
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Wed, 25 May 2016 14:50:46 +0000 (16:50 +0200)]
fou: add Kconfig options for IPv6 support
A previous patch added the fou6.ko module, but that failed to link
in a couple of configurations:
net/built-in.o: In function `ip6_tnl_encap_add_fou_ops':
net/ipv6/fou6.c:88: undefined reference to `ip6_tnl_encap_add_ops'
net/ipv6/fou6.c:94: undefined reference to `ip6_tnl_encap_add_ops'
net/ipv6/fou6.c:97: undefined reference to `ip6_tnl_encap_del_ops'
net/built-in.o: In function `ip6_tnl_encap_del_fou_ops':
net/ipv6/fou6.c:106: undefined reference to `ip6_tnl_encap_del_ops'
net/ipv6/fou6.c:107: undefined reference to `ip6_tnl_encap_del_ops'
If CONFIG_IPV6=m, ip6_tnl_encap_add_ops/ip6_tnl_encap_del_ops
are in a module, but fou6.c can still be built-in, and that
obviously fails to link.
Also, if CONFIG_IPV6=y, but CONFIG_IPV6_TUNNEL=m or
CONFIG_IPV6_TUNNEL=n, the same problem happens for a different
reason.
This adds two new silent Kconfig symbols to work around both
problems:
- CONFIG_IPV6_FOU is now always set to 'm' if either CONFIG_NET_FOU=m
or CONFIG_IPV6=m
- CONFIG_IPV6_FOU_TUNNEL is set implicitly when IPV6_FOU is enabled
and NET_FOU_IP_TUNNELS is also turned out, and it will ensure
that CONFIG_IPV6_TUNNEL is also available.
The options could be made user-visible as well, to give additional
room for configuration, but it seems easier not to bother users
with more choice here.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: aa3463d65e7b ("fou: Add encap ops for IPv6 tunnels")
Signed-off-by: David S. Miller <davem@davemloft.net>
Arnd Bergmann [Wed, 25 May 2016 14:50:45 +0000 (16:50 +0200)]
ipv6: hide ip6_encap_hlen/ip6_tnl_encap definitions
A recent cleanup moved MAX_IPTUN_ENCAP_OPS along with some other
definitions, but it is now invisible when CONFIG_INET is
not defined, but still referenced from ip6_tunnel.h:
In file included from net/xfrm/xfrm_input.c:17:0:
include/net/ip6_tunnel.h:67:17: error: 'MAX_IPTUN_ENCAP_OPS' undeclared here (not in a function)
ip6tun_encaps[MAX_IPTUN_ENCAP_OPS];
^~~~~~~~~~~~~~~~~~~
This hides the ip6_encap_hlen and ip6_tnl_encap functions inside
of CONFIG_INET so we don't run into the the problem.
Alternatively we could move the macro out of the #ifdef again to
restore the previous behavior
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Fixes: 55c2bc143224 ("net: Cleanup encap items in ip_tunnels.h")
Signed-off-by: David S. Miller <davem@davemloft.net>
David S. Miller [Sun, 29 May 2016 03:41:12 +0000 (20:41 -0700)]
sparc64: Fix return from trap window fill crashes.
We must handle data access exception as well as memory address unaligned
exceptions from return from trap window fill faults, not just normal
TLB misses.
Otherwise we can get an OOPS that looks like this:
ld-linux.so.2(36808): Kernel bad sw trap 5 [#1]
CPU: 1 PID: 36808 Comm: ld-linux.so.2 Not tainted 4.6.0 #34
task:
fff8000303be5c60 ti:
fff8000301344000 task.ti:
fff8000301344000
TSTATE:
0000004410001601 TPC:
0000000000a1a784 TNPC:
0000000000a1a788 Y:
00000002 Not tainted
TPC: <do_sparc64_fault+0x5c4/0x700>
g0:
fff8000024fc8248 g1:
0000000000db04dc g2:
0000000000000000 g3:
0000000000000001
g4:
fff8000303be5c60 g5:
fff800030e672000 g6:
fff8000301344000 g7:
0000000000000001
o0:
0000000000b95ee8 o1:
000000000000012b o2:
0000000000000000 o3:
0000000200b9b358
o4:
0000000000000000 o5:
fff8000301344040 sp:
fff80003013475c1 ret_pc:
0000000000a1a77c
RPC: <do_sparc64_fault+0x5bc/0x700>
l0:
00000000000007ff l1:
0000000000000000 l2:
000000000000005f l3:
0000000000000000
l4:
fff8000301347e98 l5:
fff8000024ff3060 l6:
0000000000000000 l7:
0000000000000000
i0:
fff8000301347f60 i1:
0000000000102400 i2:
0000000000000000 i3:
0000000000000000
i4:
0000000000000000 i5:
0000000000000000 i6:
fff80003013476a1 i7:
0000000000404d4c
I7: <user_rtt_fill_fixup+0x6c/0x7c>
Call Trace:
[
0000000000404d4c] user_rtt_fill_fixup+0x6c/0x7c
The window trap handlers are slightly clever, the trap table entries for them are
composed of two pieces of code. First comes the code that actually performs
the window fill or spill trap handling, and then there are three instructions at
the end which are for exception processing.
The userland register window fill handler is:
add %sp, STACK_BIAS + 0x00, %g1; \
ldxa [%g1 + %g0] ASI, %l0; \
mov 0x08, %g2; \
mov 0x10, %g3; \
ldxa [%g1 + %g2] ASI, %l1; \
mov 0x18, %g5; \
ldxa [%g1 + %g3] ASI, %l2; \
ldxa [%g1 + %g5] ASI, %l3; \
add %g1, 0x20, %g1; \
ldxa [%g1 + %g0] ASI, %l4; \
ldxa [%g1 + %g2] ASI, %l5; \
ldxa [%g1 + %g3] ASI, %l6; \
ldxa [%g1 + %g5] ASI, %l7; \
add %g1, 0x20, %g1; \
ldxa [%g1 + %g0] ASI, %i0; \
ldxa [%g1 + %g2] ASI, %i1; \
ldxa [%g1 + %g3] ASI, %i2; \
ldxa [%g1 + %g5] ASI, %i3; \
add %g1, 0x20, %g1; \
ldxa [%g1 + %g0] ASI, %i4; \
ldxa [%g1 + %g2] ASI, %i5; \
ldxa [%g1 + %g3] ASI, %i6; \
ldxa [%g1 + %g5] ASI, %i7; \
restored; \
retry; nop; nop; nop; nop; \
b,a,pt %xcc, fill_fixup_dax; \
b,a,pt %xcc, fill_fixup_mna; \
b,a,pt %xcc, fill_fixup;
And the way this works is that if any of those memory accesses
generate an exception, the exception handler can revector to one of
those final three branch instructions depending upon which kind of
exception the memory access took. In this way, the fault handler
doesn't have to know if it was a spill or a fill that it's handling
the fault for. It just always branches to the last instruction in
the parent trap's handler.
For example, for a regular fault, the code goes:
winfix_trampoline:
rdpr %tpc, %g3
or %g3, 0x7c, %g3
wrpr %g3, %tnpc
done
All window trap handlers are 0x80 aligned, so if we "or" 0x7c into the
trap time program counter, we'll get that final instruction in the
trap handler.
On return from trap, we have to pull the register window in but we do
this by hand instead of just executing a "restore" instruction for
several reasons. The largest being that from Niagara and onward we
simply don't have enough levels in the trap stack to fully resolve all
possible exception cases of a window fault when we are already at
trap level 1 (which we enter to get ready to return from the original
trap).
This is executed inline via the FILL_*_RTRAP handlers. rtrap_64.S's
code branches directly to these to do the window fill by hand if
necessary. Now if you look at them, we'll see at the end:
ba,a,pt %xcc, user_rtt_fill_fixup;
ba,a,pt %xcc, user_rtt_fill_fixup;
ba,a,pt %xcc, user_rtt_fill_fixup;
And oops, all three cases are handled like a fault.
This doesn't work because each of these trap types (data access
exception, memory address unaligned, and faults) store their auxiliary
info in different registers to pass on to the C handler which does the
real work.
So in the case where the stack was unaligned, the unaligned trap
handler sets up the arg registers one way, and then we branched to
the fault handler which expects them setup another way.
So the FAULT_TYPE_* value ends up basically being garbage, and
randomly would generate the backtrace seen above.
Reported-by: Nick Alcock <nix@esperi.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 29 May 2016 20:28:39 +0000 (13:28 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi
Pull SCSI fixes from James Bottomley:
"This is a set of four fixes noticed in the merge window. The aacraid
one is an optimisation, the mp3sas one fixes a spurious printk, the
sd_check_events one fixes a theoretical race and the failed zero
length commands fixes a bug in our completion/retry routines that has
been causing problems in the field"
* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
aacraid: do not activate events on non-SRC adapters
mpt3sas: add missing curly braces
sd: get disk reference in sd_check_events()
scsi_lib: correctly retry failed zero length REQ_TYPE_FS commands
David S. Miller [Sun, 29 May 2016 04:21:31 +0000 (21:21 -0700)]
sparc: Harden signal return frame checks.
All signal frames must be at least 16-byte aligned, because that is
the alignment we explicitly create when we build signal return stack
frames.
All stack pointers must be at least 8-byte aligned.
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Sun, 29 May 2016 16:29:24 +0000 (09:29 -0700)]
Linux 4.7-rc1
George Spelvin [Sun, 29 May 2016 12:05:56 +0000 (08:05 -0400)]
hash_string: Fix zero-length case for !DCACHE_WORD_ACCESS
The self-test was updated to cover zero-length strings; the function
needs to be updated, too.
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: George Spelvin <linux@sciencehorizons.net>
Fixes: fcfd2fbf22d2 ("fs/namei.c: Add hashlen_string() function")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
George Spelvin [Sun, 29 May 2016 05:26:41 +0000 (01:26 -0400)]
Rename other copy of hash_string to hashlen_string
The original name was simply hash_string(), but that conflicted with a
function with that name in drivers/base/power/trace.c, and I decided
that calling it "hashlen_" was better anyway.
But you have to do it in two places.
[ This caused build errors for architectures that don't define
CONFIG_DCACHE_WORD_ACCESS - Linus ]
Signed-off-by: George Spelvin <linux@sciencehorizons.net>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Fixes: fcfd2fbf22d2 ("fs/namei.c: Add hashlen_string() function")
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mikulas Patocka [Tue, 24 May 2016 20:49:18 +0000 (22:49 +0200)]
hpfs: implement the show_options method
The HPFS filesystem used generic_show_options to produce string that is
displayed in /proc/mounts. However, there is a problem that the options
may disappear after remount. If we mount the filesystem with option1
and then remount it with option2, /proc/mounts should show both option1
and option2, however it only shows option2 because the whole option
string is replaced with replace_mount_options in hpfs_remount_fs.
To fix this bug, implement the hpfs_show_options function that prints
options that are currently selected.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mikulas Patocka [Tue, 24 May 2016 20:48:33 +0000 (22:48 +0200)]
affs: fix remount failure when there are no options changed
Commit
c8f33d0bec99 ("affs: kstrdup() memory handling") checks if the
kstrdup function returns NULL due to out-of-memory condition.
However, if we are remounting a filesystem with no change to
filesystem-specific options, the parameter data is NULL. In this case,
kstrdup returns NULL (because it was passed NULL parameter), although no
out of memory condition exists. The mount syscall then fails with
ENOMEM.
This patch fixes the bug. We fail with ENOMEM only if data is non-NULL.
The patch also changes the call to replace_mount_options - if we didn't
pass any filesystem-specific options, we don't call
replace_mount_options (thus we don't erase existing reported options).
Fixes: c8f33d0bec99 ("affs: kstrdup() memory handling")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # v4.1+
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mikulas Patocka [Tue, 24 May 2016 20:47:00 +0000 (22:47 +0200)]
hpfs: fix remount failure when there are no options changed
Commit
ce657611baf9 ("hpfs: kstrdup() out of memory handling") checks if
the kstrdup function returns NULL due to out-of-memory condition.
However, if we are remounting a filesystem with no change to
filesystem-specific options, the parameter data is NULL. In this case,
kstrdup returns NULL (because it was passed NULL parameter), although no
out of memory condition exists. The mount syscall then fails with
ENOMEM.
This patch fixes the bug. We fail with ENOMEM only if data is non-NULL.
The patch also changes the call to replace_mount_options - if we didn't
pass any filesystem-specific options, we don't call
replace_mount_options (thus we don't erase existing reported options).
Fixes: ce657611baf9 ("hpfs: kstrdup() out of memory handling")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 28 May 2016 23:41:39 +0000 (16:41 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus
Pull more MIPS updates from Ralf Baechle:
"This is the secondnd batch of MIPS patches for 4.7. Summary:
CPS:
- Copy EVA configuration when starting secondary VPs.
EIC:
- Clear Status IPL.
Lasat:
- Fix a few off by one bugs.
lib:
- Mark intrinsics notrace. Not only are the intrinsics
uninteresting, it would cause infinite recursion.
MAINTAINERS:
- Add file patterns for MIPS BRCM device tree bindings.
- Add file patterns for mips device tree bindings.
MT7628:
- Fix MT7628 pinmux typos.
- wled_an pinmux gpio.
- EPHY LEDs pinmux support.
Pistachio:
- Enable KASLR
VDSO:
- Build microMIPS VDSO for microMIPS kernels.
- Fix aliasing warning by building with `-fno-strict-aliasing' for
debugging but also tracing them might result in recursion.
Misc:
- Add missing FROZEN hotplug notifier transitions.
- Fix clk binding example for varioius PIC32 devices.
- Fix cpu interrupt controller node-names in the DT files.
- Fix XPA CPU feature separation.
- Fix write_gc0_* macros when writing zero.
- Add inline asm encoding helpers.
- Add missing VZ accessor microMIPS encodings.
- Fix little endian microMIPS MSA encodings.
- Add 64-bit HTW fields and fix its configuration.
- Fix sigreturn via VDSO on microMIPS kernel.
- Lots of typo fixes.
- Add definitions of SegCtl registers and use them"
* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus: (49 commits)
MIPS: Add missing FROZEN hotplug notifier transitions
MIPS: Build microMIPS VDSO for microMIPS kernels
MIPS: Fix sigreturn via VDSO on microMIPS kernel
MIPS: devicetree: fix cpu interrupt controller node-names
MIPS: VDSO: Build with `-fno-strict-aliasing'
MIPS: Pistachio: Enable KASLR
MIPS: lib: Mark intrinsics notrace
MIPS: Fix 64-bit HTW configuration
MIPS: Add 64-bit HTW fields
MAINTAINERS: Add file patterns for mips device tree bindings
MAINTAINERS: Add file patterns for mips brcm device tree bindings
MIPS: Simplify DSP instruction encoding macros
MIPS: Add missing tlbinvf/XPA microMIPS encodings
MIPS: Fix little endian microMIPS MSA encodings
MIPS: Add missing VZ accessor microMIPS encodings
MIPS: Add inline asm encoding helpers
MIPS: Spelling fix lets -> let's
MIPS: VR41xx: Fix typo
MIPS: oprofile: Fix typo
MIPS: math-emu: Fix typo
...
Guenter Roeck [Sat, 28 May 2016 22:26:02 +0000 (15:26 -0700)]
fs: fix binfmt_aout.c build error
Various builds (such as i386:allmodconfig) fail with
fs/binfmt_aout.c:133:2: error: expected identifier or '(' before 'return'
fs/binfmt_aout.c:134:1: error: expected identifier or '(' before '}' token
[ Oops. My bad, I had stupidly thought that "allmodconfig" covered this
on x86-64 too, but it obviously doesn't. Egg on my face. - Linus ]
Fixes: 5d22fc25d4fc ("mm: remove more IS_ERR_VALUE abuses")
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Sat, 28 May 2016 23:15:25 +0000 (16:15 -0700)]
Merge branch 'hash' of git://ftp.sciencehorizons.net/linux
Pull string hash improvements from George Spelvin:
"This series does several related things:
- Makes the dcache hash (fs/namei.c) useful for general kernel use.
(Thanks to Bruce for noticing the zero-length corner case)
- Converts the string hashes in <linux/sunrpc/svcauth.h> to use the
above.
- Avoids 64-bit multiplies in hash_64() on 32-bit platforms. Two
32-bit multiplies will do well enough.
- Rids the world of the bad hash multipliers in hash_32.
This finishes the job started in commit
689de1d6ca95 ("Minimal
fix-up of bad hashing behavior of hash_64()")
The vast majority of Linux architectures have hardware support for
32x32-bit multiply and so derive no benefit from "simplified"
multipliers.
The few processors that do not (68000, h8/300 and some models of
Microblaze) have arch-specific implementations added. Those
patches are last in the series.
- Overhauls the dcache hash mixing.
The patch in commit
0fed3ac866ea ("namei: Improve hash mixing if
CONFIG_DCACHE_WORD_ACCESS") was an off-the-cuff suggestion.
Replaced with a much more careful design that's simultaneously
faster and better. (My own invention, as there was noting suitable
in the literature I could find. Comments welcome!)
- Modify the hash_name() loop to skip the initial HASH_MIX(). This
would let us salt the hash if we ever wanted to.
- Sort out partial_name_hash().
The hash function is declared as using a long state, even though
it's truncated to 32 bits at the end and the extra internal state
contributes nothing to the result. And some callers do odd things:
- fs/hfs/string.c only allocates 32 bits of state
- fs/hfsplus/unicode.c uses it to hash 16-bit unicode symbols not bytes
- Modify bytemask_from_count to handle inputs of 1..sizeof(long)
rather than 0..sizeof(long)-1. This would simplify users other
than full_name_hash"
Special thanks to Bruce Fields for testing and finding bugs in v1. (I
learned some humbling lessons about "obviously correct" code.)
On the arch-specific front, the m68k assembly has been tested in a
standalone test harness, I've been in contact with the Microblaze
maintainers who mostly don't care, as the hardware multiplier is never
omitted in real-world applications, and I haven't heard anything from
the H8/300 world"
* 'hash' of git://ftp.sciencehorizons.net/linux:
h8300: Add <asm/hash.h>
microblaze: Add <asm/hash.h>
m68k: Add <asm/hash.h>
<linux/hash.h>: Add support for architecture-specific functions
fs/namei.c: Improve dcache hash function
Eliminate bad hash multipliers from hash_32() and hash_64()
Change hash_64() return value to 32 bits
<linux/sunrpc/svcauth.h>: Define hash_str() in terms of hashlen_string()
fs/namei.c: Add hashlen_string() function
Pull out string hash to <linux/stringhash.h>
George Spelvin [Wed, 25 May 2016 18:19:49 +0000 (14:19 -0400)]
h8300: Add <asm/hash.h>
This will improve the performance of hash_32() and hash_64(), but due
to complete lack of multi-bit shift instructions on H8, performance will
still be bad in surrounding code.
Designing H8-specific hash algorithms to work around that is a separate
project. (But if the maintainers would like to get in touch...)
Signed-off-by: George Spelvin <linux@sciencehorizons.net>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: uclinux-h8-devel@lists.sourceforge.jp
George Spelvin [Wed, 25 May 2016 15:06:09 +0000 (11:06 -0400)]
microblaze: Add <asm/hash.h>
Microblaze is an FPGA soft core that can be configured various ways.
If it is configured without a multiplier, the standard __hash_32()
will require a call to __mulsi3, which is a slow software loop.
Instead, use a shift-and-add sequence for the constant multiply.
GCC knows how to do this, but it's not as clever as some.
Signed-off-by: George Spelvin <linux@sciencehorizons.net>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Michal Simek <michal.simek@xilinx.com>
George Spelvin [Thu, 26 May 2016 15:36:19 +0000 (11:36 -0400)]
m68k: Add <asm/hash.h>
This provides a multiply by constant GOLDEN_RATIO_32 = 0x61C88647
for the original mc68000, which lacks a 32x32-bit multiply instruction.
Yes, the amount of optimization effort put in is excessive. :-)
Shift-add chain found by Yevgen Voronenko's Hcub algorithm at
http://spiral.ece.cmu.edu/mcm/gen.html
Signed-off-by: George Spelvin <linux@sciencehorizons.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Philippe De Muyter <phdm@macq.eu>
Cc: linux-m68k@lists.linux-m68k.org
George Spelvin [Fri, 27 May 2016 02:11:51 +0000 (22:11 -0400)]
<linux/hash.h>: Add support for architecture-specific functions
This is just the infrastructure; there are no users yet.
This is modelled on CONFIG_ARCH_RANDOM; a CONFIG_ symbol declares
the existence of <asm/hash.h>.
That file may define its own versions of various functions, and define
HAVE_* symbols (no CONFIG_ prefix!) to suppress the generic ones.
Included is a self-test (in lib/test_hash.c) that verifies the basics.
It is NOT in general required that the arch-specific functions compute
the same thing as the generic, but if a HAVE_* symbol is defined with
the value 1, then equality is tested.
Signed-off-by: George Spelvin <linux@sciencehorizons.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Philippe De Muyter <phdm@macq.eu>
Cc: linux-m68k@lists.linux-m68k.org
Cc: Alistair Francis <alistai@xilinx.com>
Cc: Michal Simek <michal.simek@xilinx.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: uclinux-h8-devel@lists.sourceforge.jp
George Spelvin [Mon, 23 May 2016 11:43:58 +0000 (07:43 -0400)]
fs/namei.c: Improve dcache hash function
Patch
0fed3ac866 improved the hash mixing, but the function is slower
than necessary; there's a 7-instruction dependency chain (10 on x86)
each loop iteration.
Word-at-a-time access is a very tight loop (which is good, because
link_path_walk() is one of the hottest code paths in the entire kernel),
and the hash mixing function must not have a longer latency to avoid
slowing it down.
There do not appear to be any published fast hash functions that:
1) Operate on the input a word at a time, and
2) Don't need to know the length of the input beforehand, and
3) Have a single iterated mixing function, not needing conditional
branches or unrolling to distinguish different loop iterations.
One of the algorithms which comes closest is Yann Collet's xxHash, but
that's two dependent multiplies per word, which is too much.
The key insights in this design are:
1) Barring expensive ops like multiplies, to diffuse one input bit
across 64 bits of hash state takes at least log2(64) = 6 sequentially
dependent instructions. That is more cycles than we'd like.
2) An operation like "hash ^= hash << 13" requires a second temporary
register anyway, and on a 2-operand machine like x86, it's three
instructions.
3) A better use of a second register is to hold a two-word hash state.
With careful design, no temporaries are needed at all, so it doesn't
increase register pressure. And this gets rid of register copying
on 2-operand machines, so the code is smaller and faster.
4) Using two words of state weakens the requirement for one-round mixing;
we now have two rounds of mixing before cancellation is possible.
5) A two-word hash state also allows operations on both halves to be
done in parallel, so on a superscalar processor we get more mixing
in fewer cycles.
I ended up using a mixing function inspired by the ChaCha and Speck
round functions. It is 6 simple instructions and 3 cycles per iteration
(assuming multiply by 9 can be done by an "lea" instruction):
x ^= *input++;
y ^= x; x = ROL(x, K1);
x += y; y = ROL(y, K2);
y *= 9;
Not only is this reversible, two consecutive rounds are reversible:
if you are given the initial and final states, but not the intermediate
state, it is possible to compute both input words. This means that at
least 3 words of input are required to create a collision.
(It also has the property, used by hash_name() to avoid a branch, that
it hashes all-zero to all-zero.)
The rotate constants K1 and K2 were found by experiment. The search took
a sample of random initial states (I used 1023) and considered the effect
of flipping each of the 64 input bits on each of the 128 output bits two
rounds later. Each of the 8192 pairs can be considered a biased coin, and
adding up the Shannon entropy of all of them produces a score.
The best-scoring shifts also did well in other tests (flipping bits in y,
trying 3 or 4 rounds of mixing, flipping all 64*63/2 pairs of input bits),
so the choice was made with the additional constraint that the sum of the
shifts is odd and not too close to the word size.
The final state is then folded into a 32-bit hash value by a less carefully
optimized multiply-based scheme. This also has to be fast, as pathname
components tend to be short (the most common case is one iteration!), but
there's some room for latency, as there is a fair bit of intervening logic
before the hash value is used for anything.
(Performance verified with "bonnie++ -s 0 -n 1536:-2" on tmpfs. I need
a better benchmark; the numbers seem to show a slight dip in performance
between 4.6.0 and this patch, but they're too noisy to quote.)
Special thanks to Bruce fields for diligent testing which uncovered a
nasty fencepost error in an earlier version of this patch.
[checkpatch.pl formatting complaints noted and respectfully disagreed with.]
Signed-off-by: George Spelvin <linux@sciencehorizons.net>
Tested-by: J. Bruce Fields <bfields@redhat.com>
George Spelvin [Fri, 27 May 2016 03:00:23 +0000 (23:00 -0400)]
Eliminate bad hash multipliers from hash_32() and hash_64()
The "simplified" prime multipliers made very bad hash functions, so get rid
of them. This completes the work of
689de1d6ca.
To avoid the inefficiency which was the motivation for the "simplified"
multipliers, hash_64() on 32-bit systems is changed to use a different
algorithm. It makes two calls to hash_32() instead.
drivers/media/usb/dvb-usb-v2/af9015.c uses the old GOLDEN_RATIO_PRIME_32
for some horrible reason, so it inherits a copy of the old definition.
Signed-off-by: George Spelvin <linux@sciencehorizons.net>
Cc: Antti Palosaari <crope@iki.fi>
Cc: Mauro Carvalho Chehab <m.chehab@samsung.com>
George Spelvin [Fri, 27 May 2016 02:22:01 +0000 (22:22 -0400)]
Change hash_64() return value to 32 bits
That's all that's ever asked for, and it makes the return
type of hash_long() consistent.
It also allows (upcoming patch) an optimized implementation
of hash_64 on 32-bit machines.
I tried adding a BUILD_BUG_ON to ensure the number of bits requested
was never more than 32 (most callers use a compile-time constant), but
adding <linux/bug.h> to <linux/hash.h> breaks the tools/perf compiler
unless tools/perf/MANIFEST is updated, and understanding that code base
well enough to update it is too much trouble. I did the rest of an
allyesconfig build with such a check, and nothing tripped.
Signed-off-by: George Spelvin <linux@sciencehorizons.net>
George Spelvin [Fri, 20 May 2016 17:31:33 +0000 (13:31 -0400)]
<linux/sunrpc/svcauth.h>: Define hash_str() in terms of hashlen_string()
Finally, the first use of previous two patches: eliminate the
separate ad-hoc string hash functions in the sunrpc code.
Now hash_str() is a wrapper around hash_string(), and hash_mem() is
likewise a wrapper around full_name_hash().
Note that sunrpc code *does* call hash_mem() with a zero length, which
is why the previous patch needed to handle that in full_name_hash().
(Thanks, Bruce, for finding that!)
This also eliminates the only caller of hash_long which asks for
more than 32 bits of output.
The comment about the quality of hashlen_string() and full_name_hash()
is jumping the gun by a few patches; they aren't very impressive now,
but will be improved greatly later in the series.
Signed-off-by: George Spelvin <linux@sciencehorizons.net>
Tested-by: J. Bruce Fields <bfields@redhat.com>
Acked-by: J. Bruce Fields <bfields@redhat.com>
Cc: Jeff Layton <jlayton@poochiereds.net>
Cc: linux-nfs@vger.kernel.org