Paul Mackerras [Sun, 4 Nov 2012 18:15:43 +0000 (18:15 +0000)]
KVM: PPC: Book3S PR: Emulate PURR, SPURR and DSCR registers
This adds basic emulation of the PURR and SPURR registers. We assume
we are emulating a single-threaded core, so these advance at the same
rate as the timebase. A Linux kernel running on a POWER7 expects to
be able to access these registers and is not prepared to handle a
program interrupt on accessing them.
This also adds a very minimal emulation of the DSCR (data stream
control register). Writes are ignored and reads return zero.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Wed, 21 Nov 2012 23:28:41 +0000 (23:28 +0000)]
KVM: PPC: Book3S HV: Don't give the guest RW access to RO pages
Currently, if the guest does an H_PROTECT hcall requesting that the
permissions on a HPT entry be changed to allow writing, we make the
requested change even if the page is marked read-only in the host
Linux page tables. This is a problem since it would for instance
allow a guest to modify a page that KSM has decided can be shared
between multiple guests.
To fix this, if the new permissions for the page allow writing, we need
to look up the memslot for the page, work out the host virtual address,
and look up the Linux page tables to get the PTE for the page. If that
PTE is read-only, we reduce the HPTE permissions to read-only.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Wed, 21 Nov 2012 23:29:12 +0000 (23:29 +0000)]
KVM: PPC: Book3S HV: Report correct HPT entry index when reading HPT
This fixes a bug in the code which allows userspace to read out the
contents of the guest's hashed page table (HPT). On the second and
subsequent passes through the HPT, when we are reporting only those
entries that have changed, we were incorrectly initializing the index
field of the header with the index of the first entry we skipped
rather than the first changed entry. This fixes it.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Wed, 21 Nov 2012 23:27:19 +0000 (23:27 +0000)]
KVM: PPC: Book3S HV: Reset reverse-map chains when resetting the HPT
With HV-style KVM, we maintain reverse-mapping lists that enable us to
find all the HPT (hashed page table) entries that reference each guest
physical page, with the heads of the lists in the memslot->arch.rmap
arrays. When we reset the HPT (i.e. when we reboot the VM), we clear
out all the HPT entries but we were not clearing out the reverse
mapping lists. The result is that as we create new HPT entries, the
lists get corrupted, which can easily lead to loops, resulting in the
host kernel hanging when it tries to traverse those lists.
This fixes the problem by zeroing out all the reverse mapping lists
when we zero out the HPT. This incidentally means that we are also
zeroing our record of the referenced and changed bits (not the bits
in the Linux PTEs, used by the Linux MM subsystem, but the bits used
by the KVM_GET_DIRTY_LOG ioctl, and those used by kvm_age_hva() and
kvm_test_age_hva()).
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Mon, 19 Nov 2012 22:57:20 +0000 (22:57 +0000)]
KVM: PPC: Book3S HV: Provide a method for userspace to read and write the HPT
A new ioctl, KVM_PPC_GET_HTAB_FD, returns a file descriptor. Reads on
this fd return the contents of the HPT (hashed page table), writes
create and/or remove entries in the HPT. There is a new capability,
KVM_CAP_PPC_HTAB_FD, to indicate the presence of the ioctl. The ioctl
takes an argument structure with the index of the first HPT entry to
read out and a set of flags. The flags indicate whether the user is
intending to read or write the HPT, and whether to return all entries
or only the "bolted" entries (those with the bolted bit, 0x10, set in
the first doubleword).
This is intended for use in implementing qemu's savevm/loadvm and for
live migration. Therefore, on reads, the first pass returns information
about all HPTEs (or all bolted HPTEs). When the first pass reaches the
end of the HPT, it returns from the read. Subsequent reads only return
information about HPTEs that have changed since they were last read.
A read that finds no changed HPTEs in the HPT following where the last
read finished will return 0 bytes.
The format of the data provides a simple run-length compression of the
invalid entries. Each block of data starts with a header that indicates
the index (position in the HPT, which is just an array), the number of
valid entries starting at that index (may be zero), and the number of
invalid entries following those valid entries. The valid entries, 16
bytes each, follow the header. The invalid entries are not explicitly
represented.
Signed-off-by: Paul Mackerras <paulus@samba.org>
[agraf: fix documentation]
Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Mon, 19 Nov 2012 22:55:44 +0000 (22:55 +0000)]
KVM: PPC: Book3S HV: Make a HPTE removal function available
This makes a HPTE removal function, kvmppc_do_h_remove(), available
outside book3s_hv_rm_mmu.c. This will be used by the HPT writing
code.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Mon, 19 Nov 2012 22:52:49 +0000 (22:52 +0000)]
KVM: PPC: Book3S HV: Add a mechanism for recording modified HPTEs
This uses a bit in our record of the guest view of the HPTE to record
when the HPTE gets modified. We use a reserved bit for this, and ensure
that this bit is always cleared in HPTE values returned to the guest.
The recording of modified HPTEs is only done if other code indicates
its interest by setting kvm->arch.hpte_mod_interest to a non-zero value.
The reason for this is that when later commits add facilities for
userspace to read the HPT, the first pass of reading the HPT will be
quicker if there are no (or very few) HPTEs marked as modified,
rather than having most HPTEs marked as modified.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Mon, 19 Nov 2012 23:01:34 +0000 (23:01 +0000)]
KVM: PPC: Book3S HV: Fix bug causing loss of page dirty state
This fixes a bug where adding a new guest HPT entry via the H_ENTER
hcall would lose the "changed" bit in the reverse map information
for the guest physical page being mapped. The result was that the
KVM_GET_DIRTY_LOG could return a zero bit for the page even though
the page had been modified by the guest.
This fixes it by only modifying the index and present bits in the
reverse map entry, thus preserving the reference and change bits.
We were also unnecessarily setting the reference bit, and this
fixes that too.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Tue, 13 Nov 2012 18:31:32 +0000 (18:31 +0000)]
KVM: PPC: Book3S HV: Restructure HPT entry creation code
This restructures the code that creates HPT (hashed page table)
entries so that it can be called in situations where we don't have a
struct vcpu pointer, only a struct kvm pointer. It also fixes a bug
where kvmppc_map_vrma() would corrupt the guest R4 value.
Most of the work of kvmppc_virtmode_h_enter is now done by a new
function, kvmppc_virtmode_do_h_enter, which itself calls another new
function, kvmppc_do_h_enter, which contains most of the old
kvmppc_h_enter. The new kvmppc_do_h_enter takes explicit arguments
for the place to return the HPTE index, the Linux page tables to use,
and whether it is being called in real mode, thus removing the need
for it to have the vcpu as an argument.
Currently kvmppc_map_vrma creates the VRMA (virtual real mode area)
HPTEs by calling kvmppc_virtmode_h_enter, which is designed primarily
to handle H_ENTER hcalls from the guest that need to pin a page of
memory. Since H_ENTER returns the index of the created HPTE in R4,
kvmppc_virtmode_h_enter updates the guest R4, corrupting the guest R4
in the case when it gets called from kvmppc_map_vrma on the first
VCPU_RUN ioctl. With this, kvmppc_map_vrma instead calls
kvmppc_virtmode_do_h_enter with the address of a dummy word as the
place to store the HPTE index, thus avoiding corrupting the guest R4.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Alexander Graf [Mon, 8 Oct 2012 22:06:20 +0000 (00:06 +0200)]
KVM: PPC: Support eventfd
In order to support the generic eventfd infrastructure on PPC, we need
to call into the generic KVM in-kernel device mmio code.
Signed-off-by: Alexander Graf <agraf@suse.de>
Alexander Graf [Mon, 8 Oct 2012 22:22:59 +0000 (00:22 +0200)]
KVM: Distangle eventfd code from irqchip
The current eventfd code assumes that when we have eventfd, we also have
irqfd for in-kernel interrupt delivery. This is not necessarily true. On
PPC we don't have an in-kernel irqchip yet, but we can still support easily
support eventfd.
Signed-off-by: Alexander Graf <agraf@suse.de>
Jan Kiszka [Sun, 2 Dec 2012 10:04:14 +0000 (11:04 +0100)]
KVM: x86: Fix uninitialized return code
This is a regression caused by
18595411a7.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Will Auld [Thu, 29 Nov 2012 20:42:50 +0000 (12:42 -0800)]
KVM: x86: Emulate IA32_TSC_ADJUST MSR
CPUID.7.0.EBX[1]=1 indicates IA32_TSC_ADJUST MSR 0x3b is supported
Basic design is to emulate the MSR by allowing reads and writes to a guest
vcpu specific location to store the value of the emulated MSR while adding
the value to the vmcs tsc_offset. In this way the IA32_TSC_ADJUST value will
be included in all reads to the TSC MSR whether through rdmsr or rdtsc. This
is of course as long as the "use TSC counter offsetting" VM-execution control
is enabled as well as the IA32_TSC_ADJUST control.
However, because hardware will only return the TSC + IA32_TSC_ADJUST +
vmsc tsc_offset for a guest process when it does and rdtsc (with the correct
settings) the value of our virtualized IA32_TSC_ADJUST must be stored in one
of these three locations. The argument against storing it in the actual MSR
is performance. This is likely to be seldom used while the save/restore is
required on every transition. IA32_TSC_ADJUST was created as a way to solve
some issues with writing TSC itself so that is not an option either.
The remaining option, defined above as our solution has the problem of
returning incorrect vmcs tsc_offset values (unless we intercept and fix, not
done here) as mentioned above. However, more problematic is that storing the
data in vmcs tsc_offset will have a different semantic effect on the system
than does using the actual MSR. This is illustrated in the following example:
The hypervisor set the IA32_TSC_ADJUST, then the guest sets it and a guest
process performs a rdtsc. In this case the guest process will get
TSC + IA32_TSC_ADJUST_hyperviser + vmsc tsc_offset including
IA32_TSC_ADJUST_guest. While the total system semantics changed the semantics
as seen by the guest do not and hence this will not cause a problem.
Signed-off-by: Will Auld <will.auld@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Will Auld [Thu, 29 Nov 2012 20:42:12 +0000 (12:42 -0800)]
KVM: x86: Add code to track call origin for msr assignment
In order to track who initiated the call (host or guest) to modify an msr
value I have changed function call parameters along the call path. The
specific change is to add a struct pointer parameter that points to (index,
data, caller) information rather than having this information passed as
individual parameters.
The initial use for this capability is for updating the IA32_TSC_ADJUST msr
while setting the tsc value. It is anticipated that this capability is
useful for other tasks.
Signed-off-by: Will Auld <will.auld@intel.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Alex Williamson [Thu, 29 Nov 2012 21:07:59 +0000 (14:07 -0700)]
KVM: Fix user memslot overlap check
Prior to memory slot sorting this loop compared all of the user memory
slots for overlap with new entries. With memory slot sorting, we're
just checking some number of entries in the array that may or may not
be user slots. Instead, walk all the slots with kvm_for_each_memslot,
which has the added benefit of terminating early when we hit the first
empty slot, and skip comparison to private slots.
Cc: stable@vger.kernel.org
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Xiao Guangrong [Wed, 28 Nov 2012 12:54:14 +0000 (20:54 +0800)]
KVM: VMX: fix memory order between loading vmcs and clearing vmcs
vmcs->cpu indicates whether it exists on the target cpu, -1 means the vmcs
does not exist on any vcpu
If vcpu load vmcs with vmcs.cpu = -1, it can be directly added to cpu's percpu
list. The list can be corrupted if the cpu prefetch the vmcs's list before
reading vmcs->cpu. Meanwhile, we should remove vmcs from the list before
making vmcs->vcpu == -1 be visible
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Xiao Guangrong [Wed, 28 Nov 2012 12:53:15 +0000 (20:53 +0800)]
KVM: VMX: fix invalid cpu passed to smp_call_function_single
In loaded_vmcs_clear, loaded_vmcs->cpu is the fist parameter passed to
smp_call_function_single, if the target cpu is downing (doing cpu hot remove),
loaded_vmcs->cpu can become -1 then -1 is passed to smp_call_function_single
It can be triggered when vcpu is being destroyed, loaded_vmcs_clear is called
in the preemptionable context
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Gleb Natapov [Wed, 28 Nov 2012 13:19:08 +0000 (15:19 +0200)]
KVM: use is_idle_task() instead of idle_cpu() to decide when to halt in async_pf
As Frederic pointed idle_cpu() may return false even if async fault
happened in the idle task if wake up is pending. In this case the code
will try to put idle task to sleep. Fix this by using is_idle_task() to
check for idle task.
Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:29:04 +0000 (23:29 -0200)]
KVM: x86: update pvclock area conditionally, on cpu migration
As requested by Glauber, do not update kvmclock area on vcpu->pcpu
migration, in case the host has stable TSC.
This is to reduce cacheline bouncing.
Acked-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:29:03 +0000 (23:29 -0200)]
KVM: x86: require matched TSC offsets for master clock
With master clock, a pvclock clock read calculates:
ret = system_timestamp + [ (rdtsc + tsc_offset) - tsc_timestamp ]
Where 'rdtsc' is the host TSC.
system_timestamp and tsc_timestamp are unique, one tuple
per VM: the "master clock".
Given a host with synchronized TSCs, its obvious that
guest TSC must be matched for the above to guarantee monotonicity.
Allow master clock usage only if guest TSCs are synchronized.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:29:02 +0000 (23:29 -0200)]
KVM: x86: add kvm_arch_vcpu_postcreate callback, move TSC initialization
TSC initialization will soon make use of online_vcpus.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:29:01 +0000 (23:29 -0200)]
KVM: x86: implement PVCLOCK_TSC_STABLE_BIT pvclock flag
KVM added a global variable to guarantee monotonicity in the guest.
One of the reasons for that is that the time between
1. ktime_get_ts(×pec);
2. rdtscll(tsc);
Is variable. That is, given a host with stable TSC, suppose that
two VCPUs read the same time via ktime_get_ts() above.
The time required to execute 2. is not the same on those two instances
executing in different VCPUS (cache misses, interrupts...).
If the TSC value that is used by the host to interpolate when
calculating the monotonic time is the same value used to calculate
the tsc_timestamp value stored in the pvclock data structure, and
a single <system_timestamp, tsc_timestamp> tuple is visible to all
vcpus simultaneously, this problem disappears. See comment on top
of pvclock_update_vm_gtod_copy for details.
Monotonicity is then guaranteed by synchronicity of the host TSCs
and guest TSCs.
Set TSC stable pvclock flag in that case, allowing the guest to read
clock from userspace.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:29:00 +0000 (23:29 -0200)]
KVM: x86: notifier for clocksource changes
Register a notifier for clocksource change event. In case
the host switches to clock other than TSC, disable master
clock usage.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:28:59 +0000 (23:28 -0200)]
time: export time information for KVM pvclock
As suggested by John, export time data similarly to how its
done by vsyscall support. This allows KVM to retrieve necessary
information to implement vsyscall support in KVM guests.
Acked-by: John Stultz <johnstul@us.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:28:58 +0000 (23:28 -0200)]
KVM: x86: pass host_tsc to read_l1_tsc
Allow the caller to pass host tsc value to kvm_x86_ops->read_l1_tsc().
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:28:57 +0000 (23:28 -0200)]
x86: vdso: pvclock gettime support
Improve performance of time system calls when using Linux pvclock,
by reading time info from fixmap visible copy of pvclock data.
Originally from Jeremy Fitzhardinge.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:28:56 +0000 (23:28 -0200)]
x86: kvm guest: pvclock vsyscall support
Hook into generic pvclock vsyscall code, with the aim to
allow userspace to have visibility into pvclock data.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:28:55 +0000 (23:28 -0200)]
x86: pvclock: generic pvclock vsyscall initialization
Originally from Jeremy Fitzhardinge.
Introduce generic, non hypervisor specific, pvclock initialization
routines.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:28:54 +0000 (23:28 -0200)]
sched: add notifier for cross-cpu migrations
Originally from Jeremy Fitzhardinge.
Acked-by: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:28:53 +0000 (23:28 -0200)]
x86: pvclock: add note about rdtsc barriers
As noted by Gleb, not advertising SSE2 support implies
no RDTSC barriers.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:28:52 +0000 (23:28 -0200)]
x86: pvclock: introduce helper to read flags
Acked-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:28:51 +0000 (23:28 -0200)]
x86: pvclock: create helper for pvclock data retrieval
Originally from Jeremy Fitzhardinge.
So code can be reused.
Acked-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:28:50 +0000 (23:28 -0200)]
x86: pvclock: remove pvclock_shadow_time
Originally from Jeremy Fitzhardinge.
We can copy the information directly from "struct pvclock_vcpu_time_info",
remove pvclock_shadow_time.
Reviewed-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:28:49 +0000 (23:28 -0200)]
x86: pvclock: make sure rdtsc doesnt speculate out of region
Originally from Jeremy Fitzhardinge.
pvclock_get_time_values, which contains the memory barriers
will be removed by next patch.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:28:48 +0000 (23:28 -0200)]
x86: kvmclock: allocate pvclock shared memory area
We want to expose the pvclock shared memory areas, which
the hypervisor periodically updates, to userspace.
For a linear mapping from userspace, it is necessary that
entire page sized regions are used for array of pvclock
structures.
There is no such guarantee with per cpu areas, therefore move
to memblock_alloc based allocation.
Acked-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Wed, 28 Nov 2012 01:28:47 +0000 (23:28 -0200)]
KVM: x86: retain pvclock guest stopped bit in guest memory
Otherwise its possible for an unrelated KVM_REQ_UPDATE_CLOCK (such as due to CPU
migration) to clear the bit.
Noticed by Paolo Bonzini.
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Reviewed-by: Glauber Costa <glommer@parallels.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Guo Chao [Fri, 2 Nov 2012 10:33:23 +0000 (18:33 +0800)]
KVM: remove unnecessary return value check
No need to check return value before breaking switch.
Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Guo Chao [Fri, 2 Nov 2012 10:33:22 +0000 (18:33 +0800)]
KVM: x86: fix return value of kvm_vm_ioctl_set_tss_addr()
Return value of this function will be that of ioctl().
#include <stdio.h>
#include <linux/kvm.h>
int main () {
int fd;
fd = open ("/dev/kvm", 0);
fd = ioctl (fd, KVM_CREATE_VM, 0);
ioctl (fd, KVM_SET_TSS_ADDR, 0xfffff000);
perror ("");
return 0;
}
Output is "Operation not permitted". That's not what
we want.
Return -EINVAL in this case.
Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Guo Chao [Fri, 2 Nov 2012 10:33:21 +0000 (18:33 +0800)]
KVM: do not kfree error pointer
We should avoid kfree()ing error pointer in kvm_vcpu_ioctl() and
kvm_arch_vcpu_ioctl().
Signed-off-by: Guo Chao <yan@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Thu, 1 Nov 2012 01:21:57 +0000 (23:21 -0200)]
Merge branch 'for-queue' of https://github.com/agraf/linux-2.6 into queue
* 'for-queue' of https://github.com/agraf/linux-2.6:
PPC: ePAPR: Convert hcall header to uapi (round 2)
KVM: PPC: Book3S HV: Fix thinko in try_lock_hpte()
KVM: PPC: Book3S HV: Allow DTL to be set to address 0, length 0
KVM: PPC: Book3S HV: Fix accounting of stolen time
KVM: PPC: Book3S HV: Run virtual core whenever any vcpus in it can run
KVM: PPC: Book3S HV: Fixes for late-joining threads
KVM: PPC: Book3s HV: Don't access runnable threads list without vcore lock
KVM: PPC: Book3S HV: Fix some races in starting secondary threads
KVM: PPC: Book3S HV: Allow KVM guests to stop secondary threads coming online
PPC: ePAPR: Convert header to uapi
KVM: PPC: Move mtspr/mfspr emulation into own functions
KVM: Documentation: Fix reentry-to-be-consistent paragraph
KVM: PPC: 44x: fix DCR read/write
Joerg Roedel [Mon, 29 Oct 2012 18:08:21 +0000 (19:08 +0100)]
KVM: SVM: update MAINTAINERS entry
I have no access to my AMD email address anymore. Update
entry in MAINTAINERS to the new address.
Cc: Avi Kivity <avi@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Alexander Graf [Wed, 31 Oct 2012 12:37:59 +0000 (13:37 +0100)]
PPC: ePAPR: Convert hcall header to uapi (round 2)
The new uapi framework splits kernel internal and user space exported
bits of header files more cleanly. Adjust the ePAPR header accordingly.
Signed-off-by: Alexander Graf <agraf@suse.de>
Alexander Graf [Wed, 31 Oct 2012 12:36:18 +0000 (13:36 +0100)]
Merge commit 'origin/queue' into for-queue
Conflicts:
arch/powerpc/include/asm/Kbuild
arch/powerpc/include/uapi/asm/Kbuild
Paul Mackerras [Mon, 15 Oct 2012 01:20:50 +0000 (01:20 +0000)]
KVM: PPC: Book3S HV: Fix thinko in try_lock_hpte()
This fixes an error in the inline asm in try_lock_hpte() where we
were erroneously using a register number as an immediate operand.
The bug only affects an error path, and in fact the code will still
work as long as the compiler chooses some register other than r0
for the "bits" variable. Nevertheless it should still be fixed.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Mon, 15 Oct 2012 01:18:37 +0000 (01:18 +0000)]
KVM: PPC: Book3S HV: Allow DTL to be set to address 0, length 0
Commit
55b665b026 ("KVM: PPC: Book3S HV: Provide a way for userspace
to get/set per-vCPU areas") includes a check on the length of the
dispatch trace log (DTL) to make sure the buffer is at least one entry
long. This is appropriate when registering a buffer, but the
interface also allows for any existing buffer to be unregistered by
specifying a zero address. In this case the length check is not
appropriate. This makes the check conditional on the address being
non-zero.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Mon, 15 Oct 2012 01:18:07 +0000 (01:18 +0000)]
KVM: PPC: Book3S HV: Fix accounting of stolen time
Currently the code that accounts stolen time tends to overestimate the
stolen time, and will sometimes report more stolen time in a DTL
(dispatch trace log) entry than has elapsed since the last DTL entry.
This can cause guests to underflow the user or system time measured
for some tasks, leading to ridiculous CPU percentages and total runtimes
being reported by top and other utilities.
In addition, the current code was designed for the previous policy where
a vcore would only run when all the vcpus in it were runnable, and so
only counted stolen time on a per-vcore basis. Now that a vcore can
run while some of the vcpus in it are doing other things in the kernel
(e.g. handling a page fault), we need to count the time when a vcpu task
is preempted while it is not running as part of a vcore as stolen also.
To do this, we bring back the BUSY_IN_HOST vcpu state and extend the
vcpu_load/put functions to count preemption time while the vcpu is
in that state. Handling the transitions between the RUNNING and
BUSY_IN_HOST states requires checking and updating two variables
(accumulated time stolen and time last preempted), so we add a new
spinlock, vcpu->arch.tbacct_lock. This protects both the per-vcpu
stolen/preempt-time variables, and the per-vcore variables while this
vcpu is running the vcore.
Finally, we now don't count time spent in userspace as stolen time.
The task could be executing in userspace on behalf of the vcpu, or
it could be preempted, or the vcpu could be genuinely stopped. Since
we have no way of dividing up the time between these cases, we don't
count any of it as stolen.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Mon, 15 Oct 2012 01:17:42 +0000 (01:17 +0000)]
KVM: PPC: Book3S HV: Run virtual core whenever any vcpus in it can run
Currently the Book3S HV code implements a policy on multi-threaded
processors (i.e. POWER7) that requires all of the active vcpus in a
virtual core to be ready to run before we run the virtual core.
However, that causes problems on reset, because reset stops all vcpus
except vcpu 0, and can also reduce throughput since all four threads
in a virtual core have to wait whenever any one of them hits a
hypervisor page fault.
This relaxes the policy, allowing the virtual core to run as soon as
any vcpu in it is runnable. With this, the KVMPPC_VCPU_STOPPED state
and the KVMPPC_VCPU_BUSY_IN_HOST state have been combined into a single
KVMPPC_VCPU_NOTREADY state, since we no longer need to distinguish
between them.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Mon, 15 Oct 2012 01:17:17 +0000 (01:17 +0000)]
KVM: PPC: Book3S HV: Fixes for late-joining threads
If a thread in a virtual core becomes runnable while other threads
in the same virtual core are already running in the guest, it is
possible for the latecomer to join the others on the core without
first pulling them all out of the guest. Currently this only happens
rarely, when a vcpu is first started. This fixes some bugs and
omissions in the code in this case.
First, we need to check for VPA updates for the latecomer and make
a DTL entry for it. Secondly, if it comes along while the master
vcpu is doing a VPA update, we don't need to do anything since the
master will pick it up in kvmppc_run_core. To handle this correctly
we introduce a new vcore state, VCORE_STARTING. Thirdly, there is
a race because we currently clear the hardware thread's hwthread_req
before waiting to see it get to nap. A latecomer thread could have
its hwthread_req cleared before it gets to test it, and therefore
never increment the nap_count, leading to messages about wait_for_nap
timeouts.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Mon, 15 Oct 2012 01:16:48 +0000 (01:16 +0000)]
KVM: PPC: Book3s HV: Don't access runnable threads list without vcore lock
There were a few places where we were traversing the list of runnable
threads in a virtual core, i.e. vc->runnable_threads, without holding
the vcore spinlock. This extends the places where we hold the vcore
spinlock to cover everywhere that we traverse that list.
Since we possibly need to sleep inside kvmppc_book3s_hv_page_fault,
this moves the call of it from kvmppc_handle_exit out to
kvmppc_vcpu_run, where we don't hold the vcore lock.
In kvmppc_vcore_blocked, we don't actually need to check whether
all vcpus are ceded and don't have any pending exceptions, since the
caller has already done that. The caller (kvmppc_run_vcpu) wasn't
actually checking for pending exceptions, so we add that.
The change of if to while in kvmppc_run_vcpu is to make sure that we
never call kvmppc_remove_runnable() when the vcore state is RUNNING or
EXITING.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Mon, 15 Oct 2012 01:16:14 +0000 (01:16 +0000)]
KVM: PPC: Book3S HV: Fix some races in starting secondary threads
Subsequent patches implementing in-kernel XICS emulation will make it
possible for IPIs to arrive at secondary threads at arbitrary times.
This fixes some races in how we start the secondary threads, which
if not fixed could lead to occasional crashes of the host kernel.
This makes sure that (a) we have grabbed all the secondary threads,
and verified that they are no longer in the kernel, before we start
any thread, (b) that the secondary thread loads its vcpu pointer
after clearing the IPI that woke it up (so we don't miss a wakeup),
and (c) that the secondary thread clears its vcpu pointer before
incrementing the nap count. It also removes unnecessary setting
of the vcpu and vcore pointers in the paca in kvmppc_core_vcpu_load.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Paul Mackerras [Mon, 15 Oct 2012 01:15:41 +0000 (01:15 +0000)]
KVM: PPC: Book3S HV: Allow KVM guests to stop secondary threads coming online
When a Book3S HV KVM guest is running, we need the host to be in
single-thread mode, that is, all of the cores (or at least all of
the cores where the KVM guest could run) to be running only one
active hardware thread. This is because of the hardware restriction
in POWER processors that all of the hardware threads in the core
must be in the same logical partition. Complying with this restriction
is much easier if, from the host kernel's point of view, only one
hardware thread is active.
This adds two hooks in the SMP hotplug code to allow the KVM code to
make sure that secondary threads (i.e. hardware threads other than
thread 0) cannot come online while any KVM guest exists. The KVM
code still has to check that any core where it runs a guest has the
secondary threads offline, but having done that check it can now be
sure that they will not come online while the guest is running.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Alexander Graf [Sat, 27 Oct 2012 17:26:14 +0000 (19:26 +0200)]
PPC: ePAPR: Convert header to uapi
The new uapi framework splits kernel internal and user space exported
bits of header files more cleanly. Adjust the ePAPR header accordingly.
Signed-off-by: Alexander Graf <agraf@suse.de>
Alexander Graf [Sat, 6 Oct 2012 21:19:01 +0000 (23:19 +0200)]
KVM: PPC: Move mtspr/mfspr emulation into own functions
The mtspr/mfspr emulation code became quite big over time. Move it
into its own function so things stay more readable.
Signed-off-by: Alexander Graf <agraf@suse.de>
Alexander Graf [Sun, 7 Oct 2012 13:22:59 +0000 (15:22 +0200)]
KVM: Documentation: Fix reentry-to-be-consistent paragraph
All user space offloaded instruction emulation needs to reenter kvm
to produce consistent state again. Fix the section in the documentation
to mention all of them.
Signed-off-by: Alexander Graf <agraf@suse.de>
Alexander Graf [Sat, 6 Oct 2012 01:56:35 +0000 (03:56 +0200)]
KVM: PPC: 44x: fix DCR read/write
When remembering the direction of a DCR transaction, we should write
to the same variable that we interpret on later when doing vcpu_run
again.
Signed-off-by: Alexander Graf <agraf@suse.de>
Cc: stable@vger.kernel.org
Xiao Guangrong [Tue, 16 Oct 2012 12:10:59 +0000 (20:10 +0800)]
KVM: do not treat noslot pfn as a error pfn
This patch filters noslot pfn out from error pfns based on Marcelo comment:
noslot pfn is not a error pfn
After this patch,
- is_noslot_pfn indicates that the gfn is not in slot
- is_error_pfn indicates that the gfn is in slot but the error is occurred
when translate the gfn to pfn
- is_error_noslot_pfn indicates that the pfn either it is error pfns or it
is noslot pfn
And is_invalid_pfn can be removed, it makes the code more clean
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Marcelo Tosatti [Mon, 29 Oct 2012 21:15:32 +0000 (19:15 -0200)]
Merge remote-tracking branch 'master' into queue
Merge reason: development work has dependency on kvm patches merged
upstream.
Conflicts:
arch/powerpc/include/asm/Kbuild
arch/powerpc/include/asm/kvm_para.h
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Linus Torvalds [Mon, 29 Oct 2012 15:49:25 +0000 (08:49 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/sage/ceph-client
Pull Ceph fixes form Sage Weil:
"There are two fixes in the messenger code, one that can trigger a NULL
dereference, and one that error in refcounting (extra put). There is
also a trivial fix that in the fs client code that is triggered by NFS
reexport."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
ceph: fix dentry reference leak in encode_fh()
libceph: avoid NULL kref_put when osd reset races with alloc_msg
rbd: reset BACKOFF if unable to re-queue
David Zafman [Thu, 18 Oct 2012 21:01:43 +0000 (14:01 -0700)]
ceph: fix dentry reference leak in encode_fh()
Call to d_find_alias() needs a corresponding dput()
This fixes http://tracker.newdream.net/issues/3271
Signed-off-by: David Zafman <david.zafman@inktank.com>
Reviewed-by: Sage Weil <sage@inktank.com>
Linus Torvalds [Sun, 28 Oct 2012 21:15:09 +0000 (14:15 -0700)]
Merge branch 'i2c-for-linus' of git://git./linux/kernel/git/jdelvare/staging
Pull i2c subsystem fixes from Jean Delvare.
* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
i2c-i801: Fix comment
i2c-i801: Simplify dependency towards GPIOLIB
i2c-stub: Move to drivers/i2c
Jean Delvare [Sun, 28 Oct 2012 20:37:01 +0000 (21:37 +0100)]
i2c-i801: Fix comment
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Jean Delvare [Sun, 28 Oct 2012 20:37:01 +0000 (21:37 +0100)]
i2c-i801: Simplify dependency towards GPIOLIB
Arbitrarily selecting GPIOLIB causes trouble on some architectures,
so don't do that. Instead, just make the optional multiplexing code
depend on CONFIG_I2C_MUX_GPIO instead of CONFIG_I2C_MUX for now. We
can revisit if the i2c-i801 driver ever supports other multiplexing
flavors.
Also make that optional code depend on DMI, as it won't do anything
without that.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Jean Delvare [Sun, 28 Oct 2012 20:37:00 +0000 (21:37 +0100)]
i2c-stub: Move to drivers/i2c
Move the i2c-stub driver to drivers/i2c, to match the Kconfig entry.
This is less confusing that way.
I also fixed all checkpatch warnings and errors.
Signed-off-by: Jean Delvare <khali@linux-fr.org>
Cc: Peter Huewe <peterhuewe@gmx.de>
Linus Torvalds [Sun, 28 Oct 2012 19:24:48 +0000 (12:24 -0700)]
Linux 3.7-rc3
Linus Torvalds [Sun, 28 Oct 2012 18:14:52 +0000 (11:14 -0700)]
Merge tag 'ktest-v3.7-rc2' of git://git./linux/kernel/git/rostedt/linux-ktest
Pull ktest confusion fix from Steven Rostedt:
"With the v3.7-rc2 kernel, the network cards on my target boxes were
not being brought up.
I found that the modules for the network was not being installed.
This was due to the config CONFIG_MODULES_USE_ELF_RELA that came
before CONFIG_MODULES, and confused ktest in thinking that
CONFIG_MODULES=y was not found.
Ktest needs to test all configs and not just stop if something starts
with CONFIG_MODULES."
* tag 'ktest-v3.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest:
ktest: Fix ktest confusion with CONFIG_MODULES_USE_ELF_RELA
Linus Torvalds [Sun, 28 Oct 2012 18:13:54 +0000 (11:13 -0700)]
Merge tag 'spi-mxs' of git://git./linux/kernel/git/broonie/misc
Pull minor spi MXS fixes from Mark Brown:
"These fixes are both pretty minor ones and are driver local."
* tag 'spi-mxs' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/misc:
spi: mxs: Terminate DMA in case of DMA timeout
spi: mxs: Assign message status after transfer finished
Linus Torvalds [Sun, 28 Oct 2012 18:12:38 +0000 (11:12 -0700)]
Merge tag 'fixes-for-3.7' of git://git./linux/kernel/git/arm/arm-soc
Pull arm-soc fixes from Arnd Bergmann:
"Bug fixes for a number of ARM platforms, mostly OMAP, imx and at91.
These come a little later than I had hoped but unfortunately we had a
few of these patches cause regressions themselves and had to work out
how to deal with those in the meantime."
* tag 'fixes-for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (38 commits)
Revert "ARM i.MX25: Fix PWM per clock lookups"
ARM: versatile: fix versatile_defconfig
ARM: mvebu: update defconfig with 3.7 changes
ARM: at91: fix at91x40 build
ARM: socfpga: Fix socfpga compilation with early_printk() enabled
ARM: SPEAr: Remove unused empty files
MAINTAINERS: Add arm-soc tree entry
ARM: dts: mxs: add the "clock-names" for gpmi-nand
ARM: ux500: Correct SDI5 address and add some format changes
ARM: ux500: Specify AMBA Primecell IDs for Nomadik I2C in DT
ARM: ux500: Fix build error relating to IRQCHIP_SKIP_SET_WAKE
ARM: at91: drop duplicated config SOC_AT91SAM9 entry
ARM: at91/i2c: change id to let i2c-at91 work
ARM: at91/i2c: change id to let i2c-gpio work
ARM: at91/dts: at91sam9g20ek_common: Fix typos in buttons labels.
ARM: at91: fix external interrupt specification in board code
ARM: at91: fix external interrupts in non-DT case
ARM: at91: at91sam9g10: fix SOC type detection
ARM: at91/tc: fix typo in the DT document
ARM: AM33XX: Fix configuration of dmtimer parent clock by dmtimer driverDate:Wed, 17 Oct 2012 13:55:55 -0500
...
Mikulas Patocka [Mon, 15 Oct 2012 21:20:17 +0000 (17:20 -0400)]
Lock splice_read and splice_write functions
Functions generic_file_splice_read and generic_file_splice_write access
the pagecache directly. For block devices these functions must be locked
so that block size is not changed while they are in progress.
This patch is an additional fix for commit
b87570f5d349 ("Fix a crash
when block device is read and block size is changed at the same time")
that locked aio_read, aio_write and mmap against block size change.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mikulas Patocka [Mon, 22 Oct 2012 23:39:16 +0000 (19:39 -0400)]
percpu-rw-semaphores: use rcu_read_lock_sched
Use rcu_read_lock_sched / rcu_read_unlock_sched / synchronize_sched
instead of rcu_read_lock / rcu_read_unlock / synchronize_rcu.
This is an optimization. The RCU-protected region is very small, so
there will be no latency problems if we disable preempt in this region.
So we use rcu_read_lock_sched / rcu_read_unlock_sched that translates
to preempt_disable / preempt_disable. It is smaller (and supposedly
faster) than preemptible rcu_read_lock / rcu_read_unlock.
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Mikulas Patocka [Mon, 22 Oct 2012 23:37:47 +0000 (19:37 -0400)]
percpu-rw-semaphores: use light/heavy barriers
This patch introduces new barrier pair light_mb() and heavy_mb() for
percpu rw semaphores.
This patch fixes a bug in percpu-rw-semaphores where a barrier was
missing in percpu_up_write.
This patch improves performance on the read path of
percpu-rw-semaphores: on non-x86 cpus, there was a smp_mb() in
percpu_up_read. This patch changes it to a compiler barrier and removes
the "#if defined(X86) ..." condition.
From: Lai Jiangshan <laijs@cn.fujitsu.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Arnd Bergmann [Sat, 27 Oct 2012 15:41:13 +0000 (17:41 +0200)]
Revert "ARM i.MX25: Fix PWM per clock lookups"
This reverts commit
92063cee118655d25b50d04eb77b012f3287357a, it
was applied prematurely, causing this build error for
imx_v4_v5_defconfig:
arch/arm/mach-imx/clk-imx25.c: In function 'mx25_clocks_init':
arch/arm/mach-imx/clk-imx25.c:206:26: error: 'pwm_ipg_per' undeclared (first use in this function)
arch/arm/mach-imx/clk-imx25.c:206:26: note: each undeclared identifier is reported only once for each function it appears in
Sascha Hauer explains:
> There are several gates missing in clk-imx25.c. I have a patch which
> adds support for them and I seem to have missed that the above depends
> on it.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Fri, 26 Oct 2012 21:06:43 +0000 (23:06 +0200)]
ARM: versatile: fix versatile_defconfig
With the introduction of CONFIG_ARCH_MULTIPLATFORM, versatile is
no longer the default platform, so we need to enable
CONFIG_ARCH_VERSATILE explicitly in order for that to be selected
rather than the multiplatform configuration.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Thomas Petazzoni [Tue, 23 Oct 2012 08:17:49 +0000 (10:17 +0200)]
ARM: mvebu: update defconfig with 3.7 changes
The split of 370 and XP into two Kconfig options and the multiplatform
kernel support has changed a few Kconfig symbols, so let's update the
mvebu_defconfig file with the latest changes.
Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Arnd Bergmann [Fri, 26 Oct 2012 20:49:09 +0000 (22:49 +0200)]
ARM: at91: fix at91x40 build
patch
738a0fd7 "ARM: at91: fix external interrupts in non-DT case"
fixed a run-time error on some at91 platforms but did not apply
the same change to at91x40, which now doesn't build.
This changes at91x40 in the same way that the other platforms
were changed.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Linus Torvalds [Fri, 26 Oct 2012 22:00:48 +0000 (15:00 -0700)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
"This is what we usually expect at this stage of the game, lots of
little things, mostly in drivers. With the occasional 'oops didn't
mean to do that' kind of regressions in the core code."
1) Uninitialized data in __ip_vs_get_timeouts(), from Arnd Bergmann
2) Reject invalid ACK sequences in Fast Open sockets, from Jerry Chu.
3) Lost error code on return from _rtl_usb_receive(), from Christian
Lamparter.
4) Fix reset resume on USB rt2x00, from Stanislaw Gruszka.
5) Release resources on error in pch_gbe driver, from Veaceslav Falico.
6) Default hop limit not set correctly in ip6_template_metrics[], fix
from Li RongQing.
7) Gianfar PTP code requests wrong kind of resource during probe, fix
from Wei Yang.
8) Fix VHOST net driver on big-endian, from Michael S Tsirkin.
9) Mallenox driver bug fixes from Jack Morgenstein, Or Gerlitz, Moni
Shoua, Dotan Barak, and Uri Habusha.
10) usbnet leaks memory on TX path, fix from Hemant Kumar.
11) Use socket state test, rather than presence of FIN bit packet, to
determine FIONREAD/SIOCINQ value. Fix from Eric Dumazet.
12) Fix cxgb4 build failure, from Vipul Pandya.
13) Provide a SYN_DATA_ACKED state to complement SYN_FASTOPEN in socket
info dumps. From Yuchung Cheng.
14) Fix leak of security path in kfree_skb_partial(). Fix from Eric
Dumazet.
15) Handle RX FIFO overflows more resiliently in pch_gbe driver, from
Veaceslav Falico.
16) Fix MAINTAINERS file pattern for networking drivers, from Jean
Delvare.
17) Add iPhone5 IDs to IPHETH driver, from Jay Purohit.
18) VLAN device type change restriction is too strict, and should not
trigger for the automatically generated vlan0 device. Fix from Jiri
Pirko.
19) Make PMTU/redirect flushing work properly again in ipv4, from
Steffen Klassert.
20) Fix memory corruptions by using kfree_rcu() in netlink_release().
From Eric Dumazet.
21) More qmi_wwan device IDs, from Bjørn Mork.
22) Fix unintentional change of SNAT/DNAT hooks in generic NAT
infrastructure, from Elison Niven.
23) Fix 3.6.x regression in xt_TEE netfilter module, from Eric Dumazet.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (57 commits)
tilegx: fix some issues in the SW TSO support
qmi_wwan/cdc_ether: move Novatel 551 and E362 to qmi_wwan
net: usb: Fix memory leak on Tx data path
net/mlx4_core: Unmap UAR also in the case of error flow
net/mlx4_en: Don't use vlan tag value as an indication for vlan presence
net/mlx4_en: Fix double-release-range in tx-rings
bas_gigaset: fix pre_reset handling
vhost: fix mergeable bufs on BE hosts
gianfar_ptp: use iomem, not ioports resource tree in probe
ipv6: Set default hoplimit as zero.
NET_VENDOR_TI: make available for am33xx as well
pch_gbe: fix error handling in pch_gbe_up()
b43: Fix oops on unload when firmware not found
mwifiex: clean up scan state on error
mwifiex: return -EBUSY if specific scan request cannot be honored
brcmfmac: fix potential NULL dereference
Revert "ath9k_hw: Updated AR9003 tx gain table for 5GHz"
ath9k_htc: Add PID/VID for a Ubiquiti WiFiStation
rt2x00: usb: fix reset resume
rtlwifi: pass rx setup error code to caller
...
Linus Torvalds [Fri, 26 Oct 2012 21:59:01 +0000 (14:59 -0700)]
Merge branch 'fixes' of git://git.infradead.org/users/vkoul/slave-dma
Pull slave-dmaengine fixes from Vinod Koul:
"Three fixes for slave dmanegine.
Two are for typo omissions in sifr dmaengine driver and the last one
is for the imx driver fixing a missing unlock"
* 'fixes' of git://git.infradead.org/users/vkoul/slave-dma:
dmaengine: sirf: fix a typo in moving running dma_desc to active queue
dmaengine: sirf: fix a typo in dma_prep_interleaved
dmaengine: imx-dma: fix missing unlock on error in imxdma_xfer_desc()
Linus Torvalds [Fri, 26 Oct 2012 21:23:35 +0000 (14:23 -0700)]
Merge tag 'pm+acpi-for-3.7-rc3' of git://git./linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael J Wysocki:
- Fix for a memory leak in acpi_bind_one() from Jesper Juhl.
- Fix for an error code path memory leak in pm_genpd_attach_cpuidle()
from Jonghwan Choi.
- Fix for smp_processor_id() usage in preemptible code in powernow-k8
from Andreas Herrmann.
- Fix for a suspend-related memory leak in cpufreq stats from Xiaobing
Tu.
- Freezer fix for failure to clear PF_NOFREEZE along with PF_KTHREAD in
flush_old_exec() from Oleg Nesterov.
- acpi_processor_notify() fix from Alan Cox.
* tag 'pm+acpi-for-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: missing break
freezer: exec should clear PF_NOFREEZE along with PF_KTHREAD
Fix memory leak in cpufreq stats.
cpufreq / powernow-k8: Remove usage of smp_processor_id() in preemptible code
PM / Domains: Fix memory leak on error path in pm_genpd_attach_cpuidle
ACPI: Fix memory leak in acpi_bind_one()
Wei Yongjun [Wed, 17 Oct 2012 15:03:42 +0000 (23:03 +0800)]
KVM: ia64: remove unused variable in kvm_release_vm_pages()
The variable base_gfn is initialized but never used
otherwise, so remove the unused variable.
dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)
Acked-by: Xiantao Zhang <xiantao.zhang@intel.com>
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Linus Torvalds [Fri, 26 Oct 2012 20:46:41 +0000 (13:46 -0700)]
Merge tag 'rdma-for-linus' of git://git./linux/kernel/git/roland/infiniband
Pull infiniband fixes from Roland Dreier:
"Small batch of fixes for 3.7:
- Fix crash in error path in cxgb4
- Fix build error on 32 bits in mlx4
- Fix SR-IOV bugs in mlx4"
* tag 'rdma-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
mlx4_core: Perform correct resource cleanup if mlx4_QUERY_ADAPTER() fails
mlx4_core: Remove annoying debug messages from SR-IOV flow
RDMA/cxgb4: Don't free chunk that we have failed to allocate
IB/mlx4: Synchronize cleanup of MCGs in MCG paravirtualization
IB/mlx4: Fix QP1 P_Key processing in the Primary Physical Function (PPF)
IB/mlx4: Fix build error on platforms where UL is not 64 bits
Linus Torvalds [Fri, 26 Oct 2012 17:26:36 +0000 (10:26 -0700)]
Merge tag 'usb-3.7-rc3' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes from Greg Kroah-Hartman:
"Here are a bunch of USB fixes for the 3.7-rc tree.
There's a lot of small USB serial driver fixes, and one larger one
(the mos7840 driver changes are mostly just moving code around to fix
problems.) Thanks to Johan Hovold for finding the problems and fixing
them all up.
Other than those, there is the usual new device ids, xhci bugfixes,
and gadget driver fixes, nothing out of the ordinary.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'usb-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (49 commits)
xhci: trivial: Remove assigned but unused ep_ctx.
xhci: trivial: Remove assigned but unused slot_ctx.
xhci: Fix missing break in xhci_evaluate_context_result.
xhci: Fix potential NULL ptr deref in command cancellation.
ehci: Add yet-another Lucid nohandoff pci quirk
ehci: fix Lucid nohandoff pci quirk to be more generic with BIOS versions
USB: mos7840: fix port_probe flow
USB: mos7840: fix port-data memory leak
USB: mos7840: remove invalid disconnect handling
USB: mos7840: remove NULL-urb submission
USB: qcserial: fix interface-data memory leak in error path
USB: option: fix interface-data memory leak in error path
USB: ipw: fix interface-data memory leak in error path
USB: mos7840: fix port-device leak in error path
USB: mos7840: fix urb leak at release
USB: sierra: fix port-data memory leak
USB: sierra: fix memory leak in probe error path
USB: sierra: fix memory leak in attach error path
USB: usb-wwan: fix multiple memory leaks in error paths
USB: keyspan: fix NULL-pointer dereferences and memory leaks
...
Linus Torvalds [Fri, 26 Oct 2012 17:26:08 +0000 (10:26 -0700)]
Merge tag 'tty-3.7-rc3' of git://git./linux/kernel/git/gregkh/tty
Pull serial fix from Greg Kroah-Hartman:
"Here is one patch, a revert of a omap serial driver patch that was
causing problems, for your 3.7-rc tree.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'tty-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
Revert "serial: omap: fix software flow control"
Linus Torvalds [Fri, 26 Oct 2012 17:25:31 +0000 (10:25 -0700)]
Merge tag 'staging-3.7-rc2' of git://git./linux/kernel/git/gregkh/staging
Pull staging driver fixes from Greg Kroah-Hartman:
"Here are some staging driver fixes for your 3.7-rc tree.
Nothing major here, a number of iio driver fixups that were causing
problems, some comedi driver bugfixes, and a bunch of tidspbridge
warning squashing and other regressions fixed from the 3.6 release.
All have been in the linux-next releases for a bit.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'staging-3.7-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (32 commits)
staging: tidspbridge: delete unused mmu functions
staging: tidspbridge: ioremap physical address of the stack segment in shm
staging: tidspbridge: ioremap dsp sync addr
staging: tidspbridge: change type to __iomem for per and core addresses
staging: tidspbridge: drop const from custom mmu implementation
staging: tidspbridge: request the right irq for mmu
staging: ipack: add missing include (implicit declaration of function 'kfree')
staging: ramster: depends on NET
staging: omapdrm: fix allocation size for page addresses array
staging: zram: Fix handling of incompressible pages
Staging: android: binder: Allow using highmem for binder buffers
Staging: android: binder: Fix memory leak on thread/process exit
staging: comedi: ni_labpc: fix possible NULL deref during detach
staging: comedi: das08: fix possible NULL deref during detach
staging: comedi: amplc_pc263: fix possible NULL deref during detach
staging: comedi: amplc_pc236: fix possible NULL deref during detach
staging: comedi: amplc_pc236: fix invalid register access during detach
staging: comedi: amplc_dio200: fix possible NULL deref during detach
staging: comedi: 8255_pci: fix possible NULL deref during detach
staging: comedi: ni_daq_700: fix dio subdevice regression
...
Linus Torvalds [Fri, 26 Oct 2012 17:24:51 +0000 (10:24 -0700)]
Merge tag 'driver-core-3.7-rc3' of git://git./linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg Kroah-Hartman:
"Here are a number of firmware core fixes for 3.7, and some other minor
fixes. And some documentation updates thrown in for good measure.
All have been in the linux-next tree for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'driver-core-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
Documentation:Chinese translation of Documentation/arm64/memory.txt
Documentation:Chinese translation of Documentation/arm64/booting.txt
Documentation:Chinese translation of Documentation/IRQ.txt
firmware loader: document kernel direct loading
sysfs: sysfs_pathname/sysfs_add_one: Use strlcat() instead of strcat()
dynamic_debug: Remove unnecessary __used
firmware loader: sync firmware cache by async_synchronize_full_domain
firmware loader: let direct loading back on 'firmware_buf'
firmware loader: fix one reqeust_firmware race
firmware loader: cancel uncache work before caching firmware
Linus Torvalds [Fri, 26 Oct 2012 17:24:19 +0000 (10:24 -0700)]
Merge tag 'char-misc-3.7-rc3' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg Kroah-Hartman:
"Here are some driver fixes for 3.7. They include extcon driver fixes,
a hyper-v bugfix, and two other minor driver fixes.
All of these have been in the linux-next releases for a while.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>"
* tag 'char-misc-3.7-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
sonypi: suspend/resume callbacks should be conditionally compiled on CONFIG_PM_SLEEP
Drivers: hv: Cleanup error handling in vmbus_open()
extcon : register for cable interest by cable name
extcon: trivial: kfree missed from remove path
extcon: driver model release call not needed
extcon: MAX77693: Add platform data for MUIC device to initialize registers
extcon: max77693: Use max77693_update_reg for rmw operations
extcon: Fix kerneldoc for extcon_set_cable_state and extcon_set_cable_state_
extcon: adc-jack: Add missing MODULE_LICENSE
extcon: adc-jack: Fix checking return value of request_any_context_irq
extcon: Fix return value in extcon_register_interest()
extcon: unregister compat link on cleanup
extcon: Unregister compat class at module unload to fix oops
extcon: optimising the check_mutually_exclusive function
extcon: standard cable names definition and declaration changed
extcon-max8997: remove usage of ret in max8997_muic_handle_charger_type_detach
extcon: Remove duplicate inclusion of extcon.h header file
Linus Torvalds [Fri, 26 Oct 2012 17:05:07 +0000 (10:05 -0700)]
VFS: don't do protected {sym,hard}links by default
In commit
800179c9b8a1 ("This adds symlink and hardlink restrictions to
the Linux VFS"), the new link protections were enabled by default, in
the hope that no actual application would care, despite it being
technically against legacy UNIX (and documented POSIX) behavior.
However, it does turn out to break some applications. It's rare, and
it's unfortunate, but it's unacceptable to break existing systems, so
we'll have to default to legacy behavior.
In particular, it has broken the way AFD distributes files, see
http://www.dwd.de/AFD/
along with some legacy scripts.
Distributions can end up setting this at initrd time or in system
scripts: if you have security problems due to link attacks during your
early boot sequence, you have bigger problems than some kernel sysctl
setting. Do:
echo 1 > /proc/sys/fs/protected_symlinks
echo 1 > /proc/sys/fs/protected_hardlinks
to re-enable the link protections.
Alternatively, we may at some point introduce a kernel config option
that sets these kinds of "more secure but not traditional" behavioural
options automatically.
Reported-by: Nick Bowler <nbowler@elliptictech.com>
Reported-by: Holger Kiehl <Holger.Kiehl@dwd.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org # v3.6
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 26 Oct 2012 17:03:22 +0000 (10:03 -0700)]
Merge tag 'sound-3.7' of git://git./linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"Slightly a high amount of commits come from Adrian Knoth's HDSPM
driver fixes. Other than that, all small trival fixes or quirks that
are pretty driver-specific."
* tag 'sound-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ASoC: wm8994: Only enable extra BCLK cycles when required
ALSA: als3000: check for the kzalloc return value
ALSA: sound/isa/opti9xx/miro.c: eliminate possible double free
ALSA: hda - Fix silent headphone output from Toshiba P200
ALSA: hdspm - Fix coding style in CTL_ELEM macros
ALSA: hdspm - Fix typo in kcontrol element on RME MADI cards
ALSA: hdspm - Fix sync_in detection on AES/AES32
ALSA: hdspm - Fix sync_in reporting on RME MADI cards
ALSA: hdspm - Also report autosync_sample_rate on MADI and MADIface
ALSA: hdspm - Fix reported autosync_sample_rate
ALSA: hdspm - Fix sync check reporting on all RME HDSPM cards
ALSA: hdspm - Report external rate in slave mode on PCI MADI
ALSA: hdspm - Allow DDS/Varispeed to be set from userspace
ALSA: hda - add dock support for Thinkpad T430
ASoC: ux500_msp_i2s: Fix devm_* and return code merge error
ASoC: Ux500: Dispose of device nodes correctly
Linus Torvalds [Fri, 26 Oct 2012 17:01:43 +0000 (10:01 -0700)]
Merge branch 'fixes_for_linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping
Pull DMA-mapping revert from Marek Szyprowski:
"Due to my mistake, my previous pull request (merged as commit
cff7b8ba60e3: "Merge branch 'fixes_for_linus' ..") contained a patch
which is aimed for v3.8 and lacks its dependences. This pull request
reverts it and fixes build break of ARM architecture."
* 'fixes_for_linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
Revert "ARM: dma-mapping: support debug_dma_mapping_error"
Linus Torvalds [Fri, 26 Oct 2012 16:35:46 +0000 (09:35 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
"This fixes a couple of nasty page table initialization bugs which were
causing kdump regressions. A clean rearchitecturing of the code is in
the works - meanwhile these are reverts that restore the
best-known-working state of the kernel.
There's also EFI fixes and other small fixes."
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, mm: Undo incorrect revert in arch/x86/mm/init.c
x86: efi: Turn off efi_enabled after setup on mixed fw/kernel
x86, mm: Find_early_table_space based on ranges that are actually being mapped
x86, mm: Use memblock memory loop instead of e820_RAM
x86, mm: Trim memory in memblock to be page aligned
x86/irq/ioapic: Check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt
x86/efi: Fix oops caused by incorrect set_memory_uc() usage
x86-64: Fix page table accounting
Revert "x86/mm: Fix the size calculation of mapping tables"
MAINTAINERS: Add EFI git repository location
Linus Torvalds [Fri, 26 Oct 2012 16:35:00 +0000 (09:35 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
Pull perf fixes from Ingo Molnar:
"Most of the kernel diffstat relates to a group of Intel P6 and KNC
(Xeon-Phi Knights Corner) PMU driver fixes, neither of which is in
heavy use, so we took the fixes.
The rest is diverse smallish fixes to the tooling and kernel side."
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86: Remove unused variable in nhmex_rbox_alter_er()
perf/x86: Enable overflow on Intel KNC with a custom knc_pmu_handle_irq()
perf/x86: Remove cpuc->enable check on Intl KNC event enable/disable
perf/x86: Make Intel KNC use full 40-bit width of counters
perf/x86/uncore: Handle pci_read_config_dword() errors
perf/x86: Remove P6 cpuc->enabled check
perf/x86: Update/fix generic events on P6 PMU
perf/x86: Fix P6 FP_ASSIST event constraint
perf, cpu hotplug: Use cached value of smp_processor_id()
perf, cpu hotplug: Run CPU_STARTING notifiers with irqs disabled
x86/perf: Fix virtualization sanity check
perf test: Fix exclude_guest parse events tests
perf tools: do not flush maps on COMM for perf report
perf help: Fix --help for builtins
perf trace: Check if sample raw_data field is set
perf trace: Validate syscall id before growing syscall table
Linus Torvalds [Fri, 26 Oct 2012 16:34:04 +0000 (09:34 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/mason/linux-btrfs
Pull btrfs fixes from Chris Mason:
"This has our series of fixes for the next rc. The biggest batch is
from Jan Schmidt, fixing up some problems in our subvolume quota code
and fixing btrfs send/receive to work with the new extended inode
refs."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
Btrfs: do not bug when we fail to commit the transaction
Btrfs: fix memory leak when cloning root's node
Btrfs: Use btrfs_update_inode_fallback when creating a snapshot
Btrfs: Send: preserve ownership (uid and gid) also for symlinks.
Btrfs: fix deadlock caused by the nested chunk allocation
btrfs: Return EINVAL when length to trim is less than FSB
Btrfs: fix memory leak in btrfs_quota_enable()
Btrfs: send correct rdev and mode in btrfs-send
Btrfs: extended inode refs support for send mechanism
Btrfs: Fix wrong error handling code
Fix a sign bug causing invalid memory access in the ino_paths ioctl.
Btrfs: comment for loop in tree_mod_log_insert_move
Btrfs: fix extent buffer reference for tree mod log roots
Btrfs: determine level of old roots
Btrfs: tree mod log's old roots could still be part of the tree
Btrfs: fix a tree mod logging issue for root replacement operations
Btrfs: don't put removals from push_node_left into tree mod log twice
John W. Linville [Fri, 26 Oct 2012 14:32:13 +0000 (10:32 -0400)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless into for-davem
Arnd Bergmann [Fri, 26 Oct 2012 13:11:30 +0000 (15:11 +0200)]
Merge branch 'fixes' of git://git./linux/kernel/git/horms/renesas into fixes
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas:
ARM: shmobile: r8a7779: I/O address abuse cleanup
Arnd Bergmann [Fri, 26 Oct 2012 12:45:06 +0000 (14:45 +0200)]
Merge branch 'v3.7-samsung-fixes-2' of git://git./linux/kernel/git/kgene/linux-samsung into fixes
From Kukjin Kim <kgene.kim@samsung.com>:
One is spi stuff for fix the device names for the different subtypes of
the spi controller. And the other is adding missing .smp field for
exynos4-dt and fixing memory sections for exynos4210-trats board.
* 'v3.7-samsung-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kgene/linux-samsung:
ARM: EXYNOS: Set .smp field of machine descriptor for exynos4-dt
ARM: dts: Split memory into 4 sections for exynos4210-trats
ARM: SAMSUNG: Add naming of s3c64xx-spi devices
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Ingo Molnar [Fri, 26 Oct 2012 08:17:38 +0000 (10:17 +0200)]
Merge tag 'efi-for-3.7' of git://git./linux/kernel/git/mfleming/efi into x86/urgent
Pull EFI fixes from Matt Fleming:
"Fix oops with EFI variables on mixed 32/64-bit firmware/kernels and
document EFI git repository location on kernel.org."
Conflicts:
arch/x86/include/asm/efi.h
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Chris Metcalf [Thu, 25 Oct 2012 07:25:20 +0000 (07:25 +0000)]
tilegx: fix some issues in the SW TSO support
This change correctly computes the header length and data length in
the fragments to avoid a bug where we would end up with extremely
slow performance. Also adopt use of skb_frag_size() accessor.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Cc: stable@vger.kernel.org [v3.6]
Signed-off-by: David S. Miller <davem@davemloft.net>
Dan Williams [Wed, 24 Oct 2012 12:10:34 +0000 (12:10 +0000)]
qmi_wwan/cdc_ether: move Novatel 551 and E362 to qmi_wwan
These devices provide QMI and ethernet functionality via a standard CDC
ethernet descriptor. But when driven by cdc_ether, the QMI
functionality is unavailable because only cdc_ether can claim the USB
interface. Thus blacklist the devices in cdc_ether and add their IDs to
qmi_wwan, which enables both QMI and ethernet simultaneously.
Signed-off-by: Dan Williams <dcbw@redhat.com>
Cc: stable@vger.kernel.org
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hemant Kumar [Thu, 25 Oct 2012 18:17:54 +0000 (18:17 +0000)]
net: usb: Fix memory leak on Tx data path
Driver anchors the tx urbs and defers the urb submission if
a transmit request comes when the interface is suspended.
Anchoring urb increments the urb reference count. These
deferred urbs are later accessed by calling usb_get_from_anchor()
for submission during interface resume. usb_get_from_anchor()
unanchors the urb but urb reference count remains same.
This causes the urb reference count to remain non-zero
after usb_free_urb() gets called and urb never gets freed.
Hence call usb_put_urb() after anchoring the urb to properly
balance the reference count for these deferred urbs. Also,
unanchor these deferred urbs during disconnect, to free them
up.
Signed-off-by: Hemant Kumar <hemantk@codeaurora.org>
Acked-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Dotan Barak [Thu, 25 Oct 2012 01:12:49 +0000 (01:12 +0000)]
net/mlx4_core: Unmap UAR also in the case of error flow
If a failure takes place during the EQ creation, we need to unmap the
UAR memory block too.
Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il>
Signed-off-by: Uri Habusha <urih@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Moni Shoua [Thu, 25 Oct 2012 01:12:48 +0000 (01:12 +0000)]
net/mlx4_en: Don't use vlan tag value as an indication for vlan presence
The vlan tag can be zero. This is why it can't serve as an indication
that packet requires VLAN header in the TX flow.
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Jack Morgenstein [Thu, 25 Oct 2012 01:12:47 +0000 (01:12 +0000)]
net/mlx4_en: Fix double-release-range in tx-rings
The QP range is reserved as a single block. However, when freeing the
en resources, the tx-ring QPs are released both in mlx4_en_destroy_tx_ring
(one at a time) and in mlx4_en_free_resources (as a block release).
Fix by eliminating the one-at-a-time release in mlx4_en_destroy_tx_ring.
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>