Konrad Rzeszutek Wilk [Wed, 19 Jan 2011 01:15:21 +0000 (20:15 -0500)]
xen/mmu: Add the notion of identity (1-1) mapping.
Our P2M tree structure is a three-level. On the leaf nodes
we set the Machine Frame Number (MFN) of the PFN. What this means
is that when one does: pfn_to_mfn(pfn), which is used when creating
PTE entries, you get the real MFN of the hardware. When Xen sets
up a guest it initially populates a array which has descending
(or ascending) MFN values, as so:
idx: 0, 1, 2
[0x290F, 0x290E, 0x290D, ..]
so pfn_to_mfn(2)==0x290D. If you start, restart many guests that list
starts looking quite random.
We graft this structure on our P2M tree structure and stick in
those MFN in the leafs. But for all other leaf entries, or for the top
root, or middle one, for which there is a void entry, we assume it is
"missing". So
pfn_to_mfn(0xc0000)=INVALID_P2M_ENTRY.
We add the possibility of setting 1-1 mappings on certain regions, so
that:
pfn_to_mfn(0xc0000)=0xc0000
The benefit of this is, that we can assume for non-RAM regions (think
PCI BARs, or ACPI spaces), we can create mappings easily b/c we
get the PFN value to match the MFN.
For this to work efficiently we introduce one new page p2m_identity and
allocate (via reserved_brk) any other pages we need to cover the sides
(1GB or 4MB boundary violations). All entries in p2m_identity are set to
INVALID_P2M_ENTRY type (Xen toolstack only recognizes that and MFNs,
no other fancy value).
On lookup we spot that the entry points to p2m_identity and return the identity
value instead of dereferencing and returning INVALID_P2M_ENTRY. If the entry
points to an allocated page, we just proceed as before and return the PFN.
If the PFN has IDENTITY_FRAME_BIT set we unmask that in appropriate functions
(pfn_to_mfn).
The reason for having the IDENTITY_FRAME_BIT instead of just returning the
PFN is that we could find ourselves where pfn_to_mfn(pfn)==pfn for a
non-identity pfn. To protect ourselves against we elect to set (and get) the
IDENTITY_FRAME_BIT on all identity mapped PFNs.
This simplistic diagram is used to explain the more subtle piece of code.
There is also a digram of the P2M at the end that can help.
Imagine your E820 looking as so:
1GB 2GB
/-------------------+---------\/----\ /----------\ /---+-----\
| System RAM | Sys RAM ||ACPI| | reserved | | Sys RAM |
\-------------------+---------/\----/ \----------/ \---+-----/
^- 1029MB ^- 2001MB
[1029MB = 263424 (0x40500), 2001MB = 512256 (0x7D100), 2048MB = 524288 (0x80000)]
And dom0_mem=max:3GB,1GB is passed in to the guest, meaning memory past 1GB
is actually not present (would have to kick the balloon driver to put it in).
When we are told to set the PFNs for identity mapping (see patch: "xen/setup:
Set identity mapping for non-RAM E820 and E820 gaps.") we pass in the start
of the PFN and the end PFN (263424 and 512256 respectively). The first step is
to reserve_brk a top leaf page if the p2m[1] is missing. The top leaf page
covers 512^2 of page estate (1GB) and in case the start or end PFN is not
aligned on 512^2*PAGE_SIZE (1GB) we loop on aligned 1GB PFNs from start pfn to
end pfn. We reserve_brk top leaf pages if they are missing (means they point
to p2m_mid_missing).
With the E820 example above, 263424 is not 1GB aligned so we allocate a
reserve_brk page which will cover the PFNs estate from 0x40000 to 0x80000.
Each entry in the allocate page is "missing" (points to p2m_missing).
Next stage is to determine if we need to do a more granular boundary check
on the 4MB (or 2MB depending on architecture) off the start and end pfn's.
We check if the start pfn and end pfn violate that boundary check, and if
so reserve_brk a middle (p2m[x][y]) leaf page. This way we have a much finer
granularity of setting which PFNs are missing and which ones are identity.
In our example 263424 and 512256 both fail the check so we reserve_brk two
pages. Populate them with INVALID_P2M_ENTRY (so they both have "missing" values)
and assign them to p2m[1][2] and p2m[1][488] respectively.
At this point we would at minimum reserve_brk one page, but could be up to
three. Each call to set_phys_range_identity has at maximum a three page
cost. If we were to query the P2M at this stage, all those entries from
start PFN through end PFN (so 1029MB -> 2001MB) would return INVALID_P2M_ENTRY
("missing").
The next step is to walk from the start pfn to the end pfn setting
the IDENTITY_FRAME_BIT on each PFN. This is done in 'set_phys_range_identity'.
If we find that the middle leaf is pointing to p2m_missing we can swap it over
to p2m_identity - this way covering 4MB (or 2MB) PFN space. At this point we
do not need to worry about boundary aligment (so no need to reserve_brk a middle
page, figure out which PFNs are "missing" and which ones are identity), as that
has been done earlier. If we find that the middle leaf is not occupied by
p2m_identity or p2m_missing, we dereference that page (which covers
512 PFNs) and set the appropriate PFN with IDENTITY_FRAME_BIT. In our example
263424 and 512256 end up there, and we set from p2m[1][2][256->511] and
p2m[1][488][0->256] with IDENTITY_FRAME_BIT set.
All other regions that are void (or not filled) either point to p2m_missing
(considered missing) or have the default value of INVALID_P2M_ENTRY (also
considered missing). In our case, p2m[1][2][0->255] and p2m[1][488][257->511]
contain the INVALID_P2M_ENTRY value and are considered "missing."
This is what the p2m ends up looking (for the E820 above) with this
fabulous drawing:
p2m /--------------\
/-----\ | &mfn_list[0],| /-----------------\
| 0 |------>| &mfn_list[1],| /---------------\ | ~0, ~0, .. |
|-----| | ..., ~0, ~0 | | ~0, ~0, [x]---+----->| IDENTITY [@256] |
| 1 |---\ \--------------/ | [p2m_identity]+\ | IDENTITY [@257] |
|-----| \ | [p2m_identity]+\\ | .... |
| 2 |--\ \-------------------->| ... | \\ \----------------/
|-----| \ \---------------/ \\
| 3 |\ \ \\ p2m_identity
|-----| \ \-------------------->/---------------\ /-----------------\
| .. +->+ | [p2m_identity]+-->| ~0, ~0, ~0, ... |
\-----/ / | [p2m_identity]+-->| ..., ~0 |
/ /---------------\ | .... | \-----------------/
/ | IDENTITY[@0] | /-+-[x], ~0, ~0.. |
/ | IDENTITY[@256]|<----/ \---------------/
/ | ~0, ~0, .... |
| \---------------/
|
p2m_missing p2m_missing
/------------------\ /------------\
| [p2m_mid_missing]+---->| ~0, ~0, ~0 |
| [p2m_mid_missing]+---->| ..., ~0 |
\------------------/ \------------/
where ~0 is INVALID_P2M_ENTRY. IDENTITY is (PFN | IDENTITY_BIT)
Reviewed-by: Ian Campbell <ian.campbell@citrix.com>
[v5: Changed code to use ranges, added ASCII art]
[v6: Rebased on top of xen->p2m code split]
[v4: Squished patches in just this one]
[v7: Added RESERVE_BRK for potentially allocated pages]
[v8: Fixed alignment problem]
[v9: Changed 1<<3X to 1<<BITS_PER_LONG-X]
[v10: Copied git commit description in the p2m code + Add Review tag]
[v11: Title had '2-1' - should be '1-1' mapping]
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Konrad Rzeszutek Wilk [Wed, 19 Jan 2011 01:09:41 +0000 (20:09 -0500)]
xen: Mark all initial reserved pages for the balloon as INVALID_P2M_ENTRY.
With this patch, we diligently set regions that will be used by the
balloon driver to be INVALID_P2M_ENTRY and under the ownership
of the balloon driver. We are OK using the __set_phys_to_machine
as we do not expect to be allocating any P2M middle or entries pages.
The set_phys_to_machine has the side-effect of potentially allocating
new pages and we do not want that at this stage.
We can do this because xen_build_mfn_list_list will have already
allocated all such pages up to xen_max_p2m_pfn.
We also move the check for auto translated physmap down the
stack so it is present in __set_phys_to_machine.
[v2: Rebased with mmu->p2m code split]
Reviewed-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Linus Torvalds [Sat, 22 Jan 2011 03:01:34 +0000 (19:01 -0800)]
Linux 2.6.38-rc2
Linus Torvalds [Sat, 22 Jan 2011 00:50:31 +0000 (16:50 -0800)]
Merge branch 'media_fixes' of git://git./linux/kernel/git/mchehab/linux-2.6
* 'media_fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-2.6: (101 commits)
[media] staging/lirc: fix mem leaks and ptr err usage
[media] hdpvr: reduce latency of i2c read/write w/recycled buffer
[media] hdpvr: enable IR part
[media] rc/mceusb: timeout should be in ns, not us
[media] v4l2-device: fix 'use-after-freed' oops
[media] v4l2-dev: don't memset video_device.dev
[media] zoran: use video_device_alloc instead of kmalloc
[media] w9966: zero device state after a detach
[media] v4l: Fix a use-before-set in the control framework
[media] v4l: Include linux/videodev2.h in media/v4l2-ctrls.h
[media] DocBook/v4l: update V4L2 revision and update copyright years
[media] DocBook/v4l: fix validation error in dev-rds.xml
[media] v4l2-ctrls: queryctrl shouldn't attempt to replace V4L2_CID_PRIVATE_BASE IDs
[media] v4l2-ctrls: fix missing 'read-only' check
[media] pvrusb2: Provide more information about IR units to lirc_zilog and ir-kbd-i2c
[media] ir-kbd-i2c: Add back defaults setting for Zilog Z8's at addr 0x71
[media] lirc_zilog: Update TODO.lirc_zilog
[media] lirc_zilog: Add Andy Walls to copyright notice and authors list
[media] lirc_zilog: Remove useless struct i2c_driver.command function
[media] lirc_zilog: Remove unneeded tests for existence of the IR Tx function
...
David Howells [Thu, 20 Jan 2011 16:38:33 +0000 (16:38 +0000)]
KEYS: Fix up comments in key management code
Fix up comments in the key management code. No functional changes.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
David Howells [Thu, 20 Jan 2011 16:38:27 +0000 (16:38 +0000)]
KEYS: Do some style cleanup in the key management code.
Do a bit of a style clean up in the key management code. No functional
changes.
Done using:
perl -p -i -e 's!^/[*]*/\n!!' security/keys/*.c
perl -p -i -e 's!} /[*] end [a-z0-9_]*[(][)] [*]/\n!}\n!' security/keys/*.c
sed -i -s -e ": next" -e N -e 's/^\n[}]$/}/' -e t -e P -e 's/^.*\n//' -e "b next" security/keys/*.c
To remove /*****/ lines, remove comments on the closing brace of a
function to name the function and remove blank lines before the closing
brace of a function.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Fri, 21 Jan 2011 21:44:07 +0000 (13:44 -0800)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
cifs: fix up CIFSSMBEcho for unaligned access
cifs: fix unaligned accesses in cifsConvertToUCS
cifs: clean up unaligned accesses in cifs_unicode.c
cifs: fix unaligned access in check2ndT2 and coalesce_t2
cifs: clean up unaligned accesses in validate_t2
cifs: use get/put_unaligned functions to access ByteCount
cifs: move time field in cifsInodeInfo
cifs: TCP_Server_Info diet
CIFS: Implement cifs_strict_readv (try #4)
CIFS: Implement cifs_file_strict_mmap (try #2)
CIFS: Implement cifs_strict_fsync
CIFS: Make cifsFileInfo_put work with strict cache mode
Linus Torvalds [Fri, 21 Jan 2011 21:43:21 +0000 (13:43 -0800)]
Merge branch 'fixes-2.6.38' of git://git./linux/kernel/git/tj/percpu
* 'fixes-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
x86,percpu: Move out of place 64 bit ops into X86_64 section
Linus Torvalds [Fri, 21 Jan 2011 21:38:57 +0000 (13:38 -0800)]
Merge branch 'fixes-2.6.38' of git://git./linux/kernel/git/tj/wq
* 'fixes-2.6.38' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
workqueue: note the nested NOT_RUNNING test in worker_clr_flags() isn't a noop
workqueue: relax lockdep annotation on flush_work()
Linus Torvalds [Fri, 21 Jan 2011 21:38:26 +0000 (13:38 -0800)]
Merge branch 'irq-cleanup-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'irq-cleanup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (37 commits)
um: Use generic irq Kconfig
tile: Use generic irq Kconfig
sparc: Use generic irq Kconfig
score: Use generic irq Kconfig
powerpc: Use generic irq Kconfig
parisc: Use generic irq Kconfig
mn10300: Use generic irq Kconfig
microblaze: Use generic irq Kconfig
m68knommu: Use generic irq Kconfig
ia64: Use generic irq Kconfig
frv: Use generic irq Kconfig
blackfin: Use generic irq Kconfig
alpha: Use generic irq Kconfig
genirq: Remove __do_IRQ
m32r: Convert to generic irq Kconfig
m32r: Convert usrv platform irq handling
m32r: Convert opsput_lcdpld irq chip
m32r: Convert opsput lanpld irq chip
m32r: Convert opsput pld irq chip
m32r: Convert opsput irq chip
...
Linus Torvalds [Fri, 21 Jan 2011 21:35:10 +0000 (13:35 -0800)]
Merge branch 'stable/bug-fixes-rc1' of git://git./linux/kernel/git/konrad/xen
* 'stable/bug-fixes-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: p2m: correctly initialize partial p2m leaf
xen: fix non-ANSI function warning in irq.c
Linus Torvalds [Fri, 21 Jan 2011 21:34:39 +0000 (13:34 -0800)]
Merge branches 'fixes' and 'fwnet' of git://git./linux/kernel/git/ieee1394/linux1394-2.6
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: core: fix unstable I/O with Canon camcorder
* 'fwnet' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
firewire: net: is not experimental anymore
firewire: net: invalidate ARP entries of removed nodes
Linus Torvalds [Fri, 21 Jan 2011 21:24:33 +0000 (13:24 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound-2.6
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
ALSA: hda - Fix EAPD to low on CZC P10T tablet computer with ALC662
ALSA: HDA: Add SKU ignore for another Thinkpad Edge 14
ALSA: hda - Fix "unused variable" compile warning
ALSA: hda - Add quirk for HP Z-series workstation
Revert "ALSA: HDA: Create mixers on ALC887"
ASoC: PXA: Fix codec address on Zipit Z2
ASoC: PXA: Fix jack detection on Zipit Z2
ASoC: Blackfin: fix DAI/SPORT config dependency issues
ASoC: Blackfin TDM: use external frame syncs
ASoC: Blackfin AC97: fix build error after multi-component update
ASoC: Blackfin TDM: fix missed snd_soc_dai_get_drvdata update
ASoC: documentation updates
ALSA: ice1712 delta - initialize SPI clock
Linus Torvalds [Fri, 21 Jan 2011 21:24:16 +0000 (13:24 -0800)]
Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6
* 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
powerpc/83xx: fix build failures on dt compatible list.
Linus Torvalds [Fri, 21 Jan 2011 21:23:52 +0000 (13:23 -0800)]
Merge branch 'merge' of git://git./linux/kernel/git/benh/powerpc
* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (34 commits)
powerpc/mpic: Fix mask/unmask timeout message
powerpc/pseries: Add BNX2=m to defconfig
powerpc: Enable 64kB pages and 1024 threads in pseries config
powerpc: Disable mcount tracers in pseries defconfig
powerpc/boot/dts: Install dts from the right directory
powerpc: machine_check_generic is wrong on 64bit
powerpc: Check RTAS extended log flag before checking length
powerpc: Fix corruption when grabbing FWNMI data
powerpc: Rework pseries machine check handler
powerpc: Don't silently handle machine checks from userspace
powerpc: Remove duplicate debugger hook in machine_check_exception
powerpc: Never halt RTAS error logging after receiving an unrecoverable machine check
powerpc: Don't force MSR_RI in machine_check_exception
powerpc: Print 32 bits of DSISR in show_regs
powerpc/kdump: Disable ftrace during kexec
powerpc/kdump: Move crash_kexec_stop_spus to kdump crash handler
powerpc/kexec: Remove empty ppc_md.machine_kexec_prepare
powerpc/kexec: Don't initialise kexec hooks to default handlers
powerpc/kdump: Remove ppc_md.machine_crash_shutdown
powerpc/kexec: Remove ppc_md.machine_kexec
...
Michal Simek [Fri, 21 Jan 2011 07:49:56 +0000 (08:49 +0100)]
mm: System without MMU do not need pte_mkwrite
The patch "thp: export maybe_mkwrite" (commit
14fd403f2146) breaks
systems without MMU.
Error log:
CC arch/microblaze/mm/init.o
In file included from include/linux/mman.h:14,
from arch/microblaze/mm/consistent.c:24:
include/linux/mm.h: In function 'maybe_mkwrite':
include/linux/mm.h:482: error: implicit declaration of function 'pte_mkwrite'
include/linux/mm.h:482: error: incompatible types in assignment
Signed-off-by: Michal Simek <monstr@monstr.eu>
CC: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Roland Dreier [Fri, 21 Jan 2011 00:23:08 +0000 (16:23 -0800)]
MAINTAINERS: Update Roland Dreier's email address
The cisco.com address will stop working soon, and besides no one can
remember the second "d" in "rolandd" or how to spell "rdreier."
Signed-off-by: Roland Dreier <rolandd@cisco.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Stefan Bader [Thu, 20 Jan 2011 14:38:23 +0000 (15:38 +0100)]
xen: p2m: correctly initialize partial p2m leaf
After changing the p2m mapping to a tree by
commit
58e05027b530ff081ecea68e38de8d59db8f87e0
xen: convert p2m to a 3 level tree
and trying to boot a DomU with 615MB of memory, the following crash was
observed in the dump:
kernel direct mapping tables up to
26f00000 @
1ec4000-
1fff000
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: [<
c0107397>] xen_set_pte+0x27/0x60
*pdpt =
0000000000000000 *pde =
0000000000000000
Adding further debug statements showed that when trying to set up
pfn=0x26700 the returned mapping was invalid.
pfn=0x266ff calling set_pte(0xc1fe77f8, 0x6b3003)
pfn=0x26700 calling set_pte(0xc1fe7800, 0x3)
Although the last_pfn obtained from the startup info is 0x26700, which
should in turn not be hit, the additional 8MB which are added as extra
memory normally seem to be ok. This lead to looking into the initial
p2m tree construction, which uses the smaller value and assuming that
there is other code handling the extra memory.
When the p2m tree is set up, the leaves are directly pointed to the
array which the domain builder set up. But if the mapping is not on a
boundary that fits into one p2m page, this will result in the last leaf
being only partially valid. And as the invalid entries are not
initialized in that case, things go badly wrong.
I am trying to fix that by checking whether the current leaf is a
complete map and if not, allocate a completely new page and copy only
the valid pointers there. This may not be the most efficient or elegant
solution, but at least it seems to allow me booting DomUs with memory
assignments all over the range.
BugLink: http://bugs.launchpad.net/bugs/686692
[v2: Redid a bit of commit wording and fixed a compile warning]
Signed-off-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Linus Torvalds [Fri, 21 Jan 2011 15:33:37 +0000 (07:33 -0800)]
Merge branch 'for_linus' of git://git./linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
quota: Fix deadlock during path resolution
Thomas Gleixner [Wed, 19 Jan 2011 19:46:24 +0000 (20:46 +0100)]
um: Use generic irq Kconfig
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Jeff Dike <jdike@addtoit.com>
Thomas Gleixner [Wed, 19 Jan 2011 19:44:43 +0000 (20:44 +0100)]
tile: Use generic irq Kconfig
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Chris Metcalf <cmetcalf@tilera.com>
Thomas Gleixner [Wed, 19 Jan 2011 19:43:56 +0000 (20:43 +0100)]
sparc: Use generic irq Kconfig
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: "David S. Miller" <davem@davemloft.net>
Thomas Gleixner [Wed, 19 Jan 2011 19:41:19 +0000 (20:41 +0100)]
score: Use generic irq Kconfig
No functional change
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Chen Liqin <liqin.chen@sunplusct.com>
Thomas Gleixner [Wed, 19 Jan 2011 19:39:39 +0000 (20:39 +0100)]
powerpc: Use generic irq Kconfig
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Thomas Gleixner [Wed, 19 Jan 2011 19:38:30 +0000 (20:38 +0100)]
parisc: Use generic irq Kconfig
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: James Bottomley <James.Bottomley@HansenPartnership.com>
Thomas Gleixner [Wed, 19 Jan 2011 19:36:02 +0000 (20:36 +0100)]
mn10300: Use generic irq Kconfig
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Howells <dhowells@redhat.com>
Thomas Gleixner [Wed, 19 Jan 2011 19:35:05 +0000 (20:35 +0100)]
microblaze: Use generic irq Kconfig
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michal Simek <monstr@monstr.eu>
Thomas Gleixner [Wed, 19 Jan 2011 19:34:21 +0000 (20:34 +0100)]
m68knommu: Use generic irq Kconfig
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Ungerer <gerg@uclinux.org>
Thomas Gleixner [Wed, 19 Jan 2011 19:32:46 +0000 (20:32 +0100)]
ia64: Use generic irq Kconfig
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Tony Luck <tony.luck@intel.com>
Thomas Gleixner [Wed, 19 Jan 2011 19:32:04 +0000 (20:32 +0100)]
frv: Use generic irq Kconfig
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: David Howells <dhowells@redhat.com>
Thomas Gleixner [Wed, 19 Jan 2011 19:29:58 +0000 (20:29 +0100)]
blackfin: Use generic irq Kconfig
No functional change.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Frysinger <vapier@gentoo.org>
Thomas Gleixner [Wed, 19 Jan 2011 19:27:11 +0000 (20:27 +0100)]
alpha: Use generic irq Kconfig
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Henderson <rth@twiddle.net>
Thomas Gleixner [Wed, 19 Jan 2011 18:41:35 +0000 (19:41 +0100)]
genirq: Remove __do_IRQ
All architectures are finally converted. Remove the cruft.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Mike Frysinger <vapier@gentoo.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Greg Ungerer <gerg@uclinux.org>
Cc: Michal Simek <monstr@monstr.eu>
Acked-by: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Chen Liqin <liqin.chen@sunplusct.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Jeff Dike <jdike@addtoit.com>
Thomas Gleixner [Wed, 19 Jan 2011 18:17:10 +0000 (19:17 +0100)]
m32r: Convert to generic irq Kconfig
Use the generic irq Kconfig. Select GENERIC_HARDIRQS_NO_DEPRECATED as
we have converted all irq_chip functions.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 18:10:18 +0000 (19:10 +0100)]
m32r: Convert usrv platform irq handling
Convert the irq chips to the new functions and use proper flow
handlers. handle_level_irq is appropriate.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 18:01:23 +0000 (19:01 +0100)]
m32r: Convert opsput_lcdpld irq chip
Convert the irq chips to the new functions and use proper flow
handlers. handle_level_irq is appropriate.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 17:58:45 +0000 (18:58 +0100)]
m32r: Convert opsput lanpld irq chip
Convert the irq chips to the new functions and use proper flow
handlers. handle_level_irq is appropriate.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 17:55:09 +0000 (18:55 +0100)]
m32r: Convert opsput pld irq chip
Convert the irq chips to the new functions and use proper flow
handlers. handle_level_irq is appropriate.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 17:48:15 +0000 (18:48 +0100)]
m32r: Convert opsput irq chip
Convert the irq chips to the new functions and use proper flow
handlers. handle_level_irq is appropriate.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 17:44:10 +0000 (18:44 +0100)]
m32r: Convert oaks32r irq chips
Convert the irq chips to the new functions and use proper flow
handlers. handle_level_irq is appropriate.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 17:39:27 +0000 (18:39 +0100)]
m32r: Convert mappi3 irq chip
Convert the irq chips to the new functions and use proper flow
handlers. handle_level_irq is appropriate.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 17:34:51 +0000 (18:34 +0100)]
m32r: Convert mappi2 irq chip
Convert the irq chips to the new functions and use proper flow
handlers. handle_level_irq is appropriate.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 17:27:59 +0000 (18:27 +0100)]
m32r: Convert mappi irq chips
Convert the irq chips to the new functions and use proper flow
handlers. handle_level_irq is appropriate.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 17:19:42 +0000 (18:19 +0100)]
m32r: Convert m32700ut lcdpld irq chip
Convert the irq chip to the new functions and use proper flow
handlers. handle_level_irq is appropriate.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 17:14:21 +0000 (18:14 +0100)]
m32r: Convert m32700ut lanpld irq chip
Convert the irq chips to the new functions and use proper flow
handlers. handle_level_irq is appropriate.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 16:41:51 +0000 (17:41 +0100)]
m32r: Convert m32700ut pld irq chip
Convert the irq chips to the new functions and use proper flow
handlers. handle_level_irq is appropriate.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Tue, 11 Jan 2011 09:43:49 +0000 (10:43 +0100)]
m32r: Convert m32104ut irq chip
Convert the irq chips to the new functions and use proper flow
handlers. handle_level_irq is appropriate.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 16:02:29 +0000 (17:02 +0100)]
m32r: Convert m32104ut irq handling
Convert the irq chips to the new functions and use proper flow
handlers. handle_level_irq is appropriate.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 22 Sep 2010 17:13:16 +0000 (19:13 +0200)]
m32r: Cleanup direct irq_desc access
The irq descriptors are already initialized by the generic
code. Remove the redundant init code and set the irq chip with the
proper accessor function.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 13:20:13 +0000 (14:20 +0100)]
cris: Use generic irq Kconfig
Use the generic irq Kconfig. Select GENERIC_HARDIRQS_NO_DEPRECATED as
we have converted all irq_chip functions. Fix the fallout in
show_interrupts().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mikael Starvik <starvik@axis.com>
Thomas Gleixner [Wed, 19 Jan 2011 13:05:30 +0000 (14:05 +0100)]
cris: Convert V32 interrupt handling
Convert the irq chip functions and install handle_simple_irq for each
interrupt to get rid of __do_IRQ()
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mikael Starvik <starvik@axis.com>
Thomas Gleixner [Wed, 19 Jan 2011 12:54:54 +0000 (13:54 +0100)]
cris: Convert V10 interrupt handling
Convert the irq_chip functions and install handle_simple_irq for each
interrupt. This converts V10 to the flow handling and lets us remove
__do_IRQ().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mikael Starvik <starvik@axis.com>
Thomas Gleixner [Wed, 19 Jan 2011 12:59:01 +0000 (13:59 +0100)]
cris: Use irq handling wrapper
Use the wrapper around __do_IRQ() so we can convert V10 and V32
seperately.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Mikael Starvik <starvik@axis.com>
Thomas Gleixner [Wed, 19 Jan 2011 11:26:32 +0000 (12:26 +0100)]
h8300: Use generic irq Kconfig
Switch to the generic irq Kconfig. h8300 has all irq chips converted
to the new functions, so select the GENERIC_HARDIRQS_NO_DEPRECATED
switch as well. Fixup the resulting fallout in show_interrupts().
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 11:18:57 +0000 (12:18 +0100)]
h8300: Convert interrupt handling to flow handler
__do_IRQ is deprecated so h8300 needs to be converted to proper flow
handling. The irq chip is simple and does not required any
mask/ack/eoi functions, so we can use handle_simple_irq.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Paul Mundt <lethal@linux-sh.org>
Thomas Gleixner [Wed, 19 Jan 2011 11:15:29 +0000 (12:15 +0100)]
h8300: Convert to new irq_chip functions
No functional change, just straight forward conversion.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Paul Mundt <lethal@linux-sh.org>
Takashi Iwai [Fri, 21 Jan 2011 07:10:14 +0000 (08:10 +0100)]
Merge branch 'fix/asoc' into for-linus
Takashi Iwai [Fri, 21 Jan 2011 07:10:09 +0000 (08:10 +0100)]
Merge branch 'fix/misc' into for-linus
Scott Wood [Mon, 17 Jan 2011 12:10:41 +0000 (12:10 +0000)]
powerpc/mpic: Fix mask/unmask timeout message
Don't say that enable timed out when it was disable, and
show which IRQ had the problem.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Nishanth Aravamudan [Thu, 13 Jan 2011 13:22:39 +0000 (13:22 +0000)]
powerpc/pseries: Add BNX2=m to defconfig
Upcoming servers will include a Broadcom NIC, add to the defconfig to
increase testing coverage and make sure mainline builds come up with
networking.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Wed, 12 Jan 2011 02:14:32 +0000 (02:14 +0000)]
powerpc: Enable 64kB pages and 1024 threads in pseries config
- Enable 64kB pages so it gets some regular testing.
- The largest POWER7 has 1024 threads so bump NR_CPUS it to match.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Wed, 12 Jan 2011 02:12:43 +0000 (02:12 +0000)]
powerpc: Disable mcount tracers in pseries defconfig
IRQSOFF_TRACER and STACK_TRACER force the kernel to be built with -pg
which is a substantial overhead.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Ben Hutchings [Sat, 8 Jan 2011 14:24:01 +0000 (14:24 +0000)]
powerpc/boot/dts: Install dts from the right directory
The dts-installed variable is initialised using a wildcard path that
will be expanded relative to the build directory. Use the existing
variable dtstree to generate an absolute wildcard path that will work
when building in a separate directory.
Reported-by: Gerhard Pircher <gerhard_pircher@gmx.net>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Tested-by: Gerhard Pircher <gerhard_pircher@gmx.net> [against 2.6.32]
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 11 Jan 2011 19:52:31 +0000 (19:52 +0000)]
powerpc: machine_check_generic is wrong on 64bit
Decoding machine checks is CPU specific and so machine_check_generic doesn't
do the right thing on 64bit chips. Luckily we never call into this code
because we call ppc_md.machine_check_exception instead if available.
Since we check cur_cpu_spec->machine_check before calling it, we may as
well remove machine_check_generic from 64bit archs.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 11 Jan 2011 19:51:31 +0000 (19:51 +0000)]
powerpc: Check RTAS extended log flag before checking length
The spec suggests we should first check the extended log flag before checking
the length field.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 11 Jan 2011 19:50:51 +0000 (19:50 +0000)]
powerpc: Fix corruption when grabbing FWNMI data
The FWNMI code uses a global buffer without any locks to read the RTAS error
information. If two CPUs take a machine check at once then we will corrupt
this buffer.
Since most FWNMI rtas messages are not of the extended type, we can create a
64bit percpu buffer and use it where possible. If we do receive an extended
RTAS log then we fall back to the old behaviour of using the global buffer.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 11 Jan 2011 19:49:19 +0000 (19:49 +0000)]
powerpc: Rework pseries machine check handler
Rework pseries machine check handler:
- If MSR_RI isn't set, we cannot recover even if the machine check was fully
recovered
- Rename nonfatal to recovered
- Handle RTAS_DISP_LIMITED_RECOVERY
- Use BUS_MCEERR_AR instead of BUS_ADRERR
- Don't check all the RTAS error log fields when receiving a synchronous
machine check. Recent versions of the pseries firmware do not fill them
in during a machine check and instead send a follow up error log with
the detailed information. If we see a synchronous machine check, and we
came from userspace then kill the task.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 11 Jan 2011 19:48:14 +0000 (19:48 +0000)]
powerpc: Don't silently handle machine checks from userspace
If a machine check comes from userspace we send a SIGBUS to the task and
fail to printk anything.
If we are taking machine checks due to bad hardware we want to know about
it right away. Furthermore if we don't complain loudly then it will look
a lot like a bug in the userspace application, potentially causing a lot
of confusion.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 11 Jan 2011 19:47:20 +0000 (19:47 +0000)]
powerpc: Remove duplicate debugger hook in machine_check_exception
We are calling debugger_fault_handler twice in machine_check_exception.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 11 Jan 2011 19:46:29 +0000 (19:46 +0000)]
powerpc: Never halt RTAS error logging after receiving an unrecoverable machine check
Newer versions of the System p firwmare send a partial RTAS error log in the
machine check handler with a more detailed response appearing sometime later
via check event.
This means at machine check time we do not have enough information to
ascertain exactly what went on. Furthermore, I have found the RTAS error
logs in the machine check handler contain no useful information, so halting on
them makes little sense. If we want to halt it would make more sense to do
it following the error log received sometime later via check event.
In light of this, never halt the error log in the pseries machine
check handler.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 11 Jan 2011 19:45:31 +0000 (19:45 +0000)]
powerpc: Don't force MSR_RI in machine_check_exception
We should never force MSR_RI on. If we take a machine check with MSR_RI off
then we have no chance of recovering safely.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Tue, 11 Jan 2011 19:44:30 +0000 (19:44 +0000)]
powerpc: Print 32 bits of DSISR in show_regs
We were printing 64 bits of DSISR in show_regs even though it is 32 bit.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Thu, 6 Jan 2011 18:00:36 +0000 (18:00 +0000)]
powerpc/kdump: Disable ftrace during kexec
We should disable ftrace during kexec, some of the tracers are very invasive
and we do not want them going off while doing the low level work of swapping
one kernel out for another. This mirrors what we do on x86.
Even though we cannot return from a kexec on powerpc (since we do not implement
CONFIG_KEXEC_JUMP), add the restore code in case we do one day.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Fri, 21 Jan 2011 02:43:59 +0000 (13:43 +1100)]
powerpc/kdump: Move crash_kexec_stop_spus to kdump crash handler
Use the crash handler hooks to run the SPU stop code, just like we do for
ehea and cell RAS code.
While I'm here I noticed "CPUSs reliabally"
so fix the spelling MISTAKESs reliabally.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Thu, 6 Jan 2011 17:58:36 +0000 (17:58 +0000)]
powerpc/kexec: Remove empty ppc_md.machine_kexec_prepare
We check for a valid handler before calling ppc_md.machine_kexec_prepare
so we can just remove these empty handlers.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Thu, 6 Jan 2011 17:57:03 +0000 (17:57 +0000)]
powerpc/kexec: Don't initialise kexec hooks to default handlers
There's no need to initialise ppc_md.machine_kexec and
ppc_md.machine_kexec_prepare to the default handlers.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Thu, 6 Jan 2011 17:56:09 +0000 (17:56 +0000)]
powerpc/kdump: Remove ppc_md.machine_crash_shutdown
No one uses ppc_md.machine_crash_shutdown, so remove it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Thu, 6 Jan 2011 17:55:36 +0000 (17:55 +0000)]
powerpc/kexec: Remove ppc_md.machine_kexec
No one uses ppc_md.machine_kexec, so remove it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Thu, 6 Jan 2011 17:54:58 +0000 (17:54 +0000)]
powerpc/kexec: Remove ppc_md.machine_kexec_cleanup
No one uses ppc_md.machine_kexec_cleanup, so remove it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Anton Blanchard [Thu, 6 Jan 2011 17:54:15 +0000 (17:54 +0000)]
powerpc/kexec: Move all ppc_md kexec function pointers together
Move all the kexec handlers together.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tejun Heo [Mon, 3 Jan 2011 03:49:25 +0000 (03:49 +0000)]
powerpc/cell: Use system_wq in cpufreq_spudemand
With cmwq, there's no reason to use a separate workqueue in
cpufreq_spudemand. Use system_wq instead. The work items are already
sync canceled on stop, so it's already guaranteed that no work is
running when spu_gov_exit() is entered.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Dave Jones <davej@redhat.com>
Cc: cpufreq@vger.kernel.org
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
roel kluin [Fri, 31 Dec 2010 04:57:46 +0000 (04:57 +0000)]
powerpc/macintosh: Fix wrong test in fan_{read,write}_reg()
Fix error test in fan_{read,write}_reg()
Signed-off-by: Roel Kluin <roel.kluin@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Akinobu Mita [Fri, 24 Dec 2010 20:03:59 +0000 (20:03 +0000)]
powerpc/rtas_flash: Use simple_read_from_buffer
Simplify read file operation for /proc/powerpc/rtas/* interface
by using simple_read_from_buffer.
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Akinobu Mita [Fri, 24 Dec 2010 20:03:56 +0000 (20:03 +0000)]
powerpc/spufs: Use simple_write_to_buffer
Simplify several write fileoperations for spufs by using
simple_write_to_buffer().
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Steven Rostedt [Wed, 22 Dec 2010 16:42:56 +0000 (16:42 +0000)]
powerpc/ppc32/tracing: Add stack frame to calls of trace_hardirqs_on/off
32-bit variant of the previous patch for 64-bit:
<<
When an interrupt occurs in userspace, we can call trace_hardirqs_on/off()
With one level stack. But if we have irqsoff tracing enabled,
it checks both CALLER_ADDR0 and CALLER_ADDR1. The second call
goes two stack frames up. If this is from user space, then there may
not exist a second stack....
>>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Steven Rostedt [Thu, 23 Dec 2010 19:46:06 +0000 (19:46 +0000)]
powerpc/ppc64/tracing: Add stack frame to calls of trace_hardirqs_on/off
When an interrupt occurs in userspace, we can call trace_hardirqs_on/off()
With one level stack. But if we have irqsoff tracing enabled,
it checks both CALLER_ADDR0 and CALLER_ADDR1. The second call
goes two stack frames up. If this is from user space, then there may
not exist a second stack.
Add a second stack when calling trace_hardirqs_on/off() otherwise
the following oops might occur:
Oops: Kernel access of bad area, sig: 11 [#1]
PREEMPT SMP NR_CPUS=2 PA Semi PWRficient
last sysfs file: /sys/block/sda/size
Modules linked in: ohci_hcd ehci_hcd usbcore
NIP:
c0000000000e1c00 LR:
c0000000000034d4 CTR:
000000011012c440
REGS:
c00000003e2f3af0 TRAP: 0300 Not tainted (2.6.37-rc6+)
MSR:
9000000000001032 <ME,IR,DR> CR:
48044444 XER:
20000000
DAR:
00000001ffb9db50, DSISR:
0000000040000000
TASK =
c00000003e1a00a0[2088] 'emacs' THREAD:
c00000003e2f0000 CPU: 1
GPR00:
0000000000000001 c00000003e2f3d70 c00000000084e0d0 c0000000008816e8
GPR04:
000000001034c678 000000001032e8f9 0000000010336540 0000000040020000
GPR08:
0000000040020000 00000001ffb9db40 c00000003e2f3e30 0000000060000000
GPR12:
100000000000f032 c00000000fff0280 000000001032e8c9 0000000000000008
GPR16:
00000000105be9c0 00000000105be950 00000000105be9b0 00000000105be950
GPR20:
00000000ffb9dc50 00000000ffb9dbf0 00000000102f0000 00000000102f0000
GPR24:
00000000102e0000 00000000102f0000 0000000010336540 c0000000009ded38
GPR28:
00000000102e0000 c0000000000034d4 c0000000007ccb10 c00000003e2f3d70
NIP [
c0000000000e1c00] .trace_hardirqs_off+0xb0/0x1d0
LR [
c0000000000034d4] decrementer_common+0xd4/0x100
Call Trace:
[
c00000003e2f3d70] [
c00000003e2f3e30] 0xc00000003e2f3e30 (unreliable)
[
c00000003e2f3e30] [
c0000000000034d4] decrementer_common+0xd4/0x100
Instruction dump:
81690000 7f8b0000 419e0018 f84a0028 60000000 60000000 60000000 e95f0000
80030000 e92a0000 eb6301f8 2f800000 <
eb890010>
41fe00dc a06d000a eb1e8050
---[ end trace
4ec7fd2be9240928 ]---
Reported-by: Joerg Sommer <joerg@alea.gnuu.de>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Michael Ellerman [Sun, 7 Nov 2010 18:22:29 +0000 (18:22 +0000)]
powerpc: Ensure the else case of feature sections will fit
When we create an alternative feature section, the else case must be the
same size or smaller than the body. This is because when we patch the
else case in we just overwrite the body, so there must be room.
Up to now we just did this by inspection, but it's quite easy to enforce
it in the assembler, so we should.
The only change is to add the ifgt block, but that effects the alignment
of the tabs and so the whole macro is modified.
Also add a test, but #if 0 it because we don't want to break the build.
Anyone who's modifying the feature macros should enable the test.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Linus Torvalds [Fri, 21 Jan 2011 02:30:37 +0000 (18:30 -0800)]
Merge branch 'core-fixes-for-linus' of git://git./linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
smp: Allow on_each_cpu() to be called while early_boot_irqs_disabled status to init/main.c
lockdep: Move early boot local IRQ enable/disable status to init/main.c
Rafael J. Wysocki [Wed, 19 Jan 2011 21:27:55 +0000 (22:27 +0100)]
ACPI / PM: Call suspend_nvs_free() earlier during resume
It turns out that some device drivers map pages from the ACPI NVS region
during resume using ioremap(), which conflicts with ioremap_cache() used
for mapping those pages by the NVS save/restore code in nvs.c.
Make the NVS pages mapped by the code in nvs.c be unmapped before device
drivers' resume routines run.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Rafael J. Wysocki [Wed, 19 Jan 2011 21:27:14 +0000 (22:27 +0100)]
ACPI: Introduce acpi_os_ioremap()
Commit
ca9b600be38c ("ACPI / PM: Make suspend_nvs_save() use
acpi_os_map_memory()") attempted to prevent the code in osl.c and nvs.c
from using different ioremap() variants by making the latter use
acpi_os_map_memory() for mapping the NVS pages. However, that also
requires acpi_os_unmap_memory() to be used for unmapping them, which
causes synchronize_rcu() to be executed many times in a row
unnecessarily and introduces substantial delays during resume on some
systems.
Instead of using acpi_os_map_memory() for mapping the NVS pages in nvs.c
introduce acpi_os_ioremap() calling ioremap_cache() and make the code in
both osl.c and nvs.c use it.
Reported-by: Jeff Chua <jeff.chua.linux@gmail.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jeff Layton [Fri, 21 Jan 2011 02:19:25 +0000 (21:19 -0500)]
cifs: fix up CIFSSMBEcho for unaligned access
Make sure that CIFSSMBEcho can handle unaligned fields. Also fix a minor
bug that causes this warning:
fs/cifs/cifssmb.c: In function 'CIFSSMBEcho':
fs/cifs/cifssmb.c:740: warning: large integer implicitly truncated to unsigned type
...WordCount is u8, not __le16, so no need to convert it.
This patch should apply cleanly on top of the rest of the patchset to
clean up unaligned access.
Signed-off-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: Steve French <sfrench@us.ibm.com>
Steve French [Fri, 21 Jan 2011 02:19:30 +0000 (02:19 +0000)]
Merge branch 'for-next'
Linus Torvalds [Fri, 21 Jan 2011 01:02:14 +0000 (17:02 -0800)]
Merge branch 'akpm'
* akpm:
kernel/smp.c: consolidate writes in smp_call_function_interrupt()
kernel/smp.c: fix smp_call_function_many() SMP race
memcg: correctly order reading PCG_USED and pc->mem_cgroup
backlight: fix 88pm860x_bl macro collision
drivers/leds/ledtrig-gpio.c: make output match input, tighten input checking
MAINTAINERS: update Atmel AT91 entry
mm: fix truncate_setsize() comment
memcg: fix rmdir, force_empty with THP
memcg: fix LRU accounting with THP
memcg: fix USED bit handling at uncharge in THP
memcg: modify accounting function for supporting THP better
fs/direct-io.c: don't try to allocate more than BIO_MAX_PAGES in a bio
mm: compaction: prevent division-by-zero during user-requested compaction
mm/vmscan.c: remove duplicate include of compaction.h
memblock: fix memblock_is_region_memory()
thp: keep highpte mapped until it is no longer needed
kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT
Milton Miller [Thu, 20 Jan 2011 22:44:34 +0000 (14:44 -0800)]
kernel/smp.c: consolidate writes in smp_call_function_interrupt()
We have to test the cpu mask in the interrupt handler before checking the
refs, otherwise we can start to follow an entry before its deleted and
find it partially initailzed for the next trip. Presently we also clear
the cpumask bit before executing the called function, which implies
getting write access to the line. After the function is called we then
decrement refs, and if they go to zero we then unlock the structure.
However, this implies getting write access to the call function data
before and after another the function is called. If we can assert that no
smp_call_function execution function is allowed to enable interrupts, then
we can move both writes to after the function is called, hopfully allowing
both writes with one cache line bounce.
On a 256 thread system with a kernel compiled for 1024 threads, the time
to execute testcase in the "smp_call_function_many race" changelog was
reduced by about 30-40ms out of about 545 ms.
I decided to keep this as WARN because its now a buggy function, even
though the stack trace is of no value -- a simple printk would give us the
information needed.
Raw data:
Without patch:
ipi_test startup took 1219366ns complete 539819014ns total 541038380ns
ipi_test startup took 1695754ns complete 543439872ns total 545135626ns
ipi_test startup took 7513568ns complete 539606362ns total 547119930ns
ipi_test startup took 13304064ns complete 533898562ns total 547202626ns
ipi_test startup took 8668192ns complete 544264074ns total 552932266ns
ipi_test startup took 4977626ns complete 548862684ns total 553840310ns
ipi_test startup took 2144486ns complete 541292318ns total 543436804ns
ipi_test startup took 21245824ns complete 530280180ns total 551526004ns
With patch:
ipi_test startup took 5961748ns complete 500859628ns total 506821376ns
ipi_test startup took 8975996ns complete 495098924ns total 504074920ns
ipi_test startup took 19797750ns complete 492204740ns total 512002490ns
ipi_test startup took 14824796ns complete 487495878ns total 502320674ns
ipi_test startup took 11514882ns complete 494439372ns total 505954254ns
ipi_test startup took 8288084ns complete 502570774ns total 510858858ns
ipi_test startup took 6789954ns complete 493388112ns total 500178066ns
#include <linux/module.h>
#include <linux/init.h>
#include <linux/sched.h> /* sched clock */
#define ITERATIONS 100
static void do_nothing_ipi(void *dummy)
{
}
static void do_ipis(struct work_struct *dummy)
{
int i;
for (i = 0; i < ITERATIONS; i++)
smp_call_function(do_nothing_ipi, NULL, 1);
printk(KERN_DEBUG "cpu %d finished\n", smp_processor_id());
}
static struct work_struct work[NR_CPUS];
static int __init testcase_init(void)
{
int cpu;
u64 start, started, done;
start = local_clock();
for_each_online_cpu(cpu) {
INIT_WORK(&work[cpu], do_ipis);
schedule_work_on(cpu, &work[cpu]);
}
started = local_clock();
for_each_online_cpu(cpu)
flush_work(&work[cpu]);
done = local_clock();
pr_info("ipi_test startup took %lldns complete %lldns total %lldns\n",
started-start, done-started, done-start);
return 0;
}
static void __exit testcase_exit(void)
{
}
module_init(testcase_init)
module_exit(testcase_exit)
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Anton Blanchard");
Signed-off-by: Milton Miller <miltonm@bga.com>
Cc: Anton Blanchard <anton@samba.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Anton Blanchard [Thu, 20 Jan 2011 22:44:33 +0000 (14:44 -0800)]
kernel/smp.c: fix smp_call_function_many() SMP race
I noticed a failure where we hit the following WARN_ON in
generic_smp_call_function_interrupt:
if (!cpumask_test_and_clear_cpu(cpu, data->cpumask))
continue;
data->csd.func(data->csd.info);
refs = atomic_dec_return(&data->refs);
WARN_ON(refs < 0); <-------------------------
We atomically tested and cleared our bit in the cpumask, and yet the
number of cpus left (ie refs) was 0. How can this be?
It turns out commit
54fdade1c3332391948ec43530c02c4794a38172
("generic-ipi: make struct call_function_data lockless") is at fault. It
removes locking from smp_call_function_many and in doing so creates a
rather complicated race.
The problem comes about because:
- The smp_call_function_many interrupt handler walks call_function.queue
without any locking.
- We reuse a percpu data structure in smp_call_function_many.
- We do not wait for any RCU grace period before starting the next
smp_call_function_many.
Imagine a scenario where CPU A does two smp_call_functions back to back,
and CPU B does an smp_call_function in between. We concentrate on how CPU
C handles the calls:
CPU A CPU B CPU C CPU D
smp_call_function
smp_call_function_interrupt
walks
call_function.queue sees
data from CPU A on list
smp_call_function
smp_call_function_interrupt
walks
call_function.queue sees
(stale) CPU A on list
smp_call_function int
clears last ref on A
list_del_rcu, unlock
smp_call_function reuses
percpu *data A
data->cpumask sees and
clears bit in cpumask
might be using old or new fn!
decrements refs below 0
set data->refs (too late!)
The important thing to note is since the interrupt handler walks a
potentially stale call_function.queue without any locking, then another
cpu can view the percpu *data structure at any time, even when the owner
is in the process of initialising it.
The following test case hits the WARN_ON 100% of the time on my PowerPC
box (having 128 threads does help :)
#include <linux/module.h>
#include <linux/init.h>
#define ITERATIONS 100
static void do_nothing_ipi(void *dummy)
{
}
static void do_ipis(struct work_struct *dummy)
{
int i;
for (i = 0; i < ITERATIONS; i++)
smp_call_function(do_nothing_ipi, NULL, 1);
printk(KERN_DEBUG "cpu %d finished\n", smp_processor_id());
}
static struct work_struct work[NR_CPUS];
static int __init testcase_init(void)
{
int cpu;
for_each_online_cpu(cpu) {
INIT_WORK(&work[cpu], do_ipis);
schedule_work_on(cpu, &work[cpu]);
}
return 0;
}
static void __exit testcase_exit(void)
{
}
module_init(testcase_init)
module_exit(testcase_exit)
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Anton Blanchard");
I tried to fix it by ordering the read and the write of ->cpumask and
->refs. In doing so I missed a critical case but Paul McKenney was able
to spot my bug thankfully :) To ensure we arent viewing previous
iterations the interrupt handler needs to read ->refs then ->cpumask then
->refs _again_.
Thanks to Milton Miller and Paul McKenney for helping to debug this issue.
[miltonm@bga.com: add WARN_ON and BUG_ON, remove extra read of refs before initial read of mask that doesn't help (also noted by Peter Zijlstra), adjust comments, hopefully clarify scenario ]
[miltonm@bga.com: remove excess tests]
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Milton Miller <miltonm@bga.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: <stable@kernel.org> [2.6.32+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Johannes Weiner [Thu, 20 Jan 2011 22:44:31 +0000 (14:44 -0800)]
memcg: correctly order reading PCG_USED and pc->mem_cgroup
The placement of the read-side barrier is confused: the writer first
sets pc->mem_cgroup, then PCG_USED. The read-side barrier has to be
between testing PCG_USED and reading pc->mem_cgroup.
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Cc: Balbir Singh <balbir@linux.vnet.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Randy Dunlap [Thu, 20 Jan 2011 22:44:31 +0000 (14:44 -0800)]
backlight: fix 88pm860x_bl macro collision
Fix collision with kernel-supplied #define:
drivers/video/backlight/88pm860x_bl.c:24:1: warning: "CURRENT_MASK" redefined
arch/x86/include/asm/page_64_types.h:6:1: warning: this is the location of the previous definition
Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Haojian Zhuang <haojian.zhuang@marvell.com>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Janusz Krzysztofik [Thu, 20 Jan 2011 22:44:29 +0000 (14:44 -0800)]
drivers/leds/ledtrig-gpio.c: make output match input, tighten input checking
Replicate changes made to drivers/leds/ledtrig-backlight.c.
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Richard Purdie <richard.purdie@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Nicolas Ferre [Thu, 20 Jan 2011 22:44:27 +0000 (14:44 -0800)]
MAINTAINERS: update Atmel AT91 entry
Add two co-maintainers and update the entry with new information.
Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Acked-by: Andrew Victor <linux@maxim.org.za>
Acked-by: Jean-Christophe PLAGNIOL-VILLARD <plagnioj@jcrosoft.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Jan Kara [Thu, 20 Jan 2011 22:44:26 +0000 (14:44 -0800)]
mm: fix truncate_setsize() comment
Contrary to what the comment says, truncate_setsize() should be called
*before* filesystem truncated blocks.
Signed-off-by: Jan Kara <jack@suse.cz>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>