openwrt/staging/blogic.git
18 years ago[PATCH] sched: implement smpnice
Peter Williams [Tue, 27 Jun 2006 09:54:34 +0000 (02:54 -0700)]
[PATCH] sched: implement smpnice

Problem:

The introduction of separate run queues per CPU has brought with it "nice"
enforcement problems that are best described by a simple example.

For the sake of argument suppose that on a single CPU machine with a
nice==19 hard spinner and a nice==0 hard spinner running that the nice==0
task gets 95% of the CPU and the nice==19 task gets 5% of the CPU.  Now
suppose that there is a system with 2 CPUs and 2 nice==19 hard spinners and
2 nice==0 hard spinners running.  The user of this system would be entitled
to expect that the nice==0 tasks each get 95% of a CPU and the nice==19
tasks only get 5% each.  However, whether this expectation is met is pretty
much down to luck as there are four equally likely distributions of the
tasks to the CPUs that the load balancing code will consider to be balanced
with loads of 2.0 for each CPU.  Two of these distributions involve one
nice==0 and one nice==19 task per CPU and in these circumstances the users
expectations will be met.  The other two distributions both involve both
nice==0 tasks being on one CPU and both nice==19 being on the other CPU and
each task will get 50% of a CPU and the user's expectations will not be
met.

Solution:

The solution to this problem that is implemented in the attached patch is
to use weighted loads when determining if the system is balanced and, when
an imbalance is detected, to move an amount of weighted load between run
queues (as opposed to a number of tasks) to restore the balance.  Once
again, the easiest way to explain why both of these measures are necessary
is to use a simple example.  Suppose that (in a slight variation of the
above example) that we have a two CPU system with 4 nice==0 and 4 nice=19
hard spinning tasks running and that the 4 nice==0 tasks are on one CPU and
the 4 nice==19 tasks are on the other CPU.  The weighted loads for the two
CPUs would be 4.0 and 0.2 respectively and the load balancing code would
move 2 tasks resulting in one CPU with a load of 2.0 and the other with
load of 2.2.  If this was considered to be a big enough imbalance to
justify moving a task and that task was moved using the current
move_tasks() then it would move the highest priority task that it found and
this would result in one CPU with a load of 3.0 and the other with a load
of 1.2 which would result in the movement of a task in the opposite
direction and so on -- infinite loop.  If, on the other hand, an amount of
load to be moved is calculated from the imbalance (in this case 0.1) and
move_tasks() skips tasks until it find ones whose contributions to the
weighted load are less than this amount it would move two of the nice==19
tasks resulting in a system with 2 nice==0 and 2 nice=19 on each CPU with
loads of 2.1 for each CPU.

One of the advantages of this mechanism is that on a system where all tasks
have nice==0 the load balancing calculations would be mathematically
identical to the current load balancing code.

Notes:

struct task_struct:

has a new field load_weight which (in a trade off of space for speed)
stores the contribution that this task makes to a CPU's weighted load when
it is runnable.

struct runqueue:

has a new field raw_weighted_load which is the sum of the load_weight
values for the currently runnable tasks on this run queue.  This field
always needs to be updated when nr_running is updated so two new inline
functions inc_nr_running() and dec_nr_running() have been created to make
sure that this happens.  This also offers a convenient way to optimize away
this part of the smpnice mechanism when CONFIG_SMP is not defined.

int try_to_wake_up():

in this function the value SCHED_LOAD_BALANCE is used to represent the load
contribution of a single task in various calculations in the code that
decides which CPU to put the waking task on.  While this would be a valid
on a system where the nice values for the runnable tasks were distributed
evenly around zero it will lead to anomalous load balancing if the
distribution is skewed in either direction.  To overcome this problem
SCHED_LOAD_SCALE has been replaced by the load_weight for the relevant task
or by the average load_weight per task for the queue in question (as
appropriate).

int move_tasks():

The modifications to this function were complicated by the fact that
active_load_balance() uses it to move exactly one task without checking
whether an imbalance actually exists.  This precluded the simple
overloading of max_nr_move with max_load_move and necessitated the addition
of the latter as an extra argument to the function.  The internal
implementation is then modified to move up to max_nr_move tasks and
max_load_move of weighted load.  This slightly complicates the code where
move_tasks() is called and if ever active_load_balance() is changed to not
use move_tasks() the implementation of move_tasks() should be simplified
accordingly.

struct sched_group *find_busiest_group():

Similar to try_to_wake_up(), there are places in this function where
SCHED_LOAD_SCALE is used to represent the load contribution of a single
task and the same issues are created.  A similar solution is adopted except
that it is now the average per task contribution to a group's load (as
opposed to a run queue) that is required.  As this value is not directly
available from the group it is calculated on the fly as the queues in the
groups are visited when determining the busiest group.

A key change to this function is that it is no longer to scale down
*imbalance on exit as move_tasks() uses the load in its scaled form.

void set_user_nice():

has been modified to update the task's load_weight field when it's nice
value and also to ensure that its run queue's raw_weighted_load field is
updated if it was runnable.

From: "Siddha, Suresh B" <suresh.b.siddha@intel.com>

With smpnice, sched groups with highest priority tasks can mask the imbalance
between the other sched groups with in the same domain.  This patch fixes some
of the listed down scenarios by not considering the sched groups which are
lightly loaded.

a) on a simple 4-way MP system, if we have one high priority and 4 normal
   priority tasks, with smpnice we would like to see the high priority task
   scheduled on one cpu, two other cpus getting one normal task each and the
   fourth cpu getting the remaining two normal tasks.  but with current
   smpnice extra normal priority task keeps jumping from one cpu to another
   cpu having the normal priority task.  This is because of the
   busiest_has_loaded_cpus, nr_loaded_cpus logic..  We are not including the
   cpu with high priority task in max_load calculations but including that in
   total and avg_load calcuations..  leading to max_load < avg_load and load
   balance between cpus running normal priority tasks(2 Vs 1) will always show
   imbalanace as one normal priority and the extra normal priority task will
   keep moving from one cpu to another cpu having normal priority task..

b) 4-way system with HT (8 logical processors).  Package-P0 T0 has a
   highest priority task, T1 is idle.  Package-P1 Both T0 and T1 have 1 normal
   priority task each..  P2 and P3 are idle.  With this patch, one of the
   normal priority tasks on P1 will be moved to P2 or P3..

c) With the current weighted smp nice calculations, it doesn't always make
   sense to look at the highest weighted runqueue in the busy group..
   Consider a load balance scenario on a DP with HT system, with Package-0
   containing one high priority and one low priority, Package-1 containing one
   low priority(with other thread being idle)..  Package-1 thinks that it need
   to take the low priority thread from Package-0.  And find_busiest_queue()
   returns the cpu thread with highest priority task..  And ultimately(with
   help of active load balance) we move high priority task to Package-1.  And
   same continues with Package-0 now, moving high priority task from package-1
   to package-0..  Even without the presence of active load balance, load
   balance will fail to balance the above scenario..  Fix find_busiest_queue
   to use "imbalance" when it is lightly loaded.

[kernel@kolivas.org: sched: store weighted load on up]
[kernel@kolivas.org: sched: add discrete weighted cpu load function]
[suresh.b.siddha@intel.com: sched: remove dead code]
Signed-off-by: Peter Williams <pwil3058@bigpond.com.au>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: "Chen, Kenneth W" <kenneth.w.chen@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Con Kolivas <kernel@kolivas.org>
Cc: John Hawkes <hawkes@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] sched: CPU hotplug race vs. set_cpus_allowed()
Kirill Korotaev [Tue, 27 Jun 2006 09:54:32 +0000 (02:54 -0700)]
[PATCH] sched: CPU hotplug race vs. set_cpus_allowed()

There is a race between set_cpus_allowed() and move_task_off_dead_cpu().
__migrate_task() doesn't report any err code, so task can be left on its
runqueue if its cpus_allowed mask changed so that dest_cpu is not longer a
possible target.  Also, chaning cpus_allowed mask requires rq->lock being
held.

Signed-off-by: Kirill Korotaev <dev@openvz.org>
Acked-By: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] unnecessary long index i in sched
Steven Rostedt [Tue, 27 Jun 2006 09:54:31 +0000 (02:54 -0700)]
[PATCH] unnecessary long index i in sched

Unless we expect to have more than 2G CPUs, there's no reason to have 'i'
as a long long here.

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] sched: fix interactive ceiling code
Con Kolivas [Tue, 27 Jun 2006 09:54:30 +0000 (02:54 -0700)]
[PATCH] sched: fix interactive ceiling code

The relationship between INTERACTIVE_SLEEP and the ceiling is not perfect
and not explicit enough.  The sleep boost is not supposed to be any larger
than without this code and the comment is not clear enough about what
exactly it does, just the reason it does it.  Fix it.

There is a ceiling to the priority beyond which tasks that only ever sleep
for very long periods cannot surpass.  Fix it.

Prevent the on-runqueue bonus logic from defeating the idle sleep logic.

Opportunity to micro-optimise.

Signed-off-by: Con Kolivas <kernel@kolivas.org>
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] sched: simplify bitmap definition
Steven Rostedt [Tue, 27 Jun 2006 09:54:29 +0000 (02:54 -0700)]
[PATCH] sched: simplify bitmap definition

Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] sched: fix smt nice lock contention and optimization
Chen, Kenneth W [Tue, 27 Jun 2006 09:54:28 +0000 (02:54 -0700)]
[PATCH] sched: fix smt nice lock contention and optimization

Initial report and lock contention fix from Chris Mason:

Recent benchmarks showed some performance regressions between 2.6.16 and
2.6.5.  We tracked down one of the regressions to lock contention in
schedule heavy workloads (~70,000 context switches per second)

kernel/sched.c:dependent_sleeper() was responsible for most of the lock
contention, hammering on the run queue locks.  The patch below is more of a
discussion point than a suggested fix (although it does reduce lock
contention significantly).  The dependent_sleeper code looks very expensive
to me, especially for using a spinlock to bounce control between two
different siblings in the same cpu.

It is further optimized:

* perform dependent_sleeper check after next task is determined
* convert wake_sleeping_dependent to use trylock
* skip smt runqueue check if trylock fails
* optimize double_rq_lock now that smt nice is converted to trylock
* early exit in searching first SD_SHARE_CPUPOWER domain
* speedup fast path of dependent_sleeper

[akpm@osdl.org: cleanup]
Signed-off-by: Ken Chen <kenneth.w.chen@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Con Kolivas <kernel@kolivas.org>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: Chris Mason <mason@suse.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: add proper Kconfig, Makefile entries
Jim Cromie [Tue, 27 Jun 2006 09:54:27 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: add proper Kconfig, Makefile entries

Replace the temp makefile hacks with proper CONFIG entries, which are also
added to Kconfig.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: display pin values in/out in gpio_dump
Jim Cromie [Tue, 27 Jun 2006 09:54:26 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: display pin values in/out in gpio_dump

Add current pin settings to gpio_dump() output.  This adds the last 'word' to
the syslog lines, which displays the input and output values that the pin is
set to.

  pc8736x_gpio.0: io00: 0x0044 TS OD PUE  EDGE LO DEBOUNCE        io:1/1

The 2 values may differ for a number of reasons:
1- the pin output circuitry is diaabled, (as the above 'TS' indicates)
2- it needs a pullup resistor to drive the attached circuit,
3- the external circuit needs a pullup so the open-drain has something
   to pull-down
4- the pin is wired to Vcc or Ground

It might be appropriate to add a WARN for 2,3,4, since they could
damage the chip and/or circuit, esp if misconfig goes unnoticed.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] gpio-patchset-fixups: include linux/io.h
Jim Cromie [Tue, 27 Jun 2006 09:54:25 +0000 (02:54 -0700)]
[PATCH] gpio-patchset-fixups: include linux/io.h

Hmm.  Im somewhat ambivalent about this patch, since with it, driver wont
build for vanilla 17 or older.

Its also only 1/2 of your suggestion - when I tried it, I was building against
vanilla 17, and asm/uaccess.h cause compilation failure.  Looking back, Im
perplexed as to why linux/io.h didnt cause same failure ?!?

use linux/io.h rather than asm/io.h

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: replace spinlocks w mutexes
Jim Cromie [Tue, 27 Jun 2006 09:54:25 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: replace spinlocks w mutexes

Replace spinlocks guarding gpio config ops with mutexes.  This is a me-too
patch, and is justifiable insofar as mutexes have stricter semantics and
better debugging support, so are preferred where they are applicable.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: fix gpio_current, use shadow regs
Jim Cromie [Tue, 27 Jun 2006 09:54:24 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: fix gpio_current, use shadow regs

Add a working gpio_current() to pc8736x_gpio.c (the previous implementation
just threw a dev_warn), and fix gpio_change() to use gpio_current() rather
than the incorrect (and temporary) gpio_get().  Initialize shadow-regs so this
all works.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: use dev_dbg in common module
Jim Cromie [Tue, 27 Jun 2006 09:54:23 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: use dev_dbg in common module

Use of dev_dbg() and friends is considered good practice.  dev_dbg() needs a
struct device *devp, but nsc_gpio is only a helper module, so it doesnt
have/need its own.  To provide devp to the user-modules (scx200 & pc8736x
_gpio), we add it to the vtable, and set it during init.

Also squeeze nsc_gpio_dump()'s format a little.

[  199.259879]  pc8736x_gpio.0: io09: 0x0044 TS OD PUE  EDGE LO DEBOUNCE

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: add platform_device for use w dev_dbg
Jim Cromie [Tue, 27 Jun 2006 09:54:22 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: add platform_device for use w dev_dbg

Adds platform-device to (just introduced) driver, and uses it to replace many
printks with dev_dbg() etc.  This could trivially be merged into previous
patch, but this way matches better with the corresponding patch that does the
same change to scx200_gpio.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: add new pc8736x_gpio module
Jim Cromie [Tue, 27 Jun 2006 09:54:21 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: add new pc8736x_gpio module

Add the brand new pc8736x_gpio driver.  This is mostly based upon
scx200_gpio.c, but the platform_dev is treated separately, since its fairly
big too.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: migrate gpio_dump to common module
Jim Cromie [Tue, 27 Jun 2006 09:54:20 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: migrate gpio_dump to common module

Since the meaning of config-bits is the same for scx200 and pc8736x _gpios, we
can share a function to deliver this to user.  Since it is called via the
vtable, its also completely replaceable.  For now, we keep using printk...

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: migrate file-ops to common module
Jim Cromie [Tue, 27 Jun 2006 09:54:20 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: migrate file-ops to common module

Now that the read(), write() file-ops are dispatching gpio-ops via the vtable,
they are generic, and can be moved 'verbatim' to the nsc_gpio common-support
module.  After the move, various symbols are renamed to update 'scx200_' to
'nsc_', and headers are adjusted accordingly.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: add empty common-module
Jim Cromie [Tue, 27 Jun 2006 09:54:19 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: add empty common-module

Add the nsc_gpio common-support module as an empty shell.  Next patch starts
the migration of the common gpio support routines.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: dispatch via vtable
Jim Cromie [Tue, 27 Jun 2006 09:54:18 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: dispatch via vtable

Now actually call the gpio operations thru the vtable.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: add gpio-ops vtable
Jim Cromie [Tue, 27 Jun 2006 09:54:18 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: add gpio-ops vtable

Abstract the gpio operations into a new nsc_gpio_ops vtable.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: refactor scx200_probe to better segregat...
Jim Cromie [Tue, 27 Jun 2006 09:54:17 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: refactor scx200_probe to better segregate _gpio initialization

Pull shadow-reg initialization into separate function now, rather than doing
it 2x later (scx200, pc8736x).  When we revisit 2nd drvr below, it will be to
reimplement an init function, rather than another refactor.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: add 'v' command to device-file
Jim Cromie [Tue, 27 Jun 2006 09:54:16 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: add 'v' command to device-file

Add a new driver command: 'v' which calls gpio_dump() on the pin.  The output
goes to the log, like all other INFO messages in the original driver.  Giving
the user control over the feedback they 'need' is construed to be a
user-friendly feature, and allows us (later) to dial down many INFO messages
to DEBUG log-level.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: put gpio_dump on a diet
Jim Cromie [Tue, 27 Jun 2006 09:54:16 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: put gpio_dump on a diet

Shrink scx200_gpio_dump() to a single printk with ternary ops.  The function
is still ifdef'd out, this is corrected in next patch, when it is actually
used.

The patch 'inadvertently' changed loglevel from DEBUG to INFO.  This is Good,
because in next patch, its wired to a 'command' which the user can invoke when
they want.  When they do so, its because they want INFO to support their
developement effort, and we want to give it to them without compiling a DEBUG
version of the driver.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: device minor numbers are unsigned ints
Jim Cromie [Tue, 27 Jun 2006 09:54:15 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: device minor numbers are unsigned ints

Per kernel headers, device minor numbers are unsigned ints.  Do the same in
this driver.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: add platforn_device for use w dev_dbg
Jim Cromie [Tue, 27 Jun 2006 09:54:14 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: add platforn_device for use w dev_dbg

Add a platform-device to scx200_gpio, and use its struct device dev member
(ie: devp) in dev_dbg() once.

There are 2 alternatives here (Im soliciting guidance/commentary):

- use isa_device, if/when its added to the kernel.

- alter scx200.c to EXPORT_GPL its private devp so that both scx200_gpio,
  and the (to be added) nsc_gpio module can use it.  Since the available devp
  is in 'grandparent', this seems like too much 'action at a distance'.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: modernize driver init to 2.6 api
Jim Cromie [Tue, 27 Jun 2006 09:54:14 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: modernize driver init to 2.6 api

Adopt many modern 2.6 coding practices, ala LDD3, chapter 3.  Changes are
limited to initialization calls from module init, ie: cdev_init, cdev_add,
*_chrdev_region, mkdev.

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] chardev: GPIO for SCx200 & PC-8736x: whitespace pre-clean
Jim Cromie [Tue, 27 Jun 2006 09:54:13 +0000 (02:54 -0700)]
[PATCH] chardev: GPIO for SCx200 & PC-8736x: whitespace pre-clean

GPIO SUPPORT FOR SCx200 & PC8736x

The patch-set reworks the 2.4 vintage scx200_gpio driver for modern 2.6, and
refactors GPIO support to reuse it in a new driver for the GPIO on PC-8736x
chips.  Its handy for the Soekris.com net-4801, which has both chips.

These patches have been seen recently on Kernel-Mentors, and then
Kernel-Newbies ML, where Jesper Juhl kindly reviewed it.  His feedback has
been incorporated.  Thanks Jesper !

Its also gone to soekris-tech@soekris.com for possible testing by linux folks,
I've gotten 1 promise so far.  Theyre mostly BSD folk over there, but we'll
see..

Device-file & Sysfs

The driver preserves the existing device-file interface, including the
write/cmd set, but adds v to 'view' the pin-settings & configs by inducing,
via gpio_dump(), a dev_info() call.  Its a fairly crappy way to get status,
but it sticks to the syslog approach, conservatively.

Allowing users to voluntarily trigger logging is good, it gives them a
familiar way to confirm their app's control & use of the pins, and I've thus
reduced the pin-mode-updates from dev_info to dev_dbg.

I've recently bolted on a proto sysfs interface for both new drivers.  Im not
including those patches here; they (the patch + doc-pre-patch) are still quite
raw (and unreviewed on KNML), and since they 'invent' a convention for GPIO, a
proper vetting is needed.  Since this patchset is much bigger than my previous
ones, Id like to keep things simpler, and address it 1st, before bolting on
more stuff.

The driver-split

The Geode CPU and the PC-87366 Super-IO chip have GPIO units which share a
common pin-architecture (same pin features, with same bits controlling), but
with different addressing mechanics and port organizations.

The vintage driver expresses the pin capabilities with pin-mode commands
[OoPpTt],etc that change the pin configurations, and since the 2 chips share
pin-arch, we can reuse the read(), write() commands, once the implementation
is suitably adjusted.

The patchset adds a vtable: struct nsc_gpio_ops, to abstract the existing gpio
operations, then adjusts fileops.write() code to invoke operations via that
vtable.  Driver specific open()s set private_data to the vtable so its
available for use by write().

The vtable gets the gpio_dump() too, since its user-friendly, and (could be
construed as) part of the current device-file interface.  To support use of
dev_dbg() in write() & _dump(), the vtable gets a dev ptr too, set by both
scx200 & pc8736x _gpio drivers.

heres how the pins are presented in syslog:

[ 1890.176223]  scx200_gpio.0: io00: 0x0044 TS OD PUE  EDGE LO DEBOUNCE
[ 1890.287223]  scx200_gpio.0: io01: 0x0003 OE PP PUD  EDGE LO

nsc_gpio.c: new file is new home of several file-ops methods, which are
modified to get their vtable from filp->private_data, and use it where needed.

scx200_gpio.c: keeps some of its existing gpio routines, but now wires them up
via the vtable (they're invoked by nsc_gpio.c:nsc_gpio_write() thru this
vtable).  A driver-spcific open() initializes filp->private_data with the
vtable.

Once the split is clean, and the scx200_gpio driver is working, we copy and
modify the function and variable names, and rework the access-method bodies
for the different addressing scheme.

Heres a working overview of the patchset:

# series file for GPIO

# Spring Cleaning
gpio-scx/patch.preclean        # scripts/Lindent fixes, editor-ctrl comments

# API Modernization

gpio-scx/patch.api26        # what I learned from LDD3
gpio-scx/patch.platform-dev-2    # get pdev, support for dev_dbg()
gpio-scx/patch.unsigned-minor    # fix to match std practice

# Debuggability

gpio-scx/patch.dump-diet    # shrink gpio_dump()
gpio-scx/patch.viewpins        # add new 'command' to call dump()
gpio-scx/patch.init-refactor    # pull shadow-register init to sub

# Access-Abstraction (add vtable)

gpio-scx/patch.access-vtable    # introduce nsg_gpio_ops vtable, w dump
gpio-scx/patch.vtable-calls    # add & use the vtable in scx200_gpio
gpio-scx/patch.nscgpio-shell    # add empty driver for common-fops

# move code under abstraction
gpio-scx/patch.migrate-fops    # move file-ops methods from scx200_gpio
gpio-scx/patch.common-dump    # mv scx200.c:scx200_gpio_dump() to nsc_gpio.c
gpio-scx/patch.add-pc8736x-gpio    # add new driver, like old, w chip adapt
# gpio-scx/patch.add-DEBUG    # enable all dev_dbg()s

# Cleanups

# finish printk -> dev_dbg() etc
gpio-scx/patch.pdev-pc8736x    # new drvr needs pdev too,
gpio-scx/patch.devdbg-nscgpio    # add device to 'vtable', use in dev_dbg()

# gpio-scx/patch.pin-config-view    # another 'c' 'command'
# gpio-scx/quiet-getset        # take out excess dbg stuff (pretty quiet
now)
gpio-scx/patch.shadow-current    # imitate scx200_gpio's shadow regs in
pc87*

# post KMentors-post patches ..

gpio-scx/patch.mutexes        # use mutexes for config-locks
gpio-scx/patch.viewpins-values    # extend dump to obsolete separate 'c' cmd

gpio-scx/patch.kconfig        # add stuff for kbuild

# TBC
# combine api26 with pdev, which is just one step.
# merge c&v commands to single do-all-fn
# delay viewpins, dump-diet should also un-ifdef it too.

diff.sys-gpio-rollup-1

This patch:

Removed editor format-control comments, and used scripts/Lindent to clean up
whitespace, then deleted the bogus chunks :-(

Signed-off-by: Jim Cromie <jim.cromie@gmail.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] cpu hotplug: use hotplug version of cpu notifier in appropriate places
Chandra Seetharaman [Tue, 27 Jun 2006 09:54:11 +0000 (02:54 -0700)]
[PATCH] cpu hotplug: use hotplug version of cpu notifier in appropriate places

Make use the of newly defined hotplug version of cpu_notifier functionality
wherever appropriate.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] cpu hotplug: add hotplug versions of cpu_notifier
Chandra Seetharaman [Tue, 27 Jun 2006 09:54:10 +0000 (02:54 -0700)]
[PATCH] cpu hotplug: add hotplug versions of cpu_notifier

Define new macros register_hotcpu_notifier() and unregister_hotcpu_notifier()
that redefines register_cpu_notifier() and unregister_cpu_notifier() for use
only when HOTPLUG_CPU is defined.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] cpu hotplug: make cpu_notifier related notifier calls __cpuinit only
Chandra Seetharaman [Tue, 27 Jun 2006 09:54:10 +0000 (02:54 -0700)]
[PATCH] cpu hotplug: make cpu_notifier related notifier calls __cpuinit only

Make notifier_calls associated with cpu_notifier as __cpuinit.

__cpuinit makes sure that the function is init time only unless
CONFIG_HOTPLUG_CPU is defined.

[akpm@osdl.org: section fix]
Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] cpu hotplug: make cpu_notifier related notifier blocks __cpuinit only
Chandra Seetharaman [Tue, 27 Jun 2006 09:54:09 +0000 (02:54 -0700)]
[PATCH] cpu hotplug: make cpu_notifier related notifier blocks __cpuinit only

Make notifier_blocks associated with cpu_notifier as __cpuinitdata.

__cpuinitdata makes sure that the data is init time only unless
CONFIG_HOTPLUG_CPU is defined.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] cpu hotplug: make [un]register_cpu_notifier init time only
Chandra Seetharaman [Tue, 27 Jun 2006 09:54:08 +0000 (02:54 -0700)]
[PATCH] cpu hotplug: make [un]register_cpu_notifier init time only

CPUs come online only at init time (unless CONFIG_HOTPLUG_CPU is defined).
So, cpu_notifier functionality need to be available only at init time.

This patch makes register_cpu_notifier() available only at init time, unless
CONFIG_HOTPLUG_CPU is defined.

This patch exports register_cpu_notifier() and unregister_cpu_notifier() only
if CONFIG_HOTPLUG_CPU is defined.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] cpu hotplug: revert initdata patch submitted for 2.6.17
Chandra Seetharaman [Tue, 27 Jun 2006 09:54:07 +0000 (02:54 -0700)]
[PATCH] cpu hotplug: revert initdata patch submitted for 2.6.17

This patch reverts notifier_block changes made in 2.6.17

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] cpu hotplug: revert init patch submitted for 2.6.17
Chandra Seetharaman [Tue, 27 Jun 2006 09:54:07 +0000 (02:54 -0700)]
[PATCH] cpu hotplug: revert init patch submitted for 2.6.17

In 2.6.17, there was a problem with cpu_notifiers and XFS.  I provided a
band-aid solution to solve that problem.  In the process, i undid all the
changes you both were making to ensure that these notifiers were available
only at init time (unless CONFIG_HOTPLUG_CPU is defined).

We deferred the real fix to 2.6.18.  Here is a set of patches that fixes the
XFS problem cleanly and makes the cpu notifiers available only at init time
(unless CONFIG_HOTPLUG_CPU is defined).

If CONFIG_HOTPLUG_CPU is defined then cpu notifiers are available at run
time.

This patch reverts the notifier_call changes made in 2.6.17

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] rtc: fix idr locking
Sonny Rao [Tue, 27 Jun 2006 09:54:06 +0000 (02:54 -0700)]
[PATCH] rtc: fix idr locking

We need to serialize access to the global rtc_idr even in this error path.

Signed-off-by: Sonny Rao <sonny@burdell.org>
Acked-by: Alessandro Zummo <a.zummo@towertech.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] stallion clean up
Alan Cox [Tue, 27 Jun 2006 09:54:05 +0000 (02:54 -0700)]
[PATCH] stallion clean up

There are two locking sets involved.  One locks the board mappings and the
other is the tty open/close locking.  The low level code was clearly
designed to be ported to OS's with spin locks already so pretty much comes
out in the wash

Signed-off-by: Alan Cox <alan@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] IPMI: use schedule in kthread
akpm@osdl.org [Tue, 27 Jun 2006 09:54:04 +0000 (02:54 -0700)]
[PATCH] IPMI: use schedule in kthread

Corey Minyard <minyard@acm.org>

The kthread used to speed up polling for IPMI was using udelay in its
busy-wait polling loop when the lower-level state machine told it to do a
short delay.  This just used CPU and didn't help scheduling, thus causing
bad problems with other tasks.  Call schedule() instead.

Signed-off-by: Corey Minyard <minyard@acm.org>
Acked-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] rcutorture: add call_rcu_bh() operations
Paul E. McKenney [Tue, 27 Jun 2006 09:54:04 +0000 (02:54 -0700)]
[PATCH] rcutorture: add call_rcu_bh() operations

Add operations for the call_rcu_bh() variant of RCU.  Also add an
rcu_batches_completed_bh() function, which is needed by rcutorture.

Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] rcutorture: add ops vector and Classic RCU ops
Paul E. McKenney [Tue, 27 Jun 2006 09:54:03 +0000 (02:54 -0700)]
[PATCH] rcutorture: add ops vector and Classic RCU ops

Add an ops vector to rcutorture, and add the ops for Classic RCU.  Update
the rcutorture documentation to reflect slight change to the dmesg formats.

Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] rcutorture: catchup doc fixes for idle-hz tests
Paul E. McKenney [Tue, 27 Jun 2006 09:54:02 +0000 (02:54 -0700)]
[PATCH] rcutorture: catchup doc fixes for idle-hz tests

This just catches the RCU torture documentation up with the recent fixes
that test RCU for architectures that turn of the scheduling-clock interrupt
for idle CPUs and the addition of a SUCCESS/FAILURE indication, fixing up
an obsolete comment as well.

Signed-off-by: Paul E. McKenney <paulmck@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] fix kernel-doc in kernel/ dir
Randy Dunlap [Tue, 27 Jun 2006 09:54:01 +0000 (02:54 -0700)]
[PATCH] fix kernel-doc in kernel/ dir

Fix kernel-doc parameters in kernel/

Warning(/var/linsrc/linux-2617-g9//kernel/auditsc.c:1376): No description found for parameter 'u_abs_timeout'
Warning(/var/linsrc/linux-2617-g9//kernel/auditsc.c:1420): No description found for parameter 'u_msg_prio'
Warning(/var/linsrc/linux-2617-g9//kernel/auditsc.c:1420): No description found for parameter 'u_abs_timeout'
Warning(/var/linsrc/linux-2617-g9//kernel/acct.c:526): No description found for parameter 'pacct'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] tty: fix TCSBRK comment
Paul Fulghum [Tue, 27 Jun 2006 09:54:00 +0000 (02:54 -0700)]
[PATCH] tty: fix TCSBRK comment

Fix TCSBRK comment to prevent confusion or accidental removal.

Signed-off-by: Paul Fulghum <paulkf@microgate.com>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] ufs: ufs_read_inode cleanup
Evgeniy Dushistov [Tue, 27 Jun 2006 09:53:59 +0000 (02:53 -0700)]
[PATCH] ufs: ufs_read_inode cleanup

Add missed ufsi->i_dir_start_lookup initialization in ufs_read_inode in
UFS2 case.  Also it cleans ufs_read_inode function to prevent such kind of
situation in the future: it move depend on UFS type parts of code into
separate functions and leaves in ufs_read_inode only generic code.  It
cleans code and avoids duplication.

Signed-off-by: Evgeniy Dushistov <dushistov@mail.ru>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] RTC: Add a comment for ENOIOCTLCMD in ds1553_rtc_ioctl
Atsushi Nemoto [Tue, 27 Jun 2006 09:53:58 +0000 (02:53 -0700)]
[PATCH] RTC: Add a comment for ENOIOCTLCMD in ds1553_rtc_ioctl

Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] generic_file_buffered_write(): deadlock on vectored write
Vladimir V. Saveliev [Tue, 27 Jun 2006 09:53:57 +0000 (02:53 -0700)]
[PATCH] generic_file_buffered_write(): deadlock on vectored write

generic_file_buffered_write() prefaults in user pages in order to avoid
deadlock on copying from the same page as write goes to.

However, it looks like there is a problem when write is vectored:
fault_in_pages_readable brings in current segment or its part (maxlen).
OTOH, filemap_copy_from_user_iovec is called to copy number of bytes
(bytes) which may exceed current segment, so filemap_copy_from_user_iovec
switches to the next segment which is not brought in yet.  Pagefault is
generated.  That causes the deadlock if pagefault is for the same page
write goes to: page being written is locked and not uptodate, pagefault
will deadlock trying to lock locked page.

[akpm@osdl.org: somewhat rewritten]
Cc: Neil Brown <neilb@suse.de>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Remove gratuitous inclusion of <linux/config.h> from <linux/dmaengine.h>
David Woodhouse [Tue, 27 Jun 2006 09:53:56 +0000 (02:53 -0700)]
[PATCH] Remove gratuitous inclusion of <linux/config.h> from <linux/dmaengine.h>

We include config.h on the compiler command line. There's no need for it
to be included again.

Signed-off-by: David Woodhouse <dwmw2@infradead.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] spin/rwlock init cleanups
Ingo Molnar [Tue, 27 Jun 2006 09:53:55 +0000 (02:53 -0700)]
[PATCH] spin/rwlock init cleanups

locking init cleanups:

 - convert " = SPIN_LOCK_UNLOCKED" to spin_lock_init() or DEFINE_SPINLOCK()
 - convert rwlocks in a similar manner

this patch was generated automatically.

Motivation:

 - cleanliness
 - lockdep needs control of lock initialization, which the open-coded
   variants do not give
 - it's also useful for -rt and for lock debugging in general

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] fs/buffer.c: cleanups
Adrian Bunk [Tue, 27 Jun 2006 09:53:54 +0000 (02:53 -0700)]
[PATCH] fs/buffer.c: cleanups

- add a proper prototype for the following global function:
  - buffer_init()

- make the following needlessly global function static:
  - end_buffer_async_write()

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] poison: add & use more constants
Randy Dunlap [Tue, 27 Jun 2006 09:53:54 +0000 (02:53 -0700)]
[PATCH] poison: add & use more constants

Add more poison values to include/linux/poison.h.  It's not clear to me
whether some others should be added or not, so I haven't added any of
these:

./include/linux/libata.h:#define ATA_TAG_POISON 0xfafbfcfdU
./arch/ppc/8260_io/fcc_enet.c:1918: memset((char *)(&(immap->im_dprambase[(mem_addr+64)])), 0x88, 32);
./drivers/usb/mon/mon_text.c:429: memset(mem, 0xe5, sizeof(struct mon_event_text));
./drivers/char/ftape/lowlevel/ftape-ctl.c:738: memset(ft_buffer[i]->address, 0xAA, FT_BUFF_SIZE);
./drivers/block/sx8.c:/* 0xf is just arbitrary, non-zero noise; this is sorta like poisoning */

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] update two drivers for poison.h
Randy Dunlap [Tue, 27 Jun 2006 09:53:53 +0000 (02:53 -0700)]
[PATCH] update two drivers for poison.h

Update two drivers to use poison.h.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] add poison.h and patch primary users
Randy Dunlap [Tue, 27 Jun 2006 09:53:52 +0000 (02:53 -0700)]
[PATCH] add poison.h and patch primary users

Localize poison values into one header file for better documentation and
easier/quicker debugging and so that the same values won't be used for
multiple purposes.

Use these constants in core arch., mm, driver, and fs code.

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Acked-by: Matt Mackall <mpm@selenic.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] vdso: randomize the i386 vDSO by moving it into a vma
Ingo Molnar [Tue, 27 Jun 2006 09:53:50 +0000 (02:53 -0700)]
[PATCH] vdso: randomize the i386 vDSO by moving it into a vma

Move the i386 VDSO down into a vma and thus randomize it.

Besides the security implications, this feature also helps debuggers, which
can COW a vma-backed VDSO just like a normal DSO and can thus do
single-stepping and other debugging features.

It's good for hypervisors (Xen, VMWare) too, which typically live in the same
high-mapped address space as the VDSO, hence whenever the VDSO is used, they
get lots of guest pagefaults and have to fix such guest accesses up - which
slows things down instead of speeding things up (the primary purpose of the
VDSO).

There's a new CONFIG_COMPAT_VDSO (default=y) option, which provides support
for older glibcs that still rely on a prelinked high-mapped VDSO.  Newer
distributions (using glibc 2.3.3 or later) can turn this option off.  Turning
it off is also recommended for security reasons: attackers cannot use the
predictable high-mapped VDSO page as syscall trampoline anymore.

There is a new vdso=[0|1] boot option as well, and a runtime
/proc/sys/vm/vdso_enabled sysctl switch, that allows the VDSO to be turned
on/off.

(This version of the VDSO-randomization patch also has working ELF
coredumping, the previous patch crashed in the coredumping code.)

This code is a combined work of the exec-shield VDSO randomization
code and Gerd Hoffmann's hypervisor-centric VDSO patch. Rusty Russell
started this patch and i completed it.

[akpm@osdl.org: cleanups]
[akpm@osdl.org: compile fix]
[akpm@osdl.org: compile fix 2]
[akpm@osdl.org: compile fix 3]
[akpm@osdl.org: revernt MAXMEM change]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Cc: Gerd Hoffmann <kraxel@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] voyager: fix compile after setup rework
James Bottomley [Tue, 27 Jun 2006 09:53:50 +0000 (02:53 -0700)]
[PATCH] voyager: fix compile after setup rework

The following

[PATCH] Clean up and refactor i386 sub-architecture setup

Doesn't quite work, since it leaves out an include of asm/io.h, without
which the use of inb/outb in the setup file won.t work.  This corrects that
and also removes a spurious acpi reference that apparently crept in ages
ago but should never have been there.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] fix subarchitecture breakage with CONFIG_SCHED_SMT
James Bottomley [Tue, 27 Jun 2006 09:53:49 +0000 (02:53 -0700)]
[PATCH] fix subarchitecture breakage with CONFIG_SCHED_SMT

Commit 1e9f28fa1eb9773bf65bae08288c6a0a38eef4a7 ("[PATCH] sched: new
sched domain for representing multi-core") incorrectly made SCHED_SMT
and some of the structures it uses dependent on SMP.

However, this is wrong, the structures are only defined if X86_HT, so
SCHED_SMT has to depend on that as well.

The patch broke voyager, since it doesn't provide any of the multi-core
or hyperthreading structures.

Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] fix broken vm86 interrupt/signal handling
Aleksey Gorelov [Tue, 27 Jun 2006 09:53:48 +0000 (02:53 -0700)]
[PATCH] fix broken vm86 interrupt/signal handling

Commit c3ff8ec31c1249d268cd11390649768a12bec1b9 ("[PATCH] i386: Don't
miss pending signals returning to user mode after signal processing")
meant that vm86 interrupt/signal handling got broken for the case when
vm86 is called from kernel space.

In this scenario, if signal is pending because of vm86 interrupt,
do_notify_resume/do_signal exits immediately due to user_mode() check,
without processing any signals.  Thus, resume_userspace handler is spinning
in a tight loop with signal pending and TIF_SIGPENDING is set.  Previously
everything worked Ok.

No in-tree usage of vm86() from kernel space exists, but I've heard
about a number of projects out there which use vm86 calls from kernel,
one of them being this, for instance:

http://dev.gentoo.org/~spock/projects/vesafb-tng/

The following patch fixes the issue.

Signed-off-by: Aleksey Gorelov <aleksey_gorelov@phoenix.com>
Cc: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Cc: Roland McGrath <roland@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] i386: use C code for current_thread_info()
Chuck Ebbert [Tue, 27 Jun 2006 09:53:47 +0000 (02:53 -0700)]
[PATCH] i386: use C code for current_thread_info()

Using C code for current_thread_info() lets the compiler optimize it.
With gcc 4.0.2, kernel is smaller:

    text           data     bss     dec     hex filename
 3645212         555556  312024 4512792  44dc18 2.6.17-rc6-nb-post/vmlinux
 3647276         555556  312024 4514856  44e428 2.6.17-rc6-nb/vmlinux
 -------
   -2064

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] i386: move phys_proc_id and cpu_core_id to cpuinfo_x86
Rohit Seth [Tue, 27 Jun 2006 09:53:46 +0000 (02:53 -0700)]
[PATCH] i386: move phys_proc_id and cpu_core_id to cpuinfo_x86

Move the phys_core_id and cpu_core_id to cpuinfo_x86 structure.  Similar
patch for x86_64 is already accepted by Andi earlier this week.

[akpm@osdl.org: fix warning]
Signed-off-by: Rohit Seth <rohitseth@google.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] x86: constify some parts of arch/i386/kernel/cpu/
Andreas Mohr [Tue, 27 Jun 2006 09:53:45 +0000 (02:53 -0700)]
[PATCH] x86: constify some parts of arch/i386/kernel/cpu/

Signed-off-by: Andreas Mohr <andi@lisas.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] x86: increase interrupt vector range
Rusty Russell [Tue, 27 Jun 2006 09:53:44 +0000 (02:53 -0700)]
[PATCH] x86: increase interrupt vector range

Remove the limit of 256 interrupt vectors by changing the value stored in
orig_{e,r}ax to be the complemented interrupt vector.  The orig_{e,r}ax
needs to be < 0 to allow the signal code to distinguish between return from
interrupt and return from syscall.  With this change applied, NR_IRQS can
be > 256.

Xen extends the IRQ numbering space to include room for dynamically
allocated virtual interrupts (in the range 256-511), which requires a more
permissive interface to do_IRQ.

Signed-off-by: Ian Pratt <ian.pratt@xensource.com>
Signed-off-by: Christian Limpach <Christian.Limpach@cl.cam.ac.uk>
Signed-off-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: "Protasevich, Natalie" <Natalie.Protasevich@UNISYS.com>
Cc: Andi Kleen <ak@muc.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] x86: cpu_init(): avoid GFP_KERNEL allocation while atomic
Shaohua Li [Tue, 27 Jun 2006 09:53:43 +0000 (02:53 -0700)]
[PATCH] x86: cpu_init(): avoid GFP_KERNEL allocation while atomic

The patch fixes two issues:

1.  cpu_init is called with interrupt disabled.  Allocating gdt table
   there isn't good at runtime.

2. gdt table page cause memory leak in CPU hotplug case.

Signed-off-by: Shaohua Li <shaohua.li@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Zachary Amsden <zach@vmware.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] selinux: inherit /proc/self/attr/keycreate across fork
Michael LeMay [Tue, 27 Jun 2006 09:53:42 +0000 (02:53 -0700)]
[PATCH] selinux: inherit /proc/self/attr/keycreate across fork

Update SELinux to cause the keycreate process attribute held in
/proc/self/attr/keycreate to be inherited across a fork and reset upon
execve.  This is consistent with the handling of the other process
attributes provided by SELinux and also makes it simpler to adapt logon
programs to properly handle the keycreate attribute.

Signed-off-by: Michael LeMay <mdlemay@epoch.ncsc.mil>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephen Smalley <sds@tycho.nsa.gov>
Acked-by: James Morris <jmorris@namei.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] node hotplug: register cpu: remove node struct
KAMEZAWA Hiroyuki [Tue, 27 Jun 2006 09:53:41 +0000 (02:53 -0700)]
[PATCH] node hotplug: register cpu: remove node struct

With Goto-san's patch, we can add new pgdat/node at runtime.  I'm now
considering node-hot-add with cpu + memory on ACPI.

I found acpi container, which describes node, could evaluate cpu before
memory. This means cpu-hot-add occurs before memory hot add.

In most part, cpu-hot-add doesn't depend on node hot add.  But register_cpu(),
which creates symbolic link from node to cpu, requires that node should be
onlined before register_cpu().  When a node is onlined, its pgdat should be
there.

This patch-set holds off creating symbolic link from node to cpu
until node is onlined.

This removes node arguments from register_cpu().

Now, register_cpu() requires 'struct node' as its argument.  But the array of
struct node is now unified in driver/base/node.c now (By Goto's node hotplug
patch).  We can get struct node in generic way.  So, this argument is not
necessary now.

This patch also guarantees add cpu under node only when node is onlined.  It
is necessary for node-hot-add vs.  cpu-hot-add patch following this.

Moreover, register_cpu calculates cpu->node_id by cpu_to_node() without regard
to its 'struct node *root' argument.  This patch removes it.

Also modify callers of register_cpu()/unregister_cpu, whose args are changed
by register-cpu-remove-node-struct patch.

[Brice.Goglin@ens-lyon.org: fix it]
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Brice Goglin <Brice.Goglin@ens-lyon.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation and update for ia64 of memory hotplug: allocate pgdat and...
Yasunori Goto [Tue, 27 Jun 2006 09:53:40 +0000 (02:53 -0700)]
[PATCH] pgdat allocation and update for ia64 of memory hotplug: allocate pgdat and per node data

This is a patch to allocate pgdat and per node data area for ia64.  The size
for them can be calculated by compute_pernodesize().

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation and update for ia64 of memory hotplug: update pgdat address...
Yasunori Goto [Tue, 27 Jun 2006 09:53:39 +0000 (02:53 -0700)]
[PATCH] pgdat allocation and update for ia64 of memory hotplug: update pgdat address array

This is to refresh node_data[] array for ia64.  As I mentioned previous
patches, ia64 has copies of information of pgdat address array on each node as
per node data.

At v2 of node_add, this function used stop_machine_run() to update them.  (I
wished that they were copied safety as much as possible.) But, in this patch,
this arrays are just copied simply, and set node_online_map bit after
completion of pgdat initialization.

So, kernel must touch NODE_DATA() macro after checking node_online_map().
(Current code has already done it.) This is more simple way for just
hot-add.....

Note : It will be problem when hot-remove will occur,
       because, even if online_map bit is set, kernel may
       touch NODE_DATA() due to race condition. :-(

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation and update for ia64 of memory hotplug: hold pgdat address...
Yasunori Goto [Tue, 27 Jun 2006 09:53:38 +0000 (02:53 -0700)]
[PATCH] pgdat allocation and update for ia64 of memory hotplug: hold pgdat address at system running

This is a preparatory patch to make common code for updating of NODE_DATA() of
ia64 between boottime and hotplug.

Current code remembers pgdat address in mem_data which is used at just boot
time.  But its information can be used at hotplug time by moving to global
value.  The next patch uses this array.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Register sysfs file for hotplugged new node
Yasunori Goto [Tue, 27 Jun 2006 09:53:38 +0000 (02:53 -0700)]
[PATCH] Register sysfs file for hotplugged new node

When new node becomes enable by hot-add, new sysfs file must be created for
new node.  So, if new node is enabled by add_memory(), register_one_node() is
called to create it.  In addition, I386's arch_register_node() and a part of
register_nodes() of powerpc are consolidated to register_one_node() as a
generic_code().

This is tested by Tiger4(IPF) with node hot-plug emulation.

Signed-off-by: Keiichiro Tokunaga <tokuanga.keiich@jp.fujitsu.com>
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] sparc64: support sparsemem and !memory hotplug
Yasunori Goto [Tue, 27 Jun 2006 09:53:37 +0000 (02:53 -0700)]
[PATCH] sparc64: support sparsemem and !memory hotplug

Fix "undefined reference to `arch_add_memory'" on sparc64 allmodconfig.

sparc64 doesn't support memory hotplug.  But we want it to support
sparsemem.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] catch valid mem range at onlining memory
KAMEZAWA Hiroyuki [Tue, 27 Jun 2006 09:53:36 +0000 (02:53 -0700)]
[PATCH] catch valid mem range at onlining memory

This patch allows hot-add memory which is not aligned to section.

Now, hot-added memory has to be aligned to section size.  Considering big
section sized archs, this is not useful.

When hot-added memory is registerd as iomem resoruce by iomem resource
patch, we can make use of that information to detect valid memory range.

Note: With this, not-aligned memory can be registerd. To allow hot-add
      memory with holes, we have to do more work around add_memory().
      (It doesn't allows add memory to already existing mem section.)

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] register hot-added memory to iomem resource
KAMEZAWA Hiroyuki [Tue, 27 Jun 2006 09:53:35 +0000 (02:53 -0700)]
[PATCH] register hot-added memory to iomem resource

Register hot-added memory to iomem_resource.  With this, /proc/iomem can
show hot-added memory.

Note: kdump uses /proc/iomem to catch memory range when it is installed.
      So, kdump should be re-installed after /proc/iomem change.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Vivek Goyal <vgoyal@in.ibm.com>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation for new node add (call pgdat allocation)
Yasunori Goto [Tue, 27 Jun 2006 09:53:34 +0000 (02:53 -0700)]
[PATCH] pgdat allocation for new node add (call pgdat allocation)

Add node-hot-add support to add_memory().

node hotadd uses this sequence.
1. allocate pgdat.
2. refresh NODE_DATA()
3. call free_area_init_node() to initialize
4. create sysfs entry
5. add memory (old add_memory())
6. set node online
7. run kswapd for new node.
(8). update zonelist after pages are onlined. (This is already merged in -mm
   due to update phase is difference.)

Note:
  To make common function as much as possible,
  there is 2 changes from v2.
    - The old add_memory(), which is defiend by each archs,
      is renamed to arch_add_memory(). New add_memory becomes
      caller of arch dependent function as a common code.

    - This patch changes add_memory()'s interface
        From: add_memory(start, end)
        TO  : add_memory(nid, start, end).
      It was cause of similar code that finding node id from
      physical address is inside of old add_memory() on each arch.

      In addition, acpi memory hotplug driver can find node id easier.
      In v2, it must walk DSDT'S _CRS by matching physical address to
      get the handle of its memory device, then get _PXM and node id.
      Because input is just physical address.
      However, in v3, the acpi driver can use handle to get _PXM and node id
      for the new memory device. It can pass just node id to add_memory().

Fix interface of arch_add_memory() is in next patche.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation for new node add (export kswapd start func)
Yasunori Goto [Tue, 27 Jun 2006 09:53:33 +0000 (02:53 -0700)]
[PATCH] pgdat allocation for new node add (export kswapd start func)

When node is hot-added, kswapd for the node should start.  This export kswapd
start function as kswapd_run() to use at add_memory().

[akpm@osdl.org: daemonize() isn't needed when using the kthread API]
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation for new node add (refresh node_data[])
Yasunori Goto [Tue, 27 Jun 2006 09:53:33 +0000 (02:53 -0700)]
[PATCH] pgdat allocation for new node add (refresh node_data[])

Refresh NODE_DATA() for generic archs.  In this case, NODE_DATA(nid) ==
node_data[nid].  node_data[] is array of address of pgdat.  So, refresh is
quite simple.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation for new node add (generic alloc node_data)
Yasunori Goto [Tue, 27 Jun 2006 09:53:32 +0000 (02:53 -0700)]
[PATCH] pgdat allocation for new node add (generic alloc node_data)

For node hotplug, basically we have to allocate new pgdat.  But, there are
several types of implementations of pgdat.

1. Allocate only pgdat.
   This style allocate only pgdat area.
   And its address is recorded in node_data[].
   It is most popular style.

2. Static array of pgdat
   In this case, all of pgdats are static array.
   Some archs use this style.

3. Allocate not only pgdat, but also per node data.
   To increase performance, each node has copy of some data as
   a per node data. So, this area must be allocated too.

   Ia64 is this style. Ia64 has the copies of node_data[] array
   on each per node data to increase performance.

In this series of patches, treat (1) as generic arch.

generic archs can use generic function. (2) and (3) should have
its own if necessary.

This patch defines pgdat allocator.
Updating NODE_DATA() macro function is in other patch.

Signed-off-by: Yasonori Goto <y-goto@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation for new node add (get node id by acpi)
Yasunori Goto [Tue, 27 Jun 2006 09:53:31 +0000 (02:53 -0700)]
[PATCH] pgdat allocation for new node add (get node id by acpi)

This is to find node id from acpi's handle of memory_device in DSDT.  _PXM for
the new node can be found by acpi_get_pxm() by using new memory's handle.  So,
node id can be found by pxm_to_nid_map[].

  This patch becomes simpler than v2 of node hot-add patch.
  Because old add_memory() function doesn't have node id parameter.
  So, kernel must find its handle by physical address via DSDT again.
  But, v3 just give node id to add_memory() now.

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pgdat allocation for new node add (specify node id)
Yasunori Goto [Tue, 27 Jun 2006 09:53:30 +0000 (02:53 -0700)]
[PATCH] pgdat allocation for new node add (specify node id)

Change the name of old add_memory() to arch_add_memory.  And use node id to
get pgdat for the node at NODE_DATA().

Note: Powerpc's old add_memory() is defined as __devinit. However,
      add_memory() is usually called only after bootup.
      I suppose it may be redundant. But, I'm not well known about powerpc.
      So, I keep it. (But, __meminit is better at least.)

Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Catch notification of memory add event of ACPI via container driver. (avoid...
Yasunori Goto [Tue, 27 Jun 2006 09:53:29 +0000 (02:53 -0700)]
[PATCH] Catch notification of memory add event of ACPI via container driver. (avoid redundant call add_memory)

When acpi_memory_device_init() is called at boottime to register struct
memory acpi_memory_device, acpi_bus_add() are called via
acpi_driver_attach().

But it also calls ops->start() function.  It is called even if the memory
blocks are initialized at early boottime.  In this case add_memory() return
-EEXIST, and the memory blocks becomes INVALID state even if it is normal.

This is patch to avoid calling add_memory() for already available memory.

[akpm@osdl.org: coding cleanups]
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] Catch notification of memory add event of ACPI via container driver. (registe...
Yasunori Goto [Tue, 27 Jun 2006 09:53:28 +0000 (02:53 -0700)]
[PATCH] Catch notification of memory add event of ACPI via container driver. (register start func for memory device)

This is a patch to call add_memroy() when notify reaches for new node's add
event.

When new node is added, notify of ACPI reaches container device which means
the node.

Container device driver calls acpi_bus_scan() to find and add belonging
devices (which means cpu, memory and so on).  Its function calls add and
start function of belonging devices's driver.

Howevever, current memory hotplug driver just register add function to
create sysfs file for its memory.  But, acpi_memory_enable_device() is not
called because it is considered just the case that notify reaches memory
device directly.  So, if notify reaches container device nothing can call
add_memory().

This is a patch to create start function which calls add_memory().
add_memory() can be called by this when notify reaches container device.

[akpm@osdl.org: coding cleanups]
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: "Brown, Len" <len.brown@intel.com>
Cc: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] acpi memory hotplug cannot manage _CRS with plural resoureces
KAMEZAWA Hiroyuki [Tue, 27 Jun 2006 09:53:27 +0000 (02:53 -0700)]
[PATCH] acpi memory hotplug cannot manage _CRS with plural resoureces

Current acpi memory hotplug just looks into the first entry of resources in
_CRS.  But, _CRS can contain plural resources.  So, if _CRS contains plural
resoureces, acpi memory hot add cannot add all memory.

With this patch, acpi memory hotplug can deal with Memory Device, whose
_CRS contains plural resources.

Tested on ia64 memory hotplug test envrionment (not emulation, uses alpha
version firmware which supports dynamic reconfiguration of NUMA.)

Note: Microsoft's Windows Server 2003 requires big (>4G)resoureces to be
      divided into small (<4G) resources. looks crazy, but not invalid.
      (See http://www.microsoft.com/whdc/system/pnppwr/hotadd/hotaddmem.mspx)
      For this reason, a firmware vendor who supports Windows writes plural
      resources in a _CRS even if they are contiguous.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] pm_trace is dangerous
Andrew Morton [Tue, 27 Jun 2006 09:53:26 +0000 (02:53 -0700)]
[PATCH] pm_trace is dangerous

CONFIG_PM_TRACES scrogs your RTC.  Mark it as experimental, and defaulting to
`off'.

Also beef up the help message a bit.

Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] zlib inflate: fix function definitions
Randy Dunlap [Tue, 27 Jun 2006 09:53:26 +0000 (02:53 -0700)]
[PATCH] zlib inflate: fix function definitions

Fix function definitions to be ANSI-compliant:
lib/zlib_inflate/inffast.c:68:1: warning: non-ANSI definition of function 'inflate_fast'
lib/zlib_inflate/inftrees.c:33:1: warning: non-ANSI definition of function 'zlib_inflate_table'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] kernel/acct: fix function definition
Randy Dunlap [Tue, 27 Jun 2006 09:53:25 +0000 (02:53 -0700)]
[PATCH] kernel/acct: fix function definition

kernel/acct.c:579:19: warning: non-ANSI function declaration of function 'acct_process'

Signed-off-by: Randy Dunlap <rdunlap@xenotime.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years ago[PATCH] fix static linking of NFS
David Brownell [Tue, 27 Jun 2006 19:59:15 +0000 (12:59 -0700)]
[PATCH] fix static linking of NFS

Builds on ARM report link problems with common configurations like
statically linked NFS (for nfsroot).  The symptom is that __init
section code references __exit section code; that won't work since
the exit sections are discarded (since they can never be called).

The best fix for these particular cases would be an "__init_or_exit"
section annotation.

Signed-off-by: David Brownell <dbrownell@users.sourceforge.net>
Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agoInput: fix resetting name, phys and uniq when unregistering device
Dmitry Torokhov [Tue, 27 Jun 2006 12:30:31 +0000 (08:30 -0400)]
Input: fix resetting name, phys and uniq when unregistering device

It should be done before calling class_device_unregister() because
it will destroy the device and free memory if there are no other
references to the device.

Signed-off-by: Dmitry Torokhov <dtor@mail.ru>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agoRevert "kbuild: fix make -rR breakage"
Linus Torvalds [Mon, 26 Jun 2006 23:59:26 +0000 (16:59 -0700)]
Revert "kbuild: fix make -rR breakage"

This reverts commit e5c44fd88c146755da6941d047de4d97651404a9.

Thanks to Daniel Ritz and Michal Piotrowski for noticing the problem.

Daniel says:

  "[The] reason is a recent change that made modules always shows as
   module.mod.  it breaks modprobe and probably many scripts..besides
   lsmod looking horrible

   stuff like this in modprobe.conf:
        install pcmcia_core /sbin/modprobe --ignore-install pcmcia_core; /sbin/modprobe pcmcia
   makes modprobe fork/exec endlessly calling itself...until oom
   interrupts it"

Signed-off-by: Linus Torvalds <torvalds@osdl.org>
18 years agoMerge branch 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfashe...
Linus Torvalds [Mon, 26 Jun 2006 23:06:08 +0000 (16:06 -0700)]
Merge branch 'upstream-linus' of git://git./linux/kernel/git/mfasheh/ocfs2

* 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mfasheh/ocfs2: (56 commits)
  [PATCH] fs/ocfs2/dlm/: cleanups
  ocfs2: fix compiler warnings in dlm_convert_lock_handler()
  ocfs2: dlm_print_one_mle() needs to be defined
  ocfs2: remove whitespace in dlmunlock.c
  ocfs2: move dlm work to a private work queue
  ocfs2: fix incorrect error returns
  ocfs2: tune down some noisy messages during dlm recovery
  ocfs2: display message before waiting for recovery to complete
  ocfs2: mlog in dlm_convert_lock_handler() should be ML_ERROR
  ocfs2: retry operations when a lock is marked in recovery
  ocfs2: use cond_resched() in dlm_thread()
  ocfs2: use GFP_NOFS in some dlm operations
  ocfs2: wait for recovery when starting lock mastery
  ocfs2: continue recovery when a dead node is encountered
  ocfs2: remove unneccesary spin_unlock() in dlm_remaster_locks()
  ocfs2: dlm_remaster_locks() should never exit without completing
  ocfs2: special case recovery lock in dlmlock_remote()
  ocfs2: pending mastery asserts and migrations should block each other
  ocfs2: temporarily disable automatic lock migration
  ocfs2: do not unconditionally purge the lockres in dlmlock_remote()
  ...

18 years agoMerge master.kernel.org:/home/rmk/linux-2.6-arm
Linus Torvalds [Mon, 26 Jun 2006 22:01:05 +0000 (15:01 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-arm

* master.kernel.org:/home/rmk/linux-2.6-arm:
  [ARM] 3657/1: S3C24XX: Documentation update of Overview.txt
  [ARM] Update mach-types
  [ARM] 3656/1: S3C2412: Add S3C2412 and S3C2413 documenation
  [ARM] 3654/1: add ajeco 1arm sbc support
  [ARM] fix drivers/mfd/ucb1x00-core.c IRQ probing bug
  [ARM] 3651/1: S3C24XX: Make arch list more detailed
  [ARM] 3650/1: S3C2412: Update s3c2410_defconfig
  [ARM] 3649/1: S3C24XX: Fix capitalisation of CPU on SMDK2440
  [ARM] 3612/1: make pci bus optional for ixp4xx platform
  [ARM] Remove MODE_(SVC|IRQ|FIQ|USR) and DEFAULT_FIQ
  [ARM] Remove save_lr/restore_pc macros
  [ARM] Remove partial non-v6 binutils compatibility
  [ARM] Remove LOADREGS macro
  [ARM] Remove RETINSTR macro

18 years agoMerge master.kernel.org:/home/rmk/linux-2.6-serial
Linus Torvalds [Mon, 26 Jun 2006 22:00:33 +0000 (15:00 -0700)]
Merge master.kernel.org:/home/rmk/linux-2.6-serial

* master.kernel.org:/home/rmk/linux-2.6-serial:
  [SERIAL] 8250_pnp: add support for other Wacom tablets

18 years ago[ARM] 3657/1: S3C24XX: Documentation update of Overview.txt
Ben Dooks [Mon, 26 Jun 2006 21:51:08 +0000 (22:51 +0100)]
[ARM] 3657/1: S3C24XX: Documentation update of Overview.txt

Patch from Ben Dooks

Update the list of supported devices, and remove the
changelog. Add SMDK2413 information.--

Signed-off-by: Ben Dooks <ben-linux@fluff.org>Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
18 years ago[ARM] Update mach-types
Russell King [Mon, 26 Jun 2006 21:50:21 +0000 (22:50 +0100)]
[ARM] Update mach-types

Usual mach-types update.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
18 years ago[PATCH] fs/ocfs2/dlm/: cleanups
Adrian Bunk [Tue, 16 May 2006 15:26:41 +0000 (17:26 +0200)]
[PATCH] fs/ocfs2/dlm/: cleanups

This patch #if 0's the no longer used dlm_dump_lock_resources().

Since this makes dlmdebug.h empty, this patch also removes this header.

Additionally, the needlessly global dlm_is_node_recovered() is made
static.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: fix compiler warnings in dlm_convert_lock_handler()
Mark Fasheh [Mon, 1 May 2006 21:56:57 +0000 (14:56 -0700)]
ocfs2: fix compiler warnings in dlm_convert_lock_handler()

We need to cast to unsigned long long.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: dlm_print_one_mle() needs to be defined
Mark Fasheh [Mon, 1 May 2006 21:55:10 +0000 (14:55 -0700)]
ocfs2: dlm_print_one_mle() needs to be defined

Fixes compile breakage.

Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: remove whitespace in dlmunlock.c
Kurt Hackel [Mon, 1 May 2006 21:39:57 +0000 (14:39 -0700)]
ocfs2: remove whitespace in dlmunlock.c

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: move dlm work to a private work queue
Kurt Hackel [Mon, 1 May 2006 21:39:29 +0000 (14:39 -0700)]
ocfs2: move dlm work to a private work queue

The work that is done can block for long periods of time and so is not
appropriate for keventd.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: fix incorrect error returns
Kurt Hackel [Mon, 1 May 2006 21:34:08 +0000 (14:34 -0700)]
ocfs2: fix incorrect error returns

Use DLM_REJECTED instead of DLM_RECOVERING.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: tune down some noisy messages during dlm recovery
Kurt Hackel [Mon, 1 May 2006 21:31:37 +0000 (14:31 -0700)]
ocfs2: tune down some noisy messages during dlm recovery

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: display message before waiting for recovery to complete
Kurt Hackel [Mon, 1 May 2006 21:30:39 +0000 (14:30 -0700)]
ocfs2: display message before waiting for recovery to complete

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: mlog in dlm_convert_lock_handler() should be ML_ERROR
Kurt Hackel [Mon, 1 May 2006 21:29:59 +0000 (14:29 -0700)]
ocfs2: mlog in dlm_convert_lock_handler() should be ML_ERROR

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: retry operations when a lock is marked in recovery
Kurt Hackel [Mon, 1 May 2006 21:29:28 +0000 (14:29 -0700)]
ocfs2: retry operations when a lock is marked in recovery

Before checking for a nonexistent lock, make sure the lockres is not marked
RECOVERING. The caller will just retry and the state should be fixed up when
recovery completes.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: use cond_resched() in dlm_thread()
Kurt Hackel [Mon, 1 May 2006 21:27:41 +0000 (14:27 -0700)]
ocfs2: use cond_resched() in dlm_thread()

yield() does not yield.  cond_resched() does.

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>
18 years agoocfs2: use GFP_NOFS in some dlm operations
Kurt Hackel [Mon, 1 May 2006 21:25:21 +0000 (14:25 -0700)]
ocfs2: use GFP_NOFS in some dlm operations

Signed-off-by: Kurt Hackel <kurt.hackel@oracle.com>
Signed-off-by: Mark Fasheh <mark.fasheh@oracle.com>