openwrt/staging/blogic.git
12 years agoperf tools: Fix display of first level of callchains
Frederic Weisbecker [Fri, 23 Mar 2012 18:06:50 +0000 (19:06 +0100)]
perf tools: Fix display of first level of callchains

The callchain stdio mode display was written using a sorted by symbol
report. In this mode we have only one callchain root per hist so we
forgot to handle cases where we have multiple callchain root, as in per
dso sorting for example.

Fix this by handling these roots like any other branch, with the hist as
the parent.

Before:

     1.97%  libpthread-2.12.1.so
            |
            --- __libc_write
                create_worker
                bench_sched_messaging
                cmd_bench
                run_builtin
                main
                __libc_start_main

            |
            --- __libc_read
                create_worker
                bench_sched_messaging
                cmd_bench
                run_builtin
                main
                __libc_start_main

After:

     1.97%  libpthread-2.12.1.so
            |
            |--36.97%-- __libc_write
            |          create_worker
            |          bench_sched_messaging
            |          cmd_bench
            |          run_builtin
            |          main
            |          __libc_start_main
            |
            |--31.47%-- __libc_read
            |          create_worker
            |          bench_sched_messaging
            |          cmd_bench
            |          run_builtin
            |          main
            |          __libc_start_main
           ...

Single roots keep their entry without percentage because they have
the same overhead than the hist they refer to. ie: 100% in fractal
mode and the percentage of the hist in graph mode:

     0.00%  [k] reschedule_interrupt
            |
            --- default_idle
                amd_e400_idle
                cpu_idle
                start_secondary

Reported-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1332526010-15400-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf: Move mmap page data_head offset assertion out of header
Jiri Olsa [Fri, 23 Mar 2012 14:41:20 +0000 (15:41 +0100)]
perf: Move mmap page data_head offset assertion out of header

Having the build time assertion in header is making the perf
build fail on x86 with:

  ../../include/linux/perf_event.h:411:32: error: variably modified \
‘__assert_mmap_data_head_offset’ at file scope [-Werror]

I'm moving the build time validation out of the header, because
I think it's better than to lessen the perf build warn/error
check.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: acme@redhat.com
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
Cc: cjashfor@linux.vnet.ibm.com
Cc: fweisbec@gmail.com
Link: http://lkml.kernel.org/r/1332513680-7870-1-git-send-email-jolsa@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
12 years agoMerge branch 'tip/perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/roste...
Ingo Molnar [Sat, 24 Mar 2012 07:19:09 +0000 (08:19 +0100)]
Merge branch 'tip/perf/urgent' of git://git./linux/kernel/git/rostedt/linux-trace into perf/urgent

12 years agoperf: Fix mmap_page capabilities and docs
Peter Zijlstra [Thu, 22 Mar 2012 16:26:36 +0000 (17:26 +0100)]
perf: Fix mmap_page capabilities and docs

Complete the syscall-less self-profiling feature and address
all complaints, namely:

 - capabilities, so we can detect what is actually available at runtime

     Add a capabilities field to perf_event_mmap_page to indicate
     what is actually available for use.

 - on x86: RDPMC weirdness due to being 40/48 bits and not sign-extending
   properly.

 - ABI documentation as to how all this stuff works.

Also improve the documentation for the new features.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vince Weaver <vweaver1@eecs.utk.edu>
Cc: Arnaldo Carvalho de Melo <acme@infradead.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1332433596.2487.33.camel@twins
Signed-off-by: Ingo Molnar <mingo@kernel.org>
12 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Fri, 23 Mar 2012 08:16:03 +0000 (09:16 +0100)]
Merge tag 'perf-core-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/urgent

Cleanups and fixes for perf/core:

. Short term fix for 'diff' tool breakage related to perf.data files
  with multiple events. From Jiri Olsa

. Cleanup for event id tracepoint reading routine, from Borislav Petkov

. 32-bit compilation fixes from Jiri Olsa

. Event parsing modifier assignment fixes from Jiri Olsa

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
12 years agoperf diff: Fix to work with new hists design
Jiri Olsa [Thu, 22 Mar 2012 13:37:26 +0000 (14:37 +0100)]
perf diff: Fix to work with new hists design

The perf diff command is broken since:
  perf hists: Threaded addition and sorting of entries
  commit 1980c2ebd7020d82c024b8c4046849b38e78e7da

Several places were broken:
  - hists data need to be collected into opened sessions instead
    of into events
  - session's hists data need to be initialized properly when the
    session is created
  - hist_entry__pcnt_snprintf: the percentage and displacement
    buffer preparation must not use 'ret' because it's used
    as a pointer to the final buffer

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120322133726.GB1601@m.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf tools: Fix modifier to be applied on correct events
Jiri Olsa [Tue, 20 Mar 2012 18:15:40 +0000 (19:15 +0100)]
perf tools: Fix modifier to be applied on correct events

The event modifier needs to be applied only on the event definition it
is attached to.

The current state is that in case of multiple events definition (in
single '-e' option, separated by ',') all will get modifier of the last
one.

Fixing this by adding separated list for each event definition, so the
modifier is applied only to proper event(s). Added automated test to
catch this, plus some other modifier tests.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1332267341-26338-3-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf tools: Fix various casting issues for 32 bits
Jiri Olsa [Tue, 20 Mar 2012 18:15:39 +0000 (19:15 +0100)]
perf tools: Fix various casting issues for 32 bits

- util/parse-events.c(parse_events_add_breakpoint)
  need to use unsigned long instead of u64, otherwise
  we get following gcc error on 32 bits:
     error: cast from pointer to integer of different size

- util/header.c(print_event_desc)
  cannot retype to signed type, otherwise we get following
  gcc error on 32 bits:
     error: comparison between signed and unsigned integer expressions

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1332267341-26338-2-git-send-email-jolsa@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf tools: Simplify event_read_id exit path
Borislav Petkov [Wed, 21 Mar 2012 14:15:47 +0000 (15:15 +0100)]
perf tools: Simplify event_read_id exit path

We're freeing the token in any case so simplify the exit path by
unifying it.

No functional change.

Signed-off-by: Borislav Petkov <bp@amd64.org>
Link: http://lkml.kernel.org/r/1332339347-21342-1-git-send-email-bp@amd64.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoMerge branch 'perf/urgent' into perf/core
Arnaldo Carvalho de Melo [Thu, 22 Mar 2012 18:09:08 +0000 (15:09 -0300)]
Merge branch 'perf/urgent' into perf/core

Merge Reason: to pick the fix:

 commit e7f01d1
     perf tools: Use scnprintf where applicable

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agotracing: Fix ftrace stack trace entries
Wolfgang Mauerer [Thu, 22 Mar 2012 10:18:20 +0000 (11:18 +0100)]
tracing: Fix ftrace stack trace entries

8 hex characters tell only half the tale for 64 bit CPUs,
so use the appropriate length.

Link: http://lkml.kernel.org/r/1332411501-8059-2-git-send-email-wolfgang.mauerer@siemens.com
Cc: stable@vger.kernel.org
Signed-off-by: Wolfgang Mauerer <wolfgang.mauerer@siemens.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
12 years agoMerge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Ingo Molnar [Thu, 22 Mar 2012 08:17:57 +0000 (09:17 +0100)]
Merge branch 'tip/perf/core' of git://git./linux/kernel/git/rostedt/linux-trace into perf/urgent

12 years agoAFS: checking wrong bit in afs_readpages()
Dan Carpenter [Tue, 20 Mar 2012 16:58:06 +0000 (16:58 +0000)]
AFS: checking wrong bit in afs_readpages()

We should be testing "if (vnode->flags & (1 << 4))" instead of
"if (vnode->flags & 4) {".  The current test checks if the data was
modified instead of deleted.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 20 Mar 2012 17:32:09 +0000 (10:32 -0700)]
Merge branch 'timers-core-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer changes for v3.4 from Ingo Molnar

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
  ntp: Fix integer overflow when setting time
  math: Introduce div64_long
  cs5535-clockevt: Allow the MFGPT IRQ to be shared
  cs5535-clockevt: Don't ignore MFGPT on SMP-capable kernels
  x86/time: Eliminate unused irq0_irqs counter
  clocksource: scx200_hrt: Fix the build
  x86/tsc: Reduce the TSC sync check time for core-siblings
  timer: Fix bad idle check on irq entry
  nohz: Remove ts->Einidle checks before restarting the tick
  nohz: Remove update_ts_time_stat from tick_nohz_start_idle
  clockevents: Leave the broadcast device in shutdown mode when not needed
  clocksource: Load the ACPI PM clocksource asynchronously
  clocksource: scx200_hrt: Convert scx200 to use clocksource_register_hz
  clocksource: Get rid of clocksource_calc_mult_shift()
  clocksource: dbx500: convert to clocksource_register_hz()
  clocksource: scx200_hrt:  use pr_<level> instead of printk
  time: Move common updates to a function
  time: Reorder so the hot data is together
  time: Remove most of xtime_lock usage in timekeeping.c
  ntp: Add ntp_lock to replace xtime_locking
  ...

12 years agoMerge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 20 Mar 2012 17:31:44 +0000 (10:31 -0700)]
Merge branch 'sched-core-for-linus' of git://git./linux/kernel/git/tip/tip

Pull scheduler changes for v3.4 from Ingo Molnar

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  printk: Make it compile with !CONFIG_PRINTK
  sched/x86: Fix overflow in cyc2ns_offset
  sched: Fix nohz load accounting -- again!
  sched: Update yield() docs
  printk/sched: Introduce special printk_sched() for those awkward moments
  sched/nohz: Correctly initialize 'next_balance' in 'nohz' idle balancer
  sched: Cleanup cpu_active madness
  sched: Fix load-balance wreckage
  sched: Clean up parameter passing of proc_sched_autogroup_set_nice()
  sched: Ditch per cgroup task lists for load-balancing
  sched: Rename load-balancing fields
  sched: Move load-balancing arguments into helper struct
  sched/rt: Do not submit new work when PI-blocked
  sched/rt: Prevent idle task boosting
  sched/wait: Add __wake_up_all_locked() API
  sched/rt: Document scheduler related skip-resched-check sites
  sched/rt: Use schedule_preempt_disabled()
  sched/rt: Add schedule_preempt_disabled()
  sched/rt: Do not throttle when PI boosting
  sched/rt: Keep period timer ticking when rt throttling is active
  ...

12 years agoMerge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 20 Mar 2012 17:29:15 +0000 (10:29 -0700)]
Merge branch 'perf-core-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf events changes for v3.4 from Ingo Molnar:

 - New "hardware based branch profiling" feature both on the kernel and
   the tooling side, on CPUs that support it.  (modern x86 Intel CPUs
   with the 'LBR' hardware feature currently.)

   This new feature is basically a sophisticated 'magnifying glass' for
   branch execution - something that is pretty difficult to extract from
   regular, function histogram centric profiles.

   The simplest mode is activated via 'perf record -b', and the result
   looks like this in perf report:

$ perf record -b any_call,u -e cycles:u branchy

$ perf report -b --sort=symbol
    52.34%  [.] main                   [.] f1
    24.04%  [.] f1                     [.] f3
    23.60%  [.] f1                     [.] f2
     0.01%  [k] _IO_new_file_xsputn    [k] _IO_file_overflow
     0.01%  [k] _IO_vfprintf_internal  [k] _IO_new_file_xsputn
     0.01%  [k] _IO_vfprintf_internal  [k] strchrnul
     0.01%  [k] __printf               [k] _IO_vfprintf_internal
     0.01%  [k] main                   [k] __printf

   This output shows from/to branch columns and shows the highest
   percentage (from,to) jump combinations - i.e.  the most likely taken
   branches in the system.  "branches" can also include function calls
   and any other synchronous and asynchronous transitions of the
   instruction pointer that are not 'next instruction' - such as system
   calls, traps, interrupts, etc.

   This feature comes with (hopefully intuitive) flat ascii and TUI
   support in perf report.

 - Various 'perf annotate' visual improvements for us assembly junkies.
   It will now recognize function calls in the TUI and by hitting enter
   you can follow the call (recursively) and back, amongst other
   improvements.

 - Multiple threads/processes recording support in perf record, perf
   stat, perf top - which is activated via a comma-list of PIDs:

perf top -p 21483,21485
perf stat -p 21483,21485 -ddd
perf record -p 21483,21485

 - Support for per UID views, via the --uid paramter to perf top, perf
   report, etc.  For example 'perf top --uid mingo' will only show the
   tasks that I am running, excluding other users, root, etc.

 - Jump label restructurings and improvements - this includes the
   factoring out of the (hopefully much clearer) include/linux/static_key.h
   generic facility:

struct static_key key = STATIC_KEY_INIT_FALSE;

...

if (static_key_false(&key))
        do unlikely code
else
        do likely code

...
static_key_slow_inc();
...
static_key_slow_inc();
...

   The static_key_false() branch will be generated into the code with as
   little impact to the likely code path as possible.  the
   static_key_slow_*() APIs flip the branch via live kernel code patching.

   This facility can now be used more widely within the kernel to
   micro-optimize hot branches whose likelihood matches the static-key
   usage and fast/slow cost patterns.

 - SW function tracer improvements: perf support and filtering support.

 - Various hardenings of the perf.data ABI, to make older perf.data's
   smoother on newer tool versions, to make new features integrate more
   smoothly, to support cross-endian recording/analyzing workflows
   better, etc.

 - Restructuring of the kprobes code, the splitting out of 'optprobes',
   and a corner case bugfix.

 - Allow the tracing of kernel console output (printk).

 - Improvements/fixes to user-space RDPMC support, allowing user-space
   self-profiling code to extract PMU counts without performing any
   system calls, while playing nice with the kernel side.

 - 'perf bench' improvements

 - ... and lots of internal restructurings, cleanups and fixes that made
   these features possible.  And, as usual this list is incomplete as
   there were also lots of other improvements

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (120 commits)
  perf report: Fix annotate double quit issue in branch view mode
  perf report: Remove duplicate annotate choice in branch view mode
  perf/x86: Prettify pmu config literals
  perf report: Enable TUI in branch view mode
  perf report: Auto-detect branch stack sampling mode
  perf record: Add HEADER_BRANCH_STACK tag
  perf record: Provide default branch stack sampling mode option
  perf tools: Make perf able to read files from older ABIs
  perf tools: Fix ABI compatibility bug in print_event_desc()
  perf tools: Enable reading of perf.data files from different ABI rev
  perf: Add ABI reference sizes
  perf report: Add support for taken branch sampling
  perf record: Add support for sampling taken branch
  perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK
  x86/kprobes: Split out optprobe related code to kprobes-opt.c
  x86/kprobes: Fix a bug which can modify kernel code permanently
  x86/kprobes: Fix instruction recovery on optimized path
  perf: Add callback to flush branch_stack on context switch
  perf: Disable PERF_SAMPLE_BRANCH_* when not supported
  perf/x86: Add LBR software filter support for Intel CPUs
  ...

12 years agoMerge branch 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 20 Mar 2012 17:28:56 +0000 (10:28 -0700)]
Merge branch 'irq-core-for-linus' of git://git./linux/kernel/git/tip/tip

Pull irq/core changes for v3.4 from Ingo Molnar

* 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  genirq: Remove paranoid warnons and bogus fixups
  genirq: Flush the irq thread on synchronization
  genirq: Get rid of unnecessary IRQTF_DIED flag
  genirq: No need to check IRQTF_DIED before stopping a thread handler
  genirq: Get rid of unnecessary irqaction field in task_struct
  genirq: Fix incorrect check for forced IRQ thread handler
  softirq: Reduce invoke_softirq() code duplication
  genirq: Fix long-term regression in genirq irq_set_irq_type() handling
  x86-32/irq: Don't switch to irq stack for a user-mode irq

12 years agoMerge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Tue, 20 Mar 2012 00:12:34 +0000 (17:12 -0700)]
Merge branch 'core-rcu-for-linus' of git://git./linux/kernel/git/tip/tip

Pull RCU changes for v3.4 from Ingo Molnar.  The major features of this
series are:

 - making RCU more aggressive about entering dyntick-idle mode in order
   to improve energy efficiency

 - converting a few more call_rcu()s to kfree_rcu()s

 - applying a number of rcutree fixes and cleanups to rcutiny

 - removing CONFIG_SMP #ifdefs from treercu

 - allowing RCU CPU stall times to be set via sysfs

 - adding CPU-stall capability to rcutorture

 - adding more RCU-abuse diagnostics

 - updating documentation

 - fixing yet more issues located by the still-ongoing top-to-bottom
   inspection of RCU, this time with a special focus on the CPU-hotplug
   code path.

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits)
  rcu: Stop spurious warnings from synchronize_sched_expedited
  rcu: Hold off RCU_FAST_NO_HZ after timer posted
  rcu: Eliminate softirq-mediated RCU_FAST_NO_HZ idle-entry loop
  rcu: Add RCU_NONIDLE() for idle-loop RCU read-side critical sections
  rcu: Allow nesting of rcu_idle_enter() and rcu_idle_exit()
  rcu: Remove redundant check for rcu_head misalignment
  PTR_ERR should be called before its argument is cleared.
  rcu: Convert WARN_ON_ONCE() in rcu_lock_acquire() to lockdep
  rcu: Trace only after NULL-pointer check
  rcu: Call out dangers of expedited RCU primitives
  rcu: Rework detection of use of RCU by offline CPUs
  lockdep: Add CPU-idle/offline warning to lockdep-RCU splat
  rcu: No interrupt disabling for rcu_prepare_for_idle()
  rcu: Move synchronize_sched_expedited() to rcutree.c
  rcu: Check for illegal use of RCU from offlined CPUs
  rcu: Update stall-warning documentation
  rcu: Add CPU-stall capability to rcutorture
  rcu: Make documentation give more realistic rcutorture duration
  rcutorture: Permit holding off CPU-hotplug operations during boot
  rcu: Print scheduling-clock information on RCU CPU stall-warning messages
  ...

12 years agotracing: Move the tracing_on/off() declarations into CONFIG_TRACING
Steven Rostedt [Tue, 20 Mar 2012 16:28:29 +0000 (12:28 -0400)]
tracing: Move the tracing_on/off() declarations into CONFIG_TRACING

The tracing_on/off() declarations were under CONFIG_RING_BUFFER, but
the functions are now only defined under CONFIG_TRACING as they are
specific to ftrace and not the ring buffer.

But the declarations were still defined under the ring buffer and
this caused the build to fail when CONFIG_RING_BUFFER was set but
CONFIG_TRACING was not.

Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
12 years agoMerge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 20 Mar 2012 00:11:15 +0000 (17:11 -0700)]
Merge branch 'core-locking-for-linus' of git://git./linux/kernel/git/tip/tip

Pull core/locking changes for v3.4 from Ingo Molnar

* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  futex: Simplify return logic
  futex: Cover all PI opcodes with cmpxchg enabled check

12 years agoMerge branch 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Tue, 20 Mar 2012 00:10:38 +0000 (17:10 -0700)]
Merge branch 'core-iommu-for-linus' of git://git./linux/kernel/git/tip/tip

Pull core/iommu changes for v3.4 from Ingo Molnar

* 'core-iommu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/iommu/intel: Increase the number of iommus supported to MAX_IO_APICS
  x86/iommu/intel: Fix identity mapping for sandy bridge

12 years agoMerge branch 'dcache-word-accesses'
Linus Torvalds [Mon, 19 Mar 2012 23:37:28 +0000 (16:37 -0700)]
Merge branch 'dcache-word-accesses'

* branch 'dcache-word-accesses':
  vfs: use 'unsigned long' accesses for dcache name comparison and hashing

This does the name hashing and lookup using word-sized accesses when
that is efficient, namely on x86 (although any little-endian machine
with good unaligned accesses would do).

It does very much depend on little-endian logic, but it's a very hot
couple of functions under some real loads, and this patch improves the
performance of __d_lookup_rcu() and link_path_walk() by up to about 30%.
Giving a 10% improvement on some very pathname-heavy benchmarks.

Because we do make unaligned accesses past the filename, the
optimization is disabled when CONFIG_DEBUG_PAGEALLOC is active, and we
effectively depend on the fact that on x86 we don't really ever have the
last page of usable RAM followed immediately by any IO memory (due to
ACPI tables, BIOS buffer areas etc).

Some of the bit operations we do are a bit "subtle".  It's commented,
but you do need to really think about the code.  Or just consider it
black magic.

Thanks to people on G+ for some of the optimized bit tricks.

12 years agovfs: get rid of batshit-insane pointless dentry hash calculations
Linus Torvalds [Mon, 19 Mar 2012 23:19:53 +0000 (16:19 -0700)]
vfs: get rid of batshit-insane pointless dentry hash calculations

For some odd historical reason, the final mixing round for the dentry
cache hash table lookup had an insane "xor with big constant" logic.  In
two places.

The big constant that is being xor'ed is GOLDEN_RATIO_PRIME, which is a
fairly random-looking number that is designed to be *multiplied* with so
that the bits get spread out over a whole long-word.

But xor'ing with it is insane.  It doesn't really even change the hash -
it really only shifts the hash around in the hash table.  To make
matters worse, the insane big constant is different on 32-bit and 64-bit
builds, even though the name hash bits we use are always 32-bit (and the
bits from the pointer we mix in effectively are too).

It's all total voodoo programming, in other words.

Now, some testing and analysis of the hash chains shows that the rest of
the hash function seems to be fairly good.  It does pick the right bits
of the parent dentry pointer, for example, and while it's generally a
bad idea to use an xor to mix down the upper bits (because if there is a
repeating pattern, the xor can cause "destructive interference"), it
seems to not have been a disaster.

For example, replacing the hash with the normal "hash_long()" code (that
uses the GOLDEN_RATIO_PRIME constant correctly, btw) actually just makes
the hash worse.  The hand-picked hash knew which bits of the pointer had
the highest entropy, and hash_long() ends up mixing bits less optimally
at least in some trivial tests.

So the hash function overall seems fine, it just has that really odd
"shift result around by a constant xor".

So get rid of the silly xor, and replace the down-mixing of the bits
with an add instead of an xor that tends to not have the same kind of
destructive interference issues.  Some stats on the resulting hash
chains shows that they look statistically identical before and after,
but the code is simpler and no longer makes you go "WTF?".

Also, the incoming hash really is just "unsigned int", not a long, and
there's no real point to worry about the high 26 bits of the dentry
pointer for the 64-bit case, because they are all going to be identical
anyway.

So also change the hashing to be done in the more natural 'unsigned int'
that is the real size of the actual hashed data anyway.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Mon, 19 Mar 2012 19:14:46 +0000 (20:14 +0100)]
Merge tag 'perf-core-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/core

Fixes for the last batch from Namhyung and the initial GTK report browser
from Pekka.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf report: Add a simple GTK2-based 'perf report' browser
Pekka Enberg [Mon, 19 Mar 2012 18:13:29 +0000 (15:13 -0300)]
perf report: Add a simple GTK2-based 'perf report' browser

This patch adds a simple GTK2-based browser to 'perf report' that's
based on the TTY-based browser in builtin-report.c.

To launch "perf report" using the new GTK interface just type:

  $ perf report --gtk

The interface is somewhat limited in features at the moment:

  - No callgraph support

  - No KVM guest profiling support

  - No color coding for percentages

  - No sorting from the UI

  - ..and many, many more!

That said, I think this patch a reasonable start to build future features on.

Signed-off-by: Pekka Enberg <penberg@kernel.org>
Cc: Colin Walters <walters@verbum.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ingo Molnar <mingo@kernel.org>
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1202231952410.6689@tux.localdomain
[ committer note: Added #pragma to make gtk no strict prototype problem go
  away as suggested by Colin Walters modulo avoiding push/pop ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf report: Document --symbol-filter option
Namhyung Kim [Mon, 19 Mar 2012 02:53:48 +0000 (11:53 +0900)]
perf report: Document --symbol-filter option

Add missing description of --symbol-filter in Documentation/perf-report.txt.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1332125628-23088-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf ui browser: Clean lines inside of the input window
Namhyung Kim [Mon, 19 Mar 2012 02:46:20 +0000 (11:46 +0900)]
perf ui browser: Clean lines inside of the input window

As Arnaldo pointed out, it should be cleared to prevent the window from
displaying overlapped strings on the region.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1332125180-23041-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoLinux 3.3
Linus Torvalds [Sun, 18 Mar 2012 23:15:34 +0000 (16:15 -0700)]
Linux 3.3

12 years agoDon't limit non-nested epoll paths
Jason Baron [Fri, 16 Mar 2012 20:34:03 +0000 (16:34 -0400)]
Don't limit non-nested epoll paths

Commit 28d82dc1c4ed ("epoll: limit paths") that I did to limit the
number of possible wakeup paths in epoll is causing a few applications
to longer work (dovecot for one).

The original patch is really about limiting the amount of epoll nesting
(since epoll fds can be attached to other fds). Thus, we probably can
allow an unlimited number of paths of depth 1. My current patch limits
it at 1000. And enforce the limits on paths that have a greater depth.

This is captured in: https://bugzilla.redhat.com/show_bug.cgi?id=681578

Signed-off-by: Jason Baron <jbaron@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Sun, 18 Mar 2012 02:22:24 +0000 (19:22 -0700)]
Merge git://git./linux/kernel/git/davem/net

Pull networking changes from David Miller:
 "1) icmp6_dst_alloc() returns NULL instead of ERR_PTR() leading to
     crashes, particularly during shutdown.  Reported by Dave Jones and
     fixed by Eric Dumazet.

  2) hyperv and wimax/i2400m return NETDEV_TX_BUSY when they have
     already freed the SKB, which causes crashes as to the caller this
     means requeue the packet.  Fixes from Eric Dumazet.

  3) usbnet driver doesn't allocate the right amount of headroom on
     fresh RX SKBs, fix from Eric Dumazet.

  4) Fix regression in ip6_mc_find_dev_rcu(), as an RCU lookup it
     abolutely should not take a reference to 'dev', this leads to
     leaks.  Fix from RonQing Li.

  5) Fix netfilter ctnetlink race between delete and timeout expiration.
     From Pablo Neira Ayuso.

  6) Revert SFQ change which causes regressions, specifically queueing
     to tail can lead to unavoidable flow starvation.  From Eric
     Dumazet.

  7) Fix a memory leak and a crash on corrupt firmware files in bnx2x,
     from Michal Schmidt."

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  netfilter: ctnetlink: fix race between delete and timeout expiration
  ipv6: Don't dev_hold(dev) in ip6_mc_find_dev_rcu.
  wimax/i2400m: fix erroneous NETDEV_TX_BUSY use
  net/hyperv: fix erroneous NETDEV_TX_BUSY use
  net/usbnet: reserve headroom on rx skbs
  bnx2x: fix memory leak in bnx2x_init_firmware()
  bnx2x: fix a crash on corrupt firmware file
  sch_sfq: revert dont put new flow at the end of flows
  ipv6: fix icmp6_dst_alloc()

12 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 17 Mar 2012 16:54:16 +0000 (09:54 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar.

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf tools, x86: Build perf on older user-space as well
  perf tools: Use scnprintf where applicable
  perf tools: Incorrect use of snprintf results in SEGV

12 years agonetfilter: ctnetlink: fix race between delete and timeout expiration
Pablo Neira Ayuso [Fri, 16 Mar 2012 02:00:34 +0000 (02:00 +0000)]
netfilter: ctnetlink: fix race between delete and timeout expiration

Kerin Millar reported hardlockups while running `conntrackd -c'
in a busy firewall. That system (with several processors) was
acting as backup in a primary-backup setup.

After several tries, I found a race condition between the deletion
operation of ctnetlink and timeout expiration. This patch fixes
this problem.

Tested-by: Kerin Millar <kerframil@gmail.com>
Reported-by: Kerin Millar <kerframil@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: Don't dev_hold(dev) in ip6_mc_find_dev_rcu.
RongQing.Li [Thu, 15 Mar 2012 22:54:14 +0000 (22:54 +0000)]
ipv6: Don't dev_hold(dev) in ip6_mc_find_dev_rcu.

ip6_mc_find_dev_rcu() is called with rcu_read_lock(), so don't
need to dev_hold().
With dev_hold(), not corresponding dev_put(), will lead to leak.

[ bug introduced in 96b52e61be1 (ipv6: mcast: RCU conversions) ]

Signed-off-by: RongQing.Li <roy.qing.li@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMerge branch 'akpm' (more patches from Andrew)
Linus Torvalds [Sat, 17 Mar 2012 00:14:55 +0000 (17:14 -0700)]
Merge branch 'akpm' (more patches from Andrew)

Merge some more email patches from Andrew Morton:
 "A couple of nilfs fixes"

* emailed from Andrew Morton <akpm@linux-foundation.org>:
  nilfs2: fix NULL pointer dereference in nilfs_load_super_block()
  nilfs2: clamp ns_r_segments_percentage to [1, 99]

12 years agonilfs2: fix NULL pointer dereference in nilfs_load_super_block()
Ryusuke Konishi [Sat, 17 Mar 2012 00:08:39 +0000 (17:08 -0700)]
nilfs2: fix NULL pointer dereference in nilfs_load_super_block()

According to the report from Slicky Devil, nilfs caused kernel oops at
nilfs_load_super_block function during mount after he shrank the
partition without resizing the filesystem:

 BUG: unable to handle kernel NULL pointer dereference at 00000048
 IP: [<d0d7a08e>] nilfs_load_super_block+0x17e/0x280 [nilfs2]
 *pde = 00000000
 Oops: 0000 [#1] PREEMPT SMP
 ...
 Call Trace:
  [<d0d7a87b>] init_nilfs+0x4b/0x2e0 [nilfs2]
  [<d0d6f707>] nilfs_mount+0x447/0x5b0 [nilfs2]
  [<c0226636>] mount_fs+0x36/0x180
  [<c023d961>] vfs_kern_mount+0x51/0xa0
  [<c023ddae>] do_kern_mount+0x3e/0xe0
  [<c023f189>] do_mount+0x169/0x700
  [<c023fa9b>] sys_mount+0x6b/0xa0
  [<c04abd1f>] sysenter_do_call+0x12/0x28
 Code: 53 18 8b 43 20 89 4b 18 8b 4b 24 89 53 1c 89 43 24 89 4b 20 8b 43
 20 c7 43 2c 00 00 00 00 23 75 e8 8b 50 68 89 53 28 8b 54 b3 20 <8b> 72
 48 8b 7a 4c 8b 55 08 89 b3 84 00 00 00 89 bb 88 00 00 00
 EIP: [<d0d7a08e>] nilfs_load_super_block+0x17e/0x280 [nilfs2] SS:ESP 0068:ca9bbdcc
 CR2: 0000000000000048

This turned out due to a defect in an error path which runs if the
calculated location of the secondary super block was invalid.

This patch fixes it and eliminates the reported oops.

Reported-by: Slicky Devil <slicky.dvl@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Tested-by: Slicky Devil <slicky.dvl@gmail.com>
Cc: <stable@vger.kernel.org> [2.6.30+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agonilfs2: clamp ns_r_segments_percentage to [1, 99]
Haogang Chen [Sat, 17 Mar 2012 00:08:38 +0000 (17:08 -0700)]
nilfs2: clamp ns_r_segments_percentage to [1, 99]

ns_r_segments_percentage is read from the disk.  Bogus or malicious
value could cause integer overflow and malfunction due to meaningless
disk usage calculation.  This patch reports error when mounting such
bogus volumes.

Signed-off-by: Haogang Chen <haogangchen@gmail.com>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris...
Linus Torvalds [Sat, 17 Mar 2012 00:04:02 +0000 (17:04 -0700)]
Merge branch 'for-linus' of git://git./linux/kernel/git/jmorris/linux-security

Pull maintainer update from James Morris:
 "Please pull this patch which adds Serge as maintainer of the
  capabilities code, as discussed on lwn and the lsm list.

  New capabilities must be signed off by the maintainer, and new uses of
  any capabilities should at be cc'd to the maintainer."

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  MAINTAINERS: Add Serge as maintainer of capabilities

12 years agoMerge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming
Linus Torvalds [Sat, 17 Mar 2012 00:03:15 +0000 (17:03 -0700)]
Merge tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming

Pull c6x bugfix from Mark Salter:
 "Remove dead code from entry.S which causes a build failure when using
  a newer assembler (v2.22 complains about it, v2.20 ignores it)."

* tag 'for-linus' of git://linux-c6x.org/git/projects/linux-c6x-upstreaming:
  C6X: remove dead code from entry.S

12 years agoafs: Remote abort can cause BUG in rxrpc code
Anton Blanchard [Fri, 16 Mar 2012 10:28:19 +0000 (10:28 +0000)]
afs: Remote abort can cause BUG in rxrpc code

When writing files to afs I sometimes hit a BUG:

kernel BUG at fs/afs/rxrpc.c:179!

With a backtrace of:

afs_free_call
afs_make_call
afs_fs_store_data
afs_vnode_store_data
afs_write_back_from_locked_page
afs_writepages_region
afs_writepages

The cause is:

ASSERT(skb_queue_empty(&call->rx_queue));

Looking at a tcpdump of the session the abort happens because we
are exceeding our disk quota:

rx abort fs reply store-data error diskquota exceeded (32)

So the abort error is valid. We hit the BUG because we haven't
freed all the resources for the call.

By freeing any skbs in call->rx_queue before calling afs_free_call
we avoid hitting leaking memory and avoid hitting the BUG.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoafs: Read of file returns EBADMSG
Anton Blanchard [Fri, 16 Mar 2012 10:28:07 +0000 (10:28 +0000)]
afs: Read of file returns EBADMSG

A read of a large file on an afs mount failed:

# cat junk.file > /dev/null
cat: junk.file: Bad message

Looking at the trace, call->offset wrapped since it is only an
unsigned short. In afs_extract_data:

        _enter("{%u},{%zu},%d,,%zu", call->offset, len, last, count);
...

        if (call->offset < count) {
                if (last) {
                        _leave(" = -EBADMSG [%d < %zu]", call->offset, count);
                        return -EBADMSG;
                }

Which matches the trace:

[cat   ] ==> afs_extract_data({65132},{524},1,,65536)
[cat   ] <== afs_extract_data() = -EBADMSG [0 < 65536]

call->offset went from 65132 to 0. Fix this by making call->offset an
unsigned int.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoperf report: Treat an argument as a symbol filter
Namhyung Kim [Fri, 16 Mar 2012 08:50:55 +0000 (17:50 +0900)]
perf report: Treat an argument as a symbol filter

As Ingo requested, it'd be better off treating first (and the only)
argument as a symbol filter, so that user doesn't need to input the
symbol on the dialog window on TUI.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1331887855-874-5-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf report: Add --symbol-filter option
Namhyung Kim [Fri, 16 Mar 2012 08:50:54 +0000 (17:50 +0900)]
perf report: Add --symbol-filter option

Add new --symbol-filter command line option to set appropriate filter
string.

Its short version is missing as I couldn't find an ideal one and
--filter option of perf record also has no short version.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1331887855-874-4-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf ui browser: Add 's' key to filter by symbol name
Namhyung Kim [Fri, 16 Mar 2012 08:50:53 +0000 (17:50 +0900)]
perf ui browser: Add 's' key to filter by symbol name

Now user can enter symbol name interested via ui_browser__input_window,
and perf can process it using hists__filter_by_symbol(). Giving empty
symbol (by pressing 's' followed by ENTER) will disable the filtering.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1331887855-874-3-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf ui browser: Introduce ui_browser__input_window
Namhyung Kim [Fri, 16 Mar 2012 08:50:52 +0000 (17:50 +0900)]
perf ui browser: Introduce ui_browser__input_window

The ui_browser__input_window() function is to get user's key input.
Current implementation can handle maximum 49 characters.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1331887855-874-2-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf hists: Add hists__filter_by_symbol
Namhyung Kim [Fri, 16 Mar 2012 08:50:51 +0000 (17:50 +0900)]
perf hists: Add hists__filter_by_symbol

This function will be used for simple (sub-)string matching filter based
on user input.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1331887855-874-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf tools: Do not disable members of group event
Namhyung Kim [Fri, 16 Mar 2012 08:42:20 +0000 (17:42 +0900)]
perf tools: Do not disable members of group event

When event group is enabled for forked task (i.e. no target task/cpu
was specified) all events were disabled and marked ->enable_on_exec.
However they wouldn't be counted at all since only group leader will
be enabled on exec actually.

In contrast to perf stat, perf record doesn't have a real problem
as it enables all the event before proceeding. But it needs to be
fixed anyway IMHO.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1331887340-32448-2-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf stat: Fix event grouping on forked task
Namhyung Kim [Fri, 16 Mar 2012 08:42:19 +0000 (17:42 +0900)]
perf stat: Fix event grouping on forked task

When event group is enabled for forked task (i.e. no target task was
specified) all events were disabled and marked ->enable_on_exec.
However they are not counted at all since only group leader will be
enabled on exec actually. So the result looked like below:

 $ ./perf stat --group -- sleep 1

 Performance counter stats for 'sleep 1':

          0.554926 task-clock                #    0.001 CPUs utilized
     <not counted> context-switches
     <not counted> CPU-migrations
     <not counted> page-faults
     <not counted> cycles
   <not supported> stalled-cycles-frontend
   <not supported> stalled-cycles-backend
     <not counted> instructions
     <not counted> branches
     <not counted> branch-misses

       1.001228093 seconds time elapsed

Fix it by disabling group leader only.

Cc: Ingo Molnar <mingo@elte.hu>
Cc: Namhyung Kim <namhyung@gmail.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1331887340-32448-1-git-send-email-namhyung.kim@lge.com
Signed-off-by: Namhyung Kim <namhyung.kim@lge.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf tools: Add support to specify pmu style event
Jiri Olsa [Thu, 15 Mar 2012 19:09:18 +0000 (20:09 +0100)]
perf tools: Add support to specify pmu style event

Added new event rule to the event definition grammar:

event_def: event_pmu |
           ...
event_pmu: PE_NAME '/' event_config '/'

Using this rule, event could be now specified like:
  cpu/config=1,config1=2,config2=3/u

where pmu name 'cpu' is looked up via following path:
  ${sysfs_mount}/bus/event_source/devices/${pmu}

and config options are bound to the pmu's format definiton:
  ${sysfs_mount}/bus/event_source/devices/${pmu}/format

The hardcoded config options still stays and have precedence
over any format field defined with same name.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-50d8nr94f8k4wkezutrxvthe@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf tools: Add perf pmu object to access pmu format definition
Jiri Olsa [Thu, 15 Mar 2012 19:09:17 +0000 (20:09 +0100)]
perf tools: Add perf pmu object to access pmu format definition

Adding pmu object which provides interface to pmu's sysfs
event format definition located at:
  ${sysfs_mount}/bus/event_source/devices/${pmu}/format

Following interface is exported:
  struct perf_pmu* perf_pmu__find(char *name);
  - this function returns pmu object, which is then
    passed as a handle to other interface functions

  int perf_pmu__config(struct perf_pmu *pmu, struct perf_event_attr *attr,
                       struct list_head *head_terms);
  - this function configures perf_event_attr struct based
    on pmu's format definitions and config terms data,
    containined in head_terms list.

Parser generator is used to retrive the pmu's format definition.
The generated parser is part of the patch. Added makefile rule
'pmu-parser' to generate the parser code out of the bison/flex
sources.

Added builtin test 'Test perf pmu format parsing', which could
be run like:
perf test pmu

Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-errz96u1668gj9wlop1zhpht@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf tools: Add config options support for event parsing
Jiri Olsa [Thu, 15 Mar 2012 19:09:16 +0000 (20:09 +0100)]
perf tools: Add config options support for event parsing

Adding a new rule to the event grammar to be able to specify
values of additional attributes of symbolic event.

The new syntax for event symbolic definition is:

event_legacy_symbol:  PE_NAME_SYM '/' event_config '/' |
                      PE_NAME_SYM sep_slash_dc

event_config:         event_config ',' event_term | event_term

event_term:           PE_NAME '=' PE_NAME |
                      PE_NAME '=' PE_VALUE
                      PE_NAME

sep_slash_dc: '/' | ':' |

At the moment the config options are hardcoded to be used for legacy
symbol events to define several perf_event_attr fields. It is:

  'config'   to define perf_event_attr::config
  'config1'  to define perf_event_attr::config1
  'config2'  to define perf_event_attr::config2
  'period'   to define perf_event_attr::sample_period

Legacy events could be now specified as:
  cycles/period=100000/

If term is specified without the value assignment, then 1 is
assigned by default.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-mgkavww9790jbt2jdkooyv4q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf tools: Add parser generator for events parsing
Jiri Olsa [Thu, 15 Mar 2012 19:09:15 +0000 (20:09 +0100)]
perf tools: Add parser generator for events parsing

Changing event parsing to use flex/bison parse generator.
The event syntax stays as it was.

grammar description:

events: events ',' event | event

event:  event_def PE_MODIFIER_EVENT | event_def

event_def: event_legacy_symbol sep_dc     |
           event_legacy_cache sep_dc      |
           event_legacy_breakpoint sep_dc |
           event_legacy_tracepoint sep_dc |
           event_legacy_numeric sep_dc    |
           event_legacy_raw sep_dc

event_legacy_symbol:      PE_NAME_SYM

event_legacy_cache:       PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT '-' PE_NAME_CACHE_OP_RESULT |
                          PE_NAME_CACHE_TYPE '-' PE_NAME_CACHE_OP_RESULT  |
                          PE_NAME_CACHE_TYPE

event_legacy_raw:         PE_SEP_RAW PE_VALUE

event_legacy_numeric:     PE_VALUE ':' PE_VALUE

event_legacy_breakpoint:  PE_SEP_BP ':' PE_VALUE ':' PE_MODIFIER_BP

event_breakpoint_type:    PE_MODIFIER_BPTYPE | empty

PE_NAME_SYM:              cpu-cycles|cycles                              |
                          stalled-cycles-frontend|idle-cycles-frontend   |
                          stalled-cycles-backend|idle-cycles-backend     |
                          instructions                                   |
                          cache-references                               |
                          cache-misses                                   |
                          branch-instructions|branches                   |
                          branch-misses                                  |
                          bus-cycles                                     |
                          cpu-clock                                      |
                          task-clock                                     |
                          page-faults|faults                             |
                          minor-faults                                   |
                          major-faults                                   |
                          context-switches|cs                            |
                          cpu-migrations|migrations                      |
                          alignment-faults                               |
                          emulation-faults

PE_NAME_CACHE_TYPE:       L1-dcache|l1-d|l1d|L1-data             |
                          L1-icache|l1-i|l1i|L1-instruction      |
                          LLC|L2                                 |
                          dTLB|d-tlb|Data-TLB                    |
                          iTLB|i-tlb|Instruction-TLB             |
                          branch|branches|bpu|btb|bpc            |
                          node

PE_NAME_CACHE_OP_RESULT:  load|loads|read                        |
                          store|stores|write                     |
                          prefetch|prefetches                    |
                          speculative-read|speculative-load      |
                          refs|Reference|ops|access              |
                          misses|miss

PE_MODIFIER_EVENT:        [ukhp]{0,5}

PE_MODIFIER_BP:           [rwx]

PE_SEP_BP:                'mem'

PE_SEP_RAW:               'r'

sep_dc:                   ':' |

Added flex/bison files for event grammar parsing. The generated
parser is part of the patch. Added makefile rule 'event-parser'
to generate the parser code out of the bison/flex sources.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-u4pfig5waq3ll2bfcdex8fgi@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf: Adding sysfs group format attribute for pmu device
Jiri Olsa [Thu, 15 Mar 2012 19:09:14 +0000 (20:09 +0100)]
perf: Adding sysfs group format attribute for pmu device

Adding sysfs group 'format' attribute for pmu device that
contains a syntax description on how to construct raw events.

The event configuration is described in following
struct pefr_event_attr attributes:

  config
  config1
  config2

Each sysfs attribute within the format attribute group,
describes mapping of name and bitfield definition within
one of above attributes.

eg:
  "/sys/...<dev>/format/event" contains "config:0-7"
  "/sys/...<dev>/format/umask" contains "config:8-15"
  "/sys/...<dev>/format/usr"   contains "config:16"

the attribute value syntax is:

  line:      config ':' bits
  config:    'config' | 'config1' | 'config2"
  bits:      bits ',' bit_term | bit_term
  bit_term:  VALUE '-' VALUE | VALUE

Adding format attribute definitions for x86 cpu pmus.

Acked-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/n/tip-vhdk5y2hyype9j63prymty36@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoC6X: remove dead code from entry.S
Mark Salter [Fri, 16 Mar 2012 13:27:57 +0000 (09:27 -0400)]
C6X: remove dead code from entry.S

The ENDPROC() on sys_fadvise64_c6x() in arch/c6x/kernel/entry.S is
outside of the conditional block with the matching ENTRY() macro. This
leads a newer (v2.22 vs. v2.20) assembler to complain:

  /tmp/ccGZBaPT.s: Assembler messages:
  /tmp/ccGZBaPT.s: Error: .size expression for sys_fadvise64_c6x does not evaluate to a constant

The conditional block became dead code when c6x switched to generic
unistd.h and should be removed along with the offending ENDPROC().

Signed-off-by: Mark Salter <msalter@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
12 years agogenirq: Remove paranoid warnons and bogus fixups
Thomas Gleixner [Thu, 15 Mar 2012 21:55:21 +0000 (22:55 +0100)]
genirq: Remove paranoid warnons and bogus fixups

Alexander pointed out that the warnons in the regular exit path are
bogus and the thread_mask one actually could be triggered when
__setup_irq() hands out that thread_mask again after __free_irq()
dropped irq_desc->lock.

Thinking more about it, neither IRQTF_RUNTHREAD nor the bit in
thread_mask can be set as this is the regular exit path. We come here
due to:
__free_irq()
   remove action from desc
   synchronize_irq()
   kthread_stop()

So synchronize_irq() makes sure that the thread finished running and
cleaned up both the thread_active count and thread_mask. After that
point nothing can set IRQTF_RUNTHREAD on this action. So the warnons
and the cleanups are pointless.

Reported-by: Alexander Gordeev <agordeev@redhat.com>
Cc: Ido Yariv <ido@wizery.com>
Link: http://lkml.kernel.org/r/20120315190755.GA6732@dhcp-26-207.brq.redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agowimax/i2400m: fix erroneous NETDEV_TX_BUSY use
Eric Dumazet [Wed, 14 Mar 2012 09:21:44 +0000 (09:21 +0000)]
wimax/i2400m: fix erroneous NETDEV_TX_BUSY use

A driver start_xmit() method cannot free skb and return NETDEV_TX_BUSY,
since caller is going to reuse freed skb.

In fact netif_tx_stop_queue() / netif_stop_queue() is needed before
returning NETDEV_TX_BUSY or you can trigger a ksoftirqd fatal loop.

In case of memory allocation error, only safe way is to drop the packet
and return NETDEV_TX_OK

Also increments tx_dropped counter

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Inaky Perez-Gonzalez <inaky.perez-gonzalez@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet/hyperv: fix erroneous NETDEV_TX_BUSY use
Eric Dumazet [Wed, 14 Mar 2012 08:53:34 +0000 (08:53 +0000)]
net/hyperv: fix erroneous NETDEV_TX_BUSY use

A driver start_xmit() method cannot free skb and return NETDEV_TX_BUSY,
since caller is going to reuse freed skb.

This is mostly a revert of commit bf769375c (staging: hv: fix the return
status of netvsc_start_xmit())

In fact netif_tx_stop_queue() / netif_stop_queue() is needed before
returning NETDEV_TX_BUSY or you can trigger a ksoftirqd fatal loop.

In case of memory allocation error, only safe way is to drop the packet
and return NETDEV_TX_OK

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agonet/usbnet: reserve headroom on rx skbs
Eric Dumazet [Wed, 14 Mar 2012 06:56:25 +0000 (06:56 +0000)]
net/usbnet: reserve headroom on rx skbs

network drivers should reserve some headroom on incoming skbs so that we
dont need expensive reallocations, eg forwarding packets in tunnels.

This NET_SKB_PAD padding is done in various helpers, like
__netdev_alloc_skb_ip_align() in this patch, combining NET_SKB_PAD and
NET_IP_ALIGN magic.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Oliver Neukum <oneukum@suse.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Oliver Neukum <oneukum@suse.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: fix memory leak in bnx2x_init_firmware()
Michal Schmidt [Thu, 15 Mar 2012 14:08:29 +0000 (14:08 +0000)]
bnx2x: fix memory leak in bnx2x_init_firmware()

When cycling the interface down and up, bnx2x_init_firmware() knows that
the firmware is already loaded, but nevertheless it allocates certain
arrays anew (init_data, init_ops, init_ops_offsets, iro_arr). The old
arrays are leaked.

Fix the leaks by returning early if the firmware was already loaded.
Because if the firmware is loaded, so are the arrays.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agobnx2x: fix a crash on corrupt firmware file
Michal Schmidt [Thu, 15 Mar 2012 14:08:28 +0000 (14:08 +0000)]
bnx2x: fix a crash on corrupt firmware file

If the requested firmware is deemed corrupt and then released, reset the
pointer to NULL in order to avoid double-freeing it in
bnx2x_release_firmware() or dereferencing it in bnx2x_init_firmware().

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Acked-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agosch_sfq: revert dont put new flow at the end of flows
Eric Dumazet [Tue, 13 Mar 2012 18:04:25 +0000 (18:04 +0000)]
sch_sfq: revert dont put new flow at the end of flows

This reverts commit d47a0ac7b6 (sch_sfq: dont put new flow at the end of
flows)

As Jesper found out, patch sounded great but has bad side effects.

In stress situation, pushing new flows in front of the queue can prevent
old flows doing any progress. Packets can stay in SFQ queue for
unlimited amount of time.

It's possible to add heuristics to limit this problem, but this would
add complexity outside of SFQ scope.

A more sensible answer to Dave Taht concerns (who reported the issued I
tried to solve in original commit) is probably to use a qdisc hierarchy
so that high prio packets dont enter a potentially crowded SFQ qdisc.

Reported-by: Jesper Dangaard Brouer <jdb@comx.dk>
Cc: Dave Taht <dave.taht@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoipv6: fix icmp6_dst_alloc()
Eric Dumazet [Wed, 14 Mar 2012 21:13:11 +0000 (21:13 +0000)]
ipv6: fix icmp6_dst_alloc()

commit 87a115783 ( ipv6: Move xfrm_lookup() call down into
icmp6_dst_alloc().) forgot to convert one error path, leading
to crashes in mld_sendpack()

Many thanks to Dave Jones for providing a very complete bug report.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
12 years agoMAINTAINERS: Add Serge as maintainer of capabilities
James Morris [Fri, 16 Mar 2012 01:05:48 +0000 (12:05 +1100)]
MAINTAINERS: Add Serge as maintainer of capabilities

Add Serge as maintainer of capabilities, per suggestion on LWN:
http://lwn.net/Articles/486306/

Signed-off-by: James Morris <james.l.morris@oracle.com>
12 years agoMerge branch 'akpm' (Andrew's patch-bomb)
Linus Torvalds [Fri, 16 Mar 2012 00:16:22 +0000 (17:16 -0700)]
Merge branch 'akpm' (Andrew's patch-bomb)

Merge patches from Andrew Morton:
 "Nine patches - some bug fixes and some MAINTAINERS fiddling."

* emailed from Andrew Morton <akpm@linux-foundation.org>:
  drivers/video/backlight/s6e63m0.c: fix corruption storing gamma mode
  MAINTAINERS: add entry for exynos mipi display drivers
  MAINTAINERS: fix link to Gustavo Padovans tree
  MAINTAINERS: add Johan to Bluetooth maintainers
  MAINTAINERS: Gustavo has moved
  prctl: use CAP_SYS_RESOURCE for PR_SET_MM option
  rapidio/tsi721: fix bug in register offset definitions
  MAINTAINERS: update ST's Mailing list for SPEAr
  memcg: free mem_cgroup by RCU to fix oops

12 years agoMerge branch 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvar...
Linus Torvalds [Fri, 16 Mar 2012 00:14:35 +0000 (17:14 -0700)]
Merge branch 'i2c-for-linus' of git://git./linux/kernel/git/jdelvare/staging

Pull i2c subsystem fixes from Jean Delvare.

* 'i2c-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  i2c-algo-bit: Fix spurious SCL timeouts under heavy load
  i2c-core: Comment says "transmitted" but means "received"

12 years agoMerge tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck...
Linus Torvalds [Fri, 16 Mar 2012 00:13:39 +0000 (17:13 -0700)]
Merge tag 'hwmon-for-linus' of git://git./linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck.

* tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (zl6100) Enable interval between chip accesses for all chips
  hwmon: (w83627ehf) Describe undocumented pwm attributes
  hwmon: (w83627ehf) Fix temp2 source for W83627UHG
  hwmon: (w83627ehf) Fix memory leak in probe function
  hwmon: (w83627ehf) Fix writing into fan_stop_time for NCT6775F/NCT6776F

12 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Fri, 16 Mar 2012 00:07:25 +0000 (17:07 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm exynos/intel updates from Dave Airlie:
 "Two minor updates from Jesse for Intel SNB fixes, and a few fixes from
  Samsung for exynos.  The pull req has Alan's commit in it since Intel
  based their tree on my tree at that time, but it all seems fine wrt
  merging."

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm exynos: use drm_fb_helper_set_par directly
  drm/exynos: Fix fb_videomode <-> drm_mode_modeinfo conversion
  drm/exynos: fix runtime_pm fimd device state on probe
  drm/exynos: use correct 'exynos-drm' name for platform device
  drm/i915: support 32 bit BGR formats in sprite planes
  drm/i915: fix color order for BGR formats on SNB
  drm/gma500: Fix Cedarview boot failures in 3.3-rc

12 years agoMerge branch 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab...
Linus Torvalds [Fri, 16 Mar 2012 00:06:05 +0000 (17:06 -0700)]
Merge branch 'v4l_for_linus' of git://git./linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:
 "For 4 fixes for 3.3 (all trivial):
       - uvc video driver: fixes a division by zero;
       - davinci: add module.h to fix compilation;
       - smsusb: fix the delivery system setting;
       - smsdvb: the get_frontend implementation there is broken.

  The smsdvb patch has 127 lines, but it is trivial: instead of
  returning a cache of the set_frontend (with is wrong, as it doesn't
  have the updated values for the data, and the implementation there is
  buggy), it copies the information of the detected DVB parameters from
  the smsdvb private structures into the corresponding DVBv5 struct
  fields."

* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  [media] smsdvb: fix get_frontend
  [media] smsusb: fix the default delivery system setting
  [media] media: davinci: added module.h to resolve unresolved macros
  [media] [FOR,v3.3] uvcvideo: Avoid division by 0 in timestamp calculation

12 years agoMerge branch '3.3-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target...
Linus Torvalds [Fri, 16 Mar 2012 00:04:56 +0000 (17:04 -0700)]
Merge branch '3.3-urgent' of git://git./linux/kernel/git/nab/target-pending

Pull target fixes from Nicholas Bellinger:
 "This series addresses two recently reported regression bugs related to
  legacy SCSI reservation usage in target core, and iscsi-target
  reservation conflict handling.

  The second patch in particular addresses possible data-corruption with
  SCSI reservations that is specific to iscsi-target fabric LUNs with
  multiple client writers.  Both patches need to go into v3.2 stable
  ASAP, and the branch based on the last target-pending/3.3-rc-fixes
  HEAD.

  Again, thanks to Martin Svec for his help to identify and address this
  regression bug with iscsi-target."

* '3.3-urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending:
  iscsi-target: Fix reservation conflict -EBUSY response handling bug
  target: Fix compatible reservation handling (CRH=1) with legacy RESERVE/RELEASE

12 years agodrivers/video/backlight/s6e63m0.c: fix corruption storing gamma mode
Dan Carpenter [Thu, 15 Mar 2012 22:17:12 +0000 (15:17 -0700)]
drivers/video/backlight/s6e63m0.c: fix corruption storing gamma mode

strict_strtoul() writes a long but ->gamma_mode only has space to store an
int, so on 64 bit systems we end up scribbling over ->gamma_table_count as
well.  I've changed it to use kstrtouint() instead.

Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMAINTAINERS: add entry for exynos mipi display drivers
Donghwa Lee [Thu, 15 Mar 2012 22:17:11 +0000 (15:17 -0700)]
MAINTAINERS: add entry for exynos mipi display drivers

I'd like to add Inki Dae, Donghwa Lee and Kyungmin Park as maintainers
who developers for exynos mipi display drivers for
video/driver/exynos/exynos_mipi* and include/video/exynos_mipi*.

Signed-off-by: Donghwa Lee <dh09.lee@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Cc: Florian Tobias Schandinat <FlorianSchandinat@gmx.de>
Cc: Richard Purdie <rpurdie@rpsys.net>
Cc: Kukjin Kim <kgene.kim@samsung.com>
Cc: Jingoo Han <jg1.han@samsung.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMAINTAINERS: fix link to Gustavo Padovans tree
Johan Hedberg [Thu, 15 Mar 2012 22:17:11 +0000 (15:17 -0700)]
MAINTAINERS: fix link to Gustavo Padovans tree

Gustavo's tree is called just bluetooth.git and not bluetooth-2.6.git
anymore.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: "Gustavo F. Padovan" <padovan@profusion.mobi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMAINTAINERS: add Johan to Bluetooth maintainers
Johan Hedberg [Thu, 15 Mar 2012 22:17:11 +0000 (15:17 -0700)]
MAINTAINERS: add Johan to Bluetooth maintainers

I've been coordinating Bluetooth patches in my tree for some time and
it's possible I'll do it in the future too, so add myself to the
Bluetooth sections as well as mention my tree there.

Signed-off-by: Johan Hedberg <johan.hedberg@intel.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Cc: "Gustavo F. Padovan" <padovan@profusion.mobi>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMAINTAINERS: Gustavo has moved
Gustavo Padovan [Thu, 15 Mar 2012 22:17:10 +0000 (15:17 -0700)]
MAINTAINERS: Gustavo has moved

This is going to be the primary e-mail for kernel development.

Signed-off-by: Gustavo Padovan <gustavo@padovan.org>
Cc: Johan Hedberg <johan.hedberg@gmail.com>
Cc: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoprctl: use CAP_SYS_RESOURCE for PR_SET_MM option
Cyrill Gorcunov [Thu, 15 Mar 2012 22:17:10 +0000 (15:17 -0700)]
prctl: use CAP_SYS_RESOURCE for PR_SET_MM option

CAP_SYS_ADMIN is already overloaded left and right, so to have more
fine-grained access control use CAP_SYS_RESOURCE here.

The CAP_SYS_RESOUCE is chosen because this prctl option allows a current
process to adjust some fields of memory map descriptor which rather
represents what the process owns: pointers to code, data, stack
segments, command line, auxiliary vector data and etc.

Suggested-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul Bolle <pebolle@tiscali.nl>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agorapidio/tsi721: fix bug in register offset definitions
Alexandre Bounine [Thu, 15 Mar 2012 22:17:09 +0000 (15:17 -0700)]
rapidio/tsi721: fix bug in register offset definitions

Fix indexed register offset definitions that use decimal (wrong) instead
of hexadecimal (correct) notation for indexing multipliers.

Incorrect definitions do not affect Tsi721 driver in its current default
configuration because it uses only IDB queue 0.  Loss of inbound
doorbell functionality should be observed if queue other than 0 is used.

Signed-off-by: Alexandre Bounine <alexandre.bounine@idt.com>
Cc: Matt Porter <mporter@kernel.crashing.org>
Cc: Chul Kim <chul.kim@idt.com>
Cc: <stable@vger.kernel.org> [3.2+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agoMAINTAINERS: update ST's Mailing list for SPEAr
Viresh Kumar [Thu, 15 Mar 2012 22:17:09 +0000 (15:17 -0700)]
MAINTAINERS: update ST's Mailing list for SPEAr

We have created a ST's Mailing list for SPEAr.  This can be accessed
from non-st email ids.  I want people to cc this list, when they have
changes specific to SPEAr.  So, its better to get this updated in
MAINTAINERS file.

linux-arm-kernel@lists.infradead.org is also added for SPEAr.

Signed-off-by: Viresh Kumar <viresh.kumar@st.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agomemcg: free mem_cgroup by RCU to fix oops
Hugh Dickins [Thu, 15 Mar 2012 22:17:07 +0000 (15:17 -0700)]
memcg: free mem_cgroup by RCU to fix oops

After fixing the GPF in mem_cgroup_lru_del_list(), three times one
machine running a similar load (moving and removing memcgs while
swapping) has oopsed in mem_cgroup_zone_nr_lru_pages(), when retrieving
memcg zone numbers for get_scan_count() for shrink_mem_cgroup_zone():
this is where a struct mem_cgroup is first accessed after being chosen
by mem_cgroup_iter().

Just what protects a struct mem_cgroup from being freed, in between
mem_cgroup_iter()'s css_get_next() and its css_tryget()? css_tryget()
fails once css->refcnt is zero with CSS_REMOVED set in flags, yes: but
what if that memory is freed and reused for something else, which sets
"refcnt" non-zero? Hmm, and scope for an indefinite freeze if refcnt is
left at zero but flags are cleared.

It's tempting to move the css_tryget() into css_get_next(), to make it
really "get" the css, but I don't think that actually solves anything:
the same difficulty in moving from css_id found to stable css remains.

But we already have rcu_read_lock() around the two, so it's easily fixed
if __mem_cgroup_free() just uses kfree_rcu() to free mem_cgroup.

However, a big struct mem_cgroup is allocated with vzalloc() instead of
kzalloc(), and we're not allowed to vfree() at interrupt time: there
doesn't appear to be a general vfree_rcu() to help with this, so roll
our own using schedule_work().  The compiler decently removes
vfree_work() and vfree_rcu() when the config doesn't need them.

Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Ying Han <yinghan@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 years agontp: Fix integer overflow when setting time
Sasha Levin [Thu, 15 Mar 2012 16:36:14 +0000 (12:36 -0400)]
ntp: Fix integer overflow when setting time

'long secs' is passed as divisor to div_s64, which accepts a 32bit
divisor. On 64bit machines that value is trimmed back from 8 bytes
back to 4, causing a divide by zero when the number is bigger than
(1 << 32) - 1 and all 32 lower bits are 0.

Use div64_long() instead.

Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Cc: johnstul@us.ibm.com
Link: http://lkml.kernel.org/r/1331829374-31543-2-git-send-email-levinsasha928@gmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agomath: Introduce div64_long
Sasha Levin [Thu, 15 Mar 2012 16:36:13 +0000 (12:36 -0400)]
math: Introduce div64_long

Add a div64_long macro which is used to devide a 64bit number by a long (which
can be 4 bytes on 32bit systems and 8 bytes on 64bit systems).

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
Cc: johnstul@us.ibm.com
Link: http://lkml.kernel.org/r/1331829374-31543-1-git-send-email-levinsasha928@gmail.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
12 years agoperf tools: Adjust make rules
Jan Beulich [Thu, 8 Mar 2012 09:29:28 +0000 (09:29 +0000)]
perf tools: Adjust make rules

Add rules to generate pre-processed files (just like are available for
the normal kernel build), and adjust the rule to create assembly files
from C ones to produce its output in the output directory rather than
in the source tree.

E.g.

  $ make -C tools/perf/ O=/tmp/perf /tmp/perf/builtin-top.i
  make: Entering directory `/home/git/linux/tools/perf'
      CC /tmp/perf/builtin-top.i
  make: Leaving directory `/home/git/linux/tools/perf'
  $ wc -l /tmp/perf/builtin-top.i
  31379 /tmp/perf/builtin-top.i
  $

Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/4F588A0802000078000770FF@nat28.tlf.novell.com
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoi2c-algo-bit: Fix spurious SCL timeouts under heavy load
Ville Syrjala [Thu, 15 Mar 2012 17:11:05 +0000 (18:11 +0100)]
i2c-algo-bit: Fix spurious SCL timeouts under heavy load

When the system is under heavy load, there can be a significant delay
between the getscl() and time_after() calls inside sclhi(). That delay
may cause the time_after() check to trigger after SCL has gone high,
causing sclhi() to return -ETIMEDOUT.

To fix the problem, double check that SCL is still low after the
timeout has been reached, before deciding to return -ETIMEDOUT.

Signed-off-by: Ville Syrjala <syrjala@sci.fi>
Cc: stable@vger.kernel.org
Signed-off-by: Jean Delvare <khali@linux-fr.org>
12 years agoi2c-core: Comment says "transmitted" but means "received"
Wolfram Sang [Thu, 15 Mar 2012 17:11:05 +0000 (18:11 +0100)]
i2c-core: Comment says "transmitted" but means "received"

Fix that. Also convert this and the related comment to proper commenting
style.

Signed-off-by: Wolfram Sang <w.sang@pengutronix.de>
Signed-off-by: Jean Delvare <khali@linux-fr.org>
12 years agoprintk: Make it compile with !CONFIG_PRINTK
Peter Zijlstra [Thu, 15 Mar 2012 11:35:37 +0000 (12:35 +0100)]
printk: Make it compile with !CONFIG_PRINTK

Commit 3ccf3e830615 ("printk/sched: Introduce special
printk_sched() for those awkward moments") overlooked
an #ifdef, so move code around to respect these directives.

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Randy Dunlap <rdunlap@xenotime.net>
Link: http://lkml.kernel.org/r/1331811337.18960.179.camel@twins
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agoMerge branch 'exynos-drm-fixes' of git://git.infradead.org/users/kmpark/linux-samsung...
Dave Airlie [Thu, 15 Mar 2012 09:41:26 +0000 (09:41 +0000)]
Merge branch 'exynos-drm-fixes' of git://git.infradead.org/users/kmpark/linux-samsung into drm-fixes

* 'exynos-drm-fixes' of git://git.infradead.org/users/kmpark/linux-samsung:
  drm exynos: use drm_fb_helper_set_par directly
  drm/exynos: Fix fb_videomode <-> drm_mode_modeinfo conversion
  drm/exynos: fix runtime_pm fimd device state on probe
  drm/exynos: use correct 'exynos-drm' name for platform device

12 years agodrm exynos: use drm_fb_helper_set_par directly
Sascha Hauer [Wed, 14 Mar 2012 10:44:54 +0000 (19:44 +0900)]
drm exynos: use drm_fb_helper_set_par directly

info->fix.visual already is correctly set from drm_fb_helper_fill_fix.
info->fix.line_length is also set from drm_fb_helper_fill_fix,
so drm_fb_helper_set_par directly instead of a custom
exynos_drm_fbdev_set_par.

Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
12 years agodrm/exynos: Fix fb_videomode <-> drm_mode_modeinfo conversion
Laurent Pinchart [Fri, 9 Mar 2012 00:45:21 +0000 (09:45 +0900)]
drm/exynos: Fix fb_videomode <-> drm_mode_modeinfo conversion

The fb_videomode structure stores the front porch and back porch in the
right_margin and left_margin fields respectively. right_margin should
thus be computed with hsync_start - hdisplay, and left_margin with
htotal - hsync_end. The same holds for the vertical direction.

       Active               Front           Sync            Back
       Region               Porch                           Porch
<-------------------><----------------><-------------><---------------->

  //////////////////|
 ////////////////// |
//////////////////  |..................               ..................
                                       _______________

<------ xres -------><- right_margin -><- hsync_len -><- left_margin -->

<---- hdisplay ----->
<------------ hsync_start ------------>
<--------------------- hsync_end -------------------->
<--------------------------------- htotal ----------------------------->

Fix the fb_videomode <-> drm_mode_modeinfo conversion functions
accordingly.

Signed-off-by: Laurent Pinchart <laurent.pinchart@ideasonboard.com>
Acked-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
12 years agodrm/exynos: fix runtime_pm fimd device state on probe
Marek Szyprowski [Thu, 8 Mar 2012 01:28:56 +0000 (10:28 +0900)]
drm/exynos: fix runtime_pm fimd device state on probe

A call to pm_runtime_set_active() forces device to be at the active
state and skips calling its runtime suspend/resume callbacks. This
results in a freeze with a new power domain code based on gen_pd. Fimd
driver does all required runtime power management calls, so this
pm_runtime_set_active call is buggy. This patch removes it and corrects
clock management in probe function (clocks are now enabled by
pm_runtime_get_sync() call).

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
12 years agodrm/exynos: use correct 'exynos-drm' name for platform device
Marek Szyprowski [Mon, 5 Mar 2012 11:02:30 +0000 (12:02 +0100)]
drm/exynos: use correct 'exynos-drm' name for platform device

Currently Exynos DRM driver uses DRIVER_NAME ('exynos') name for the
core platform device. This is confusing, because it doesn't refer to the
function the platform device is performing. This patch renames the
platform device to the 'exynos-drm', which matches the convention for
naming the platform devices. The name used inside DRM subsystem has not
been changed.

Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
12 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block
Linus Torvalds [Thu, 15 Mar 2012 00:16:45 +0000 (17:16 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "Been sitting on this for a while, but lets get this out the door.
  This fixes various important bugs for 3.3 final, along with a few more
  trivial ones.  Please pull!"

* 'for-linus' of git://git.kernel.dk/linux-block:
  block: fix ioc leak in put_io_context
  block, sx8: fix pointer math issue getting fw version
  Block: use a freezable workqueue for disk-event polling
  drivers/block/DAC960: fix -Wuninitialized warning
  drivers/block/DAC960: fix DAC960_V2_IOCTL_Opcode_T -Wenum-compare warning
  block: fix __blkdev_get and add_disk race condition
  block: Fix setting bio flags in drivers (sd_dif/floppy)
  block: Fix NULL pointer dereference in sd_revalidate_disk
  block: exit_io_context() should call elevator_exit_icq_fn()
  block: simplify ioc_release_fn()
  block: replace icq->changed with icq->flags

12 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator
Linus Torvalds [Thu, 15 Mar 2012 00:16:02 +0000 (17:16 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/broonie/regulator

Pull regulator fixes from Mark Brown:
 "Another small batch of driver specific bug fixes, a couple more errors
  in the da9052 driver and a bad return value in the tps6524x driver."

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator:
  regulator: da9052: Ensure the selected voltage falls within the specified range
  regulator: Set n_voltages for da9052 regulators
  regulator: Fix setting selector in tps6524x set_voltage function

12 years agoMerge branch 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux...
Linus Torvalds [Thu, 15 Mar 2012 00:13:49 +0000 (17:13 -0700)]
Merge branch 'stable' of git://git./linux/kernel/git/cmetcalf/linux-tile

Pull arch/tile update to run "make minconfig" on the tile defconfigs
from Chris Metcalf.

This removes almost three thousand lines of inane defconfig chatter.

* 'stable' of git://git.kernel.org/pub/scm/linux/kernel/git/cmetcalf/linux-tile:
  arch/tile/configs: convert to minimal configs via "make savedefconfig"

12 years agoarch/tile/configs: convert to minimal configs via "make savedefconfig"
Chris Metcalf [Wed, 14 Mar 2012 18:33:16 +0000 (14:33 -0400)]
arch/tile/configs: convert to minimal configs via "make savedefconfig"

Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
12 years agoMerge branch 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keith...
Dave Airlie [Wed, 14 Mar 2012 18:32:27 +0000 (18:32 +0000)]
Merge branch 'drm-intel-fixes' of git://git./linux/kernel/git/keithp/linux into drm-fixes

* 'drm-intel-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/keithp/linux:
  drm/i915: support 32 bit BGR formats in sprite planes
  drm/i915: fix color order for BGR formats on SNB
  drm/gma500: Fix Cedarview boot failures in 3.3-rc

12 years agoMerge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Wed, 14 Mar 2012 17:49:05 +0000 (18:49 +0100)]
Merge tag 'perf-urgent-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/urgent

Some corner case fixes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
12 years agohwmon: (zl6100) Enable interval between chip accesses for all chips
Guenter Roeck [Tue, 13 Mar 2012 16:05:14 +0000 (09:05 -0700)]
hwmon: (zl6100) Enable interval between chip accesses for all chips

Intersil reports that all chips supported by the zl6100 driver require
an interval between chip accesses, even ZL2004 and ZL6105 which were thought
to be safe.

Reported-by: Vivek Gani <vgani@intersil.com>
Cc: stable@vger.kernel.org # 3.2+
Signed-off-by: Guenter Roeck <guenter.roeck@ericsson.com>
12 years agoperf tools, x86: Build perf on older user-space as well
Ingo Molnar [Wed, 14 Mar 2012 15:42:34 +0000 (12:42 -0300)]
perf tools, x86: Build perf on older user-space as well

On ancient systems I get this build failure:

  util/../../../arch/x86/include/asm/unistd.h:67:29: error: asm/unistd_64.h: No such file or directory
  In file included from util/cache.h:7,
                   from builtin-test.c:8:
  util/../perf.h: In function ‘sys_perf_event_open’:In file included from util/../perf.h:16
  perf.h:170: error: ‘__NR_perf_event_open’ undeclared (first use in this function)

The reason is that this old system does not have the split
unistd.h headers yet, from which to pick up the syscall
definitions.

Add the syscall numbers to the already existing i386 and x86_64
blocks in perf.h, and also provide empty include file stubs.

With this patch perf builds and works fine on 5 years old
user-space as well.

Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lkml.kernel.org/n/tip-jctwg64le1w47tuaoeyftsg9@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf tools: Use scnprintf where applicable
Arnaldo Carvalho de Melo [Wed, 14 Mar 2012 15:29:29 +0000 (12:29 -0300)]
perf tools: Use scnprintf where applicable

Several places were expecting that the value returned was the number of
characters printed, not what would be printed if there was space.

Fix it by using the scnprintf and vscnprintf variants we inherited from
the kernel sources.

Some corner cases where the number of printed characters were not
accounted were fixed too.

Reported-by: Anton Blanchard <anton@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/n/tip-kwxo2eh29cxmd8ilixi2005x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoperf tools: Incorrect use of snprintf results in SEGV
Anton Blanchard [Wed, 7 Mar 2012 00:42:49 +0000 (11:42 +1100)]
perf tools: Incorrect use of snprintf results in SEGV

I have a workload where perf top scribbles over the stack and we SEGV.
What makes it interesting is that an snprintf is causing this.

The workload is a c++ gem that has method names over 3000 characters
long, but snprintf is designed to avoid overrunning buffers. So what
went wrong?

The problem is we assume snprintf returns the number of characters
written:

    ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", self->level);
...
    ret += repsep_snprintf(bf + ret, size - ret, "%s", self->ms.sym->name);

Unfortunately this is not how snprintf works. snprintf returns the
number of characters that would have been written if there was enough
space. In the above case, if the first snprintf returns a value larger
than size, we pass a negative size into the second snprintf and happily
scribble over the stack. If you have 3000 character c++ methods thats a
lot of stack to trample.

This patch fixes repsep_snprintf by clamping the value at size - 1 which
is the maximum snprintf can write before adding the NULL terminator.

I get the sinking feeling that there are a lot of other uses of snprintf
that have this same bug, we should audit them all.

Cc: David Ahern <dsahern@gmail.com>
Cc: Eric B Munson <emunson@mgebm.net>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yanmin Zhang <yanmin_zhang@linux.intel.com>
Cc: stable@kernel.org
Link: http://lkml.kernel.org/r/20120307114249.44275ca3@kryten
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
12 years agoblock: fix ioc leak in put_io_context
Xiaotian Feng [Wed, 14 Mar 2012 14:34:48 +0000 (15:34 +0100)]
block: fix ioc leak in put_io_context

When put_io_context is called, if ioc->icq_list is empty and refcount
is 1, kernel will not free the ioc.

This is caught by following kmemleak:

unreferenced object 0xffff880036349fe0 (size 216):
  comm "sh", pid 2137, jiffies 4294931140 (age 290579.412s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    01 00 01 00 ad 4e ad de ff ff ff ff 00 00 00 00  .....N..........
  backtrace:
    [<ffffffff8169f926>] kmemleak_alloc+0x26/0x50
    [<ffffffff81195a9c>] kmem_cache_alloc_node+0x1cc/0x2a0
    [<ffffffff81356b67>] create_io_context_slowpath+0x27/0x130
    [<ffffffff81356d2b>] get_task_io_context+0xbb/0xf0
    [<ffffffff81055f0e>] copy_process+0x188e/0x18b0
    [<ffffffff8105609b>] do_fork+0x11b/0x420
    [<ffffffff810247f8>] sys_clone+0x28/0x30
    [<ffffffff816d3373>] stub_clone+0x13/0x20
    [<ffffffffffffffff>] 0xffffffffffffffff

ioc should be freed if ioc->icq_list is empty.
Signed-off-by: Xiaotian Feng <dannyfeng@tencent.com>
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
12 years agoperf: Add ifdef to remove unused enum switch warnings
Jiri Olsa [Tue, 13 Mar 2012 23:03:02 +0000 (00:03 +0100)]
perf: Add ifdef to remove unused enum switch warnings

Fix for unused symbols in switch warnings.

Link: http://lkml.kernel.org/r/20120313230302.GA1514@m.redhat.com
Cc: Mark Brown <broonie@opensource.wolfsonmicro.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>