openwrt/staging/blogic.git
8 years agoperf/x86: Move perf_event_intel_pt.[ch] ...... => x86/events/intel/pt.[ch]
Borislav Petkov [Wed, 10 Feb 2016 09:55:13 +0000 (10:55 +0100)]
perf/x86: Move perf_event_intel_pt.[ch] ...... => x86/events/intel/pt.[ch]

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1455098123-11740-8-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf/x86: Move perf_event_intel_lbr.c ........ => x86/events/intel/lbr.c
Borislav Petkov [Wed, 10 Feb 2016 09:55:12 +0000 (10:55 +0100)]
perf/x86: Move perf_event_intel_lbr.c ........ => x86/events/intel/lbr.c

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1455098123-11740-7-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf/x86: Move perf_event_intel_ds.c ......... => x86/events/intel/ds.c
Borislav Petkov [Wed, 10 Feb 2016 09:55:11 +0000 (10:55 +0100)]
perf/x86: Move perf_event_intel_ds.c ......... => x86/events/intel/ds.c

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1455098123-11740-6-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf/x86: Move perf_event_intel_cstate.c ..... => x86/events/intel/cstate.c
Borislav Petkov [Wed, 10 Feb 2016 09:55:10 +0000 (10:55 +0100)]
perf/x86: Move perf_event_intel_cstate.c ..... => x86/events/intel/cstate.c

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1455098123-11740-5-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf/x86: Move perf_event_intel_cqm.c ........ => x86/events/intel/cqm.c
Borislav Petkov [Wed, 10 Feb 2016 09:55:09 +0000 (10:55 +0100)]
perf/x86: Move perf_event_intel_cqm.c ........ => x86/events/intel/cqm.c

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1455098123-11740-4-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf/x86: Move perf_event_intel.c ............ => x86/events/intel/core.c
Borislav Petkov [Wed, 10 Feb 2016 09:55:08 +0000 (10:55 +0100)]
perf/x86: Move perf_event_intel.c ............ => x86/events/intel/core.c

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1455098123-11740-3-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf/x86: Move perf_event_intel_bts.c ........ => x86/events/intel/bts.c
Borislav Petkov [Wed, 10 Feb 2016 09:55:07 +0000 (10:55 +0100)]
perf/x86: Move perf_event_intel_bts.c ........ => x86/events/intel/bts.c

Start moving the Intel bits.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1455098123-11740-2-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Wed, 17 Feb 2016 07:36:41 +0000 (08:36 +0100)]
Merge tag 'perf-core-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

 - Make 'perf record' collect CPU cache info in the perf.data file header:

  $ perf record usleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.017 MB perf.data (7 samples) ]
  $ perf report --header-only -I | tail -10 | head -8
  # CPU cache info:
  #  L1 Data                 32K [0-1]
  #  L1 Instruction          32K [0-1]
  #  L1 Data                 32K [2-3]
  #  L1 Instruction          32K [2-3]
  #  L2 Unified             256K [0-1]
  #  L2 Unified             256K [2-3]
  #  L3 Unified            4096K [0-3]
  $

  Will be used in 'perf c2c' and eventually in 'perf diff' to allow, for instance
  running the same workload in multiple machines and then when using 'diff' show
  the hardware difference. (Jiri Olsa)

 - 'perf stat' now shows shadow metrics (insn per cycle, etc) in
   interval mode too. E.g:

    # perf stat -I 1000 -e instructions,cycles sleep 1
    #         time   counts unit events
       1.000215928  519,620      instructions     #  0.69 insn per cycle
       1.000215928  752,003      cycles
    <SNIP>

Infrastructure changes:

 - libapi now can also use pr_{warning,info,debug}() and that can be
   set by tools using it (Jiri Olsa)

 - libapi adopts filename__read_str() from perf, adds sysfs__read_str() (Jiri Olsa)

 - Add check for java alternatives cmd in jvmti Makefile, so that it manages
   to automatically find the right path for the JDK devel files in Ubuntu like
   systems in addition to Fedora like ones (Stephane Eranian)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf stat: Move noise/running printing into printout
Andi Kleen [Sat, 30 Jan 2016 17:06:51 +0000 (09:06 -0800)]
perf stat: Move noise/running printing into printout

Move the running/noise printing into printout to avoid duplicated code
in the callers.

v2: Merged with other patches. Remove unnecessary hunk.
    Readd hunk that ended in earlier patch.
v3: Fix noise/running output in CSV mode
v4: Merge with later patch that also moves not supported printing.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1454173616-17710-4-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf stat: Add support for metrics in interval mode
Andi Kleen [Sat, 30 Jan 2016 17:06:50 +0000 (09:06 -0800)]
perf stat: Add support for metrics in interval mode

Now that we can modify the metrics printout functions easily, it's
straight forward to support metric printing for interval mode.  All that
is needed is to print the time stamp on every new line.  Pass the prefix
into the context and print it out.

v2: Move wrong hunk to here.

Committer note:

Before:

  [root@jouet ~]# perf stat -I 1000 -e instructions,cycles sleep 1
  #           time    counts unit events
       1.000168216   538,913      instructions
       1.000168216   748,765      cycles
       1.000660048   153,741      instructions
       1.000660048   214,066      cycles

After:

  # perf stat -I 1000 -e instructions,cycles sleep 1
  #           time    counts unit events
       1.000215928   519,620      instructions              #    0.69  insn per cycle
       1.000215928   752,003      cycles
       1.000946033   148,502      instructions              #    0.33  insn per cycle
       1.000946033   160,104      cycles

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1454173616-17710-3-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf stat: Abstract stat metrics printing
Andi Kleen [Sat, 30 Jan 2016 17:06:49 +0000 (09:06 -0800)]
perf stat: Abstract stat metrics printing

Abstract the printing of shadow metrics. Instead of every metric calling
fprintf directly and taking care of indentation, use two call backs: one
to print metrics and another to start a new line.

This will allow adding metrics to CSV mode and also using them for other
purposes.

The computation of padding is now done in the central callback, instead
of every metric doing it manually.  This makes it easier to add new
metrics.

v2: Refactor functions, printout now does more. Move
shadow printing. Improve fallback callbacks. Don't
use void * callback data.
v3: Remove unnecessary hunk. Add typedef for new_line
v4: Remove unnecessary hunk. Don't print metrics for CSV/interval
mode yet.  Move printout change to separate patch.
v5: Fix bisect bugs. Avoid bogus frontend cycles printing.
Fix indentation in different aggregation modes.
v6: Delay newline handling

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1454173616-17710-2-git-send-email-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add perf data cache feature
Jiri Olsa [Tue, 16 Feb 2016 15:01:43 +0000 (16:01 +0100)]
perf tools: Add perf data cache feature

Storing CPU cache details under perf data. It's stored as new
HEADER_CACHE feature and it's displayed under header info with -I
option:

  $ perf report --header-only -I
  ...
  # CPU cache info:
  #  L1 Data                 32K [0-1]
  #  L1 Instruction          32K [0-1]
  #  L1 Data                 32K [2-3]
  #  L1 Instruction          32K [2-3]
  #  L2 Unified             256K [0-1]
  #  L2 Unified             256K [2-3]
  #  L3 Unified            4096K [0-3]
  ...

All distinct caches are stored/displayed.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20160216150143.GA7119@krava.brq.redhat.com
[ Fixed leak on process_caches(), s/cache_level/cpu_cache_level/g ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Initialize libapi debug output
Jiri Olsa [Sun, 14 Feb 2016 16:03:45 +0000 (17:03 +0100)]
perf tools: Initialize libapi debug output

Setting libapi debug output functions to use perf functions.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1455465826-8426-5-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf debug: Rename __eprintf(va_list args) to veprintf
Arnaldo Carvalho de Melo [Tue, 16 Feb 2016 14:48:38 +0000 (11:48 -0300)]
perf debug: Rename __eprintf(va_list args) to veprintf

Adhering to the naming convention used when va_args is in a printf like
function, e.g. stdio.h.

Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-b5l3wt77ct28dcnriguxtvn6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agotools lib api fs: Add sysfs__read_str function
Jiri Olsa [Sun, 14 Feb 2016 16:03:44 +0000 (17:03 +0100)]
tools lib api fs: Add sysfs__read_str function

Adding sysfs__read_str function to ease up reading string files from
sysfs. New interface is:

  int sysfs__read_str(const char *entry, char **buf, size_t *sizep);

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1455465826-8426-4-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agotools lib api fs: Adopt filename__read_str from perf
Jiri Olsa [Sun, 14 Feb 2016 16:03:43 +0000 (17:03 +0100)]
tools lib api fs: Adopt filename__read_str from perf

We already moved similar functions in here, also it'll be useful for
sysfs__read_str addition in following patch.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1455465826-8426-3-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agotools lib api: Add debug output support
Jiri Olsa [Sun, 14 Feb 2016 16:03:42 +0000 (17:03 +0100)]
tools lib api: Add debug output support

Adding support for warning/info/debug output within libapi code. Adding
following macros:

  pr_warning(fmt, ...)
  pr_info(fmt, ...)
  pr_debug(fmt, ...)

Also adding libapi_set_print function to set above functions. This will
be used in perf to set standard debug handlers for libapi.

Adding 2 header files:
  debug.h
    - to be used outside libapi, contains
      libapi_set_print interface

  debug-internal.h
    - to be used within libapi, contains
      pr_warning/pr_info/pr_debug definitions

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1455465826-8426-2-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf jvmti: Add check for java alternatives cmd in Makefile
Stephane Eranian [Tue, 16 Feb 2016 06:37:41 +0000 (07:37 +0100)]
perf jvmti: Add check for java alternatives cmd in Makefile

This patch modifies the jvmti makefile to check if the
/usr/sbin/java-update-alternatives utility is present.  If so, then use
it, if not then use the altenatives command.

This helps handle the difference between Ubuntu and Fedora Linux
distributions.

Signed-off-by: Stephane Eranian <eranian@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1455604661-9357-1-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Tue, 16 Feb 2016 07:45:56 +0000 (08:45 +0100)]
Merge tag 'perf-core-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

  - Do not print trailing spaces in the hists browser (top, report) to
    avoid line wrapping issues when long C++ demangled functions are
    sampled (Arnaldo Carvalho de Melo)

  - Allow 'perf config' to show --system or --user settings (Taeung Song)

  - Add better warning about the need to install the audit-lib-python
    package when using perf python scripts (Taeung Song)

  - Fix symbol resolution when kernel modules files are only in the
    build id cache (~/.debug) (Wang Nan)

Build fixes:

  - Fix 'perf test' build on older systems where 'signal' is reserved (Arnaldo Carvalho de Melo)

Infrastructure changes:

  - Free the terms list_head in parse_events__free_terms(), also unlink the entries
    when deleting them (Wang Nan)

  - Fix releasing event_class in 'perf data' fixing integration with
    libbabeltrace (Wang Nan)

  - Add EXTRA_LDFLAGS option to Makefile (Zubair Lutfullah Kakakhel)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf tests: Fix build on older systems where 'signal' is reserved
Arnaldo Carvalho de Melo [Fri, 12 Feb 2016 21:30:01 +0000 (18:30 -0300)]
perf tests: Fix build on older systems where 'signal' is reserved

fixing the following problems, for instance, on RHEL6.7:

    CC       /tmp/build/perf/tests/bp_signal.o
  cc1: warnings being treated as errors
  tests/bp_signal.c: In function ‘__event’:
  tests/bp_signal.c:106: error: declaration of ‘signal’ shadows a global declaration
  /usr/include/signal.h:101: error: shadowed declaration is here
  tests/bp_signal.c: In function ‘bp_event’:
  tests/bp_signal.c:144: error: declaration of ‘signal’ shadows a global declaration
  /usr/include/signal.h:101: error: shadowed declaration is here
  tests/bp_signal.c: In function ‘wp_event’:
  tests/bp_signal.c:149: error: declaration of ‘signal’ shadows a global declaration
  /usr/include/signal.h:101: error: shadowed declaration is here
  mv: cannot stat `/tmp/build/perf/tests/.bp_signal.o.tmp': No such file or directory
  make[3]: *** [/tmp/build/perf/tests/bp_signal.o] Error 1
  make[2]: *** [tests] Error 2
  make[1]: *** [/tmp/build/perf/perf-in.o] Error 2
  make[1]: *** Waiting for unfinished jobs....

Reported-by: Vinson Lee <vlee@freedesktop.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: He Kuang <hekuang@huawei.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: pi3orama@163.com
Fixes: 8fd34e1cce18 ("perf test: Improve bp_signal")
Link: http://lkml.kernel.org/n/tip-wlpx6tik1b0jirlkw64bv400@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf data: Fix releasing event_class
Wang Nan [Fri, 5 Feb 2016 14:01:30 +0000 (14:01 +0000)]
perf data: Fix releasing event_class

A new patch in libbabeltrace [1] reveals a object leak problem in
'perf data' CTF support: perf code never releases the event_class
which is allocated in add_event() and stored in evsel's private field.

If libbabeltrace has the above patch applied, leaking event_class
prevents the writer from being destroyed and flushing metadata. For
example:

  $ perf record ls
  perf.data
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.012 MB perf.data (12 samples) ]
  $ perf data convert --to-ctf ./out.ctf
  [ perf data convert: Converted 'perf.data' into CTF data './out.ctf' ]
  [ perf data convert: Converted and wrote 0.000 MB (12 samples) ]
  $ cat ./out.ctf/metadata
  $ ls -l  ./out.ctf/metadata
  -rw-r----- 1 w00229757 mm 0 Jan 27 10:49 ./out.ctf/metadata

The correct result should be:
  ...
  $ cat ./out.ctf/metadata
  /* CTF 1.8 */

  trace {
  [SNIP]

  $ ls -l  ./out.ctf/metadata
  -rw-r----- 1 w00229757 mm 2446 Jan 27 10:52 ./out.ctf/metadata

The full story is:

Patch [1] of babeltrace redesigns its reference counting scheme. In that
patch:

 * writer <- trace (bt_ctf_writer_create)
 * trace <- stream_class (bt_ctf_trace_add_stream_class)
 * stream_class <- event_class (bt_ctf_stream_class_add_event_class)
 ('<-' means 'is a parent of')

Holding of event_class causes reference count of corresponding 'writer'
to increase through parent chain. Perf expects that 'writer' is released
(so metadata is flushed) through bt_ctf_writer_put() in
ctf_writer__cleanup(). However, since it never releases event_class, the
reference of 'writer' won't be dropped, so bt_ctf_writer_put() won't
lead to the release of writer.

Before this CTF patch, !(writer <- trace). Even with event_class leaking,
the writer ends up being released.

[1] https://github.com/efficios/babeltrace/commit/e6a8e8e4744633807083a077ff9f101eb97d9801

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1454680939-24963-6-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Rename parse_events__free_terms() to parse_events_terms__delete()
Arnaldo Carvalho de Melo [Fri, 12 Feb 2016 20:09:17 +0000 (17:09 -0300)]
perf tools: Rename parse_events__free_terms() to parse_events_terms__delete()

To follow convention used in other tools/perf/ areas. Also remove the
need to check if it is NULL before calling the destructor, again, to
follow convention that goes back to free().

Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-w6owu7rb8a46gvunlinxaqwx@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Free the terms list_head in parse_events__free_terms()
Wang Nan [Fri, 12 Feb 2016 20:01:17 +0000 (17:01 -0300)]
perf tools: Free the terms list_head in parse_events__free_terms()

Fixing a leak, since code calling parse_events__free_terms() expect it
to free the list_head too.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
[ Spun off from another patch ]
Link: http://lkml.kernel.org/r/1454680939-24963-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Use perf_event_terms__purge() for non-malloced terms
Arnaldo Carvalho de Melo [Fri, 12 Feb 2016 19:48:00 +0000 (16:48 -0300)]
perf tools: Use perf_event_terms__purge() for non-malloced terms

In these two cases, a 'perf test' entry and in the PMU code the
list_head is on the stack, so we can't use perf_event__free_terms()
(soon to be renamed to perf_event_terms__delete()), because it will
free the list_head as well.

Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-i956ryjhz97gnnqe8iqe7m7s@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Introduce parse_events_terms__purge()
Arnaldo Carvalho de Melo [Fri, 12 Feb 2016 19:43:02 +0000 (16:43 -0300)]
perf tools: Introduce parse_events_terms__purge()

Purges 'struct parse_event_term' entries from a list_head.

Some users need this because they don't allocate space for the list
head, it maybe on the stack or embedded into some other struct.

Next patch will convert users that need just purging and then the
perf_events__free_terms() routine will free the list head as well,
finally being renamed to perf_events_terms__delete().

Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/n/tip-4w3zl4ifcl0ed0j4bu3tckqp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Unlink entries from terms list
Wang Nan [Fri, 12 Feb 2016 19:31:23 +0000 (16:31 -0300)]
perf tools: Unlink entries from terms list

We were just freeing them, better unlink and init its nodes to catch
bugs faster if we keep dangling references to them.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: He Kuang <hekuang@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
[ Spun off from another patch, use list_del_init() instead of list_del() ]
Link: http://lkml.kernel.org/r/1454680939-24963-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf hists: Do column alignment on the format iterator
Arnaldo Carvalho de Melo [Thu, 11 Feb 2016 20:14:13 +0000 (17:14 -0300)]
perf hists: Do column alignment on the format iterator

We were doing column alignment in the format function for each cell,
returning a string padded with spaces so that when the next column is
printed the cursor is at its column alignment.

This ends up needlessly printing trailing spaces, do it at the format
iterator, that is where we know if it is needed, i.e. if there is more
columns to be printed.

This eliminates the need for triming lines when doing a dump using 'P'
in the TUI browser and also produces far saner results with things like
piping 'perf report' to 'less'.

Right now only the formatters for sym->name and the 'locked' column
(perf mem report), that are the ones that end up at the end of lines
in the default 'perf report', 'perf top' and 'perf mem report' tools,
the others will be done in a subsequent patch.

In the end the 'width' parameter for the formatters now mean, in
'printf' terms, the 'precision', where before it was the field 'width'.

Reported-by: Dave Jones <davej@codemonkey.org.uk>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-s7iwl2gj23w92l6tibnrcqzr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add comment explaining the repsep_snprintf function
Arnaldo Carvalho de Melo [Fri, 12 Feb 2016 14:27:51 +0000 (11:27 -0300)]
perf tools: Add comment explaining the repsep_snprintf function

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-4j67nvlfwbnkg85b969ewnkr@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf python scripting: Append examples to err msg about audit-libs-python
Taeung Song [Tue, 9 Feb 2016 11:53:10 +0000 (20:53 +0900)]
perf python scripting: Append examples to err msg about audit-libs-python

To print syscall names, the audit-libs-python package is required.. If
not installed, it prints this error string:

    # perf script syscall-counts
    Install the audit-libs-python package to get syscall names.

But the package name is different in Ubuntu, mention that in the error
message, similar to a error message of util/trace-event-scripting.c:

    # perf script syscall-counts
    Install the audit-libs-python package to get syscall names.
    For example:
      # apt-get install python-audit (Ubuntu)
      # yum install audit-libs-python (Fedora)
      etc.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1455018790-13425-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf build: Add EXTRA_LDFLAGS option to makefile
Zubair Lutfullah Kakakhel [Tue, 9 Feb 2016 13:33:38 +0000 (13:33 +0000)]
perf build: Add EXTRA_LDFLAGS option to makefile

To compile for little-endian systems, you need to pass -EL to CC and LD.

EXTRA_CFLAGS works to pass -EL to CC.
Add EXTRA_LDFLAGS to pass -EL to LD.

Signed-off-by: Zubair Lutfullah Kakakhel <Zubair.Kakakhel@imgtec.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1455024818-15842-1-git-send-email-Zubair.Kakakhel@imgtec.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf symbols: Fix symbols searching for module in buildid-cache
Wang Nan [Fri, 5 Feb 2016 14:01:27 +0000 (14:01 +0000)]
perf symbols: Fix symbols searching for module in buildid-cache

Before this patch, if a sample is triggered inside a module not in
/lib/modules/`uname -r`/, even if the module is in buildid-cache, 'perf
report' will still be unable to find the correct symbol.  For example:

  # rm -rf ~/.debug/
  # perf buildid-cache -a ./mymodule.ko
  # perf probe -m ./mymodule.ko -a get_mymodule_val
  Added new event:
    probe:get_mymodule_val (on get_mymodule_val in mymodule)

  You can now use it in all perf tools, such as:

  perf record -e probe:get_mymodule_val -aR sleep 1

  # perf record -e probe:get_mymodule_val cat /proc/mymodule
  mymodule:3
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.011 MB perf.data (1 samples) ]

  # perf report --stdio
  [SNIP]
  #
  # Overhead  Command  Shared Object     Symbol
  # ........  .......  ................  ......................
  #
    100.00%  cat      [mymodule]        [k] 0x0000000000000001

  # perf report -vvvv --stdio
  dso__load_sym: adjusting symbol: st_value: 0 sh_addr: 0 sh_offset: 0x70
  symbol__new: get_mymodule_val 0x70-0x8a
  [SNIP]

This is caused by dso__load() -> dso__load_sym(). In dso__load(), kmod
is true only when its file is found in some well know directories. All
files loaded from buildid-cache are treated as user programs. Following
dso__load_sym() set map->pgoff incorrectly.

This patch gives kernel modules in buildid-cache a chance to adjust
value of kmod. After dso__load() get the type of symbols, if it is
buildid, check the last 3 chars of original filename against '.ko', and
adjust the value of kmod if the file is a kernel module.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Cody P Schafer <dev@codyps.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jeremie Galarneau <jeremie.galarneau@efficios.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kirill Smelkov <kirr@nexedi.com>
Cc: Li Zefan <lizefan@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1454680939-24963-3-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf config: Add '--system' and '--user' options to select which config file is used
Taeung Song [Wed, 10 Feb 2016 17:51:17 +0000 (02:51 +0900)]
perf config: Add '--system' and '--user' options to select which config file is used

The '--system' option means $(sysconfdir)/perfconfig and '--user' means
$HOME/.perfconfig. If none is used, both system and user config file are
read.  E.g.:

    # perf config [<file-option>] [options]

    With an specific config file:

    # perf config --user | --system

    or both user and system config file:

    # perf config

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1455126685-32367-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agokprobes: Optimize hot path by using percpu counter to collect 'nhit' statistics
Martin KaFai Lau [Wed, 3 Feb 2016 20:28:28 +0000 (12:28 -0800)]
kprobes: Optimize hot path by using percpu counter to collect 'nhit' statistics

When doing ebpf+kprobe on some hot TCP functions (e.g.
tcp_rcv_established), the kprobe_dispatcher() function
shows up in 'perf report'.

In kprobe_dispatcher(), there is a lot of cache bouncing
on 'tk->nhit++'.  'tk->nhit' and 'tk->tp.flags' also share
the same cacheline.

perf report (cycles:pp):

8.30%  ipv4_dst_check
4.74%  copy_user_enhanced_fast_string
3.93%  dst_release
2.80%  tcp_v4_rcv
2.31%  queued_spin_lock_slowpath
2.30%  _raw_spin_lock
1.88%  mlx4_en_process_rx_cq
1.84%  eth_get_headlen
1.81%  ip_rcv_finish
~~~~
1.71%  kprobe_dispatcher
~~~~
1.55%  mlx4_en_xmit
1.09%  __probe_kernel_read

perf report after patch:

9.15%  ipv4_dst_check
5.00%  copy_user_enhanced_fast_string
4.12%  dst_release
2.96%  tcp_v4_rcv
2.50%  _raw_spin_lock
2.39%  queued_spin_lock_slowpath
2.11%  eth_get_headlen
2.03%  mlx4_en_process_rx_cq
1.69%  mlx4_en_xmit
1.19%  ip_rcv_finish
1.12%  __probe_kernel_read
1.02%  ehci_hcd_cleanup

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Acked-by: Steven Rostedt <rostedt@goodmis.org>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Kernel Team <kernel-team@fb.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1454531308-2441898-1-git-send-email-kafai@fb.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Tue, 9 Feb 2016 09:38:40 +0000 (10:38 +0100)]
Merge tag 'perf-core-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible fixes:

 - Handle spaces in file names obtained from /proc/pid/maps (Marcin Ślusarz)

New features:

 - Improved support for Java, using the JVMTI agent library to do jitdumps
   that then will be inserted in synthesized PERF_RECORD_MMAP2 events via
   'perf inject' pointed to synthesized ELF files stored in ~/.debug and
   keyed with build-ids, to allow symbol resolution and even annotation with
   source line info, see the changeset comments to see how to use it (Stephane Eranian)

Documentation changes:

 - Document mmore variables in the 'perf config' man page (Taeung Song)

Infrastructure changes:

 - Improve a bit the 'make -C tools/perf build-test' output (Arnaldo Carvalho de Melo)

 - Do 'build-test' in parallel, using 'make -j' (Arnaldo Carvalho de Melo)

 - Fix handling of 'clean' in multi-target make invokations for parallell builds (Jiri Olsa)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf/x86: Move perf_event_amd_uncore.c .... => x86/events/amd/uncore.c
Borislav Petkov [Mon, 8 Feb 2016 16:09:08 +0000 (17:09 +0100)]
perf/x86: Move perf_event_amd_uncore.c .... => x86/events/amd/uncore.c

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1454947748-28629-6-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf/x86: Move perf_event_amd_iommu.[ch] .. => x86/events/amd/iommu.[ch]
Borislav Petkov [Mon, 8 Feb 2016 16:09:07 +0000 (17:09 +0100)]
perf/x86: Move perf_event_amd_iommu.[ch] .. => x86/events/amd/iommu.[ch]

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1454947748-28629-5-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf/x86: Move perf_event_amd_ibs.c ....... => x86/events/amd/ibs.c
Borislav Petkov [Mon, 8 Feb 2016 16:09:06 +0000 (17:09 +0100)]
perf/x86: Move perf_event_amd_ibs.c ....... => x86/events/amd/ibs.c

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1454947748-28629-4-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf/x86: Move perf_event_amd.c ........... => x86/events/amd/core.c
Borislav Petkov [Mon, 8 Feb 2016 16:09:05 +0000 (17:09 +0100)]
perf/x86: Move perf_event_amd.c ........... => x86/events/amd/core.c

We distribute those in vendor subdirs, starting with .../events/amd/.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1454947748-28629-3-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf/x86: Move perf_event.c ............... => x86/events/core.c
Borislav Petkov [Mon, 8 Feb 2016 16:09:04 +0000 (17:09 +0100)]
perf/x86: Move perf_event.c ............... => x86/events/core.c

Also, keep the churn at minimum by adjusting the include "perf_event.h"
when each file gets moved.

Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: http://lkml.kernel.org/r/1454947748-28629-2-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge branch 'x86/cpu' into perf/core, to pick up dependency
Ingo Molnar [Tue, 9 Feb 2016 09:23:35 +0000 (10:23 +0100)]
Merge branch 'x86/cpu' into perf/core, to pick up dependency

Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf jit: add source line info support
Stephane Eranian [Mon, 30 Nov 2015 09:02:23 +0000 (10:02 +0100)]
perf jit: add source line info support

This patch adds source line information support to perf for jitted code.

The source line info must be emitted by the runtime, such as JVMTI.

Perf injects extract the source line info from the jitdump file and adds
the corresponding .debug_lines section in the ELF image generated for
each jitted function.

The source line enables matching any address in the profile with a
source file and line number.

The improvement is visible in perf annotate with the source code
displayed alongside the assembly code.

The dwarf code leverages the support from OProfile which is also
released under GPLv2.  Copyright 2007 OProfile authors.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-5-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: add JVMTI agent library
Stephane Eranian [Mon, 30 Nov 2015 09:02:22 +0000 (10:02 +0100)]
perf tools: add JVMTI agent library

This is a standalone JVMTI library to help  profile Java jitted code with perf
record/perf report. The library is not installed or compiled automatically by
perf Makefile. It is not used directly by perf. It is arch agnostic and has
been tested on X86 and ARM. It needs to be used with a Java runtime, such as
OpenJDK, as follows:

  $ java -agentpath:libjvmti.so .......

See the "Committer Notes" below on how to build it.

When used this way, java will generate a jitdump binary file in
$HOME/.debug/java/jit/java-jit-*

This binary dump file contains information to help symbolize and
annotate jitted code.

The jitdump information must be injected into the perf.data file
using:

  $ perf inject --jit -i perf.data -o perf.data.jitted

This injects the MMAP records to cover the jitted code and also generates
one ELF image for each jitted function. The ELF images are created in the
same subdir as the jitdump file. The MMAP records point there too.

Then, to visualize the function or asm profile, simply use the regular
perf commands:

  $ perf report -i perf.data.jitted

or

  $ perf annotate -i perf.data.jitted

JVMTI agent code adapted from the OProfile's opagent code.

This version of the JVMTI agent is using the CLOCK_MONOTONIC as the time
source to timestamp jit samples. To correlate with perf_events samples,
it needs to run on kernel 4.0.0-rc5+ or later with the following commit
from Peter Zijlstra:

  34f439278cef ("perf: Add per event clockid support")

With this patch recording jitted code is done as follows:

   $ perf record -k mono -- java -agentpath:libjvmti.so .......

 --------------------------------------------------------------------------

Committer Notes:

Extended testing instructions:

  $ cd tools/perf/jvmti/
  $ dnf install java-devel
  $ make

Then, create some simple java stuff to record some samples:

  $ cat hello.java
  public class hello {
public static void main(String[] args) {
                 System.out.println("Hello, World");
        }
  }
  $ javac hello.java
  $ java hello
  Hello, World
  $

And then record it using this jvmti thing:

  $ perf record -k mono java -agentpath:/home/acme/git/linux/tools/perf/jvmti/libjvmti.so hello
  java: jvmti: jitdump in /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jit-1908.dump
  Hello, World
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.030 MB perf.data (268 samples) ]
  $

Now lets insert the PERF_RECORD_MMAP2 records to point jitted mmaps to
files created by the agent:

  $ perf inject --jit -i perf.data -o perf.data.jitted

And finally see that it did its job:

  $ perf report -D -i perf.data.jitted | grep PERF_RECORD_MMAP2 | tail -5
  79197149129422 0xfe10 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428bd60(0x80) @ 0x40 fd:02 1840554 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-283.so
  79197149235701 0xfeb0 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428ba60(0x180) @ 0x40 fd:02 1840555 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-284.so
  79197149250558 0xff50 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428b860(0x180) @ 0x40 fd:02 1840556 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-285.so
  79197149714746 0xfff0 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428b660(0x180) @ 0x40 fd:02 1840557 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-286.so
  79197149806558 0x10090 [0xa0]: PERF_RECORD_MMAP2 1908/1923: [0x7f172428b460(0x180) @ 0x40 fd:02 1840558 1]: --xs /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-287.so
  $

So:

  $ perf report -D -i perf.data | grep PERF_RECORD_MMAP2 | wc -l
  Failed to open /tmp/perf-1908.map, continuing without symbols
  21
  $ perf report -D -i perf.data.jitted | grep PERF_RECORD_MMAP2 | wc -l
  307
  $ echo $((307 - 21))
  286
  $

286 extra PERF_RECORD_MMAP2 records.

All for thise tiny, with just one function, ELF files:

  $ file /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-9.so
  /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-9.so: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), corrupted program header size, BuildID[sha1]=ae54a2ebc3ecf0ba547bfc8cabdea1519df5203f, not stripped
  $ readelf -sw /home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-9.so

  Symbol table '.symtab' contains 2 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 0000000000000040     9 FUNC    LOCAL  DEFAULT    1 atomic_cmpxchg_long
  $

Inserted into the build-id cache:

  $ ls -la ~/.debug/.build-id/ae/54a2ebc3ecf0ba547bfc8cabdea1519df5203f
  lrwxrwxrwx. 1 acme acme 111 Feb  5 11:30 /home/acme/.debug/.build-id/ae/54a2ebc3ecf0ba547bfc8cabdea1519df5203f -> ../../home/acme/.debug/jit/java-jit-20160205.XXWIEDls/jitted-1908-9.so/ae54a2ebc3ecf0ba547bfc8cabdea1519df5203f

Note: check why 'file' reports that 'corrupted program header size'.

With a stupid java hog to do some profiling:

$ cat hog.java
  public class hog {
private static double do_something_else(int i) {
double total = 0;
while (i > 0) {
total += Math.log(i--);
}
return total;
}
private static double do_something(int i) {
double total = 0;
while (i > 0) {
total += Math.sqrt(i--) + do_something_else(i / 100);
}
return total;
}
public static void main(String[] args) {
System.out.println(String.format("%s=%f & %f", args[0],
   do_something(Integer.parseInt(args[0])),
   do_something_else(Integer.parseInt(args[1]))));
}
  }
  $ javac hog.java
  $ perf record -F 10000 -g -k mono java -agentpath:/home/acme/git/linux/tools/perf/jvmti/libjvmti.so hog 100000 2345000
  java: jvmti: jitdump in /home/acme/.debug/jit/java-jit-20160205.XX4sqd14/jit-8670.dump
  100000=291561592.669602 & 32050989.778714
  [ perf record: Woken up 6 times to write data ]
  [ perf record: Captured and wrote 1.536 MB perf.data (12538 samples) ]
  $ perf inject --jit -i perf.data -o perf.data.jitted

Looking at the 'perf report' TUI, at one expanded callchain leading
to the jitted code:

  $ perf report --no-children -i perf.data.jitted

Samples: 12K of event 'cycles:pp', Event count (approx.): 3829569932
  Overhead  Comm  Shared Object       Symbol
-   93.38%  java  jitted-8670-291.so  [.] class hog.do_something_else(int)
     class hog.do_something_else(int)
   - Interpreter
      - 75.86% call_stub
           JavaCalls::call_helper
           jni_invoke_static
           jni_CallStaticVoidMethod
           JavaMain
           start_thread
      - 17.52% JavaCalls::call_helper
           jni_invoke_static
           jni_CallStaticVoidMethod
           JavaMain
           start_thread

Signed-off-by: Stephane Eranian <eranian@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-4-git-send-email-eranian@google.com
[ Made it build on fedora23, added some build/usage instructions ]
[ Check if filename != NULL in compiled_method_load_cb, fixing segfault ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf inject: Add jitdump mmap injection support
Stephane Eranian [Mon, 30 Nov 2015 09:02:21 +0000 (10:02 +0100)]
perf inject: Add jitdump mmap injection support

This patch adds a --jit/-j option to perf inject.

This options injects MMAP records into the perf.data file to cover the
jitted code mmaps. It also emits ELF images for each function in the
jidump file.  Those images are created where the jitdump file is.  The
MMAP records point to that location as well.

Typical flow:

  $ perf record -k mono -- java -agentpath:libpjvmti.so java_class
  $ perf inject --jit -i perf.data -o perf.data.jitted
  $ perf report -i perf.data.jitted

Note that jitdump.h support is not limited to Java, it works with any
jitted environment modified to emit the jitdump file format, include
those where code can be jitted multiple times and moved around.

The jitdump.h format is adapted from the Oprofile project.

The genelf.c (ELF binary generation) depends on MD5 hash encoding for
the buildid. To enable this, libssl-dev must be installed. If not, then
genelf.c defaults to using urandom to generate the buildid, which is not
ideal.  The Makefile auto-detects the presence on libssl-dev.

This version mmaps the jitdump file to create a marker MMAP record in
the perf.data file. The marker is used to detect jitdump and cause perf
inject to inject the jitted mmaps and generate ELF images for jitted
functions.

In V8, the following fixes and changes were made among other things:

  -  the jidump header format include a new flags field to be used
     to carry information about the configuration of the runtime agent.
     Contributed by: Adrian Hunter <adrian.hunter@intel.com>

  - Fix mmap pgoff: MMAP event pgoff must be the offset within the ELF file
    at which the code resides.
    Contributed by: Adrian Hunter <adrian.hunter@intel.com>

  - Fix ELF virtual addresses: perf tools expect the ELF virtual addresses of dynamic
    objects to match the file offset.
    Contributed by: Adrian Hunter <adrian.hunter@intel.com>

  - JIT MMAP injection does not obey finished_round semantics. JIT MMAP injection injects all
    MMAP events in one go, so it does not obey finished_round semantics, so drop the
    finished_round events from the output perf.data file.
    Contributed by: Adrian Hunter <adrian.hunter@intel.com>

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-3-git-send-email-eranian@google.com
[ Moved inject.build_ids ordering bits to a separate patch, fixed the NO_LIBELF=1 build ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf inject: Make sure mmap records are ordered when injecting build_ids
Arnaldo Carvalho de Melo [Fri, 22 Jan 2016 21:41:00 +0000 (18:41 -0300)]
perf inject: Make sure mmap records are ordered when injecting build_ids

To make sure the mmap records are ordered correctly and so that the
correct especially due to jitted code mmaps.

We cannot generate the buildid hit list and inject the jit mmaps (will
come right after this patch) in at the same time for now.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-3-git-send-email-eranian@google.com
[ Carved out from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf build: Add libcrypto feature detection
Stephane Eranian [Mon, 30 Nov 2015 09:02:21 +0000 (10:02 +0100)]
perf build: Add libcrypto feature detection

Will be used to generate build-ids in the jitdump code.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-3-git-send-email-eranian@google.com
[ tools/perf/Makefile.perf comment about NO_LIBCRYPTO and added it to tests/make ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf symbols: add Java demangling support
Stephane Eranian [Mon, 30 Nov 2015 09:02:20 +0000 (10:02 +0100)]
perf symbols: add Java demangling support

Add Java function descriptor demangling support.  Something bfd cannot
do.

Use the JAVA_DEMANGLE_NORET flag to avoid decoding the return type of
functions.

Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Carl Love <cel@us.ibm.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John McCutchan <johnmccutchan@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pawel Moll <pawel.moll@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sonny Rao <sonnyrao@chromium.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1448874143-7269-2-git-send-email-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: handle spaces in file names obtained from /proc/pid/maps
Marcin Ślusarz [Tue, 19 Jan 2016 19:03:03 +0000 (20:03 +0100)]
perf tools: handle spaces in file names obtained from /proc/pid/maps

Steam frequently puts game binaries in folders with spaces.

Note: "(deleted)" markers are now treated as part of the file name.

Signed-off-by: Marcin Ślusarz <marcin.slusarz@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Fixes: 6064803313ba ("perf tools: Use sscanf for parsing /proc/pid/maps")
Link: http://lkml.kernel.org/r/20160119190303.GA17579@marcin-Inspiron-7720
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf build tests: Do parallell builds with 'build-test'
Arnaldo Carvalho de Melo [Wed, 3 Feb 2016 20:28:45 +0000 (17:28 -0300)]
perf build tests: Do parallell builds with 'build-test'

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-jhmnf9g7y9ryqcjql00unk5y@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Fix parallel build including 'clean' target
Jiri Olsa [Thu, 4 Feb 2016 11:30:36 +0000 (12:30 +0100)]
perf tools: Fix parallel build including 'clean' target

Do not parallelize 'clean' with other targets, figure out if it is
present and do it first, then the other targets.

Noticed with:

  tools/perf> make -j24 clean all

   LD       arch/libperf-in.o
   LD       plugin_xen-in.o
 arch//libperf-in.o: file not recognized: File truncated
 make[3]: *** [arch/libperf-in.o] Error 1
 make[2]: *** [arch] Error 2
 make[2]: *** Waiting for unfinished jobs....
   AR       libapi.a

Reported-and-Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-kb0qs29zbz7hxn32mc5zbsoz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf config: Document 'record.build-id' variable in man page
Taeung Song [Thu, 4 Feb 2016 09:25:13 +0000 (18:25 +0900)]
perf config: Document 'record.build-id' variable in man page

Explain 'record.build-id' variable.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-9-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf config: Document 'kmem.default' variable in man page
Taeung Song [Thu, 4 Feb 2016 09:25:12 +0000 (18:25 +0900)]
perf config: Document 'kmem.default' variable in man page

Explain 'kmem.default' variable.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-8-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf config: Document 'pager.<subcommand>' variables in man page
Taeung Song [Thu, 4 Feb 2016 09:25:11 +0000 (18:25 +0900)]
perf config: Document 'pager.<subcommand>' variables in man page

Explain 'pager.<subcommand>' variables.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-7-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf config: Document 'man.viewer' variable in man page
Taeung Song [Thu, 4 Feb 2016 09:25:10 +0000 (18:25 +0900)]
perf config: Document 'man.viewer' variable in man page

Explain 'man.viewer' variable and how to add new man viewer tools.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-6-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf config: Document 'top.children' variable in man page
Taeung Song [Thu, 4 Feb 2016 09:25:09 +0000 (18:25 +0900)]
perf config: Document 'top.children' variable in man page

Explain 'top.children' variable.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-5-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf config: Document variables for 'report' section in man page
Taeung Song [Thu, 4 Feb 2016 09:25:08 +0000 (18:25 +0900)]
perf config: Document variables for 'report' section in man page

Explain 'report' section's variables:

  'percent-limit', 'queue-size' and 'children'.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-4-git-send-email-treeze.taeung@gmail.com
[ Fix some grammar issues, add some more info ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf config: Document variables for 'call-graph' section in man page
Taeung Song [Thu, 4 Feb 2016 09:25:07 +0000 (18:25 +0900)]
perf config: Document variables for 'call-graph' section in man page

Explain 'call-graph' section and its variables:

  'record-mode', 'dump-size', 'print-type', 'order', 'sort-key',
  'threshold' and 'print-limit'.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-3-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf config: Document 'ui.show-headers' variable in man page
Taeung Song [Thu, 4 Feb 2016 09:25:06 +0000 (18:25 +0900)]
perf config: Document 'ui.show-headers' variable in man page

This option controls display of column headers (like 'Overhead' and
'Symbol') in 'report' and 'top'. If this option is false, they are
hidden.  This option is only applied to TUI.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1454577913-16401-2-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf build tests: Move the feature related vars to the front of the make cmdline
Arnaldo Carvalho de Melo [Wed, 3 Feb 2016 20:24:11 +0000 (17:24 -0300)]
perf build tests: Move the feature related vars to the front of the make cmdline

So that we do less visual searching on the 'make build-test' output to
see the feature related variables:

After:

  $ make -C tools/perf build-test
  <SNIP>
           make_no_newt_O: cd . && make NO_NEWT=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP O=/tmp/tmp.dz55IX DESTDIR=/tmp/tmp.X29xxo
              make_tags_O: cd . && make tags FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP O=/tmp/tmp.6ecLh8 DESTDIR=/tmp/tmp.6vIla578Ho
  make_util_pmu_bison_o_O: cd . && make util/pmu-bison.o FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP O=/tmp/tmp.SVPM2G DESTDIR=/tmp/tmp.C0oAam

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-dx4krgzqa566v1pedrbrcchi@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf build tests: Elide "-f Makefile" from make invokation
Arnaldo Carvalho de Melo [Wed, 3 Feb 2016 20:16:32 +0000 (17:16 -0300)]
perf build tests: Elide "-f Makefile" from make invokation

Since this is the name that 'make' will look for if no explicit -f file
is passed.

This in turn makes the output of 'build-test' more compact:

Before:

  $ perf stat make -C tools/perf build-test
  <SNIP>
  cd . && make FEATURE_DUMP_COPY=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP feature-dump
          make_no_libaudit_O: cd . && make -f Makefile  O=/tmp/tmp.tHIa0Kkk2Y DESTDIR=/tmp/tmp.foK7rckkVi NO_LIBAUDIT=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP
  <SNIP>

After:

  $ perf stat make -C tools/perf build-test
  <SNIP>
          make_no_libaudit_O: cd . && make  O=/tmp/tmp.tHIa0Kkk2Y DESTDIR=/tmp/tmp.foK7rckkVi NO_LIBAUDIT=1 FEATURES_DUMP=/home/acme/git/linux/tools/perf/BUILD_TEST_FEATURE_DUMP
  <SNIP>

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-m440lb8dkfsywsyah0htif6t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Thu, 4 Feb 2016 07:58:01 +0000 (08:58 +0100)]
Merge tag 'perf-core-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

New features:

 - Add 'L' hotkey to dynamicly set the percent threshold for histogram
   entries and callchains, i.e. dynamicly do what the --percent-limit
   command line option to 'top' and 'report' does. (Namhyung Kim)

Infrastructure changes:

 - Per hists field and sort lists, that will be used, for instance,
   in the c2c tool (Jiri Olsa)

Documentation changes:

 - Update documentation of --sort and --perf-limit options
   for 'perf report' (Namhyung Kim)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge branch 'perf/urgent' into perf/core, to pick up fixes
Ingo Molnar [Thu, 4 Feb 2016 07:57:44 +0000 (08:57 +0100)]
Merge branch 'perf/urgent' into perf/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge tag 'perf-urgent-for-mingo-2' of git://git.kernel.org/pub/scm/linux/kernel...
Ingo Molnar [Thu, 4 Feb 2016 07:56:54 +0000 (08:56 +0100)]
Merge tag 'perf-urgent-for-mingo-2' of git://git./linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fix from Arnaldo Carvalho de Melo:

  - Fix 'perf stat' interval output values (Jiri Olsa)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Thu, 4 Feb 2016 07:55:00 +0000 (08:55 +0100)]
Merge tag 'perf-urgent-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

 - tracepoint_error() can receive e=NULL, robustify it, fixes a problem noticed
   with a very specific combination: Machine with Intel PT (e.g. Broadwell),
   kernel with no perf_event_attr.context_switch feature (e.g. 4.2) and unreadable
   tracefs (for instance !root users), making the fallback from
   perf_event_attr.context_switch to the sched:sched_switch tracepoint to fail
   reading its info from tracefs, fix it. (Adrian Hunter)

 - Fix segfault in intel PT, by making it follow the 'struct thread' lifetime cycle
   checking expectations, noticed for instance, when processing perf.data files with
   Intel PT data using 'perf script' and when exiting 'perf report' (Adrian Hunter)

 - Fix CFI usage from .eh_frame and .debug_frame, which sometimes requires that we
   fallback from .eh_frame to .debug_frame in architectures such as PowerPC (Hemant Kumar)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf stat: Fix interval output values
Jiri Olsa [Wed, 3 Feb 2016 07:43:56 +0000 (08:43 +0100)]
perf stat: Fix interval output values

We broke interval data displays with commit:

  3f416f22d1e2 ("perf stat: Do not clean event's private stats")

This commit removed stats cleaning, which is important for '-r' option
to carry counters data over the whole run. But it's necessary to clean
it for interval mode, otherwise the displayed value is avg of all
previous values.

Before:
  $ perf stat -e cycles -a -I 1000 record
  #           time             counts unit events
       1.000240796         75,216,287      cycles
       2.000512791        107,823,524      cycles

  $ perf stat report
  #           time             counts unit events
       1.000240796         75,216,287      cycles
       2.000512791         91,519,906      cycles

Now:
  $ perf stat report
  #           time             counts unit events
       1.000240796         75,216,287      cycles
       2.000512791        107,823,524      cycles

Notice the second value being bigger (91,.. < 107,..).

This could be easily verified by using perf script which displays raw
stat data:

  $ perf script
  CPU  THREAD       VAL         ENA         RUN        TIME EVENT
    0      -1  23855779  1000209530  1000209530  1000240796 cycles
    1      -1  33340397  1000224964  1000224964  1000240796 cycles
    2      -1  15835415  1000226695  1000226695  1000240796 cycles
    3      -1   2184696  1000228245  1000228245  1000240796 cycles
    0      -1  97014312  2000514533  2000514533  2000512791 cycles
    1      -1  46121497  2000543795  2000543795  2000512791 cycles
    2      -1  32269530  2000543566  2000543566  2000512791 cycles
    3      -1   7634472  2000544108  2000544108  2000512791 cycles

The sum of the first 4 values is the first interval aggregated value:

  23855779 + 33340397 + 15835415 + 2184696 = 75,216,287

The sum of the second 4 values minus first value is the second interval
aggregated value:

  97014312 + 46121497 + 32269530 + 7634472 - 75216287 = 107,823,524

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/1454485436-20639-1-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Wed, 3 Feb 2016 18:10:02 +0000 (10:10 -0800)]
Merge branch 'akpm' (patches from Andrew)

Merge fixes from Andrew Morton:
 "18 fixes"

[ The 18 fixes turned into 17 commits, because one of the fixes was a
  fix for another patch in the series that I just folded in by editing
  the patch manually - hopefully correctly     - Linus ]

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm: fix memory leak in copy_huge_pmd()
  drivers/hwspinlock: fix race between radix tree insertion and lookup
  radix-tree: fix race in gang lookup
  mm/vmpressure.c: fix subtree pressure detection
  mm: polish virtual memory accounting
  mm: warn about VmData over RLIMIT_DATA
  Documentation: cgroup-v2: add memory.stat::sock description
  mm: memcontrol: drop superfluous entry in the per-memcg stats array
  drivers/scsi/sg.c: mark VMA as VM_IO to prevent migration
  proc: revert /proc/<pid>/maps [stack:TID] annotation
  numa: fix /proc/<pid>/numa_maps for hugetlbfs on s390
  MAINTAINERS: update Seth email
  ocfs2/cluster: fix memory leak in o2hb_region_release
  lib/test-string_helpers.c: fix and improve string_get_size() tests
  thp: limit number of object to scan on deferred_split_scan()
  thp: change deferred_split_count() to return number of THP in queue
  thp: make split_queue per-node

8 years agoMerge tag 'for-linus-4.5-2' of git://git.code.sf.net/p/openipmi/linux-ipmi
Linus Torvalds [Wed, 3 Feb 2016 18:04:58 +0000 (10:04 -0800)]
Merge tag 'for-linus-4.5-2' of git://git.code.sf.net/p/openipmi/linux-ipmi

Pull IPMI fix from Corey Minyard:
 "Fix a compile error on IPMI when ACPI is disabled"

* tag 'for-linus-4.5-2' of git://git.code.sf.net/p/openipmi/linux-ipmi:
  ipmi: put acpi.h with the other headers

8 years agoMerge tag 'devicetree-fixes-for-4.5' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 3 Feb 2016 17:55:50 +0000 (09:55 -0800)]
Merge tag 'devicetree-fixes-for-4.5' of git://git./linux/kernel/git/robh/linux

Pull DeviceTree fixes from Rob Herring:

 - Fix build error with *_OF_DECLARE() when used in modules

 - Add missing platform maintainers for dts files in MAINTAINERS

* tag 'devicetree-fixes-for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  of: drop symbols declared by _OF_DECLARE() from modules
  MAINTAINERS: Add missing platform maintainers for dts files

8 years agoMerge tag 'nfs-for-4.5-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Linus Torvalds [Wed, 3 Feb 2016 17:36:41 +0000 (09:36 -0800)]
Merge tag 'nfs-for-4.5-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfix and cleanup from Trond Myklebust:
 "Bugfix:
   - pNFS: Fix for missing layoutreturn calls

  Cleanup:
   - pNFS: rename NFS_LAYOUT_RETURN_BEFORE_CLOSE for code clarity"

* tag 'nfs-for-4.5-3' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  NFS: Cleanup - rename NFS_LAYOUT_RETURN_BEFORE_CLOSE
  pNFS: Fix missing layoutreturn calls

8 years agoMerge tag 'trace-v4.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Wed, 3 Feb 2016 17:31:34 +0000 (09:31 -0800)]
Merge tag 'trace-v4.5-rc1-2' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fix from Steven Rostedt:
 "A cleanup to the stack tracer broke stack tracing on s390.  Here's a
  simple fix to correct that issue"

* tag 'trace-v4.5-rc1-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing/stacktrace: Show entire trace if passed in function not found

8 years agomm: retire GUP WARN_ON_ONCE that outlived its usefulness
Hugh Dickins [Sun, 31 Jan 2016 02:03:16 +0000 (18:03 -0800)]
mm: retire GUP WARN_ON_ONCE that outlived its usefulness

Trinity is now hitting the WARN_ON_ONCE we added in v3.15 commit
cda540ace6a1 ("mm: get_user_pages(write,force) refuse to COW in shared
areas").  The warning has served its purpose, nobody was harmed by that
change, so just remove the warning to generate less noise from Trinity.

Which reminds me of the comment I wrongly left behind with that commit
(but was spotted at the time by Kirill), which has since moved into a
separate function, and become even more obscure: delete it.

Reported-by: Dave Jones <davej@codemonkey.org.uk>
Suggested-by: Kirill A. Shutemov <kirill@shutemov.name>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoipmi: put acpi.h with the other headers
Tony Camuso [Tue, 2 Feb 2016 18:57:24 +0000 (13:57 -0500)]
ipmi: put acpi.h with the other headers

Enclosing '#include <linux/acpi.h>' within '#ifdef CONFIG_ACPI' is
unnecessary, since it has its own conditional compile for CONFIG_ACPI.

Commit 0fbcf4af7c83 ("ipmi: Convert the IPMI SI ACPI handling to a
platform device") exposed this as a problem for platforms that do not
support ACPI when it introduced a call to ACPI_PTR() macro outside of
the CONFIG_ACPI conditional compile. This would have been perfectly
acceptable if acpi.h were not conditionally excluded for the non-acpi
platform, because the conditional compile within acpi.h defines
ACPI_PTR() to return NULL when compiled for non acpi platforms.

Signed-off-by: Tony Camuso <tcamuso@redhat.com>
Fixed commit reference in header to conform to standard.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
8 years agomm: fix memory leak in copy_huge_pmd()
Matthew Wilcox [Wed, 3 Feb 2016 00:57:57 +0000 (16:57 -0800)]
mm: fix memory leak in copy_huge_pmd()

We allocate a pgtable but do not attach it to anything if the PMD is in
a DAX VMA, causing it to leak.

We certainly try to not free pgtables associated with the huge zero page
if the zero page is in a DAX VMA, so I think this is the right solution.
This needs to be properly audited.

Signed-off-by: Matthew Wilcox <matthew.r.wilcox@intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodrivers/hwspinlock: fix race between radix tree insertion and lookup
Matthew Wilcox [Wed, 3 Feb 2016 00:57:55 +0000 (16:57 -0800)]
drivers/hwspinlock: fix race between radix tree insertion and lookup

of_hwspin_lock_get_id() is protected by the RCU lock, which means that
insertions can occur simultaneously with the lookup.  If the radix tree
transitions from a height of 0, we can see a slot with the indirect_ptr
bit set, which will cause us to at least read random memory, and could
cause other havoc.

Fix this by using the newly introduced radix_tree_iter_retry().

Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoradix-tree: fix race in gang lookup
Matthew Wilcox [Wed, 3 Feb 2016 00:57:52 +0000 (16:57 -0800)]
radix-tree: fix race in gang lookup

If the indirect_ptr bit is set on a slot, that indicates we need to redo
the lookup.  Introduce a new function radix_tree_iter_retry() which
forces the loop to retry the lookup by setting 'slot' to NULL and
turning the iterator back to point at the problematic entry.

This is a pretty rare problem to hit at the moment; the lookup has to
race with a grow of the radix tree from a height of 0.  The consequences
of hitting this race are that gang lookup could return a pointer to a
radix_tree_node instead of a pointer to whatever the user had inserted
in the tree.

Fixes: cebbd29e1c2f ("radix-tree: rewrite gang lookup using iterator")
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Ohad Ben-Cohen <ohad@wizery.com>
Cc: Konstantin Khlebnikov <khlebnikov@openvz.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm/vmpressure.c: fix subtree pressure detection
Vladimir Davydov [Wed, 3 Feb 2016 00:57:49 +0000 (16:57 -0800)]
mm/vmpressure.c: fix subtree pressure detection

When vmpressure is called for the entire subtree under pressure we
mistakenly use vmpressure->scanned instead of vmpressure->tree_scanned
when checking if vmpressure work is to be scheduled.  This results in
suppressing all vmpressure events in the legacy cgroup hierarchy.  Fix it.

Fixes: 8e8ae645249b ("mm: memcontrol: hook up vmpressure to socket pressure")
Signed-off-by: Vladimir Davydov <vdavydov@virtuozzo.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: polish virtual memory accounting
Konstantin Khlebnikov [Wed, 3 Feb 2016 00:57:46 +0000 (16:57 -0800)]
mm: polish virtual memory accounting

* add VM_STACK as alias for VM_GROWSUP/DOWN depending on architecture
* always account VMAs with flag VM_STACK as stack (as it was before)
* cleanup classifying helpers
* update comments and documentation

Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Tested-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: warn about VmData over RLIMIT_DATA
Konstantin Khlebnikov [Wed, 3 Feb 2016 00:57:43 +0000 (16:57 -0800)]
mm: warn about VmData over RLIMIT_DATA

This patch provides a way of working around a slight regression
introduced by commit 84638335900f ("mm: rework virtual memory
accounting").

Before that commit RLIMIT_DATA have control only over size of the brk
region.  But that change have caused problems with all existing versions
of valgrind, because it set RLIMIT_DATA to zero.

This patch fixes rlimit check (limit actually in bytes, not pages) and
by default turns it into warning which prints at first VmData misuse:

  "mmap: top (795): VmData 516096 exceed data ulimit 512000.  Will be forbidden soon."

Behavior is controlled by boot param ignore_rlimit_data=y/n and by sysfs
/sys/module/kernel/parameters/ignore_rlimit_data.  For now it set to "y".

[akpm@linux-foundation.org: tweak kernel-parameters.txt text[
Signed-off-by: Konstantin Khlebnikov <koct9i@gmail.com>
Link: http://lkml.kernel.org/r/20151228211015.GL2194@uranus
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cyrill Gorcunov <gorcunov@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Cc: Kees Cook <keescook@google.com>
Cc: Willy Tarreau <w@1wt.eu>
Cc: Pavel Emelyanov <xemul@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoDocumentation: cgroup-v2: add memory.stat::sock description
Johannes Weiner [Wed, 3 Feb 2016 00:57:41 +0000 (16:57 -0800)]
Documentation: cgroup-v2: add memory.stat::sock description

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: memcontrol: drop superfluous entry in the per-memcg stats array
Johannes Weiner [Wed, 3 Feb 2016 00:57:38 +0000 (16:57 -0800)]
mm: memcontrol: drop superfluous entry in the per-memcg stats array

MEM_CGROUP_STAT_NSTATS is just a delimiter for cgroup1 statistics, not
an actual array entry.  Reuse it for the first cgroup2 stat entry, like
in the event array.

Fixes: b2807f07f4f8 ("mm: memcontrol: add "sock" to cgroup2 memory.stat")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Vladimir Davydov <vdavydov@virtuozzo.com>
Cc: Michal Hocko <mhocko@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agodrivers/scsi/sg.c: mark VMA as VM_IO to prevent migration
Kirill A. Shutemov [Wed, 3 Feb 2016 00:57:35 +0000 (16:57 -0800)]
drivers/scsi/sg.c: mark VMA as VM_IO to prevent migration

Reduced testcase:

    #include <fcntl.h>
    #include <unistd.h>
    #include <sys/mman.h>
    #include <numaif.h>

    #define SIZE 0x2000

    int main()
    {
        int fd;
        void *p;

        fd = open("/dev/sg0", O_RDWR);
        p = mmap(NULL, SIZE, PROT_EXEC, MAP_PRIVATE | MAP_LOCKED, fd, 0);
        mbind(p, SIZE, 0, NULL, 0, MPOL_MF_MOVE);
        return 0;
    }

We shouldn't try to migrate pages in sg VMA as we don't have a way to
update Sg_scatter_hold::pages accordingly from mm core.

Let's mark the VMA as VM_IO to indicate to mm core that the VMA is not
migratable.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Doug Gilbert <dgilbert@interlog.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Shiraz Hashim <shashim@codeaurora.org>
Cc: Hugh Dickins <hughd@google.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: syzkaller <syzkaller@googlegroups.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoproc: revert /proc/<pid>/maps [stack:TID] annotation
Johannes Weiner [Wed, 3 Feb 2016 00:57:29 +0000 (16:57 -0800)]
proc: revert /proc/<pid>/maps [stack:TID] annotation

Commit b76437579d13 ("procfs: mark thread stack correctly in
proc/<pid>/maps") added [stack:TID] annotation to /proc/<pid>/maps.

Finding the task of a stack VMA requires walking the entire thread list,
turning this into quadratic behavior: a thousand threads means a
thousand stacks, so the rendering of /proc/<pid>/maps needs to look at a
million combinations.

The cost is not in proportion to the usefulness as described in the
patch.

Drop the [stack:TID] annotation to make /proc/<pid>/maps (and
/proc/<pid>/numa_maps) usable again for higher thread counts.

The [stack] annotation inside /proc/<pid>/task/<tid>/maps is retained, as
identifying the stack VMA there is an O(1) operation.

Siddesh said:
 "The end users needed a way to identify thread stacks programmatically and
  there wasn't a way to do that.  I'm afraid I no longer remember (or have
  access to the resources that would aid my memory since I changed
  employers) the details of their requirement.  However, I did do this on my
  own time because I thought it was an interesting project for me and nobody
  really gave any feedback then as to its utility, so as far as I am
  concerned you could roll back the main thread maps information since the
  information is available in the thread-specific files"

Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: Siddhesh Poyarekar <siddhesh.poyarekar@gmail.com>
Cc: Shaohua Li <shli@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agonuma: fix /proc/<pid>/numa_maps for hugetlbfs on s390
Michael Holzheu [Wed, 3 Feb 2016 00:57:26 +0000 (16:57 -0800)]
numa: fix /proc/<pid>/numa_maps for hugetlbfs on s390

When working with hugetlbfs ptes (which are actually pmds) is not valid to
directly use pte functions like pte_present() because the hardware bit
layout of pmds and ptes can be different.  This is the case on s390.
Therefore we have to convert the hugetlbfs ptes first into a valid pte
encoding with huge_ptep_get().

Currently the /proc/<pid>/numa_maps code uses hugetlbfs ptes without
huge_ptep_get().  On s390 this leads to the following two problems:

1) The pte_present() function returns false (instead of true) for
   PROT_NONE hugetlb ptes. Therefore PROT_NONE vmas are missing
   completely in the "numa_maps" output.

2) The pte_dirty() function always returns false for all hugetlb ptes.
   Therefore these pages are reported as "mapped=xxx" instead of
   "dirty=xxx".

Therefore use huge_ptep_get() to correctly convert the hugetlb ptes.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: <stable@vger.kernel.org> [4.3+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMAINTAINERS: update Seth email
Seth Jennings [Wed, 3 Feb 2016 00:57:23 +0000 (16:57 -0800)]
MAINTAINERS: update Seth email

Update/unify my contact info.  The old email address will no longer work
soon.

Signed-off-by: Seth Jennings <sjenning@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoocfs2/cluster: fix memory leak in o2hb_region_release
Joseph Qi [Wed, 3 Feb 2016 00:57:21 +0000 (16:57 -0800)]
ocfs2/cluster: fix memory leak in o2hb_region_release

o2hb_region_release currently doesn't free o2hb_debug_buf
hr_db_elapsed_time and hr_db_pinned malloced in o2hb_debug_create.  Also
we should call debugfs_remove before freeing its data, to prevent the risk
accessing debugfs rightly after its data has been freed.

Signed-off-by: Joseph Qi <joseph.qi@huawei.com>
Reviewed-by: Jiufei Xue <xuejiufei@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Junxiao Bi <junxiao.bi@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agolib/test-string_helpers.c: fix and improve string_get_size() tests
Vitaly Kuznetsov [Wed, 3 Feb 2016 00:57:18 +0000 (16:57 -0800)]
lib/test-string_helpers.c: fix and improve string_get_size() tests

Recently added commit 564b026fbd0d ("string_helpers: fix precision loss
for some inputs") fixed precision issues for string_get_size() and broke
tests.

Fix and improve them: test both STRING_UNITS_2 and STRING_UNITS_10 at a
time, better failure reporting, test small an huge values.

Fixes: 564b026fbd0d28e9 ("string_helpers: fix precision loss for some inputs")
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: James Bottomley <JBottomley@Odin.com>
Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agothp: limit number of object to scan on deferred_split_scan()
Kirill A. Shutemov [Wed, 3 Feb 2016 00:57:15 +0000 (16:57 -0800)]
thp: limit number of object to scan on deferred_split_scan()

If we have a lot of pages in queue to be split, deferred_split_scan()
can spend unreasonable amount of time under spinlock with disabled
interrupts.

Let's cap number of pages to split on scan by sc->nr_to_scan.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reported-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agothp: change deferred_split_count() to return number of THP in queue
Kirill A. Shutemov [Wed, 3 Feb 2016 00:57:12 +0000 (16:57 -0800)]
thp: change deferred_split_count() to return number of THP in queue

I've got meaning of shrinker::count_objects() wrong: it should return
number of potentially freeable objects, which is not necessary correlate
with freeable memory.

Returning 256 per THP in queue is not reasonable:
shrinker::scan_objects() never called with nr_to_scan > 128 in my setup.

Let's return 1 per THP and correct scan_object accordingly.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agothp: make split_queue per-node
Kirill A. Shutemov [Wed, 3 Feb 2016 00:57:08 +0000 (16:57 -0800)]
thp: make split_queue per-node

Andrea Arcangeli suggested to make split queue per-node to improve
scalability.  Let's do it.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Suggested-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Rik van Riel <riel@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.cz>
Cc: Jerome Marchand <jmarchan@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoperf hists browser: Add 'L' hotkey to change percent limit
Namhyung Kim [Wed, 3 Feb 2016 14:11:23 +0000 (23:11 +0900)]
perf hists browser: Add 'L' hotkey to change percent limit

Add 'L' key action to change the percent limit applied to both of hist
entries and callchains.

Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1454508683-5735-4-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf report: Update documention of --percent-limit option
Namhyung Kim [Wed, 3 Feb 2016 14:11:21 +0000 (23:11 +0900)]
perf report: Update documention of --percent-limit option

The --percent-limit option was changed to be applied to callchains as
well as to hist entries recently, but it missed to update the doc.

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1454508683-5735-2-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf report: Update documentation of --sort option
Namhyung Kim [Wed, 3 Feb 2016 14:11:20 +0000 (23:11 +0900)]
perf report: Update documentation of --sort option

The description of the memory sort key (used by --mem-mode) was
misplaced.  Move it under the --sort option so that it can be referenced
properly.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/1454508683-5735-1-git-send-email-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf hists: Introduce hists__for_each_sort_list macro
Jiri Olsa [Mon, 18 Jan 2016 09:24:24 +0000 (10:24 +0100)]
perf hists: Introduce hists__for_each_sort_list macro

With the hist object having the perf_hpp_list we can now iterate sort
format entries based in the hists object. Adding
hists__for_each_sort_list macro to do that.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-27-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf hists: Introduce hists__for_each_format macro
Jiri Olsa [Mon, 18 Jan 2016 09:24:23 +0000 (10:24 +0100)]
perf hists: Introduce hists__for_each_format macro

With the hist object having the perf_hpp_list we can now iterate output
format entries based in the hists object. Adding hists__for_each_format
macro to do that.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-26-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add hpp_list into struct hists object
Jiri Olsa [Mon, 18 Jan 2016 09:24:22 +0000 (10:24 +0100)]
perf tools: Add hpp_list into struct hists object

Adding hpp_list into struct hists object.

Initializing struct hists_evsel hists object to carry global
perf_hpp_list list.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-25-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf hists: Add struct perf_hpp_list argument to helper functions
Jiri Olsa [Mon, 18 Jan 2016 09:24:21 +0000 (10:24 +0100)]
perf hists: Add struct perf_hpp_list argument to helper functions

Adding struct perf_hpp_list argument to following helper functions:

  void perf_hpp__setup_output_field(struct perf_hpp_list *list);
  void perf_hpp__reset_output_field(struct perf_hpp_list *list);
  void perf_hpp__append_sort_keys(struct perf_hpp_list *list);

so they could be used on hists's hpp_list.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-24-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf hists: Introduce perf_hpp_list__for_each_sort_list_safe macro
Jiri Olsa [Mon, 18 Jan 2016 09:24:20 +0000 (10:24 +0100)]
perf hists: Introduce perf_hpp_list__for_each_sort_list_safe macro

Introducing perf_hpp_list__for_each_sort_list_safe macro
to iterate perf_hpp_list object's sort entries safely.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-23-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf hists: Introduce perf_hpp_list__for_each_sort_list macro
Jiri Olsa [Mon, 18 Jan 2016 09:24:19 +0000 (10:24 +0100)]
perf hists: Introduce perf_hpp_list__for_each_sort_list macro

Introducing perf_hpp_list__for_each_sort_list macro to iterate
perf_hpp_list object's sort entries.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-22-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf hists: Introduce perf_hpp_list__for_each_format_safe macro
Jiri Olsa [Mon, 18 Jan 2016 09:24:18 +0000 (10:24 +0100)]
perf hists: Introduce perf_hpp_list__for_each_format_safe macro

Introducing perf_hpp_list__for_each_format_safe macro to iterate
perf_hpp_list object's output entries safely.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-21-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf hists: Introduce perf_hpp_list__for_each_format macro
Jiri Olsa [Mon, 18 Jan 2016 09:24:17 +0000 (10:24 +0100)]
perf hists: Introduce perf_hpp_list__for_each_format macro

Introducing perf_hpp_list__for_each_format macro to iterate
perf_hpp_list object's output entries.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-20-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf hists: Pass perf_hpp_list all the way through setup_output_list
Jiri Olsa [Mon, 18 Jan 2016 09:24:16 +0000 (10:24 +0100)]
perf hists: Pass perf_hpp_list all the way through setup_output_list

Passing perf_hpp_list all the way through setup_output_list so the
output entry could be added on the arbitrary list.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1453109064-1026-19-git-send-email-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>