Add new column that normalizes counter results by
'nanoseconds spent running' unit.
Before:
Performance counter stats for '/home/mingo/hackbench':
10469.403605 task clock ticks (msecs)
75502 context switches (events)
9501 CPU migrations (events)
36158 pagefaults (events)
31975676185 CPU cycles (events)
26257738659 instructions (events)
108740581 cache references (events)
54606088 cache misses (events)
Wall-clock time elapsed: 810.514504 msecs
After:
Performance counter stats for '/home/mingo/hackbench':
10469.403605 task clock ticks (msecs)
75502 context switches # 0.007 M/sec
9501 CPU migrations # 0.001 M/sec
36158 pagefaults # 0.003 M/sec
31975676185 CPU cycles # 3054.202 M/sec
26257738659 instructions # 2508.045 M/sec
108740581 cache references # 10.387 M/sec
54606088 cache misses # 5.216 M/sec
Wall-clock time elapsed: 810.514504 msecs
The advantage of that column is that it is characteristic of the
execution workflow, regardless of runtime. Hence 'hackbench 10'
will look similar to 'hackbench 15' - while the absolute counter
values are very different.
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Corey Ashford <cjashfor@linux.vnet.ibm.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Kacur <jkacur@redhat.com>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
static __u64 event_res[MAX_COUNTERS][3];
static __u64 event_scaled[MAX_COUNTERS];
+static __u64 runtime_nsecs;
+
static void create_perfstat_counter(int counter)
{
struct perf_counter_hw_event hw_event;
((double)count[0] * count[1] / count[2] + 0.5);
}
}
+ /*
+ * Save the full runtime - to allow normalization during printout:
+ */
+ if (event_id[counter] == EID(PERF_TYPE_SOFTWARE, PERF_COUNT_TASK_CLOCK))
+ runtime_nsecs = count[0];
}
/*
fprintf(stderr, " %14.6f %-20s (msecs)",
msecs, event_name(counter));
} else {
- fprintf(stderr, " %14Ld %-20s (events)",
+ fprintf(stderr, " %14Ld %-20s",
count[0], event_name(counter));
+ if (runtime_nsecs)
+ fprintf(stderr, " # %12.3f M/sec",
+ (double)count[0]/runtime_nsecs*1000.0);
}
if (scaled)
fprintf(stderr, " (scaled from %.2f%%)",