perf cs-etm: Correct synthesizing instruction samples
When 'etm->instructions_sample_period' is less than
'tidq->period_instructions', the function cs_etm__sample() cannot handle
this case properly with its logic.
Let's see below flow as an example:
- If we set itrace option '--itrace=i4', then function cs_etm__sample()
has variables with initialized values:
tidq->period_instructions = 0
etm->instructions_sample_period = 4
- When the first packet is coming:
packet->instr_count = 10; the number of instructions executed in this
packet is 10, thus update period_instructions as below:
tidq->period_instructions = 0 + 10 = 10
instrs_over = 10 - 4 = 6
offset = 10 - 6 - 1 = 3
tidq->period_instructions = instrs_over = 6
- When the second packet is coming:
packet->instr_count = 10; in the second pass, assume 10 instructions
in the trace sample again:
tidq->period_instructions = 6 + 10 = 16
instrs_over = 16 - 4 = 12
offset = 10 - 12 - 1 = -3 -> the negative value
tidq->period_instructions = instrs_over = 12
So after handle these two packets, there have below issues:
The first issue is that cs_etm__instr_addr() returns the address within
the current trace sample of the instruction related to offset, so the
offset is supposed to be always unsigned value. But in fact, function
cs_etm__sample() might calculate a negative offset value (in handling
the second packet, the offset is -3) and pass to cs_etm__instr_addr()
with u64 type with a big positive integer.
The second issue is it only synthesizes 2 samples for sample period = 4.
In theory, every packet has 10 instructions so the two packets have
total 20 instructions, 20 instructions should generate 5 samples
(4 x 5 = 20). This is because cs_etm__sample() only calls once
cs_etm__synth_instruction_sample() to generate instruction sample per
range packet.
This patch fixes the logic in function cs_etm__sample(); the basic
idea for handling coming packet is:
- To synthesize the first instruction sample, it combines the left
instructions from the previous packet and the head of the new
packet; then generate continuous samples with sample period;
- At the tail of the new packet, if it has the rest instructions,
these instructions will be left for the sequential sample.
Suggested-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Mike Leach <mike.leach@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Walker <robert.walker@arm.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight ml <coresight@lists.linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200219021811.20067-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>