From: Kirill Smelkov Date: Fri, 2 Nov 2012 12:10:32 +0000 (-0300) Subject: [media] vivi: Move computations out of vivi_fillbuf linecopy loop X-Git-Url: http://git.lede-project.org./?a=commitdiff_plain;h=13908f330f91996b6d59355aa7860553dd1093d5;p=openwrt%2Fstaging%2Fblogic.git [media] vivi: Move computations out of vivi_fillbuf linecopy loop The "dev->mvcount % wmax" thing was showing high in profiles (we do it for each line which ~ 500 per frame) ? 000010c0 : ... 0,39 ? 70:???mov 0x3ff4(%edi),%esi 0,22 ? 76:? mov 0x2a0(%edi),%eax 0,30 ? ? mov -0x84(%ebp),%ebx 0,35 ? ? mov %eax,%edx 0,04 ? ? mov -0x7c(%ebp),%ecx 0,35 ? ? sar $0x1f,%edx 0,44 ? ? idivl -0x7c(%ebp) 21,68 ? ? imul %esi,%ecx 0,70 ? ? imul %esi,%ebx 0,52 ? ? add -0x88(%ebp),%ebx 1,65 ? ? mov %ebx,%eax 0,22 ? ? imul %edx,%esi 0,04 ? ? lea 0x3f4(%edi,%esi,1),%edx 2,18 ? ?? call vivi_fillbuff+0xa6 0,74 ? ? addl $0x1,-0x80(%ebp) 62,69 ? ? mov -0x7c(%ebp),%edx 1,18 ? ? mov -0x80(%ebp),%ecx 0,35 ? ? add %edx,-0x84(%ebp) 0,61 ? ? cmp %ecx,-0x8c(%ebp) 0,22 ? ???jne 70 so since all variables stay the same for all iterations let's move computations out of the loop: the abovementioned division and "width*pixelsize" too before: # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20 # # Samples: 49K of event 'cycles' # Event count (approx.): 16475832370 # # Overhead Command Shared Object # ........ ............... ...................... # 29.07% rawv libc-2.13.so [.] __memcpy_ssse3 20.57% vivi-* [kernel.kallsyms] [k] memcpy 10.20% Xorg [unknown] [.] 0xa7301494 5.16% vivi-* [vivi] [k] gen_text.constprop.6 4.43% rawv [vivi] [k] gen_twopix 4.36% vivi-* [vivi] [k] vivi_fillbuff 2.42% rawv [vivi] [k] precalculate_line 1.33% swapper [kernel.kallsyms] [k] read_hpet after: # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20 # # Samples: 46K of event 'cycles' # Event count (approx.): 15574200568 # # Overhead Command Shared Object # ........ ............... .................... # 27.99% rawv libc-2.13.so [.] __memcpy_ssse3 23.29% vivi-* [kernel.kallsyms] [k] memcpy 10.30% Xorg [unknown] [.] 0xa75c98f8 5.34% vivi-* [vivi] [k] gen_text.constprop.6 4.61% rawv [vivi] [k] gen_twopix 2.64% rawv [vivi] [k] precalculate_line 1.37% swapper [kernel.kallsyms] [k] read_hpet 0.79% Xorg [kernel.kallsyms] [k] read_hpet 0.64% Xorg [kernel.kallsyms] [k] unix_poll 0.45% Xorg [kernel.kallsyms] [k] fget_light 0.43% rawv libxcb.so.1.1.0 [.] 0x0000aae9 0.40% runsv [kernel.kallsyms] [k] ext2_try_to_allocate 0.36% Xorg [kernel.kallsyms] [k] _raw_spin_lock_irqsave 0.31% vivi-* [vivi] [k] vivi_fillbuff (i.e. vivi_fillbuff own overhead is almost gone) Signed-off-by: Kirill Smelkov Acked-by: Hans Verkuil Signed-off-by: Mauro Carvalho Chehab --- diff --git a/drivers/media/platform/vivi.c b/drivers/media/platform/vivi.c index a3090f0312c1..ec832e9461a8 100644 --- a/drivers/media/platform/vivi.c +++ b/drivers/media/platform/vivi.c @@ -574,21 +574,22 @@ static void gen_text(struct vivi_dev *dev, char *basep, static void vivi_fillbuff(struct vivi_dev *dev, struct vivi_buffer *buf) { - int wmax = dev->width; + int stride = dev->width * dev->pixelsize; int hmax = dev->height; void *vbuf = vb2_plane_vaddr(&buf->vb, 0); unsigned ms; char str[100]; int h, line = 1; + u8 *linestart; s32 gain; if (!vbuf) return; + linestart = dev->line + (dev->mv_count % dev->width) * dev->pixelsize; + for (h = 0; h < hmax; h++) - memcpy(vbuf + h * wmax * dev->pixelsize, - dev->line + (dev->mv_count % wmax) * dev->pixelsize, - wmax * dev->pixelsize); + memcpy(vbuf + h * stride, linestart, stride); /* Updates stream time */