powerpc: Fix races with irq_work
authorBenjamin Herrenschmidt <benh@kernel.crashing.org>
Tue, 14 Jan 2014 06:11:39 +0000 (17:11 +1100)
committerBenjamin Herrenschmidt <benh@kernel.crashing.org>
Wed, 15 Jan 2014 02:59:03 +0000 (13:59 +1100)
If we set irq_work on a processor and immediately afterward, before the
irq work has a chance to be processed, we change the decrementer value,
we can seriously delay the handling of that irq_work.

Fix it by checking in a few places for pending irq work, first before
changing the decrementer in decrementer_set_next_event() and after
changing it in the same function and in timer_interrupt().

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
arch/powerpc/kernel/time.c

index afb1b56ef4fa6481f940e85130e17b4efc585023..b3dab20acf34abe126e244d34a7dee39021457de 100644 (file)
@@ -536,6 +536,9 @@ void timer_interrupt(struct pt_regs * regs)
                now = *next_tb - now;
                if (now <= DECREMENTER_MAX)
                        set_dec((int)now);
+               /* We may have raced with new irq work */
+               if (test_irq_work_pending())
+                       set_dec(1);
                __get_cpu_var(irq_stat).timer_irqs_others++;
        }
 
@@ -802,8 +805,16 @@ static void __init clocksource_init(void)
 static int decrementer_set_next_event(unsigned long evt,
                                      struct clock_event_device *dev)
 {
+       /* Don't adjust the decrementer if some irq work is pending */
+       if (test_irq_work_pending())
+               return 0;
        __get_cpu_var(decrementers_next_tb) = get_tb_or_rtc() + evt;
        set_dec(evt);
+
+       /* We may have raced with new irq work */
+       if (test_irq_work_pending())
+               set_dec(1);
+
        return 0;
 }