ARC: udelay: fix inline assembler by adding LP_COUNT to clobber list
authorVineet Gupta <vgupta@synopsys.com>
Tue, 24 Jan 2017 18:23:42 +0000 (10:23 -0800)
committerVineet Gupta <vgupta@synopsys.com>
Tue, 24 Jan 2017 18:54:24 +0000 (10:54 -0800)
commit 3c7c7a2fc8811bc ("ARC: Don't use "+l" inline asm constraint")
modified the inline assembly to setup LP_COUNT register manually and NOT
rely on gcc to do it (with the +l inline assembler contraint hint, now
being retired in the compiler)

However the fix was flawed as we didn't add LP_COUNT to asm clobber list,
meaning gcc doesn't know that LP_COUNT or zero-delay-loops are in action
in the inline asm.

This resulted in some fun - as nested ZOL loops were being generared

| mov lp_count,250000 ;16 # tmp235,
| lp .L__GCC__LP14 # <======= OUTER LOOP (gcc generated)
|   .L14:
|   ld r2, [r5] # MEM[(volatile u32 *)prephitmp_43], w
|   dmb 1
|   breq r2, -1, @.L21 #, w,,
|   bbit0 r2,1,@.L13 # w,,
|   ld r4,[r7] ;25 # loops_per_jiffy, loops_per_jiffy
|   mpymu r3,r4,r6 #, loops_per_jiffy, tmp234
|
|   mov lp_count, r3 #  <====== INNER LOOP (from inline asm)
|   lp 1f
|   nop
|   1:
|   nop_s
| .L__GCC__LP14: ; loop end, start is @.L14 #,

This caused issues with drivers relying on sane behaviour of udelay
friends.

With LP_COUNT added to clobber list, gcc doesn't generate the outer
loop in say above case.

Addresses STAR 9001146134

Reported-by: Joao Pinto <jpinto@synopsys.com>
Fixes: 3c7c7a2fc8811bc ("ARC: Don't use "+l" inline asm constraint")
Cc: stable@vger.kernel.org
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
arch/arc/include/asm/delay.h

index a36e8601114d2ca2970f257f9f8d7e5c03ec08a2..d5da2115d78a678e343da2abec51f6c8efbbe0a4 100644 (file)
@@ -26,7 +26,9 @@ static inline void __delay(unsigned long loops)
        "       lp  1f                  \n"
        "       nop                     \n"
        "1:                             \n"
-       : : "r"(loops));
+       :
+        : "r"(loops)
+        : "lp_count");
 }
 
 extern void __bad_udelay(void);