kvm: ioapic: Restrict lazy EOI update to edge-triggered interrupts
authorPaolo Bonzini <pbonzini@redhat.com>
Mon, 4 May 2020 16:19:45 +0000 (12:19 -0400)
committerPaolo Bonzini <pbonzini@redhat.com>
Mon, 4 May 2020 16:29:05 +0000 (12:29 -0400)
Commit f458d039db7e ("kvm: ioapic: Lazy update IOAPIC EOI") introduces
the following infinite loop:

BUG: stack guard page was hit at 000000008f595917 \
(stack is 00000000bdefe5a4..00000000ae2b06f5)
kernel stack overflow (double-fault): 0000 [#1] SMP NOPTI
RIP: 0010:kvm_set_irq+0x51/0x160 [kvm]
Call Trace:
 irqfd_resampler_ack+0x32/0x90 [kvm]
 kvm_notify_acked_irq+0x62/0xd0 [kvm]
 kvm_ioapic_update_eoi_one.isra.0+0x30/0x120 [kvm]
 ioapic_set_irq+0x20e/0x240 [kvm]
 kvm_ioapic_set_irq+0x5c/0x80 [kvm]
 kvm_set_irq+0xbb/0x160 [kvm]
 ? kvm_hv_set_sint+0x20/0x20 [kvm]
 irqfd_resampler_ack+0x32/0x90 [kvm]
 kvm_notify_acked_irq+0x62/0xd0 [kvm]
 kvm_ioapic_update_eoi_one.isra.0+0x30/0x120 [kvm]
 ioapic_set_irq+0x20e/0x240 [kvm]
 kvm_ioapic_set_irq+0x5c/0x80 [kvm]
 kvm_set_irq+0xbb/0x160 [kvm]
 ? kvm_hv_set_sint+0x20/0x20 [kvm]
....

The re-entrancy happens because the irq state is the OR of
the interrupt state and the resamplefd state.  That is, we don't
want to show the state as 0 until we've had a chance to set the
resamplefd.  But if the interrupt has _not_ gone low then
ioapic_set_irq is invoked again, causing an infinite loop.

This can only happen for a level-triggered interrupt, otherwise
irqfd_inject would immediately set the KVM_USERSPACE_IRQ_SOURCE_ID high
and then low.  Fortunately, in the case of level-triggered interrupts the VMEXIT already happens because
TMR is set.  Thus, fix the bug by restricting the lazy invocation
of the ack notifier to edge-triggered interrupts, the only ones that
need it.

Tested-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Reported-by: borisvk@bstnet.org
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Link: https://www.spinics.net/lists/kvm/msg213512.html
Fixes: f458d039db7e ("kvm: ioapic: Lazy update IOAPIC EOI")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207489
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
arch/x86/kvm/ioapic.c

index 750ff0b294047cf83b2c28f72cfbd057375fc7b1..d057376bd3d33ceb24155d6b2d19a7e5fa18a71a 100644 (file)
@@ -225,12 +225,12 @@ static int ioapic_set_irq(struct kvm_ioapic *ioapic, unsigned int irq,
        }
 
        /*
-        * AMD SVM AVIC accelerate EOI write and do not trap,
-        * in-kernel IOAPIC will not be able to receive the EOI.
-        * In this case, we do lazy update of the pending EOI when
-        * trying to set IOAPIC irq.
+        * AMD SVM AVIC accelerate EOI write iff the interrupt is edge
+        * triggered, in which case the in-kernel IOAPIC will not be able
+        * to receive the EOI.  In this case, we do a lazy update of the
+        * pending EOI when trying to set IOAPIC irq.
         */
-       if (kvm_apicv_activated(ioapic->kvm))
+       if (edge && kvm_apicv_activated(ioapic->kvm))
                ioapic_lazy_update_eoi(ioapic, irq);
 
        /*