Convert Intel Many Integrated Core architecture docs to ReST.
The conversion is trivial: just add title and literal block
markups, and adjust some identation.
At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
--- /dev/null
+:orphan:
+
+=============================================
+Intel Many Integrated Core (MIC) architecture
+=============================================
+
+.. toctree::
+ :maxdepth: 1
+
+ mic_overview
+ scif_overview
+
+.. only:: subproject and html
+
+ Indices
+ =======
+
+ * :ref:`genindex`
--- /dev/null
+======================================================
+Intel Many Integrated Core (MIC) architecture overview
+======================================================
+
+An Intel MIC X100 device is a PCIe form factor add-in coprocessor
+card based on the Intel Many Integrated Core (MIC) architecture
+that runs a Linux OS. It is a PCIe endpoint in a platform and therefore
+implements the three required standard address spaces i.e. configuration,
+memory and I/O. The host OS loads a device driver as is typical for
+PCIe devices. The card itself runs a bootstrap after reset that
+transfers control to the card OS downloaded from the host driver. The
+host driver supports OSPM suspend and resume operations. It shuts down
+the card during suspend and reboots the card OS during resume.
+The card OS as shipped by Intel is a Linux kernel with modifications
+for the X100 devices.
+
+Since it is a PCIe card, it does not have the ability to host hardware
+devices for networking, storage and console. We provide these devices
+on X100 coprocessors thus enabling a self-bootable equivalent
+environment for applications. A key benefit of our solution is that it
+leverages the standard virtio framework for network, disk and console
+devices, though in our case the virtio framework is used across a PCIe
+bus. A Virtio Over PCIe (VOP) driver allows creating user space
+backends or devices on the host which are used to probe virtio drivers
+for these devices on the MIC card. The existing VRINGH infrastructure
+in the kernel is used to access virtio rings from the host. The card
+VOP driver allows card virtio drivers to communicate with their user
+space backends on the host via a device page. Ring 3 apps on the host
+can add, remove and configure virtio devices. A thin MIC specific
+virtio_config_ops is implemented which is borrowed heavily from
+previous similar implementations in lguest and s390.
+
+MIC PCIe card has a dma controller with 8 channels. These channels are
+shared between the host s/w and the card s/w. 0 to 3 are used by host
+and 4 to 7 by card. As the dma device doesn't show up as PCIe device,
+a virtual bus called mic bus is created and virtual dma devices are
+created on it by the host/card drivers. On host the channels are private
+and used only by the host driver to transfer data for the virtio devices.
+
+The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a
+low level communications API across PCIe currently implemented for MIC.
+More details are available at scif_overview.txt.
+
+The Coprocessor State Management (COSM) driver on the host allows for
+boot, shutdown and reset of Intel MIC devices. It communicates with a COSM
+"client" driver on the MIC cards over SCIF to perform these functions.
+
+Here is a block diagram of the various components described above. The
+virtio backends are situated on the host rather than the card given better
+single threaded performance for the host compared to MIC, the ability of
+the host to initiate DMA's to/from the card using the MIC DMA engine and
+the fact that the virtio block storage backend can only be on the host::
+
+ +----------+ | +----------+
+ | Card OS | | | Host OS |
+ +----------+ | +----------+
+ |
+ +-------+ +--------+ +------+ | +---------+ +--------+ +--------+
+ | Virtio| |Virtio | |Virtio| | |Virtio | |Virtio | |Virtio |
+ | Net | |Console | |Block | | |Net | |Console | |Block |
+ | Driver| |Driver | |Driver| | |backend | |backend | |backend |
+ +---+---+ +---+----+ +--+---+ | +---------+ +----+---+ +--------+
+ | | | | | | |
+ | | | |User | | |
+ | | | |------|------------|--+------|-------
+ +---------+---------+ |Kernel |
+ | | |
+ +---------+ +---+----+ +------+ | +------+ +------+ +--+---+ +-------+
+ |MIC DMA | | VOP | | SCIF | | | SCIF | | COSM | | VOP | |MIC DMA|
+ +---+-----+ +---+----+ +--+---+ | +--+---+ +--+---+ +------+ +----+--+
+ | | | | | | |
+ +---+-----+ +---+----+ +--+---+ | +--+---+ +--+---+ +------+ +----+--+
+ |MIC | | VOP | |SCIF | | |SCIF | | COSM | | VOP | | MIC |
+ |HW Bus | | HW Bus| |HW Bus| | |HW Bus| | Bus | |HW Bus| |HW Bus |
+ +---------+ +--------+ +--+---+ | +--+---+ +------+ +------+ +-------+
+ | | | | | | |
+ | +-----------+--+ | | | +---------------+ |
+ | |Intel MIC | | | | |Intel MIC | |
+ | |Card Driver | | | | |Host Driver | |
+ +---+--------------+------+ | +----+---------------+-----+
+ | | |
+ +-------------------------------------------------------------+
+ | |
+ | PCIe Bus |
+ +-------------------------------------------------------------+
+++ /dev/null
-An Intel MIC X100 device is a PCIe form factor add-in coprocessor
-card based on the Intel Many Integrated Core (MIC) architecture
-that runs a Linux OS. It is a PCIe endpoint in a platform and therefore
-implements the three required standard address spaces i.e. configuration,
-memory and I/O. The host OS loads a device driver as is typical for
-PCIe devices. The card itself runs a bootstrap after reset that
-transfers control to the card OS downloaded from the host driver. The
-host driver supports OSPM suspend and resume operations. It shuts down
-the card during suspend and reboots the card OS during resume.
-The card OS as shipped by Intel is a Linux kernel with modifications
-for the X100 devices.
-
-Since it is a PCIe card, it does not have the ability to host hardware
-devices for networking, storage and console. We provide these devices
-on X100 coprocessors thus enabling a self-bootable equivalent
-environment for applications. A key benefit of our solution is that it
-leverages the standard virtio framework for network, disk and console
-devices, though in our case the virtio framework is used across a PCIe
-bus. A Virtio Over PCIe (VOP) driver allows creating user space
-backends or devices on the host which are used to probe virtio drivers
-for these devices on the MIC card. The existing VRINGH infrastructure
-in the kernel is used to access virtio rings from the host. The card
-VOP driver allows card virtio drivers to communicate with their user
-space backends on the host via a device page. Ring 3 apps on the host
-can add, remove and configure virtio devices. A thin MIC specific
-virtio_config_ops is implemented which is borrowed heavily from
-previous similar implementations in lguest and s390.
-
-MIC PCIe card has a dma controller with 8 channels. These channels are
-shared between the host s/w and the card s/w. 0 to 3 are used by host
-and 4 to 7 by card. As the dma device doesn't show up as PCIe device,
-a virtual bus called mic bus is created and virtual dma devices are
-created on it by the host/card drivers. On host the channels are private
-and used only by the host driver to transfer data for the virtio devices.
-
-The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a
-low level communications API across PCIe currently implemented for MIC.
-More details are available at scif_overview.txt.
-
-The Coprocessor State Management (COSM) driver on the host allows for
-boot, shutdown and reset of Intel MIC devices. It communicates with a COSM
-"client" driver on the MIC cards over SCIF to perform these functions.
-
-Here is a block diagram of the various components described above. The
-virtio backends are situated on the host rather than the card given better
-single threaded performance for the host compared to MIC, the ability of
-the host to initiate DMA's to/from the card using the MIC DMA engine and
-the fact that the virtio block storage backend can only be on the host.
-
- +----------+ | +----------+
- | Card OS | | | Host OS |
- +----------+ | +----------+
- |
- +-------+ +--------+ +------+ | +---------+ +--------+ +--------+
- | Virtio| |Virtio | |Virtio| | |Virtio | |Virtio | |Virtio |
- | Net | |Console | |Block | | |Net | |Console | |Block |
- | Driver| |Driver | |Driver| | |backend | |backend | |backend |
- +---+---+ +---+----+ +--+---+ | +---------+ +----+---+ +--------+
- | | | | | | |
- | | | |User | | |
- | | | |------|------------|--+------|-------
- +---------+---------+ |Kernel |
- | | |
- +---------+ +---+----+ +------+ | +------+ +------+ +--+---+ +-------+
- |MIC DMA | | VOP | | SCIF | | | SCIF | | COSM | | VOP | |MIC DMA|
- +---+-----+ +---+----+ +--+---+ | +--+---+ +--+---+ +------+ +----+--+
- | | | | | | |
- +---+-----+ +---+----+ +--+---+ | +--+---+ +--+---+ +------+ +----+--+
- |MIC | | VOP | |SCIF | | |SCIF | | COSM | | VOP | | MIC |
- |HW Bus | | HW Bus| |HW Bus| | |HW Bus| | Bus | |HW Bus| |HW Bus |
- +---------+ +--------+ +--+---+ | +--+---+ +------+ +------+ +-------+
- | | | | | | |
- | +-----------+--+ | | | +---------------+ |
- | |Intel MIC | | | | |Intel MIC | |
- | |Card Driver | | | | |Host Driver | |
- +---+--------------+------+ | +----+---------------+-----+
- | | |
- +-------------------------------------------------------------+
- | |
- | PCIe Bus |
- +-------------------------------------------------------------+
--- /dev/null
+========================================
+Symmetric Communication Interface (SCIF)
+========================================
+
+The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
+level communications API across PCIe currently implemented for MIC. Currently
+SCIF provides inter-node communication within a single host platform, where a
+node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
+communicating over the PCIe bus while providing an API that is symmetric
+across all the nodes in the PCIe network. An important design objective for SCIF
+is to deliver the maximum possible performance given the communication
+abilities of the hardware. SCIF has been used to implement an offload compiler
+runtime and OFED support for MPI implementations for MIC coprocessors.
+
+SCIF API Components
+===================
+
+The SCIF API has the following parts:
+
+1. Connection establishment using a client server model
+2. Byte stream messaging intended for short messages
+3. Node enumeration to determine online nodes
+4. Poll semantics for detection of incoming connections and messages
+5. Memory registration to pin down pages
+6. Remote memory mapping for low latency CPU accesses via mmap
+7. Remote DMA (RDMA) for high bandwidth DMA transfers
+8. Fence APIs for RDMA synchronization
+
+SCIF exposes the notion of a connection which can be used by peer processes on
+nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A
+process in a SCIF node initiates a SCIF connection to a peer process on a
+different node via a SCIF "endpoint". SCIF endpoints support messaging APIs
+which are similar to connection oriented socket APIs. Connected SCIF endpoints
+can also register local memory which is followed by data transfer using either
+DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and
+kernel mode clients which are functionally equivalent.
+
+SCIF Performance for MIC
+========================
+
+DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus
+SCIF shows the performance advantages of SCIF for HPC applications and
+runtimes::
+
+ Comparison of TCP and SCIF based BW
+
+ Throughput (GB/sec)
+ 8 + PCIe Bandwidth ******
+ + TCP ######
+ 7 + ************************************** SCIF %%%%%%
+ | %%%%%%%%%%%%%%%%%%%
+ 6 + %%%%
+ | %%
+ | %%%
+ 5 + %%
+ | %%
+ 4 + %%
+ | %%
+ 3 + %%
+ | %
+ 2 + %%
+ | %%
+ | %
+ 1 +
+ + ######################################
+ 0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+-
+ 1 10 100 1000 10000 100000
+ Transfer Size (KBytes)
+
+SCIF allows memory sharing via mmap(..) between processes on different PCIe
+nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap
+latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs.
+
+SCIF has a user space library which is a thin IOCTL wrapper providing a user
+space API similar to the kernel API in scif.h. The SCIF user space library
+is distributed @ https://software.intel.com/en-us/mic-developer
+
+Here is some pseudo code for an example of how two applications on two PCIe
+nodes would typically use the SCIF API::
+
+ Process A (on node A) Process B (on node B)
+
+ /* get online node information */
+ scif_get_node_ids(..) scif_get_node_ids(..)
+ scif_open(..) scif_open(..)
+ scif_bind(..) scif_bind(..)
+ scif_listen(..)
+ scif_accept(..) scif_connect(..)
+ /* SCIF connection established */
+
+ /* Send and receive short messages */
+ scif_send(..)/scif_recv(..) scif_send(..)/scif_recv(..)
+
+ /* Register memory */
+ scif_register(..) scif_register(..)
+
+ /* RDMA */
+ scif_readfrom(..)/scif_writeto(..) scif_readfrom(..)/scif_writeto(..)
+
+ /* Fence DMAs */
+ scif_fence_signal(..) scif_fence_signal(..)
+
+ mmap(..) mmap(..)
+
+ /* Access remote registered memory */
+
+ /* Close the endpoints */
+ scif_close(..) scif_close(..)
+++ /dev/null
-The Symmetric Communication Interface (SCIF (pronounced as skiff)) is a low
-level communications API across PCIe currently implemented for MIC. Currently
-SCIF provides inter-node communication within a single host platform, where a
-node is a MIC Coprocessor or Xeon based host. SCIF abstracts the details of
-communicating over the PCIe bus while providing an API that is symmetric
-across all the nodes in the PCIe network. An important design objective for SCIF
-is to deliver the maximum possible performance given the communication
-abilities of the hardware. SCIF has been used to implement an offload compiler
-runtime and OFED support for MPI implementations for MIC coprocessors.
-
-==== SCIF API Components ====
-The SCIF API has the following parts:
-1. Connection establishment using a client server model
-2. Byte stream messaging intended for short messages
-3. Node enumeration to determine online nodes
-4. Poll semantics for detection of incoming connections and messages
-5. Memory registration to pin down pages
-6. Remote memory mapping for low latency CPU accesses via mmap
-7. Remote DMA (RDMA) for high bandwidth DMA transfers
-8. Fence APIs for RDMA synchronization
-
-SCIF exposes the notion of a connection which can be used by peer processes on
-nodes in a SCIF PCIe "network" to share memory "windows" and to communicate. A
-process in a SCIF node initiates a SCIF connection to a peer process on a
-different node via a SCIF "endpoint". SCIF endpoints support messaging APIs
-which are similar to connection oriented socket APIs. Connected SCIF endpoints
-can also register local memory which is followed by data transfer using either
-DMA, CPU copies or remote memory mapping via mmap. SCIF supports both user and
-kernel mode clients which are functionally equivalent.
-
-==== SCIF Performance for MIC ====
-DMA bandwidth comparison between the TCP (over ethernet over PCIe) stack versus
-SCIF shows the performance advantages of SCIF for HPC applications and runtimes.
-
- Comparison of TCP and SCIF based BW
-
- Throughput (GB/sec)
- 8 + PCIe Bandwidth ******
- + TCP ######
- 7 + ************************************** SCIF %%%%%%
- | %%%%%%%%%%%%%%%%%%%
- 6 + %%%%
- | %%
- | %%%
- 5 + %%
- | %%
- 4 + %%
- | %%
- 3 + %%
- | %
- 2 + %%
- | %%
- | %
- 1 +
- + ######################################
- 0 +++---+++--+--+-+--+--+-++-+--+-++-+--+-++-+-
- 1 10 100 1000 10000 100000
- Transfer Size (KBytes)
-
-SCIF allows memory sharing via mmap(..) between processes on different PCIe
-nodes and thus provides bare-metal PCIe latency. The round trip SCIF mmap
-latency from the host to an x100 MIC for an 8 byte message is 0.44 usecs.
-
-SCIF has a user space library which is a thin IOCTL wrapper providing a user
-space API similar to the kernel API in scif.h. The SCIF user space library
-is distributed @ https://software.intel.com/en-us/mic-developer
-
-Here is some pseudo code for an example of how two applications on two PCIe
-nodes would typically use the SCIF API:
-
-Process A (on node A) Process B (on node B)
-
-/* get online node information */
-scif_get_node_ids(..) scif_get_node_ids(..)
-scif_open(..) scif_open(..)
-scif_bind(..) scif_bind(..)
-scif_listen(..)
-scif_accept(..) scif_connect(..)
-/* SCIF connection established */
-
-/* Send and receive short messages */
-scif_send(..)/scif_recv(..) scif_send(..)/scif_recv(..)
-
-/* Register memory */
-scif_register(..) scif_register(..)
-
-/* RDMA */
-scif_readfrom(..)/scif_writeto(..) scif_readfrom(..)/scif_writeto(..)
-
-/* Fence DMAs */
-scif_fence_signal(..) scif_fence_signal(..)
-
-mmap(..) mmap(..)
-
-/* Access remote registered memory */
-
-/* Close the endpoints */
-scif_close(..) scif_close(..)