Initial commit of Sphinx docs

Change-Id: I9fca8fb98502dffc2555f9de7f507b6f006e0e77
Signed-off-by: John DeNisco <jdenisco@cisco.com>
diff --git a/docs/overview/whatisvpp/dataplane.rst b/docs/overview/whatisvpp/dataplane.rst
new file mode 100644
index 0000000..9113038
--- /dev/null
+++ b/docs/overview/whatisvpp/dataplane.rst
@@ -0,0 +1,34 @@
+.. _packet-processing:
+
+=================
+Packet Processing
+=================
+
+* Layer 2 - 4 Network Stack
+
+  * Fast lookup tables for routes, bridge entries
+  * Arbitrary n-tuple classifiers 
+  * Control Plane, Traffic Management and Overlays
+ 
+* `Linux <https://en.wikipedia.org/wiki/Linux>`_ and `FreeBSD <https://en.wikipedia.org/wiki/FreeBSD>`_ support
+
+  * Wide support for standard Operating System Interfaces such as AF_Packet, Tun/Tap & Netmap.
+
+* Wide network and cryptograhic hardware support with `DPDK <https://www.dpdk.org/>`_.
+* Container and Virtualization support
+
+  * Para-virtualized intefaces; Vhost and Virtio 
+  * Network Adapters over PCI passthrough
+  * Native container interfaces; MemIF
+  
+* Universal Data Plane: one code base, for many use cases
+ 
+  * Discrete appliances; such as `Routers <https://en.wikipedia.org/wiki/Router_(computing)>`_ and `Switches <https://en.wikipedia.org/wiki/Network_switch>`_.
+  * `Cloud Infrastructure and Virtual Network Functions <https://en.wikipedia.org/wiki/Network_function_virtualization>`_
+  * `Cloud Native Infrastructure <https://www.cncf.io/>`_
+  * The same binary package for all use cases. 
+
+* Out of the box production quality, with thanks to `CSIT <https://wiki.fd.io/view/CSIT#Start_Here>`_. 
+
+For more information, please see :ref:`features` for the complete list.
+
diff --git a/docs/overview/whatisvpp/developer.rst b/docs/overview/whatisvpp/developer.rst
new file mode 100644
index 0000000..a0bb2d4
--- /dev/null
+++ b/docs/overview/whatisvpp/developer.rst
@@ -0,0 +1,24 @@
+.. _developer-friendly:
+
+==================
+Developer Friendly
+==================
+
+* Extensive runtime counters; throughput, `intructions per cycle <https://en.wikipedia.org/wiki/Instructions_per_cycle>`_, errors, events etc.
+* Integrated pipeline tracing facilities
+* Multi-language API bindings
+* Integrated command line for debugging
+* Fault-tolerant and upgradable
+
+  * Runs as a standard user-space process for fault tolerance, software crashes seldom require more than a process restart. 
+  * Improved fault-tolerance and upgradability when compared to running similar packet processing in the kernel, software updates never require system reboots. 
+  * Development expierence is easier compared to similar kernel code 
+  * Hardware isolation and protection (`iommu <https://en.wikipedia.org/wiki/Input%E2%80%93output_memory_management_unit>`_)
+
+* Built for security
+
+  * Extensive white-box testing
+  * Image segment base address randomization
+  * Shared-memory segment base address randomization
+  * Stack bounds checking
+  * Static analysis with `Coverity <https://en.wikipedia.org/wiki/Coverity>`_
diff --git a/docs/overview/whatisvpp/extensible.rst b/docs/overview/whatisvpp/extensible.rst
new file mode 100644
index 0000000..c271dad
--- /dev/null
+++ b/docs/overview/whatisvpp/extensible.rst
@@ -0,0 +1,39 @@
+.. _extensible:
+
+=============================
+Extensible and Modular Design
+=============================
+
+* Pluggable, easy to understand & extend
+* Mature graph node architecture
+* Full control to reorganize the pipeline
+* Fast, plugins are equal citizens
+
+**Modular, Flexible, and Extensible**
+
+The FD.io VPP packet processing pipeline is decomposed into a ‘packet processing
+graph’.  This modular approach means that anyone can ‘plugin’ new graph
+nodes. This makes VPP easily exensible and means that plugins can be
+customized for specific purposes. VPP is also configurable through it's
+Low-Level API.
+
+.. figure:: /_images/VPP_custom_application_packet_processing_graph.280.jpg
+   :alt: Extensible, modular graph node architecture?
+   
+   Extensible and modular graph node architecture. 
+
+At runtime, the FD.io VPP platform assembles a vector of packets from RX rings,
+typically up to 256 packets in a single vector. The packet processing graph is
+then applied, node by node (including plugins) to the entire packet vector. The
+received packets typically traverse the packet processing graph nodes in the
+vector, when the network processing represented by each graph node is applied to
+each packet in turn.  Graph nodes are small and modular, and loosely
+coupled. This makes it easy to introduce new graph nodes and rewire existing
+graph nodes.
+
+Plugins are `shared libraries <https://en.wikipedia.org/wiki/Library_(computing)>`_ 
+and are loaded at runtime by VPP. VPP find plugins by searching the plugin path 
+for libraries, and then dynamically loads each one in turn on startup. 
+A plugin can introduce new graph nodes or rearrange the packet processing graph. 
+You can build a plugin completely independently of the FD.io VPP source tree,
+which means you can treat it as an independent component.
diff --git a/docs/overview/whatisvpp/fast.rst b/docs/overview/whatisvpp/fast.rst
new file mode 100644
index 0000000..b04c12f
--- /dev/null
+++ b/docs/overview/whatisvpp/fast.rst
@@ -0,0 +1,16 @@
+.. _fast:
+
+================================
+Fast, Scalable and Deterministic
+================================
+
+* `Continuous integration and system testing <https://wiki.fd.io/view/CSIT#Start_Here>`_
+
+  * Including continuous & extensive, latency and throughput testing
+
+* Layer 2 Cross Connect (L2XC), typically achieve 15+ Mpps per core.
+* Tested to achieve **zero** packet drops and ~15µs latency.
+* Performance scales linearly with core/thread count
+* Supporting millions of concurrent lookup tables entries
+
+Please see :ref:`performance` for more information.
diff --git a/docs/overview/whatisvpp/index.rst b/docs/overview/whatisvpp/index.rst
new file mode 100644
index 0000000..f8cb25d
--- /dev/null
+++ b/docs/overview/whatisvpp/index.rst
@@ -0,0 +1,27 @@
+.. _whatisvpp:
+
+=========================================
+What is VPP?
+=========================================
+
+FD.io's Vector Packet Processing (VPP) technology is a :ref:`fast`,
+:ref:`packet-processing` stack that runs on commodity CPUs. It provides
+out-of-the-box production quality switch/router functionality and much, much
+more. FD.io VPP is at the same time, an :ref:`extensible` and
+:ref:`developer-friendly` framework, capable of boot-strapping the development
+of packet-processing applications. The benefits of FD.io VPP are its high
+performance, proven technology, its modularity and flexibility, integrations and
+rich feature set.
+
+FD.io VPP is vector packet processing software, to learn more about what that
+means, see the :ref:`what-is-vector-packet-processing` section. 
+
+For more detailed information on FD.io features, see the following sections:
+
+.. toctree::
+   :maxdepth: 1
+
+   dataplane.rst
+   fast.rst
+   developer.rst
+   extensible.rst
diff --git a/docs/overview/whatisvpp/what-is-vector-packet-processing.rst b/docs/overview/whatisvpp/what-is-vector-packet-processing.rst
new file mode 100644
index 0000000..994318e
--- /dev/null
+++ b/docs/overview/whatisvpp/what-is-vector-packet-processing.rst
@@ -0,0 +1,73 @@
+:orphan:
+
+.. _what-is-vector-packet-processing:
+
+=================================
+What is vector packet processing?
+=================================
+
+FD.io VPP is developed using vector packet processing concepts, as opposed to
+scalar packet processing, these concepts are explained in the following sections. 
+
+Vector packet processing is a common approach among high performance `Userspace
+<https://en.wikipedia.org/wiki/User_space>`_ packet processing applications such
+as developed with FD.io VPP and `DPDK
+<https://en.wikipedia.org/wiki/Data_Plane_Development_Kit>`_. The scalar based
+aproach tends to be favoured by Operating System `Kernel
+<https://en.wikipedia.org/wiki/Kernel_(operating_system)>`_ Network Stacks and
+Userspace stacks that don't have strict performance requirements.
+
+**Scalar Packet Processing**
+
+A scalar packet processing network stack typically processes one packet at a
+time: an interrupt handling function takes a single packet from a Network
+Inteface, and processes it through a set of functions: fooA calls fooB calls
+fooC and so on.
+
+.. code-block:: none 
+
+   +---> fooA(packet1) +---> fooB(packet1) +---> fooC(packet1)
+   +---> fooA(packet2) +---> fooB(packet2) +---> fooC(packet2)
+   ...
+   +---> fooA(packet3) +---> fooB(packet3) +---> fooC(packet3)
+
+
+Scalar packet processing is simple, but inefficent in these ways:
+
+* When the code path length exceeds the size of the Microprocessor's instruction
+  cache (I-cache), `thrashing
+  <https://en.wikipedia.org/wiki/Thrashing_(computer_science)>`_ occurs as the
+  Microprocessor is continually loading new instructions. In this model, each
+  packet incurs an identical set of I-cache misses.
+* The associated deep call stack will also add load-store-unit pressure as
+  stack-locals fall out of the Microprocessor's Layer 1 Data Cache (D-cache).
+
+**Vector Packet Processing**
+
+In contrast, a vector packet processing network stack processes multiple packets
+at a time, called 'vectors of packets' or simply a 'vector'. An interrupt
+handling function takes the vector of packets from a Network Inteface, and
+processes the vector through a set of functions: fooA calls fooB calls fooC and
+so on.
+
+.. code-block:: none 
+
+   +---> fooA([packet1, +---> fooB([packet1, +---> fooC([packet1, +--->
+               packet2,             packet2,             packet2,
+               ...                  ...                  ...
+               packet256])          packet256])          packet256])
+
+This approach fixes: 
+
+* The I-cache thrashing problem described above, by ammoritizing the cost of
+  I-cache loads across multiple packets.
+
+* The ineffeciences associated with the deep call stack by recieving vectors
+  of up to 256 packets at a time from the Network Interface, and processes them
+  using a directed graph of node. The graph scheduler invokes one node dispatch
+  function at a time, restricting stack depth to a few stack frames.
+
+The further optimizations that this approaches enables are pipelining and
+prefetching to minimize read latency on table data and parallelize packet loads
+needed to process packets.
+