jdenisco | 0923a23 | 2018-08-29 13:19:43 -0400 | [diff] [blame] | 1 | .. _graphwalks: |
| 2 | |
| 3 | Graph Walks |
| 4 | ^^^^^^^^^^^^ |
| 5 | |
| 6 | All FIB object types are allocated from a VPP memory pool [#f13]_. The objects are thus |
| 7 | susceptible to memory re-allocation, therefore the use of a bare "C" pointer to refer |
| 8 | to a child or parent is not possible. Instead there is the concept of a *fib_node_ptr_t* |
| 9 | which is a tuple of type,index. The type indicates what type of object it is |
| 10 | (and hence which pool to use) and the index is the index in that pool. This allows |
| 11 | for the safe retrieval of any object type. |
| 12 | |
| 13 | When a child resolves via a parent it does so knowing the type of that parent. The |
| 14 | child to parent relationship is thus fully known to the child, and hence a forward |
| 15 | walk of the graph (from child to parent) is trivial. However, a parent does not choose |
| 16 | its children, it does not even choose the type. All object types that form part of the |
| 17 | FIB control plane graph all inherit from a single base class14; *fib_node_t*. A *fib_node_t* |
| 18 | indentifies the object's index and its associated virtual function table provides the |
| 19 | parent a mechanism to Զisitՠthat object during the walk. The reason for a back-walk |
| 20 | is to inform all children that the state of the parent has changed in some way, and |
| 21 | that the child may itself need to update. |
| 22 | |
| 23 | To support the many to one, child to parent, relationship a parent must maintain a |
| 24 | list of its children. The requirements of this list are; |
| 25 | |
| 26 | - O(1) insertion and delete time. Several child-parent relationships are made/broken during route addition/deletion. |
| 27 | - Ordering. High priority children are at the front, low priority at the back (see section Fast Convergence) |
| 28 | - Insertion at arbitrary locations. |
| 29 | |
| 30 | To realise these requirements the child-list is a doubly linked-list, where each element |
| 31 | contains a *fib_node_ptr_t*. The VPP pool memory model applies to the list elements, so |
| 32 | they are also identified by an index. When a child is added to a list it is returned the |
| 33 | index of the element. Using this index the element can be removed in constant time. |
| 34 | The list supports 'push-front' and 'push-back' semantics for ordering. To walk the children |
| 35 | of a parent is then to iterate of this list. |
| 36 | |
| 37 | A back-walk of the graph is a depth first search where all children in all levels of the |
| 38 | hierarchy are visited. Such walks can therefore encounter all object instances in the |
| 39 | FIB control plane graph, numbering in the millions. A FIB control-plane graph is cyclic |
| 40 | in the presence of a recursion loop, so the walk implementation has mechanisms to detect |
| 41 | this and exit early. |
| 42 | |
| 43 | A back-walk can be either synchronous or asynchronous. A synchronous walk will visit the |
| 44 | entire section of the graph before control is returned to the caller, an asynchronous |
| 45 | walk will queue the walk to a background process, to run at a later time, and immediately |
| 46 | return to the caller. To implement asynchronous walks a *fib_walk_t* object it added to |
| 47 | the front of the parent's child list. As children are visited the *fib_walk_t* object |
| 48 | advances through the list. Since it is inserted in the list, when the walk suspends |
| 49 | and resumes, it can continue at the correct location. It is also safe with respect to |
| 50 | the deletion of children from the list. New children are added to the head of the list, |
| 51 | and so will not encounter the walk, but since they are new, they already have the up to |
| 52 | date state of the parent. |
| 53 | |
| 54 | A VLIB process 'fib-walk' runs to perform the asynchronous walks. VLIB has no priority |
| 55 | scheduling between respective processes, so the fib-walk process does work in small |
| 56 | increments so it does not block the main route download process. Since the main download |
| 57 | process effectively has priority numerous asynchronous back-walks can be started on the |
| 58 | same parent instance before the fib-walk process can run. FIB is a 'final state' application. |
| 59 | If a parent changes n times, it is not necessary for the children to also update n |
| 60 | times, instead it is only necessary that this child updates to the latest, or final, |
| 61 | state. Consequently when multiple walks on a parent (and hence potential updates to a |
| 62 | child) are queued, these walks can be merged into a single walk. |
| 63 | |
| 64 | Choosing between a synchronous and an asynchronous walk is therefore a trade-off between |
| 65 | time it takes to propagate a change in the parent to all of its children, versus the |
| 66 | time it takes to act on a single route update. For example, if a route update where to |
| 67 | affect millions of child recursive routes, then the rate at which such updates could be |
| 68 | processed would be dependent on the number of child recursive route Рwhich would not be |
| 69 | good. At the time of writing FIB2.0 uses synchronous walk in all locations except when |
| 70 | walking the children of a path-list, and it has more than 32 [#f15]_ children. This avoids the |
| 71 | case mentioned above. |
| 72 | |
| 73 | .. rubric:: Footnotes: |
| 74 | |
| 75 | .. [#f13] Fast memory allocation is crucial to fast route update times. |
| 76 | .. [#f14] VPP may be written in C and not C++ but inheritance is still possible. |
| 77 | .. [#f15] The value is arbitrary and yet to be tuned. |