Dave Barach | 6c1f56f | 2019-03-22 15:49:45 -0400 | [diff] [blame] | 1 | Multi-Architecture Arbitrary Function Cookbook |
| 2 | ============================================== |
| 3 | |
| 4 | Optimizing arbitrary functions for multiple architectures is simple |
| 5 | enough, and very similar to process used to produce multi-architecture |
| 6 | graph node dispatch functions. |
| 7 | |
| 8 | As with multi-architecture graph nodes, we compile source files |
| 9 | multiple times, generating multiple implementations of the original |
| 10 | function, and a public selector function. |
| 11 | |
| 12 | Details |
| 13 | ------- |
| 14 | |
| 15 | Decorate function definitions with CLIB_MARCH_FN macros. For example: |
| 16 | |
| 17 | Change the original function prototype... |
| 18 | |
| 19 | :: |
| 20 | |
| 21 | u32 vlib_frame_alloc_to_node (vlib_main_t * vm, u32 to_node_index, |
| 22 | u32 frame_flags) |
| 23 | |
| 24 | ...by recasting the function name and return type as the first two |
| 25 | arguments to the CLIB_MARCH_FN macro: |
| 26 | |
| 27 | :: |
| 28 | |
| 29 | CLIB_MARCH_FN (vlib_frame_alloc_to_node, u32, vlib_main_t * vm, |
| 30 | u32 to_node_index, u32 frame_flags) |
| 31 | |
| 32 | In the actual vpp image, several versions of vlib_frame_alloc_to_node |
| 33 | will appear: vlib_frame_alloc_to_node_avx2, |
| 34 | vlib_frame_alloc_to_node_avx512, and so forth. |
| 35 | |
| 36 | |
| 37 | For each multi-architecture function, use the CLIB_MARCH_FN_SELECT |
| 38 | macro to help generate the one-and-only multi-architecture selector |
| 39 | function: |
| 40 | |
| 41 | :: |
| 42 | |
| 43 | #ifndef CLIB_MARCH_VARIANT |
| 44 | u32 |
| 45 | vlib_frame_alloc_to_node (vlib_main_t * vm, u32 to_node_index, |
| 46 | u32 frame_flags) |
| 47 | { |
| 48 | return CLIB_MARCH_FN_SELECT (vlib_frame_alloc_to_node) |
| 49 | (vm, to_node_index, frame_flags); |
| 50 | } |
| 51 | #endif /* CLIB_MARCH_VARIANT */ |
| 52 | |
| 53 | Once bound, the multi-architecture selector function is about as |
| 54 | expensive as an indirect function call; which is to say: not very |
| 55 | expensive. |
| 56 | |
| 57 | Modify CMakeLists.txt |
| 58 | --------------------- |
| 59 | |
| 60 | If the component in question already lists "MULTIARCH_SOURCES", simply |
| 61 | add the indicated .c file to the list. Otherwise, add as shown |
| 62 | below. Note that the added file "new_multiarch_node.c" should appear in |
| 63 | *both* SOURCES and MULTIARCH_SOURCES: |
| 64 | |
| 65 | :: |
| 66 | |
| 67 | add_vpp_plugin(myplugin |
| 68 | SOURCES |
| 69 | multiarch_code.c |
| 70 | ... |
| 71 | |
| 72 | MULTIARCH_SOURCES |
| 73 | multiarch_code.c |
| 74 | ... |
| 75 | ) |
| 76 | |
| 77 | A Word to the Wise |
| 78 | ------------------ |
| 79 | |
| 80 | A file which liberally mixes functions worth compiling for multiple |
| 81 | architectures and functions which are not will end up full of |
| 82 | #ifndef CLIB_MARCH_VARIANT conditionals. This won't do a thing to make |
| 83 | the code look any better. |
| 84 | |
| 85 | Depending on requirements, it may make sense to move functions to |
| 86 | (new) files to reduce complexity and/or improve legibility of the |
| 87 | resulting code. |