sort worker-thread init functions in advance

Otherwise, all N worker threads try to sort the list at the same time:
a good way to have a bad day.

This approach performs *far* better than maintaing order by adding a
spin-lock. By direct measurement w/ elog + g2: 11 threads execute the
per-thread init function list in 22us, vs. 50ms with a CLIB_PAUSE()
enabled spin-lock.

Change-Id: I1745f2a213c0561260139a60114dcb981e0c64e5
Signed-off-by: Dave Barach <dave@barachs.net>
diff --git a/src/vlib/init.h b/src/vlib/init.h
index 6d27114..fc63801 100644
--- a/src/vlib/init.h
+++ b/src/vlib/init.h
@@ -328,7 +328,11 @@
 clib_error_t *vlib_call_init_exit_functions (struct vlib_main_t *vm,
 					     _vlib_init_function_list_elt_t **
 					     headp, int call_once);
-
+clib_error_t *vlib_call_init_exit_functions_no_sort (struct vlib_main_t *vm,
+						     _vlib_init_function_list_elt_t
+						     ** headp, int call_once);
+clib_error_t *vlib_sort_init_exit_functions (_vlib_init_function_list_elt_t
+					     **);
 #define foreach_vlib_module_reference		\
   _ (node_cli)					\
   _ (trace_cli)