vlib: Handle race in thread barrier processing

When CLIB_DEBUG is enabled, vlib_foreach_main macro asserts that
vlib_main it currently looks at is safely parked in barrier, by
checkling that vlib_main->parked_at_barrier is not 0.
Unfortunately, the check is racy - workers first increment the
atomic counter to indicate that they have reached the barrier
and _then_ set this_main->parked_at_barrier to 1. For the last
worker to suspend this opens the race - main thread is free
to execute and assert immediately after atomic counter has been
incremented, before worker gets to write to own parked_at_barrier.

Fix this by simply swapping the order of two operations.

Type: fix

Signed-off-by: Alexnader Kabaev <kan@FreeBSD.org>
Change-Id: Iae47abd6ca0be1c5413f5ecaefabc64cd7eac2ed
diff --git a/src/vlib/threads.h b/src/vlib/threads.h
index 79f44c8..312323c 100644
--- a/src/vlib/threads.h
+++ b/src/vlib/threads.h
@@ -416,12 +416,12 @@
 	  ed->thread_index = thread_index;
 	}
 
-      clib_atomic_fetch_add (vlib_worker_threads->workers_at_barrier, 1);
       if (CLIB_DEBUG > 0)
 	{
 	  vm = vlib_get_main ();
 	  vm->parked_at_barrier = 1;
 	}
+      clib_atomic_fetch_add (vlib_worker_threads->workers_at_barrier, 1);
       while (*vlib_worker_threads->wait_at_barrier)
 	;