svm_fifo rework to avoid contention on cursize

Problems Addressed:
- Contention of cursize by producer and consumer.
- Reduce the no of modulo operations.

Changes:
- Synchronization between producer and consumer changed from cursize
  to head and tail indexes
  Implications: reduces the usable size of fifo by 1.
- Using weaker memory ordering C++11 atomics to access head and tail
  based on producer and consumer role.
- Head and tail indexes are unsigned 32 bit integers. Additions and
  subtraction on them are implicit 32 bit Modulo operation.
- Adding weaker memory ordering variants of max_enq, max_deq, is_empty
  and is_full Using them appropriately in all places.

Perfomance improvement (iperf3 via Hoststack):

iperf3 Server: Marvell ThunderX2(AArch64) - iperf3 Client: Skylake(x86)
   ~6%(256 rxd/txd) - ~11%(2048 rxd/txd)

Change-Id: I1d484e000e437430fdd5a819657d1c6b62443018
Signed-off-by: Sirshak Das <sirshak.das@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
diff --git a/src/vnet/session-apps/echo_client.c b/src/vnet/session-apps/echo_client.c
index fb7de48..39f464d 100644
--- a/src/vnet/session-apps/echo_client.c
+++ b/src/vnet/session-apps/echo_client.c
@@ -60,7 +60,7 @@
       if (ecm->no_copy)
 	{
 	  svm_fifo_t *f = s->data.tx_fifo;
-	  rv = clib_min (svm_fifo_max_enqueue (f), bytes_this_chunk);
+	  rv = clib_min (svm_fifo_max_enqueue_prod (f), bytes_this_chunk);
 	  svm_fifo_enqueue_nocopy (f, rv);
 	  session_send_io_evt_to_thread_custom (&f->master_session_index,
 						s->thread_index,
@@ -77,7 +77,7 @@
 	  session_dgram_hdr_t hdr;
 	  svm_fifo_t *f = s->data.tx_fifo;
 	  app_session_transport_t *at = &s->data.transport;
-	  u32 max_enqueue = svm_fifo_max_enqueue (f);
+	  u32 max_enqueue = svm_fifo_max_enqueue_prod (f);
 
 	  if (max_enqueue <= sizeof (session_dgram_hdr_t))
 	    return;
@@ -151,7 +151,7 @@
     }
   else
     {
-      n_read = svm_fifo_max_dequeue (rx_fifo);
+      n_read = svm_fifo_max_dequeue_cons (rx_fifo);
       svm_fifo_dequeue_drop (rx_fifo, n_read);
     }
 
@@ -480,7 +480,7 @@
   sp = pool_elt_at_index (ecm->sessions, s->rx_fifo->client_session_index);
   receive_data_chunk (ecm, sp);
 
-  if (svm_fifo_max_dequeue (s->rx_fifo))
+  if (svm_fifo_max_dequeue_cons (s->rx_fifo))
     {
       if (svm_fifo_set_event (s->rx_fifo))
 	session_send_io_evt_to_thread (s->rx_fifo, SESSION_IO_EVT_BUILTIN_RX);