svm_fifo rework to avoid contention on cursize

Problems Addressed:
- Contention of cursize by producer and consumer.
- Reduce the no of modulo operations.

Changes:
- Synchronization between producer and consumer changed from cursize
  to head and tail indexes
  Implications: reduces the usable size of fifo by 1.
- Using weaker memory ordering C++11 atomics to access head and tail
  based on producer and consumer role.
- Head and tail indexes are unsigned 32 bit integers. Additions and
  subtraction on them are implicit 32 bit Modulo operation.
- Adding weaker memory ordering variants of max_enq, max_deq, is_empty
  and is_full Using them appropriately in all places.

Perfomance improvement (iperf3 via Hoststack):

iperf3 Server: Marvell ThunderX2(AArch64) - iperf3 Client: Skylake(x86)
   ~6%(256 rxd/txd) - ~11%(2048 rxd/txd)

Change-Id: I1d484e000e437430fdd5a819657d1c6b62443018
Signed-off-by: Sirshak Das <sirshak.das@arm.com>
Reviewed-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
diff --git a/src/vnet/session/session.h b/src/vnet/session/session.h
index ed42e54..997d153 100644
--- a/src/vnet/session/session.h
+++ b/src/vnet/session/session.h
@@ -378,14 +378,14 @@
 transport_max_rx_enqueue (transport_connection_t * tc)
 {
   session_t *s = session_get (tc->s_index, tc->thread_index);
-  return svm_fifo_max_enqueue (s->rx_fifo);
+  return svm_fifo_max_enqueue_prod (s->rx_fifo);
 }
 
 always_inline u32
 transport_max_tx_dequeue (transport_connection_t * tc)
 {
   session_t *s = session_get (tc->s_index, tc->thread_index);
-  return svm_fifo_max_dequeue (s->tx_fifo);
+  return svm_fifo_max_dequeue_cons (s->tx_fifo);
 }
 
 always_inline u32