ip checksum multiarch support, cleanup

When computing tcp/udp checksums across large amounts of data -
e.g. when NIC h/w checksum offload is not available - it's worth
providing arch-dependent code; if only to compile the code w/ -O3.

Fix calculation when data is fully unaligned / on an odd byte
boundary.

Add a buffer alignment test vector.

Change-Id: I7644e2276ac6cbc3f575bf61746a6ffedbbb6150
Signed-off-by: Dave Barach <dave@barachs.net>
diff --git a/src/vnet/ip/ip_packet.h b/src/vnet/ip/ip_packet.h
index d3f3de7..3c532f1 100644
--- a/src/vnet/ip/ip_packet.h
+++ b/src/vnet/ip/ip_packet.h
@@ -156,9 +156,13 @@
   return c;
 }
 
-/* Copy data and checksum at the same time. */
-ip_csum_t ip_csum_and_memcpy (ip_csum_t sum, void *dst, void *src,
-			      uword n_bytes);
+extern ip_csum_t (*vnet_incremental_checksum_fp) (ip_csum_t, void *, uword);
+
+always_inline ip_csum_t
+ip_incremental_checksum (ip_csum_t sum, void *_data, uword n_bytes)
+{
+  return (*vnet_incremental_checksum_fp) (sum, _data, n_bytes);
+}
 
 always_inline u16
 ip_csum_and_memcpy_fold (ip_csum_t sum, void *dst)