test(e2e): Add return to sender app based test Change-Id: Idb460369c4210a614dfba961efa3a8df8f5ee51a Signed-off-by: E. Scott Daniels <daniels@research.att.com>

commit: f7d44570f8de6e15f768e8e2d9b6061cd0bff11f [log] [tgz]
author: E. Scott Daniels <daniels@research.att.com> Thu May 16 17:04:34 2019 +0000
committer: E. Scott Daniels <daniels@research.att.com> Thu May 16 17:05:46 2019 +0000
tree: 0cb14ad71bf6594af73061cc91682d70f6229c96
parent: a012cf63dfdad3656c995cb06c316fd208c63b98 [diff] [blame]
diff --git a/test/app_test/NOTES.txt b/test/app_test/NOTES.txt
new file mode 100644
index 0000000..a3b43e9
--- /dev/null
+++ b/test/app_test/NOTES.txt

@@ -0,0 +1,118 @@
+
+In general, seeing a "PASS" from the sender(s) and receiver(s) for each execution
+is a good indication that all was successful.  Reeceivers will fail if the 
+simple checksum calculated for the payload and trace data doesn't match. Senders
+will fail if a returned message doesn't have its matching tag (meaning it was
+returned to the wrong sender).  Both will error on a timeout either no route 
+information, or receiver did not receive the expected number of messages.
+
+Receivers send an 'ack' for message type 5, so for some tests the number of ack
+messages sent will not be the same as the number of messages received. Senders
+loop through message types 0-9 inclusive, unless otherwise directed on the
+command line (e.g. the rts test sends nothing but message type 5 messages so that
+all messages are ack'd).   
+
+Receivers will generate a final histogram of message types received. For example
+
+<RCVR> mtype histogram:      0      0      0      0      0 100000      0      0      0      0      0 
+
+is generated for the rts test -- all messages are type 5 and thus all other message
+type bins should be 0.
+
+By default, senders send 10 messages at a rate of about 1/sec.  Receivers give up
+after 20 seconds, so even though the rate and number of messages sent can be 
+adjusted from the command line, if the combination is such that the total number
+of messages sent requires more than 20 seconds to send the tests will fail. 
+
+Specific examples
+The output is chopped to the last few lines.
+
+Return to sender test with 20 senders sending 5K messages each:
+	ksh run_rts_test.ksh -s 20 -d 180 -n 5000
+
+
+	<SNDR> [PASS] sent=5000  rcvd=4999  rts-ok=4999 failures=0 retries=4
+	<RCVR> mtype histogram:      0      0      0      0      0 100000      0      0      0      0      0 
+	<RCVR> [PASS] 100000 messages;  good=100000  acked=99983 bad=0  bad-trace=0 bad-sub_id=0
+	<SNDR> [PASS] sent=5000  rcvd=5000  rts-ok=5000 failures=0 retries=4
+	<SNDR> [PASS] sent=5000  rcvd=4998  rts-ok=4998 failures=0 retries=2
+	<SNDR> [PASS] sent=5000  rcvd=4998  rts-ok=4998 failures=0 retries=2
+	<SNDR> [PASS] sent=5000  rcvd=5000  rts-ok=5000 failures=0 retries=4
+	<SNDR> [PASS] sent=5000  rcvd=4998  rts-ok=4998 failures=0 retries=2
+	<SNDR> [PASS] sent=5000  rcvd=5000  rts-ok=5000 failures=0 retries=4
+	<SNDR> [PASS] sent=5000  rcvd=5000  rts-ok=5000 failures=0 retries=2
+	<SNDR> [PASS] sent=5000  rcvd=5000  rts-ok=5000 failures=0 retries=4
+	<SNDR> [PASS] sent=5000  rcvd=4999  rts-ok=4999 failures=0 retries=2
+	<SNDR> [PASS] sent=5000  rcvd=4999  rts-ok=4999 failures=0 retries=4
+	<SNDR> [PASS] sent=5000  rcvd=4999  rts-ok=4999 failures=0 retries=5
+	<SNDR> [PASS] sent=5000  rcvd=4999  rts-ok=4999 failures=0 retries=1
+	<SNDR> [PASS] sent=5000  rcvd=5000  rts-ok=5000 failures=0 retries=4
+	<SNDR> [PASS] sent=5000  rcvd=4997  rts-ok=4997 failures=0 retries=2
+	<SNDR> [PASS] sent=5000  rcvd=4999  rts-ok=4999 failures=0 retries=2
+	<SNDR> [PASS] sent=5000  rcvd=5000  rts-ok=5000 failures=0 retries=2
+	<SNDR> [PASS] sent=5000  rcvd=5000  rts-ok=5000 failures=0 retries=3
+	<SNDR> [PASS] sent=5000  rcvd=5000  rts-ok=5000 failures=0 retries=1
+	<SNDR> [PASS] sent=5000  rcvd=4998  rts-ok=4998 failures=0 retries=2
+	[PASS] sender rc=0  receiver rc=0
+
+Important notes
+	+ The receiver will only retry acks for a finite number of tries before
+	  giving up, thus the total acs sent may still be less than messages
+	  received. As a cross validation, the total acks sent by the receiver
+	  should match the recvd count sum over all senders. 
+
+	+ The recvd and rts-ok counts for each sender should match. If they don't
+	  the receiver should mark the overall state as a failure as this indicates
+	  that a return to sender message was returned to the wrong place.
+
+
+
+Multiple Receiver test
+Test run with 10 receivers and sender sending 10K messages. The histograms
+and status messages were reorganised for easier reading here.
+
+	ksh run_multi_test.ksh  -r 10 -d 180 -n 10000
+	<RCVR> mtype histogram:   1000   1000   1000   1000   1000   1000   1000   1000   1000   1000      0 
+	<RCVR> mtype histogram:   1000   1000   1000   1000   1000   1000   1000   1000   1000   1000      0 
+	<RCVR> mtype histogram:   1000   1000   1000   1000   1000   1000   1000   1000   1000   1000      0 
+	<RCVR> mtype histogram:   1000   1000   1000   1000   1000   1000   1000   1000   1000   1000      0 
+	<RCVR> mtype histogram:   1000   1000   1000   1000   1000   1000   1000   1000   1000   1000      0 
+	<RCVR> mtype histogram:   1000   1000   1000   1000   1000   1000   1000   1000   1000   1000      0 
+	<RCVR> mtype histogram:   1000   1000   1000   1000   1000   1000   1000   1000   1000   1000      0 
+	<RCVR> mtype histogram:   1000   1000   1000   1000   1000   1000   1000   1000   1000   1000      0 
+	<RCVR> mtype histogram:   1000   1000   1000   1000   1000   1000   1000   1000   1000   1000      0 
+	<RCVR> mtype histogram:   1000   1000   1000   1000   1000   1000   1000   1000   1000   1000      0 
+
+	<RCVR> [PASS] 10000 messages;  good=10000  acked=1000 bad=0  bad-trace=0 bad-sub_id=0
+	<SNDR> [PASS] sent=10000  rcvd=10000  rts-ok=10000 failures=0 retries=0
+	<RCVR> [PASS] 10000 messages;  good=10000  acked=1000 bad=0  bad-trace=0 bad-sub_id=0
+	<RCVR> [PASS] 10000 messages;  good=10000  acked=1000 bad=0  bad-trace=0 bad-sub_id=0
+	<RCVR> [PASS] 10000 messages;  good=10000  acked=1000 bad=0  bad-trace=0 bad-sub_id=0
+	<RCVR> [PASS] 10000 messages;  good=10000  acked=1000 bad=0  bad-trace=0 bad-sub_id=0
+	<RCVR> [PASS] 10000 messages;  good=10000  acked=1000 bad=0  bad-trace=0 bad-sub_id=0
+	<RCVR> [PASS] 10000 messages;  good=10000  acked=1000 bad=0  bad-trace=0 bad-sub_id=0
+	<RCVR> [PASS] 10000 messages;  good=10000  acked=1000 bad=0  bad-trace=0 bad-sub_id=0
+	<RCVR> [PASS] 10000 messages;  good=10000  acked=1000 bad=0  bad-trace=0 bad-sub_id=0
+	<RCVR> [PASS] 10000 messages;  good=10000  acked=1000 bad=0  bad-trace=0 bad-sub_id=0
+	[PASS] sender rc=0  receiver rc=0
+
+Important notes:
+	+ histograms should show messages for all types, except type 10 which are never sent.
+
+	+ sender should receive only 1/10th of the number of messages sent back as acks;
+	  modulo receiver giving up on an ack retry, so as before the sum of ack counts should 
+	  match the sender's received count.
+
+	+ sender should fail if the received count does not match the rts-ok count indicating
+	  that a return to sender was sent to the wrong spot (very unlikely here as there is
+	  only one sender).
+
+
+
+Retries
+The retries counter for a sender is the number of times that a retry send loop had to be
+entered in order to successfully send a message. The sender will never give up on a send
+attempt, but retrying will affect latency of that message. A count of less than 10/10000
+messages is good, but it also depends on the rate that the sender is attempting. The
+higher the rate, the more likely the need to retry, and thus the higher this counter will
+be.
commit	f7d44570f8de6e15f768e8e2d9b6061cd0bff11f	[log] [tgz]
author	E. Scott Daniels <daniels@research.att.com>	Thu May 16 17:04:34 2019 +0000
committer	E. Scott Daniels <daniels@research.att.com>	Thu May 16 17:05:46 2019 +0000
tree	0cb14ad71bf6594af73061cc91682d70f6229c96
parent	a012cf63dfdad3656c995cb06c316fd208c63b98 [diff] [blame]