Blame - docs/specs/parallelism_improvement.rst - onap/multicloud/framework

blob: 69f0fb8cf1be3deb6384cd8294b0cfa5c5ce5b46 [file] [log] [blame]

Hong Hui Xiao	4ec7716	2018-01-17 10:15:26 +0800	[diff] [blame]	1	..
				2	This work is licensed under a Creative Commons Attribution 4.0
				3	International License.
				4
				5	===============================================
				6	Parallelism improvement of Multi Cloud Services
				7	===============================================
				8
				9
				10	Problem Description
				11	===================
				12
				13	Multi-Cloud runs Django by using Django's built-in webserver currently.
				14	According to Django Document[Django_Document]_, this mode should not be used
				15	in production. This mode has not gone through security audits or performance
				16	tests, and should only be used in development. From test on local computer,
				17	this mode can only handle ONE API request at one time. This can not meet the
				18	performance requirement.
				19
				20	.. [Django_Document] https://docs.djangoproject.com/en/dev/ref/django-admin/#runserver
				21
				22	Although security and scalability might be improved as the side effect of
				23	resolving the performance issue, this spec will only focus on how to improve
				24	the parallelism(performance) of current MultiCloud API framework.
				25
				26	Possible Solutions
				27	==================
				28
				29	Solution 1
				30	----------
				31
				32	Django is a mature framework. And it has its own way to improve parallelism.
				33	Instead of running Django's build-in webserver, Django APP can be deployed in
Ethan Lynn	b3e79cc	2018-06-05 17:26:55 +0800	[diff] [blame]	34	some dedicated web server. Django’s primary deployment platform is
				35	WSGI[django_deploy]_,
Hong Hui Xiao	4ec7716	2018-01-17 10:15:26 +0800	[diff] [blame]	36	the Python standard for web servers and applications.
				37
				38	.. [django_deploy] https://docs.djangoproject.com/en/2.0/howto/deployment/wsgi/
				39
				40
				41	But on the other side, Danjgo is very huge. And Django is a black box if one
				42	doesn't have good knowledge of it. Adding feature based on Django may be
				43	time-consuming. For example, the unit test[unit_test]_ of Multi-Cloud can't use
				44	regular python test library because of Django. The unit test has to base on
				45	Django's test framework. When we want to improve the parallelism of Multi-Cloud
Ethan Lynn	b3e79cc	2018-06-05 17:26:55 +0800	[diff] [blame]	46	services, we need to find out how Django can implement it, instead of using
				47	some common method.
Hong Hui Xiao	4ec7716	2018-01-17 10:15:26 +0800	[diff] [blame]	48
				49	.. [unit_test] https://gerrit.onap.org/r/#/c/8909/
				50
				51	Besides, Django's code pattern is too much like web code. And, most famous use
				52	cases of Django are web UI. Current code of Multi-Cloud puts many logic in
Ethan Lynn	b3e79cc	2018-06-05 17:26:55 +0800	[diff] [blame]	53	files named `views.py`, but actually there is no view to expose. It is
				54	confusing.
Hong Hui Xiao	4ec7716	2018-01-17 10:15:26 +0800	[diff] [blame]	55
				56	The benefit of this solution is that most current code needs no change.
				57
				58	Solution 2
				59	----------
				60
				61	Given the fact that Django has shortcomings to move on, this solution propose
Ethan Lynn	b3e79cc	2018-06-05 17:26:55 +0800	[diff] [blame]	62	to use a alternative framework. Eventlet[Eventlet]_ with Pecan[Pecan]_ will be
				63	the idea web framework in this case, because it is lightweight, lean and widely
Hong Hui Xiao	4ec7716	2018-01-17 10:15:26 +0800	[diff] [blame]	64	used.
				65
				66	.. [Eventlet] http://eventlet.net/doc/modules/wsgi.html
				67
				68	.. [Pecan] https://pecan.readthedocs.io/en/latest/
				69
				70	For example, most OpenStack projects use such framework. This framework is so
				71	thin that it can provide flexibility for future architecture design.
				72
				73	However, it needs to change existing code of API exposing.
				74
				75
				76	Performance Test Comparison
				77	===========================
				78
				79	Test Environment
				80	----------------
				81
				82	Apache Benchmark is used as test tool. It is shipped with Ubuntu, if you
				83	don’t find it, just run “sudo apt install -y apache2-utils”
				84
				85	2 Virtual Machine with Ubuntu1604. Virtual Machines are hosted in a multi-core
				86	hardware server. One VM is for Apache Benchmark. This VM is 1 CPU core, 8G mem.
				87	The other VM is for Multicloud. The VM is 4 CPU core, 6G mem.
				88
				89	Test Command
				90	~~~~~~~~~~~~
				91
				92	`ab -n <num of total requests> -c <concurrency level> http://<IP:port>/api/multicloud/v0/vim_types`
				93
				94	Test result
				95	-----------
				96
Ethan Lynn	b3e79cc	2018-06-05 17:26:55 +0800	[diff] [blame]	97	It should be noted that data may vary in different test run, but overall result
				98	is similar as below.
Hong Hui Xiao	4ec7716	2018-01-17 10:15:26 +0800	[diff] [blame]	99
				100	100 requests, concurrency level 1
				101	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				102
				103	Command: `ab -n 100 -c 1 http://<IP:port>/api/multicloud/v0/vim_types`
Ethan Lynn	b3e79cc	2018-06-05 17:26:55 +0800	[diff] [blame]	104	Result::
				105
Hong Hui Xiao	4ec7716	2018-01-17 10:15:26 +0800	[diff] [blame]	106	Django runserver: total takes 0.512 seconds, all requests success
				107	Django+uwsgi: totally takes 0.671 seconds, all requests success.
				108	Pecan+eventlet: totally takes 0.149 seconds, all requests success.
				109
				110	10000 requests, concurrency level 100
				111	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				112
				113	Command: `ab -n 10000 -c 100 http://<IP:port>/api/multicloud/v0/vim_types`
Ethan Lynn	b3e79cc	2018-06-05 17:26:55 +0800	[diff] [blame]	114	Result::
				115
Hong Hui Xiao	4ec7716	2018-01-17 10:15:26 +0800	[diff] [blame]	116	Django runserver: total takes 85.326 seconds, all requests success
				117	Django+uwsgi: totally takes 3.808 seconds, all requests success.
				118	Pecan+eventlet: totally takes 3.181 seconds, all requests success.
				119
				120	100000 requests, concurrency level 1000
				121	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				122
Bin Sun	6923064	2018-03-23 10:17:12 +0800	[diff] [blame]	123	Command: `ab -n 100000 -c 1000 http://<IP:port>/api/multicloud/v0/vim_types`
Ethan Lynn	b3e79cc	2018-06-05 17:26:55 +0800	[diff] [blame]	124	Result::
				125
Hong Hui Xiao	4ec7716	2018-01-17 10:15:26 +0800	[diff] [blame]	126	Django runserver: Apache Benchmark quit because it reports timeout after
				127	running a random portion of all requests.
				128	Django+uwsgi: totally takes 37.316 seconds, about 32% requests fail. I see
				129	some error says that tcp socket open too many.
				130	Pecan+eventlet: totally takes 35.315 seconds, all requests success.
				131
				132	Proposed Change
				133	===============
				134
				135	Given the test result above, this spec proposes to use solution 2. Based on
				136	the consideration of Elastic API exposure[jira_workitem]_, Multi-Cloud will
				137	provide a new way to expose its API. That is to say, existing code of API
				138	exposing needs rewrite in [jira_workitem]_. So the disadvantage of solution
				139	2 doesn't exist.
				140
				141	.. [jira_workitem] https://jira.onap.org/browse/MULTICLOUD-152
				142
				143	To define a clear scope of this spec, VoLTE is the use case that will be used
				144	to perform test to this spec. All functionality that VoLTE needed should be
				145	implemented in this spec and [jira_workitem]_.
				146
				147	Backward compatibility
				148	----------------------
				149
				150	This spec will NOT change current API. This spec will NOT replace the current
				151	API framework in R2, nor will switch to new API framework in R2. Instead,
				152	this spec will provide a configuration option, named `web_framework`, to make
				153	sure use case and functionalities not be broken. Default value of the
				154	configuration will BE `django`, which will still run current Django API
				155	framework. An alternative value is `pecan`, which will run the API framework
				156	proposed in this spec. So users don't care about the change won't be
				157	affected.
				158
				159	WSGI Server
				160	-----------
				161
				162	No matter what API framework will be used, a WSGI Server needs to be provided.
				163	This spec will use Eventlet WSGI server. API framework will be run as an
				164	application in WSGI server.
				165
				166	Multi processes framework
				167	-------------------------
				168
				169	This spec proposes to run Multi-Cloud API server in multiple processes mode.
				170	Multi-process can provide parallel API handlers. So, when multiple API
				171	requests come to Multi-Cloud, they can be handled simultaneously. On the other
				172	hand, different processes can effectively isolate different API request. So
				173	that, one API request will not affect another.
				174
				175	Managing multiple processes could be overwhelming difficult and sometimes
				176	dangerous. Some mature library could be used to reduce related work here, for
				177	example oslo.service[oslo_service]_. Since oslo is used by all OpenStack
				178	projects for many releases, and oslo project is actively updated, it can be
				179	seen as a stable library.
				180
				181	.. [oslo_service] https://github.com/openstack/oslo.service
				182
				183	Number of processes
				184	~~~~~~~~~~~~~~~~~~~
				185
				186	To best utilize multi-core CPU, the number of processes will be set to the
				187	number of CPU cores by default.
				188
				189	Shared socket file
				190	~~~~~~~~~~~~~~~~~~
				191
				192	To make multiple processes work together and provide a unified port number,
				193	multiple processes need to share a socket file. To achieve this, a bootstrap
				194	process will be started and will initialize the socket file. Other processes
				195	can be forked from this bootstrap process.
				196
				197	Work Items
				198	==========
				199
				200	#. Add WSGI server.
				201	#. Run Pecan application in WSGI server.
				202	#. Add multiple processes support.
				203	#. Update deploy script to support new API framework.
				204