blob: 69f0fb8cf1be3deb6384cd8294b0cfa5c5ce5b46 [file] [log] [blame]
Hong Hui Xiao4ec77162018-01-17 10:15:26 +08001..
2 This work is licensed under a Creative Commons Attribution 4.0
3 International License.
4
5===============================================
6Parallelism improvement of Multi Cloud Services
7===============================================
8
9
10Problem Description
11===================
12
13Multi-Cloud runs Django by using Django's built-in webserver currently.
14According to Django Document[Django_Document]_, this mode should not be used
15in production. This mode has not gone through security audits or performance
16tests, and should only be used in development. From test on local computer,
17this mode can only handle ONE API request at one time. This can not meet the
18performance requirement.
19
20.. [Django_Document] https://docs.djangoproject.com/en/dev/ref/django-admin/#runserver
21
22Although security and scalability might be improved as the side effect of
23resolving the performance issue, this spec will only focus on how to improve
24the parallelism(performance) of current MultiCloud API framework.
25
26Possible Solutions
27==================
28
29Solution 1
30----------
31
32Django is a mature framework. And it has its own way to improve parallelism.
33Instead of running Django's build-in webserver, Django APP can be deployed in
Ethan Lynnb3e79cc2018-06-05 17:26:55 +080034some dedicated web server. Djangos primary deployment platform is
35WSGI[django_deploy]_,
Hong Hui Xiao4ec77162018-01-17 10:15:26 +080036the Python standard for web servers and applications.
37
38.. [django_deploy] https://docs.djangoproject.com/en/2.0/howto/deployment/wsgi/
39
40
41But on the other side, Danjgo is very huge. And Django is a black box if one
42doesn't have good knowledge of it. Adding feature based on Django may be
43time-consuming. For example, the unit test[unit_test]_ of Multi-Cloud can't use
44regular python test library because of Django. The unit test has to base on
45Django's test framework. When we want to improve the parallelism of Multi-Cloud
Ethan Lynnb3e79cc2018-06-05 17:26:55 +080046services, we need to find out how Django can implement it, instead of using
47some common method.
Hong Hui Xiao4ec77162018-01-17 10:15:26 +080048
49.. [unit_test] https://gerrit.onap.org/r/#/c/8909/
50
51Besides, Django's code pattern is too much like web code. And, most famous use
52cases of Django are web UI. Current code of Multi-Cloud puts many logic in
Ethan Lynnb3e79cc2018-06-05 17:26:55 +080053files named `views.py`, but actually there is no view to expose. It is
54confusing.
Hong Hui Xiao4ec77162018-01-17 10:15:26 +080055
56The benefit of this solution is that most current code needs no change.
57
58Solution 2
59----------
60
61Given the fact that Django has shortcomings to move on, this solution propose
Ethan Lynnb3e79cc2018-06-05 17:26:55 +080062to use a alternative framework. Eventlet[Eventlet]_ with Pecan[Pecan]_ will be
63the idea web framework in this case, because it is lightweight, lean and widely
Hong Hui Xiao4ec77162018-01-17 10:15:26 +080064used.
65
66.. [Eventlet] http://eventlet.net/doc/modules/wsgi.html
67
68.. [Pecan] https://pecan.readthedocs.io/en/latest/
69
70For example, most OpenStack projects use such framework. This framework is so
71thin that it can provide flexibility for future architecture design.
72
73However, it needs to change existing code of API exposing.
74
75
76Performance Test Comparison
77===========================
78
79Test Environment
80----------------
81
82Apache Benchmark is used as test tool. It is shipped with Ubuntu, if you
83dont find it, just run sudo apt install -y apache2-utils
84
852 Virtual Machine with Ubuntu1604. Virtual Machines are hosted in a multi-core
86hardware server. One VM is for Apache Benchmark. This VM is 1 CPU core, 8G mem.
87The other VM is for Multicloud. The VM is 4 CPU core, 6G mem.
88
89Test Command
90~~~~~~~~~~~~
91
92`ab -n <num of total requests> -c <concurrency level> http://<IP:port>/api/multicloud/v0/vim_types`
93
94Test result
95-----------
96
Ethan Lynnb3e79cc2018-06-05 17:26:55 +080097It should be noted that data may vary in different test run, but overall result
98is similar as below.
Hong Hui Xiao4ec77162018-01-17 10:15:26 +080099
100100 requests, concurrency level 1
101~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
102
103Command: `ab -n 100 -c 1 http://<IP:port>/api/multicloud/v0/vim_types`
Ethan Lynnb3e79cc2018-06-05 17:26:55 +0800104Result::
105
Hong Hui Xiao4ec77162018-01-17 10:15:26 +0800106 Django runserver: total takes 0.512 seconds, all requests success
107 Django+uwsgi: totally takes 0.671 seconds, all requests success.
108 Pecan+eventlet: totally takes 0.149 seconds, all requests success.
109
11010000 requests, concurrency level 100
111~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
112
113Command: `ab -n 10000 -c 100 http://<IP:port>/api/multicloud/v0/vim_types`
Ethan Lynnb3e79cc2018-06-05 17:26:55 +0800114Result::
115
Hong Hui Xiao4ec77162018-01-17 10:15:26 +0800116 Django runserver: total takes 85.326 seconds, all requests success
117 Django+uwsgi: totally takes 3.808 seconds, all requests success.
118 Pecan+eventlet: totally takes 3.181 seconds, all requests success.
119
120100000 requests, concurrency level 1000
121~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
122
Bin Sun69230642018-03-23 10:17:12 +0800123Command: `ab -n 100000 -c 1000 http://<IP:port>/api/multicloud/v0/vim_types`
Ethan Lynnb3e79cc2018-06-05 17:26:55 +0800124Result::
125
Hong Hui Xiao4ec77162018-01-17 10:15:26 +0800126 Django runserver: Apache Benchmark quit because it reports timeout after
127 running a random portion of all requests.
128 Django+uwsgi: totally takes 37.316 seconds, about 32% requests fail. I see
129 some error says that tcp socket open too many.
130 Pecan+eventlet: totally takes 35.315 seconds, all requests success.
131
132Proposed Change
133===============
134
135Given the test result above, this spec proposes to use solution 2. Based on
136the consideration of Elastic API exposure[jira_workitem]_, Multi-Cloud will
137provide a new way to expose its API. That is to say, existing code of API
138exposing needs rewrite in [jira_workitem]_. So the disadvantage of solution
1392 doesn't exist.
140
141.. [jira_workitem] https://jira.onap.org/browse/MULTICLOUD-152
142
143To define a clear scope of this spec, VoLTE is the use case that will be used
144to perform test to this spec. All functionality that VoLTE needed should be
145implemented in this spec and [jira_workitem]_.
146
147Backward compatibility
148----------------------
149
150This spec will NOT change current API. This spec will NOT replace the current
151API framework in R2, nor will switch to new API framework in R2. Instead,
152this spec will provide a configuration option, named `web_framework`, to make
153sure use case and functionalities not be broken. Default value of the
154configuration will BE `django`, which will still run current Django API
155framework. An alternative value is `pecan`, which will run the API framework
156proposed in this spec. So users don't care about the change won't be
157affected.
158
159WSGI Server
160-----------
161
162No matter what API framework will be used, a WSGI Server needs to be provided.
163This spec will use Eventlet WSGI server. API framework will be run as an
164application in WSGI server.
165
166Multi processes framework
167-------------------------
168
169This spec proposes to run Multi-Cloud API server in multiple processes mode.
170Multi-process can provide parallel API handlers. So, when multiple API
171requests come to Multi-Cloud, they can be handled simultaneously. On the other
172hand, different processes can effectively isolate different API request. So
173that, one API request will not affect another.
174
175Managing multiple processes could be overwhelming difficult and sometimes
176dangerous. Some mature library could be used to reduce related work here, for
177example oslo.service[oslo_service]_. Since oslo is used by all OpenStack
178projects for many releases, and oslo project is actively updated, it can be
179seen as a stable library.
180
181.. [oslo_service] https://github.com/openstack/oslo.service
182
183Number of processes
184~~~~~~~~~~~~~~~~~~~
185
186To best utilize multi-core CPU, the number of processes will be set to the
187number of CPU cores by default.
188
189Shared socket file
190~~~~~~~~~~~~~~~~~~
191
192To make multiple processes work together and provide a unified port number,
193multiple processes need to share a socket file. To achieve this, a bootstrap
194process will be started and will initialize the socket file. Other processes
195can be forked from this bootstrap process.
196
197Work Items
198==========
199
200#. Add WSGI server.
201#. Run Pecan application in WSGI server.
202#. Add multiple processes support.
203#. Update deploy script to support new API framework.
204