Guobiao Mo | bb94cb7 | 2020-02-11 17:14:33 -0800 | [diff] [blame] | 1 | .. This work is licensed under a Creative Commons Attribution 4.0
|
| 2 | International License. http://creativecommons.org/licenses/by/4.0
|
| 3 |
|
| 4 | .. _docs_Datalake_Handler_MS:
|
| 5 |
|
| 6 | Architecture
|
| 7 | ------------
|
| 8 |
|
| 9 |
|
| 10 | Background
|
| 11 | ~~~~~~~~~~
|
| 12 | There are large amount of data flowing among ONAP components, mostly via DMaaP and Web Services.
|
| 13 | For example, all events/feed collected by DCAE collectors go through DMaaP.
|
| 14 | DMaaP is backed by Kafka, which is a system for Publish-Subscribe,
|
| 15 | where data is not meant to be permanent and gets deleted after certain retention period.
|
| 16 | Kafka is not a database, means that data there is not for query.
|
| 17 | Though some components may store processed result into their local databases, most of the raw data will eventually lost.
|
| 18 | We should provide a systematic way to store these raw data, and even the processed result,
|
| 19 | which will serve as the source for data analytics and machine learning, providing insight to the network operation.
|
| 20 |
|
| 21 |
|
| 22 | Relations with Other ONAP Components
|
| 23 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
| 24 | The architecture below depicts the DataLake MS as a part of ONAP. Only the relevant interactions and components are shown.
|
| 25 |
|
| 26 | .. image:: ./arch.PNG
|
| 27 |
|
| 28 | Note that not all data storage systems in the picture are supported. In R6, the following storage are supported:
|
| 29 | - MongoDB
|
| 30 | - Couchbase
|
| 31 | - Elasticsearch and Kibana
|
| 32 | - HDFS
|
VENKATESH KUMAR | e21c848 | 2020-11-12 15:00:02 -0500 | [diff] [blame] | 33 |
|
Guobiao Mo | bb94cb7 | 2020-02-11 17:14:33 -0800 | [diff] [blame] | 34 | Depending on demands, new systems may be added to the supported list. In the following we use the term database for the storage,
|
| 35 | even though HDFS is a file system (but with simple settings, it can be treats as a database, e.g. Hive.)
|
| 36 |
|
| 37 | Note that once the data is stored in databases, other ONAP components and systems will directly query data from the databases,
|
| 38 | without interacting with DataLake Handler.
|
| 39 |
|
| 40 | Description
|
| 41 | ~~~~~~~~~~~
|
Kai | 6c9735a | 2020-11-13 17:03:52 +0800 | [diff] [blame] | 42 | DataLake Handler's main function is to monitor and persist data flow through DMaaP and provide a query API for other component or external services. The databases are outside of ONAP scope,
|
Guobiao Mo | bb94cb7 | 2020-02-11 17:14:33 -0800 | [diff] [blame] | 43 | since the data is expected to be huge, and a database may be a complicated cluster consisting of thousand of nodes.
|
| 44 |
|
| 45 | Admin UI
|
| 46 | ~~~~~~~~
|
| 47 | A system administrator uses DataLake Admin UI to:
|
| 48 | - Configure external database connections, such as host, port, login.
|
| 49 | - Configure which Topics to monitor, which databases to store the data for each Topic.
|
| 50 | - Pre-configured 3rd Party Tools dashboards and templates.
|
| 51 |
|
Niranjana | 25c8aa0 | 2021-05-07 11:18:09 +0530 | [diff] [blame^] | 52 | This UI tool is used to manage all the Dayalake settings stored in postgres. Here is the database schema:
|
Guobiao Mo | bb94cb7 | 2020-02-11 17:14:33 -0800 | [diff] [blame] | 53 |
|
| 54 | .. image:: ./dbschema.PNG
|
| 55 |
|
| 56 | Feeder
|
| 57 | ~~~~~~
|
| 58 | Architecture
|
| 59 | .. image:: ./feeder-arch.PNG
|
| 60 |
|
| 61 | Features
|
| 62 |
|
| 63 | - Read data directly from Kafka for performance.
|
| 64 | - Support for pluggable databases. To add a new database, we only need to implement its corrosponding service.
|
Niranjana | 25c8aa0 | 2021-05-07 11:18:09 +0530 | [diff] [blame^] | 65 | - Support REST API for inter-component communications. Besides managing DatAlake settings in postgres, Admin UI also use this API to start/stop Feeder, query Feeder status and statistics.
|
| 66 | - Use postgres to store settings.
|
VENKATESH KUMAR | e21c848 | 2020-11-12 15:00:02 -0500 | [diff] [blame] | 67 | - Support data processing features. Before persisting data, data can be massaged in Feeder. Currently two features are implemented: Correlate Cleared Message (in org.onap.datalake.feeder.service.db.ElasticsearchService) and Flatten JSON Array (org.onap.datalake.feeder.service.StoreService).
|
Guobiao Mo | bb94cb7 | 2020-02-11 17:14:33 -0800 | [diff] [blame] | 68 | - Connection to Kafka and DBs are secured
|
| 69 |
|
Kai | 6c9735a | 2020-11-13 17:03:52 +0800 | [diff] [blame] | 70 | Des
|
| 71 | ~~~
|
| 72 | Architecture
|
| 73 | .. image:: ./des-arch.PNG
|
| 74 |
|
| 75 | Features
|
| 76 |
|
| 77 | - Provide a data query API for other components to consume.
|
| 78 | - Integrate with Presto to do data query via sql template.
|
Guobiao Mo | bb94cb7 | 2020-02-11 17:14:33 -0800 | [diff] [blame] | 79 |
|
| 80 | Links
|
| 81 | ~~~~~
|
| 82 | - DataLake Development Environment Setup https://wiki.onap.org/display/DW/DataLake+Development+Environment+Setup
|
Kai | 6c9735a | 2020-11-13 17:03:52 +0800 | [diff] [blame] | 83 | - Des description and deployment steps: https://wiki.onap.org/display/DW/DES
|
Guobiao Mo | bb94cb7 | 2020-02-11 17:14:33 -0800 | [diff] [blame] | 84 | - Source Code https://gerrit.onap.org/r/gitweb?p=dcaegen2/services.git;a=tree;f=components/datalake-handler;hb=HEAD
|