Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement MongoDBArtifactStore #4963

Merged
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions ansible/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,42 @@ ansible-playbook -i environments/$ENVIRONMENT routemgmt.yml
- To use the API Gateway, you'll need to run `apigateway.yml` and `routemgmt.yml`.
- Use `ansible-playbook -i environments/$ENVIRONMENT openwhisk.yml` to avoid wiping the data store. This is useful to start OpenWhisk after restarting your Operating System.

### Deploying Using MongoDB

You can choose MongoDB instead of CouchDB as the database backend to store entities and activations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems this PR only includes ArtifactStore, should we remove "activations" from the sentence?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but an ArtifactStore can also store activations

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would better to mention activations when we introduce related changes for ActivationStore(ArtifactStore[Activation])?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok


- Deploy a mongodb server(Optional, for test and develop only, use an external MongoDB server in production)

```
ansible-playbook -i environments/<environment> mongodb.yml -e mongodb_data_volume="/tmp/mongo-data"
```

- Then execute

```
cd <openwhisk_home>
./gradlew distDocker
cd ansible
ansible-playbook -i environments/<environment> initMongodb.yml -e mongodb_connect_string="mongodb://172.17.0.1:27017"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the relation with db_local.ini?

Getting this when deploying this change.

TASK [prepare db_local.ini] *********************************************************************************************************************************************************
Monday 24 May 2021  15:07:04 +0900 (0:00:00.883)       0:00:02.373 ************
fatal: [localhost -> localhost]: FAILED! => {"changed": false, "msg": "AnsibleUndefinedVariable: 'dict object' has no attribute 'db'"}

AnsibleUndefinedVariable: 'dict object' has no attribute 'db'

PLAY RECAP **************************************************************************************************************************************************************************
localhost                  : ok=2    changed=0    unreachable=0    failed=1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems pymongo needs to be installed.
then it should be documented somewhere.

TASK [create necessary auth keys] ******************************************************************************************************************************************************************************************************************************************************************************************************************************************
Monday 24 May 2021  15:57:29 +0900 (0:00:01.872)       0:00:02.229 ************
failed: [ansible] (item=guest) => {"changed": false, "item": "guest", "msg": "the python pymongo module is required"}
failed: [ansible] (item=whisk.system) => {"changed": false, "item": "whisk.system", "msg": "the python pymongo module is required"}

All items completed

PLAY RECAP *****************************************************************************************************************************************************************************************************************************************************************************************************************************************************************
ansible                    : ok=1    changed=0    unreachable=0    failed=1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not quite sure the reason yet but I am unable to deploy this change with this line.

scheduler:
  dataManagementService:
    retryInterval: "{{ scheduler_dataManagementService_retryInterval | default(1 second) }}"
TASK [kafka : add kafka default env vars] **********************************************************************************************************************************************************************************************************************************************************************************************************************************
Monday 24 May 2021  16:07:39 +0900 (0:00:00.054)       0:00:13.372 ************
fatal: [kafka0]: FAILED! => {"msg": "An unhandled exception occurred while templating '{% set ret = [] %}{% for host in groups['zookeepers'] %}{{ ret.append( hostvars[host].ansible_host + ':' + ((zookeeper.port+loop.index-1)|string) ) }}{% endfor %}{{ ret | join(',') }}'. Error was a <class 'ansible.errors.AnsibleError'>, original message: template error while templating string: expected token ',', got 'second'. String: {{ scheduler_dataManagementService_retryInterval | default(1 second) }}"}

An unhandled exception occurred while templating '{% set ret = [] %}{% for
host in groups['zookeepers'] %}{{ ret.append( hostvars[host].ansible_host + ':'
+ ((zookeeper.port+loop.index-1)|string) ) }}{% endfor %}{{ ret | join(',') }}'.
Error was a <class 'ansible.errors.AnsibleError'>, original message: template
error while templating string: expected token ',', got 'second'. String: {{
scheduler_dataManagementService_retryInterval | default(1 second) }}

Copy link
Contributor Author

@jiangpengcheng jiangpengcheng May 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove db_local.ini relation
seems group_vars/all needs a db_local.ini

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, db_local.ini is being used to specify the target DB, can we apply the same for MongoDB as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for 2.5.2 we need to add ' ' around string value, like default('1 second')

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh ok, then that should be updated as we guide users to use ansible=2.5.2.
https://github.com/apache/openwhisk/tree/master/ansible

Copy link
Contributor Author

@jiangpengcheng jiangpengcheng May 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I will create a new PR to fix this
it's already fixed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could move to a more modern version of ansible (in separate PRs of course). We don't have to stay on 2.5.2 forever.

Copy link
Contributor

@ddragosd ddragosd Jun 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could move to a more modern version of ansible (in separate PRs of course). We don't have to stay on 2.5.2 forever.

I'm installing OW on a clean new machine and I'm updating the versions that worked.

ansible-playbook -i environments/<environment> apigateway.yml -e mongodb_connect_string="mongodb://172.17.0.1:27017"
ansible-playbook -i environments/<environment> openwhisk.yml -e mongodb_connect_string="mongodb://172.17.0.1:27017" -e database_backend="MongoDB"

# installs a catalog of public packages and actions
ansible-playbook -i environments/<environment> postdeploy.yml

# to use the API gateway
ansible-playbook -i environments/<environment> apigateway.yml
ansible-playbook -i environments/<environment> routemgmt.yml
```

Available parameters for ansible are
```
mongodb:
connect_string: "{{ mongodb_connect_string }}"
database: "{{ mongodb_database | default('whisks') }}"
data_volume: "{{ mongodb_data_volume | default('mongo-data') }}"
```

### Using ElasticSearch to Store Activations

You can use ElasticSearch (ES) to store activations separately while other entities remain stored in CouchDB. There is an Ansible playbook to setup a simple ES cluster for testing and development purposes.
Expand Down
11 changes: 11 additions & 0 deletions ansible/group_vars/all
Original file line number Diff line number Diff line change
Expand Up @@ -256,6 +256,7 @@ db:
port: "{{ db_port | default(lookup('ini', 'db_port section=db_creds file={{ playbook_dir }}/db_local.ini')) }}"
host: "{{ db_host | default(lookup('ini', 'db_host section=db_creds file={{ playbook_dir }}/db_local.ini')) }}"
persist_path: "{{ db_persist_path | default(false) }}"
backend: "{{ database_backend | default('CouchDB') }}"
instances: "{{ groups['db'] | length }}"
authkeys:
- guest
Expand Down Expand Up @@ -295,6 +296,10 @@ db:
admin:
username: "{{ elastic_username | default('admin') }}"
password: "{{ elastic_password | default('admin') }}"
mongodb:
connect_string: "{{ mongodb_connect_string | default('mongodb://172.17.0.1:27017') }}"
database: "{{ mongodb_database | default('whisks') }}"
data_volume: "{{ mongodb_data_volume | default('mongo-data') }}"

apigateway:
port:
Expand Down Expand Up @@ -322,6 +327,12 @@ elasticsearch_connect_string: "{% set ret = [] %}\
{{ ret.append( hostvars[host].ansible_host + ':' + ((db.elasticsearch.port+loop.index-1)|string) ) }}\
{% endfor %}\
{{ ret | join(',') }}"
mongodb:
version: 4.4.0
commonEnv:
CONFIG_whisk_mongodb_uri: "{{ db.mongodb.connect_string }}"
CONFIG_whisk_mongodb_database: "{{ db.mongodb.database }}"
CONFIG_whisk_spi_ArtifactStoreProvider: "org.apache.openwhisk.core.database.mongodb.MongoDBArtifactStoreProvider"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to follow the convention just like the others?

spi: "{{ userLogs_spi | default('org.apache.openwhisk.core.containerpool.logging.DockerToActivationLogStoreProvider') }}"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can follow elasticsearch activation store style?

- name: setup elasticsearch activation store env
set_fact:
elastic_env:
"CONFIG_whisk_activationStore_elasticsearch_protocol": "{{ db.elasticsearch.protocol}}"
"CONFIG_whisk_activationStore_elasticsearch_hosts": "{{ elasticsearch_connect_string }}"
"CONFIG_whisk_activationStore_elasticsearch_indexPattern": "{{ db.elasticsearch.index_pattern }}"
"CONFIG_whisk_activationStore_elasticsearch_username": "{{ db.elasticsearch.auth.admin.username }}"
"CONFIG_whisk_activationStore_elasticsearch_password": "{{ db.elasticsearch.auth.admin.password }}"
"CONFIG_whisk_spi_ActivationStoreProvider": "org.apache.openwhisk.core.database.elasticsearch.ElasticSearchActivationStoreProvider"
when: db.activation_store.backend == "ElasticSearch"
- name: merge elasticsearch activation store env
set_fact:
env: "{{ env | combine(elastic_env) }}"
when: db.activation_store.backend == "ElasticSearch"


docker:
# The user to install docker for. Defaults to the ansible user if not set. This will be the user who is able to run
Expand Down
39 changes: 39 additions & 0 deletions ansible/initMongoDB.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
---
# This playbook will initialize the immortal DBs in the database account.
# This step is usually done only once per deployment.

- hosts: ansible
tasks:
- name: create necessary auth keys
mongodb:
connect_string: "{{ db.mongodb.connect_string }}"
database: "{{ db.mongodb.database }}"
collection: "whiskauth"
doc:
_id: "{{ item }}"
subject: "{{ item }}"
namespaces:
- name: "{{ item }}"
uuid: "{{ key.split(':')[0] }}"
key: "{{ key.split(':')[1] }}"
mode: "doc"
force_update: True
vars:
key: "{{ lookup('file', 'files/auth.{{ item }}') }}"
with_items: "{{ db.authkeys }}"
283 changes: 283 additions & 0 deletions ansible/library/mongodb.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,283 @@
#!/usr/bin/python

#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#

from __future__ import absolute_import, division, print_function
__metaclass__ = type


DOCUMENTATION = '''
---
module: mongodb
short_description: A module which support some simple operations on MongoDB.
description:
- Including add user/insert document/create indexes in MongoDB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this tool does something similar to wskadmin, should we put this under tools?
https://github.com/apache/openwhisk/blob/68120f2170dc9f9b53361ab0cb51c4e9458dbe29/tools/admin/wskadmin

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is like an ansible library

options:
connect_string:
description:
- The uri of mongodb server
required: true
database:
description:
- The name of the database you want to manipulate
required: true
user:
description:
- The name of the user to add or remove, required when use 'user' mode
required: false
default: null
password:
description:
- The password to use for the user, required when use 'user' mode
required: false
default: null
roles:
description:
- The roles of the user, it's a list of dict, each dict requires two fields: 'db' and 'role', required when use 'user' mode
required: false
default: null
collection:
required: false
description:
- The name of the collection you want to manipulate, required when use 'doc' or 'indexes' mode
doc:
required: false
description:
- The document you want to insert into MongoDB, required when use 'doc' mode
indexes:
required: false
description:
- The indexes you want to create in MongoDB, it's a list of dict, you can see the example for the usage, required when use 'index' mode
force_update:
required: false
description:
- Whether replace/update existing user or doc or raise DuplicateKeyError, default is false
mode:
required: false
default: user
choices: ['user', 'doc', 'index']
description:
- use 'user' mode if you want to add user, 'doc' mode to insert document, 'index' mode to create indexes

requirements: [ "pymongo" ]
author:
- "Jinag PengCheng"
'''

EXAMPLES = '''
# add user
- mongodb:
connect_string: mongodb://localhost:27017
database: admin
user: test
password: 123456
roles:
- db: test_database
role: read
force_update: true

# add doc
- mongodb:
connect_string: mongodb://localhost:27017
mode: doc
database: admin
collection: main
doc:
id: "id/document"
title: "the name of document"
content: "which doesn't matter"
force_update: true

# add indexes
- mongodb:
connect_string: mongodb://localhost:27017
mode: index
database: admin
collection: main
indexes:
- index:
- field: updated_at
direction: 1
- field: name
direction: -1
name: test-index
unique: true
'''

import traceback

from ansible.module_utils.basic import AnsibleModule
from ansible.module_utils._text import to_native

try:
from pymongo import ASCENDING, DESCENDING, GEO2D, GEOHAYSTACK, GEOSPHERE, HASHED, TEXT
from pymongo import IndexModel
from pymongo import MongoClient
from pymongo.errors import DuplicateKeyError
except ImportError:
pass


# =========================================
# MongoDB module specific support methods.
#

class UnknownIndexPlugin(Exception):
pass


def check_params(params, mode, module):
missed_params = []
for key in OPERATIONS[mode]['required']:
if params[key] is None:
missed_params.append(key)

if missed_params:
module.fail_json(msg="missing required arguments: %s" % (",".join(missed_params)))


def _recreate_user(module, db, user, password, roles):
try:
db.command("dropUser", user)
db.command("createUser", user, pwd=password, roles=roles)
except Exception as e:
module.fail_json(msg='Unable to create user: %s' % to_native(e), exception=traceback.format_exc())



def user(module, client, db_name, **kwargs):
roles = kwargs['roles']
if roles is None:
roles = []
db = client[db_name]

try:
db.command("createUser", kwargs['user'], pwd=kwargs['password'], roles=roles)
except DuplicateKeyError as e:
if kwargs['force_update']:
_recreate_user(module, db, kwargs['user'], kwargs['password'], roles)
else:
module.fail_json(msg='Unable to create user: %s' % to_native(e), exception=traceback.format_exc())
except Exception as e:
module.fail_json(msg='Unable to create user: %s' % to_native(e), exception=traceback.format_exc())

module.exit_json(changed=True, user=kwargs['user'])


def doc(module, client, db_name, **kwargs):
coll = client[db_name][kwargs['collection']]
try:
coll.insert_one(kwargs['doc'])
except DuplicateKeyError as e:
if kwargs['force_update']:
try:
coll.replace_one({'_id': kwargs['doc']['_id']}, kwargs['doc'])
except Exception as e:
module.fail_json(msg='Unable to insert doc: %s' % to_native(e), exception=traceback.format_exc())
else:
module.fail_json(msg='Unable to insert doc: %s' % to_native(e), exception=traceback.format_exc())
except Exception as e:
module.fail_json(msg='Unable to insert doc: %s' % to_native(e), exception=traceback.format_exc())

kwargs['doc']['_id'] = str(kwargs['doc']['_id'])
module.exit_json(changed=True, doc=kwargs['doc'])


def _clean_index_direction(direction):
if direction in ["1", "-1"]:
direction = int(direction)

if direction not in [ASCENDING, DESCENDING, GEO2D, GEOHAYSTACK, GEOSPHERE, HASHED, TEXT]:
raise UnknownIndexPlugin("Unable to create indexes: Unknown index plugin: %s" % direction)
return direction


def _clean_index_options(options):
res = {}
supported_options = set(['name', 'unique', 'background', 'sparse', 'bucketSize', 'min', 'max', 'expireAfterSeconds'])
for key in set(options.keys()).intersection(supported_options):
res[key] = options[key]
if key in ['min', 'max', 'bucketSize', 'expireAfterSeconds']:
res[key] = int(res[key])

return res


def parse_indexes(idx):
keys = [(k['field'], _clean_index_direction(k['direction'])) for k in idx.pop('index')]
options = _clean_index_options(idx)
return IndexModel(keys, **options)


def index(module, client, db_name, **kwargs):
parsed_indexes = map(parse_indexes, kwargs['indexes'])
try:
coll = client[db_name][kwargs['collection']]
coll.create_indexes(parsed_indexes)
except Exception as e:
module.fail_json(msg='Unable to create indexes: %s' % to_native(e), exception=traceback.format_exc())

module.exit_json(changed=True, indexes=kwargs['indexes'])


OPERATIONS = {
'user': { 'function': user, 'params': ['user', 'password', 'roles', 'force_update'], 'required': ['user', 'password']},
'doc': {'function': doc, 'params': ['doc', 'collection', 'force_update'], 'required': ['doc', 'collection']},
'index': {'function': index, 'params': ['indexes', 'collection'], 'required': ['indexes', 'collection']}
}


# =========================================
# Module execution.
#

def main():
module = AnsibleModule(
argument_spec=dict(
connect_string=dict(required=True),
database=dict(required=True, aliases=['db']),
mode=dict(default='user', choices=['user', 'doc', 'index']),
user=dict(default=None),
password=dict(default=None, no_log=True),
roles=dict(default=None, type='list'),
collection=dict(default=None),
doc=dict(default=None, type='dict'),
force_update=dict(default=False, type='bool'),
indexes=dict(default=None, type='list'),
)
)

mode = module.params['mode']

db_name = module.params['database']

params = {key: module.params[key] for key in OPERATIONS[mode]['params']}
check_params(params, mode, module)

try:
client = MongoClient(module.params['connect_string'])
except NameError:
module.fail_json(msg='the python pymongo module is required')
except Exception as e:
module.fail_json(msg='unable to connect to database: %s' % to_native(e), exception=traceback.format_exc())

OPERATIONS[mode]['function'](module, client, db_name, **params)


if __name__ == '__main__':
main()
Loading