-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mongo7 upgrade #140
base: develop
Are you sure you want to change the base?
Mongo7 upgrade #140
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #140 +/- ##
===========================================
+ Coverage 80.14% 80.81% +0.67%
===========================================
Files 8 8
Lines 2695 2810 +115
===========================================
+ Hits 2160 2271 +111
- Misses 535 539 +4 ☔ View full report in Codecov by Sentry. |
sys.exit(1) | ||
|
||
endpoints = [] | ||
catalog_mock_auth = ['auth_admin.json', 'auth_invalid.json', 'auth_missing.json', 'auth_non_admin.json'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to put a comment here explaining that the order is being preserved here in order to ensure the routes capture the requests in the correct order
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if a comment was added or not, but additional info about "Mock server servers matches routes in a specific order, this is the current order required for tests to pass"
lib/biokbase/catalog/db.py
Outdated
self._mongo_client_initialized = False | ||
self.mongo = None | ||
self.db = None | ||
self.index_created = defaultdict(int) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a bool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used 1 for True and 0 for False.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, so why not use a defaultdict(bool) instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This PR does the following:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pr description is empty, and would benefit from a copy of the changelog/release notes . The version and release notes should be updated as well
@@ -27,11 +27,6 @@ LIB_DIR = lib | |||
|
|||
PATH := kb_sdk/bin:$(PATH) | |||
|
|||
default: init | |||
|
|||
init: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this init section need to be removed or can we keep it in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or was this deprecated in favor of docker compose, and was only needed for tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All submodules have been removed, so there is no need to pull them. If you want to keep init section, what should be included in the init?
@@ -102,8 +97,8 @@ setup-tests: | |||
mkdir -p $(TESTLIB)/biokbase | |||
mkdir -p $(TESTDIR)/nms | |||
rsync -av lib/biokbase/* $(TESTLIB)/biokbase/. --exclude *.bak-* | |||
rsync -av kbapi_common/lib/biokbase/* $(TESTLIB)/biokbase/. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't need this anymore, can we delete it instead of commenting it out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
services: | ||
mongo: | ||
image: mongo:${{matrix.mongo-version}} | ||
ports: | ||
- 27017:27017 | ||
- 27018:27017 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep it at 27017?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
port 27017 is already being used by the NMS Mongo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would change the NMS port in that case. If one or the other is going to have a non standard port it should be the supporting system, not the one we're working on
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need 2 seperate mongos?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One presumably starts up with the docker compose to support NMS, the other starts up as part of the catalog tests. That seems good to me, the catalog shouldn't know or care about any NMS dependencies
|
||
- name: Setup Python | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version: ${{matrix.python-version}} | ||
|
||
- name: Install Docker Compose |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not necessary. You can use docker compose
as it is included in docker nowadays
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Try docker compose
without the hyphen
sys.exit(1) | ||
|
||
endpoints = [] | ||
catalog_mock_auth = ['auth_admin.json', 'auth_invalid.json', 'auth_missing.json', 'auth_non_admin.json'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if a comment was added or not, but additional info about "Mock server servers matches routes in a specific order, this is the current order required for tests to pass"
# host where mongo lives, e.g. localhost:27017, for tests there should be not authentication required | ||
mongodb-host = localhost:27017 | ||
# host where mongo lives, e.g. localhost:27018, for tests there should be not authentication required | ||
mongodb-host = localhost:27018 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we keep this at 27017?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
port 27017 is already being used by the NMS Mongo
unique=True, sparse=False) | ||
|
||
elif collection_name == MongoCatalogDBI._LOCAL_FUNCTIONS: | ||
# Make sure we have a unique index on module_name_lc and git_commit_hash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible to refactor this giant elif
block using a dictionary to map collection names to their respective index creation functions. Here is an example of how you can refactor it:
def _create_indexes(self, collection_name):
"""Create indexes for the given collection lazily."""
collection = self.db[collection_name]
index_creation_map = {
MongoCatalogDBI._MODULE_VERSIONS: [
('module_name_lc', False),
('git_commit_hash', False),
[('module_name_lc', ASCENDING), ('git_commit_hash', ASCENDING), True]
],
MongoCatalogDBI._LOCAL_FUNCTIONS: [
'function_id',
[('module_name_lc', ASCENDING), ('function_id', ASCENDING), ('git_commit_hash', ASCENDING), True],
'module_name_lc',
'git_commit_hash'
],
# Add other collections and their indexes similarly
}
indexes = index_creation_map.get(collection_name, [])
for index in indexes:
if isinstance(index, tuple):
collection.create_index(index[0], sparse=index[1])
elif isinstance(index, list):
collection.create_index(index, unique=index[-1])
else:
collection.create_index(index)
This approach uses a dictionary to map collection names to a list of indexes, which reduces the need for multiple elif
statements. Adjust the index_creation_map
dictionary as needed to cover all your collections and their respective indexes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could potentially use an index spec to simplify as well: https://github.com/kbase/cdm-task-service/blob/main/cdmtaskservice/mongo.py#L70
return self._get_collection(MongoCatalogDBI._SECURE_CONFIG_PARAMS) | ||
|
||
def _create_indexes(self, collection_name): | ||
"""Create indexes for the given collection lazily.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to move out these into smaller functions into a file or class like db_indexes.py or something as this is an extremely long function. If possible, function's shouldn't be longer than the entire height of your screen.
@@ -1274,6 +1360,12 @@ def get_secure_config_params(self, module_name): | |||
# another server is already starting an update, we can skip or abort | |||
def check_db_schema(self): | |||
|
|||
if self._db_schema_checked: | |||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is missing coverage
@@ -1455,20 +1551,23 @@ def prepare_version_doc_for_db_2_to_3_update(self, version, module): | |||
def update_db_3_to_4(self): | |||
|
|||
# make sure we don't have any indecies on the collections | |||
self.volume_mounts.drop_indexes() | |||
self.client_groups.drop_indexes() | |||
volume_mounts_collection = self.db[MongoCatalogDBI._VOLUME_MOUNTS] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this entire function can be deleted, as we have already upgraded the indexes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK we're the only ones using the catalog, all these db version changes happened years ago, and so we can just put a note in the release notes that db version upgrades are no longer supported and delete all the upgrade methods
@@ -0,0 +1,47 @@ | |||
version: '3.4' | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we re-name this file as "docker-compose?"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is a compose file for the nms rather than the catalog, calling it as such seems appropriate to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a docker compose for testing purposes. By naming it "docker-compose" it allows you to start it up without specifying the path to a file. If we had multiple docker composes, I think it makes sense to use a non standard name, but since we only have one docker-compose file, we should use the standard name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just for starting the NMS service for CI/local testing to run unit tests. It makes more sense to keep its name as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with -f <file>
in this case but I don't feel that strongly about it
This also upgrades pymongo I think these can be added to the README.md |
@@ -77,6 +81,10 @@ jobs: | |||
sed -i "s#^nms-admin-token.*#nms-admin-token=$KBASE_CI_TOKEN#" test/test.cfg | |||
sed -i "s#^method-spec-admin-users.*#method-spec-admin-users=$ADMIN_USER#" test/test.cfg | |||
|
|||
# setup env variables in docker-compose_nms.yml file | |||
echo "ADMIN_USER=$ADMIN_USER" | |||
echo "ADMIN_USER=$ADMIN_USER" >> $GITHUB_ENV |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment about why this env var is set up and where it's used? That should help the next person who has to work on this service understand what's going on
COPY requirements.txt requirements.txt | ||
RUN source activate root && \ | ||
pip install -r requirements.txt | ||
RUN pip install -r requirements.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should convert to pipenv / poetry / uv, but that's a later task
NMS_PID=$! | ||
|
||
echo 'Starting Mock Auth API...' | ||
docker run -d --rm -v ${PWD}/mock_auth:/config -p 7777:5000 --name mock-auth mockservices/mock_json_service | ||
docker run -d --rm -v ${PWD}/mock_auth:/config -v ${PWD}/mock_auth/server.py:/server/server.py -p 7777:5000 --name mock-auth mockservices/mock_json_service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't something we need to deal with now, but there are plenty of 3rd party service mocking options out there, I'm not sure why we rolled our own. Also auth2 has test mode which we could use
|
||
echo 'Waiting for NMS to start...' | ||
sleep 25 | ||
curl -d '{"id":"1","params":[],"method":"NarrativeMethodStore.ver","version":"1.1"}' http://localhost:7125 | ||
curl -d '{"id":"1","params":[],"method":"NarrativeMethodStore.ver","version":"1.1"}' http://localhost:7125/rpc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow these tests must not've passed for a long time...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file needs some explanation at the top. Why are we replacing the file in the mock server container? Whoever comes along next to work on the catalog will need to understand the reasoning behind this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What changes have been made in this file compared to the original in the mock server container?
self.mongo_psswd = mongo_psswd | ||
self.mongo_authMechanism = mongo_authMechanism | ||
|
||
# MongoDB client and database initialization deferred |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add an explanation as to why this is necessary?
self.mongo_psswd = mongo_psswd | ||
self.mongo_authMechanism = mongo_authMechanism | ||
|
||
# MongoDB client and database initialization deferred |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How lazy is lazy? Is it not until a request is made, and therefore the service can be running in a guaranteed to fail state? If so, it seems like this class should at least test connectivity and auth at init time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC the db class instance is created pre fork, and then the fork happens and it's no longer working. If that's the case, and if it's possible, what I would do is
- in the init, check connectivity, check db schema, create indexes, etc., and close the client.
- On the first call, create a mongo client knowing that everything is set up correctly, and set up the instance variables for the collections. Then you don't need all the property functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, also you probably need a thread lock around the client creation step, or multiple calls could result in multiple client creations
"""Initialize MongoDB client and collections lazily.""" | ||
if not self._mongo_client_initialized: | ||
try: | ||
if self.mongo_user and self.mongo_psswd: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a note here that this case is only tested manually?
I assume it has been tested manually, yeah?
No description provided.