Tweak node pool usage #984

leej3 · 2022-01-05T17:24:52Z

Allocates some stray cleaml resources to the clearml node-pool and targets the forwardauth deployment resource to the general node pool.

viniciusdc

LGTM, I will present that in the next meeting with others and we can merge once they have a look as well. Thanks @leej3 for the contribuitions!!!

leej3 · 2022-01-06T07:34:31Z

Sounds good. Thanks. One thing to consider is whether to target some of the clearml pods using affinity as we have done in the case of forward-auth. On our deployment the clearml nodes use a lot of resources and my understanding is that the clearml server needs quite a lot of resources (or at least we haven't explored what the minimum resources are). It may be more sensible to run all auxiliary pods somewhere other than the clearml node pool to be more efficient...

costrouc · 2022-01-06T20:34:39Z

...utter.repo_directory }}/infrastructure/modules/kubernetes/services/clearml/chart/values.yaml

@@ -226,6 +226,8 @@ redis: # configuration from https://github.com/bitnami/charts/blob/master/bitnam
  master:
    name: "{{ .Release.Name }}-redis-master"
    port: 6379
+    nodeSelector:


How is the node being labeled "app: clearml"?

Or this an assumption that the node will have this label?

How is the node being labeled "app: clearml"

good point. It's a default value set in the variable.tf file that propagest to the chart's values.yaml (see here)).

Or this an assumption that the node will have this label?

It is a requirement. We manually set this label for our deployment. We are not sure if this is automated when QHub deploys clearml when using the cloud providers (as described here).

that drew our attention to missing redis/mongdb dynamic value setting. fixed now.

leej3 · 2022-01-07T14:55:47Z

Additionally, instead of targeting all clearml pods to the general pool, we want to be able to separately target the services to different node pools. The agent should run on a large node but all other services (including kube-system components) should be targeted to a general pool with nodes with less resources. @viniciusdc and myself will take a look. We'll submit this separately

costrouc · 2022-04-26T15:26:11Z

@viniciusdc could you update this PR when you test clearml? I think this may already be fixed.

viniciusdc · 2022-05-05T18:38:03Z

Based on a comparison between the main.tf files from this PR and the current present on Qhub, this was not fixed yet. So we need to move these changes to 0.4.0 standards then we can merge this. (after a new review).

As this will be part of #1217, I will move this to the same milestone.

viniciusdc · 2022-05-24T16:13:19Z

I will be closing this due to the changes made for v0.4.0 and #1292. These changes will then be included once a PR for #1281 is merged. Thanks, @leej3 for pointing out the issue.

leej3 added 2 commits January 5, 2022 12:19

add remaining clearml resources to clearml node pool

7508ba6

add node affinity for forward auth resource

9805ade

leej3 requested a review from viniciusdc January 5, 2022 17:24

viniciusdc reviewed Jan 5, 2022

View reviewed changes

viniciusdc added area: integration/ClearML area: terraform 💾 needs: follow-up 📫 Someone needs to get back to this issue or PR labels Jan 5, 2022

costrouc reviewed Jan 6, 2022

View reviewed changes

set mongodb and redis node selector

c15c224

leej3 mentioned this pull request Jan 17, 2022

WIP: Clearml override #996

Closed

magsol added this to the Release v0.4.1 milestone Apr 26, 2022

viniciusdc modified the milestones: Release v0.4.1, Release v0.4.2 May 5, 2022

viniciusdc mentioned this pull request May 13, 2022

ENH - Update node pool usage for ClearML #1281

Closed

viniciusdc closed this May 24, 2022

iameskild removed this from the Release v0.4.2 milestone Jun 2, 2022

trallard deleted the tweak_node_pool_usage branch October 7, 2022 16:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tweak node pool usage #984

Tweak node pool usage #984

leej3 commented Jan 5, 2022

viniciusdc left a comment

leej3 commented Jan 6, 2022

costrouc Jan 6, 2022

costrouc Jan 6, 2022

leej3 Jan 7, 2022

leej3 Jan 7, 2022

leej3 commented Jan 7, 2022

costrouc commented Apr 26, 2022

viniciusdc commented May 5, 2022 •

edited

Loading

viniciusdc commented May 24, 2022

Tweak node pool usage #984

Tweak node pool usage #984

Conversation

leej3 commented Jan 5, 2022

viniciusdc left a comment

Choose a reason for hiding this comment

leej3 commented Jan 6, 2022

costrouc Jan 6, 2022

Choose a reason for hiding this comment

costrouc Jan 6, 2022

Choose a reason for hiding this comment

leej3 Jan 7, 2022

Choose a reason for hiding this comment

leej3 Jan 7, 2022

Choose a reason for hiding this comment

leej3 commented Jan 7, 2022

costrouc commented Apr 26, 2022

viniciusdc commented May 5, 2022 • edited Loading

viniciusdc commented May 24, 2022

viniciusdc commented May 5, 2022 •

edited

Loading