Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: makes AWSbatch the default executor #161

Merged
merged 22 commits into from
Mar 22, 2024
Merged

Conversation

dapineyro
Copy link
Collaborator

@dapineyro dapineyro commented Mar 19, 2024

Overview

This PR makes AWS batch the default executor, so there is no need to specify --batch for the cloudos job run command.

Jira ticket

DEL-17662 & LP-6350

Changes

  • Adds new --ignite flag to allow ignite executor if available. Using it shows a warning message to alert the user the command may fail.
  • Makes --batch flag unnecessary, although using it will not break the command, to maintain backwards compatibility.
  • Changes the queue selection when the user do not specify --job-queue.
    • Before: cloudos-cli selected the last available queue.
    • Now: cloudos-cli selects the workspace "default" queue, if available. If not, selects the last available queue, as before.
  • Bumps version to v2.7.0.
  • Updates CHANGELOG.md.
  • Updates CI tests for not using --batch flag.
  • Updates README.md and adds a GitHub actions badge.
  • Updates --help messages.

Tests

Setting up variables:

MY_API_KEY="xxxxx"
CLOUDOS="https://cloudos.lifebit.ai"
WORKSPACE_ID="xxxxx"
PROJECT_NAME="API jobs"
WORKFLOW_NAME="rnatoy"
JOB_PARAMS="cloudos/examples/rnatoy.config"

Running docker image:

docker run --rm -it quay.io/lifebitaiorg/cloudos-cli:v2.7.0

CloudOS job using the old --batch flag

cloudos job run \
    --cloudos-url $CLOUDOS \
    --apikey $MY_API_KEY \
    --workspace-id $WORKSPACE_ID \
    --project-name "$PROJECT_NAME" \
    --workflow-name $WORKFLOW_NAME \
    --job-config $JOB_PARAMS \
    --resumable \
    --spot \
    --batch \
    --job-name "test_1_cloudos_cli"
CloudOS python package: a package for interacting with CloudOS.

Version: 2.7.0

CloudOS job functionality: run and check jobs in CloudOS.

Executing run...
	No job_queue was specified, using the CloudOS default queue: On_demand_standard_stable_3TB.
	Job successfully launched to CloudOS, please check the following link: https://cloudos.lifebit.ai/app/jobs/65fb3a6d39195838201939c6
	Your assigned job id is: 65fb3a6d39195838201939c6

	Your current job status is: initializing
	To further check your job status you can either go to https://cloudos.lifebit.ai/app/jobs/65fb3a6d39195838201939c6 or use the following command:
	cloudos job status \
		--apikey $MY_API_KEY \
		--cloudos-url https://cloudos.lifebit.ai \
		--job-id 65fb3a6d39195838201939c6
Screenshot 2024-03-20 at 20 39 27 Screenshot 2024-03-20 at 20 40 15

CloudOS job not using any flag (new default)

cloudos job run \
    --cloudos-url $CLOUDOS \
    --apikey $MY_API_KEY \
    --workspace-id $WORKSPACE_ID \
    --project-name "$PROJECT_NAME" \
    --workflow-name $WORKFLOW_NAME \
    --job-config $JOB_PARAMS \
    --resumable \
    --spot \
    --job-name "test_2_cloudos_cli"
CloudOS python package: a package for interacting with CloudOS.

Version: 2.7.0

CloudOS job functionality: run and check jobs in CloudOS.

Executing run...
	No job_queue was specified, using the CloudOS default queue: On_demand_standard_stable_3TB.
	Job successfully launched to CloudOS, please check the following link: https://cloudos.lifebit.ai/app/jobs/65fb3b37391958382019648d
	Your assigned job id is: 65fb3b37391958382019648d

	Your current job status is: initializing
	To further check your job status you can either go to https://cloudos.lifebit.ai/app/jobs/65fb3b37391958382019648d or use the following command:
	cloudos job status \
		--apikey $MY_API_KEY \
		--cloudos-url https://cloudos.lifebit.ai \
		--job-id 65fb3b37391958382019648d
Screenshot 2024-03-20 at 20 40 42

CloudOS job using an existing queue

cloudos job run \
    --cloudos-url $CLOUDOS \
    --apikey $MY_API_KEY \
    --workspace-id $WORKSPACE_ID \
    --project-name "$PROJECT_NAME" \
    --workflow-name $WORKFLOW_NAME \
    --job-config $JOB_PARAMS \
    --resumable \
    --spot \
    --job-queue "end_to_end" \
    --job-name "test_3_cloudos_cli"
CloudOS python package: a package for interacting with CloudOS.

Version: 2.7.0

CloudOS job functionality: run and check jobs in CloudOS.

Executing run...
	Job successfully launched to CloudOS, please check the following link: https://cloudos.lifebit.ai/app/jobs/65fb3c2b391958382019b9c9
	Your assigned job id is: 65fb3c2b391958382019b9c9

	Your current job status is: initializing
	To further check your job status you can either go to https://cloudos.lifebit.ai/app/jobs/65fb3c2b391958382019b9c9 or use the following command:
	cloudos job status \
		--apikey $MY_API_KEY \
		--cloudos-url https://cloudos.lifebit.ai \
		--job-id 65fb3c2b391958382019b9c9
Screenshot 2024-03-20 at 20 41 14 Screenshot 2024-03-20 at 20 44 42

CloudOS job using a non-existing queue

cloudos job run \
    --cloudos-url $CLOUDOS \
    --apikey $MY_API_KEY \
    --workspace-id $WORKSPACE_ID \
    --project-name "$PROJECT_NAME" \
    --workflow-name $WORKFLOW_NAME \
    --job-config $JOB_PARAMS \
    --resumable \
    --spot \
    --job-queue "a_non_existing_queue" \
    --job-name "test_4_cloudos_cli"
CloudOS python package: a package for interacting with CloudOS.

Version: 2.7.0

CloudOS job functionality: run and check jobs in CloudOS.

Executing run...
	Queue 'a_non_existing_queue' you specified was not found, using the CloudOS default queue instead: On_demand_standard_stable_3TB.
	Job successfully launched to CloudOS, please check the following link: https://cloudos.lifebit.ai/app/jobs/65fb3cdc391958382019dcb5
	Your assigned job id is: 65fb3cdc391958382019dcb5

	Your current job status is: initializing
	To further check your job status you can either go to https://cloudos.lifebit.ai/app/jobs/65fb3cdc391958382019dcb5 or use the following command:
	cloudos job status \
		--apikey $MY_API_KEY \
		--cloudos-url https://cloudos.lifebit.ai \
		--job-id 65fb3cdc391958382019dcb5
Screenshot 2024-03-20 at 20 47 31

CloudOS job trying to use --ignite in a non-ignite workspace

cloudos job run \
    --cloudos-url $CLOUDOS \
    --apikey $MY_API_KEY \
    --workspace-id $WORKSPACE_ID \
    --project-name "$PROJECT_NAME" \
    --workflow-name $WORKFLOW_NAME \
    --job-config $JOB_PARAMS \
    --resumable \
    --spot \
    --ignite \
    --job-name "test_5_cloudos_cli"
CloudOS python package: a package for interacting with CloudOS.

Version: 2.7.0

CloudOS job functionality: run and check jobs in CloudOS.

Executing run...

[Warning] You have specified ignite executor. Please, note that ignite is being removed from CloudOS, so the command may fail. Check ignite availability in your CloudOS

Traceback (most recent call last):
  File "/opt/conda/bin/cloudos", line 33, in <module>
    sys.exit(load_entry_point('cloudos', 'console_scripts', 'cloudos')())
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.12/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.12/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.12/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.12/site-packages/click/core.py", line 1688, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.12/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/conda/lib/python3.12/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/cloudos/__main__.py", line 313, in run
    j_id = j.send_job(job_config=job_config,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/cloudos/jobs/job.py", line 534, in send_job
    raise BadRequestException(r)
cloudos.utils.errors.BadRequestException: Server returned status 400. Reason: Bad Request

NOTE: I was not able to find a workspace with ignite support, but as I don't have access to all the existing workspaces, we will maintain the --ignite option until we can confirm 100% there is no ignite workspace left and we are not going to support it again in the future.

@dapineyro dapineyro marked this pull request as ready for review March 20, 2024 18:02
@dapineyro dapineyro marked this pull request as draft March 20, 2024 18:32
@dapineyro dapineyro marked this pull request as ready for review March 20, 2024 18:32
@dapineyro dapineyro marked this pull request as draft March 20, 2024 18:38
@dapineyro dapineyro marked this pull request as ready for review March 20, 2024 18:38
@dapineyro dapineyro marked this pull request as draft March 20, 2024 18:40
@dapineyro dapineyro marked this pull request as ready for review March 20, 2024 18:40
@dapineyro dapineyro marked this pull request as draft March 20, 2024 18:42
@dapineyro dapineyro marked this pull request as ready for review March 20, 2024 18:42
@dapineyro dapineyro marked this pull request as draft March 20, 2024 18:47
@dapineyro dapineyro marked this pull request as ready for review March 20, 2024 18:47
@dapineyro dapineyro marked this pull request as draft March 20, 2024 18:58
@dapineyro dapineyro marked this pull request as ready for review March 20, 2024 18:58
@dapineyro dapineyro marked this pull request as draft March 20, 2024 19:56
@dapineyro dapineyro marked this pull request as ready for review March 20, 2024 19:57
Copy link
Contributor

@danielboloc danielboloc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great new feature @dapineyro! And is great that it still maintains backwards compatibility so we don't have to change all our CI tests

@dapineyro dapineyro merged commit d39754e into main Mar 22, 2024
8 checks passed
@dapineyro dapineyro deleted the release-branch-2.7.0 branch March 22, 2024 14:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants