Skip to content

Conversation

caleb-kaiser
Copy link
Contributor

@caleb-kaiser caleb-kaiser commented Nov 23, 2020

closes #


checklist:

  • run make test and make lint
  • test manually (i.e. build/push all images, restart operator, and re-deploy APIs)

Copy link
Member

@RobertLucian RobertLucian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks awesome to me!

I haven't been able to run the example because the deploy command would always timeout (followed the instructions in the README). This timeout occurs on the connection between the CLI and the operator. The reason why the deploy command takes so much time is that the predictor.models.dir path has many models that have to be validated one at a time. The error that I get when I deploy is:

(base) robert@cortex-development:mmc-python-translator$ cxd deploy

using cortex-dev-2 environment

error: Post http://localhost:8888/deploy: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

unable to connect to your cluster in the cortex-dev-2 environment (operator endpoint: http://localhost:8888)

if you don't have a cluster running:
    → if you'd like to create a cluster, run `cortex cluster up --configure-env cortex-dev-2`
    → otherwise you can ignore this message, and prevent it in the future with `cortex env delete cortex-dev-2`

if you have a cluster running:
    → run `cortex cluster info --configure-env cortex-dev-2` to update your environment (include `--config <cluster.yaml>` if you have a cluster configuration file)
    → if you set `operator_load_balancer_scheme: internal` in your cluster configuration file, your CLI must run from within a VPC that has access to your cluster's VPC (see https://docs.cortex.dev/v/master/aws/vpc-peering)
Post http://localhost:8888/deploy: net/http: request canceled (Client.Timeout exceeded while awaiting headers)

unable to connect to your cluster in the cortex-dev-2 environment (operator endpoint: http://localhost:8888)

if you don't have a cluster running:
    → if you'd like to create a cluster, run `cortex cluster up --configure-env cortex-dev-2`
    → otherwise you can ignore this message, and prevent it in the future with `cortex env delete cortex-dev-2`

if you have a cluster running:
    → run `cortex cluster info --configure-env cortex-dev-2` to update your environment (include `--config <cluster.yaml>` if you have a cluster configuration file)
    → if you set `operator_load_balancer_scheme: internal` in your cluster configuration file, your CLI must run from within a VPC that has access to your cluster's VPC (see https://docs.cortex.dev/v/master/aws/vpc-peering)

github.com/cortexlabs/cortex/cli/cluster.ErrorFailedToConnectOperator
        /home/robert/projects/github/cortex/cli/cluster/errors.go:68
github.com/cortexlabs/cortex/cli/cluster.makeOperatorRequest
        /home/robert/projects/github/cortex/cli/cluster/lib_http_client.go:203
github.com/cortexlabs/cortex/cli/cluster.HTTPUpload
        /home/robert/projects/github/cortex/cli/cluster/lib_http_client.go:133
github.com/cortexlabs/cortex/cli/cluster.Deploy
        /home/robert/projects/github/cortex/cli/cluster/deploy.go:37
github.com/cortexlabs/cortex/cli/cmd.glob..func8
        /home/robert/projects/github/cortex/cli/cmd/deploy.go:99
github.com/spf13/cobra.(*Command).execute
        /home/robert/.miniconda3/envs/go/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:846
github.com/spf13/cobra.(*Command).ExecuteC
        /home/robert/.miniconda3/envs/go/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:950
github.com/spf13/cobra.(*Command).Execute
        /home/robert/.miniconda3/envs/go/go/pkg/mod/github.com/spf13/cobra@v1.0.0/command.go:887
github.com/cortexlabs/cortex/cli/cmd.Execute
        /home/robert/projects/github/cortex/cli/cmd/root.go:181
main.main
        /home/robert/projects/github/cortex/cli/main.go:24
runtime.main
        /home/robert/.miniconda3/envs/go/go/src/runtime/proc.go:203
runtime.goexit
        /home/robert/.miniconda3/envs/go/go/src/runtime/asm_amd64.s:1357

Needless to say, this isn't a problem with the example, but this is a problem that needs to be addressed by #1530.

# Conflicts:
#	test/apis/model-caching/python/translator/README.md
#	test/apis/model-caching/python/translator/cluster.yaml
#	test/apis/model-caching/python/translator/cortex.yaml
#	test/apis/model-caching/python/translator/predictor.py
#	test/apis/model-caching/python/translator/requirements.txt
#	test/apis/model-caching/python/translator/sample.json
@RobertLucian RobertLucian reopened this Jan 21, 2021
@RobertLucian RobertLucian merged commit f69e982 into master Feb 3, 2021
@RobertLucian RobertLucian deleted the translator-example branch February 3, 2021 22:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants