Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API tries to download local model from S3 (for TensorFlow, MMC) #1598

Closed
RobertLucian opened this issue Nov 24, 2020 · 0 comments · Fixed by #1669
Closed

API tries to download local model from S3 (for TensorFlow, MMC) #1598

RobertLucian opened this issue Nov 24, 2020 · 0 comments · Fixed by #1669
Assignees
Labels
bug Something isn't working
Milestone

Comments

@RobertLucian
Copy link
Member

Version

>= 0.22

Description

For the following set of conditions, a local model cannot be loaded:

  • For TensorFlow predictor type (Realtime kind)
  • MMC is enabled
  • Local model is specified

Steps to reproduce

Add a local model to an example such as https://github.com/cortexlabs/cortex/tree/master/examples/model-caching/tensorflow/multi-model-classifier and then run a prediction on the said model.

Expected behavior

Should actually load the model from disk.

Actual behavior

It attempts to download the said local model from the S3 upstream.

Stack traces

(error output from cortex logs <api name>)

2020-11-24 23:21:25.743322:cortex:pid-229:INFO:200 OK POST /
2020-11-24 23:21:34.540678:cortex:pid-229:INFO:grabbing access to model local-resnet50 of version 1
2020-11-24 23:21:34.540978:cortex:pid-229:INFO:model local-resnet50 of version 1 is available
2020-11-24 23:21:34.541159:cortex:pid-229:INFO:checking the local-resnet50 1 status
2020-11-24 23:21:34.541252:cortex:pid-229:INFO:model local-resnet50 of version 1 is not loaded (with status not-available or different timestamp)
2020-11-24 23:21:34.541384:cortex:pid-229:INFO:downloading model local-resnet50 of version 1 from the S3 upstream
2020-11-24 23:21:34.541471:cortex:pid-229:INFO:downloading from bucket //mnt/model/local-resnet50/1, model local-resnet50 of version 1, temporarily to /tmp/cron and then finally to /mnt/model
2020-11-24 23:21:34.552484:cortex:pid-229:INFO:500 Internal Server Error POST /
2020-11-24 23:21:34.552773:cortex:pid-229:ERROR:Exception in ASGI application
Traceback (most recent call last):
  File "/opt/conda/envs/env/lib/python3.6/site-packages/uvicorn/protocols/http/httptools_impl.py", line 390, in run_asgi
    result = await app(self.scope, self.receive, self.send)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/uvicorn/middleware/proxy_headers.py", line 45, in __call__
    return await self.app(scope, receive, send)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/fastapi/applications.py", line 181, in __call__
    await super().__call__(scope, receive, send)  # pragma: no cover
  File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/applications.py", line 111, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/middleware/errors.py", line 181, in __call__
    raise exc from None
  File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/middleware/errors.py", line 159, in __call__
    await self.app(scope, receive, _send)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/middleware/base.py", line 25, in __call__
    response = await self.dispatch_func(request, self.call_next)
  File "/src/cortex/serve/serve.py", line 176, in parse_payload
    return await call_next(request)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/middleware/base.py", line 45, in call_next
    task.result()
  File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/middleware/base.py", line 38, in coro
    await self.app(scope, receive, send)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/middleware/base.py", line 25, in __call__
    response = await self.dispatch_func(request, self.call_next)
  File "/src/cortex/serve/serve.py", line 133, in register_request
    response = await call_next(request)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/middleware/base.py", line 45, in call_next
    task.result()
  File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/middleware/base.py", line 38, in coro
    await self.app(scope, receive, send)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/exceptions.py", line 82, in __call__
    raise exc from None
  File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/exceptions.py", line 71, in __call__
    await self.app(scope, receive, sender)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/routing.py", line 566, in __call__
    await route.handle(scope, receive, send)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/routing.py", line 227, in handle
    await self.app(scope, receive, send)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/routing.py", line 41, in app
    response = await func(request)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/fastapi/routing.py", line 183, in app
    dependant=dependant, values=values, is_coroutine=is_coroutine
  File "/opt/conda/envs/env/lib/python3.6/site-packages/fastapi/routing.py", line 135, in run_endpoint_function
    return await run_in_threadpool(dependant.call, **values)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/starlette/concurrency.py", line 34, in run_in_threadpool
    return await loop.run_in_executor(None, func, *args)
  File "/opt/conda/envs/env/lib/python3.6/concurrent/futures/thread.py", line 56, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/src/cortex/serve/serve.py", line 185, in predict
    prediction = predictor_impl.predict(**kwargs)
  File "/mnt/project/predictor.py", line 44, in predict
    predicted_label = self.predict_image_classifier(model_name, payload["url"])
  File "/mnt/project/predictor.py", line 57, in predict_image_classifier
    results = self.client.predict(img, model)[self.config[model]["output_key"]]
  File "/src/cortex/lib/client/tensorflow.py", line 126, in predict
    return self._run_inference(model_input, model_name, model_version)
  File "/src/cortex/lib/client/tensorflow.py", line 287, in _run_inference
    upstream_model["path"],
  File "/src/cortex/lib/model/model.py", line 352, in download_model
    self._model_dir,
  File "/src/cortex/lib/api/predictor.py", line 529, in model_downloader
    sub_paths, ts = s3_client.search(model_path)
  File "/src/cortex/lib/storage/s3.py", line 151, in search
    for key, ts in self._get_matching_s3_keys_generator(prefix, suffix):
  File "/src/cortex/lib/storage/s3.py", line 106, in _get_matching_s3_keys_generator
    for obj in self._get_matching_s3_objects_generator(prefix, suffix):
  File "/src/cortex/lib/storage/s3.py", line 89, in _get_matching_s3_objects_generator
    resp = self.s3.list_objects_v2(**kwargs)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/botocore/client.py", line 337, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/botocore/client.py", line 629, in _make_api_call
    api_params, operation_model, context=request_context)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/botocore/client.py", line 675, in _convert_to_request_dict
    api_params, operation_model, context)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/botocore/client.py", line 707, in _emit_api_params
    params=api_params, model=operation_model, context=context)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/botocore/hooks.py", line 356, in emit
    return self._emitter.emit(aliased_event_name, **kwargs)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/botocore/hooks.py", line 228, in emit
    return self._emit(event_name, kwargs)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/botocore/hooks.py", line 211, in _emit
    response = handler(**kwargs)
  File "/opt/conda/envs/env/lib/python3.6/site-packages/botocore/handlers.py", line 205, in validate_bucket_name
    raise ParamValidationError(report=error_msg)
botocore.exceptions.ParamValidationError: Parameter validation failed:
Invalid bucket name "": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:s3:[a-z\-0-9]+:[0-9]{12}:accesspoint[/:][a-zA-Z0-9\-]{1,63}$"
@RobertLucian RobertLucian added the bug Something isn't working label Nov 24, 2020
@RobertLucian RobertLucian self-assigned this Dec 8, 2020
@deliahu deliahu added this to the v0.24 milestone Dec 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants