-
Notifications
You must be signed in to change notification settings - Fork 866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix failed to create model archive #1508
Conversation
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
@@ -70,6 +70,11 @@ public static ModelArchive downloadModel( | |||
|
|||
if (new File(url).isDirectory()) { | |||
return load(url, new File(url), false); | |||
} else if (modelLocation.exists()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lxning how would we handle the case if there are multiple mar files in a directory? are we relying on user to have only one mar file in the /xxx/model_store/modelXXX directory? In this case we need to make sure we have documented it clearly.
Do you we have any documentation on it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understood this PR as no longer needing a model_store
and just letting users link to various model files directy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In model_store dir, existing code ignores the mar files in the subdir (ie. /xxx/model_store/modelXXX).
@lxning Can you add more details as to why this change is being made? What changed that this stopped working? |
"modelServerVersion": "1.0", | ||
"implementationVersion": "1.0", | ||
"specificationVersion": "1.0" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to store a model.pt
in version control? Is it just an empty file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, noop is an example. model.pt is an empty file. Manifest requires it to pass validation.
import time | ||
|
||
|
||
class NoopService(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow why a NoopService
is needed, it seems very similar to the base handler? Could you please explain at a high level in the github issue the design of this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example noop_no_archive_no_version is copied from original noop_no_archive. I did slightly change (ie remove modelversion in manifest.json) for unit test the default manifest bug fixing.
@@ -70,6 +70,11 @@ public static ModelArchive downloadModel( | |||
|
|||
if (new File(url).isDirectory()) { | |||
return load(url, new File(url), false); | |||
} else if (modelLocation.exists()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understood this PR as no longer needing a model_store
and just letting users link to various model files directy
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Description
Please include a summary of the feature or issue being fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.
Fixes #(issue)
Type of change
#1498
Please delete options that are not relevant.
Feature/Issue validation/testing
Please describe the tests [UT/IT] that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.
Test A
Test B
Logs
inference_address=http://0.0.0.0:8080
management_address=http://0.0.0.0:8081
number_of_netty_threads=32
job_queue_size=1000
vmargs=-Xmx4g -XX:+ExitOnOutOfMemoryError -XX:+HeapDumpOnOutOfMemoryError
prefer_direct_buffer=True
default_response_timeout=300
unregister_model_timeout=300
install_py_dep_per_model=true
default_service_handler=./model_store/noop_no_archive_no_version/service.py:handle
tree model_store
model_store
└── noop_no_archive_no_version
├── model.pt
└── service.py
torchserve --ncs --start --model-store model_store --models noop_no_archive_no_version --ts-config config.properties
server log:
Config file: config.properties
Inference address: http://0.0.0.0:8080
Management address: http://0.0.0.0:8081
Metrics address: http://127.0.0.1:8082
Model Store: /Volumes/workplace/python_env/serve/model_store
Initial Models: noop_no_archive_no_version
Log dir: /Volumes/workplace/python_env/serve/logs
Metrics dir: /Volumes/workplace/python_env/serve/logs
Netty threads: 32
Netty client threads: 0
Default workers per model: 12
Blacklist Regex: N/A
Maximum Response Size: 6553500
Maximum Request Size: 6553500
Limit Maximum Image Pixels: true
Prefer direct buffer: True
Allowed Urls: [file://.|http(s)?://.]
Custom python dependency for model allowed: true
Metrics report format: prometheus
Enable metrics API: true
Workflow Store: /Volumes/workplace/python_env/serve/model_store
Model config: N/A
2022-03-14T18:58:43,154 [INFO ] main org.pytorch.serve.servingsdk.impl.PluginsManager - Loading snapshot serializer plugin...
2022-03-14T18:58:43,186 [INFO ] main org.pytorch.serve.ModelServer - Loading initial models: noop_no_archive_no_version
2022-03-14T18:58:43,190 [WARN ] main org.pytorch.serve.archive.model.ModelArchive - Model archive version is not defined. Please upgrade to torch-model-archiver 0.2.0 or higher
2022-03-14T18:58:43,190 [WARN ] main org.pytorch.serve.archive.model.ModelArchive - Model archive createdOn is not defined. Please upgrade to torch-model-archiver 0.2.0 or higher
2022-03-14T18:58:43,193 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Adding new version 1.0 for model noop_no_archive_no_version
2022-03-14T18:58:43,193 [DEBUG] main org.pytorch.serve.wlm.ModelVersionedRefs - Setting default version to 1.0 for model noop_no_archive_no_version
2022-03-14T18:58:43,194 [INFO ] main org.pytorch.serve.wlm.ModelManager - Model noop_no_archive_no_version loaded.
2022-03-14T18:58:43,194 [DEBUG] main org.pytorch.serve.wlm.ModelManager - updateModel: noop_no_archive_no_version, count: 12
2022-03-14T18:58:43,216 [DEBUG] W-9002-noop_no_archive_no_version_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/Users/lninga/opt/anaconda3/envs/py38/bin/python, /Users/lninga/opt/anaconda3/envs/py38/lib/python3.8/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /var/folders/w6/s5gp9htn2pb9z87lwp6fzjg9hv4nys/T//.ts.sock.9002]
2022-03-14T18:58:43,216 [DEBUG] W-9005-noop_no_archive_no_version_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/Users/lninga/opt/anaconda3/envs/py38/bin/python, /Users/lninga/opt/anaconda3/envs/py38/lib/python3.8/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /var/folders/w6/s5gp9htn2pb9z87lwp6fzjg9hv4nys/T//.ts.sock.9005]
2022-03-14T18:58:43,216 [DEBUG] W-9006-noop_no_archive_no_version_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/Users/lninga/opt/anaconda3/envs/py38/bin/python, /Users/lninga/opt/anaconda3/envs/py38/lib/python3.8/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /var/folders/w6/s5gp9htn2pb9z87lwp6fzjg9hv4nys/T//.ts.sock.9006]
2022-03-14T18:58:43,216 [DEBUG] W-9008-noop_no_archive_no_version_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/Users/lninga/opt/anaconda3/envs/py38/bin/python, /Users/lninga/opt/anaconda3/envs/py38/lib/python3.8/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /var/folders/w6/s5gp9htn2pb9z87lwp6fzjg9hv4nys/T//.ts.sock.9008]
2022-03-14T18:58:43,216 [DEBUG] W-9001-noop_no_archive_no_version_1.0 org.pytorch.serve.wlm.WorkerLifeCycle - Worker cmdline: [/Users/lninga/opt/anaconda3/envs/py38/bin/python, /Users/lninga/opt/anaconda3/envs/py38/lib/python3.8/site-packages/ts/model_service_worker.py, --sock-type, unix, --sock-name, /var/folders/w6/s5gp9htn2pb9z87lwp6fzjg9hv4nys/T//.ts.sock.9001]
Checklist: