Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix interpretation of filename from model archive URL #2416

Merged
merged 1 commit into from
Jun 15, 2023

Conversation

namannandan
Copy link
Collaborator

Description

Current implementation incorrectly interprets the model archive filename from the S3 presigned URL as follows:

S3 presigned URL: https://test-account.s3.us-west-2.amazonaws.com/mar_files/resnet-18.mar?response-content-disposition=inline&X-Amz-Security-Token=token&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20230614T182131Z&X-Amz-SignedHeaders=host&X-Amz-Expires=43200&X-Amz-Credential=credential&X-Amz-Signature=signature

Interpreted model archive filename from URL: resnet-18.mar?response-content-disposition=inline&X-Amz-Security-Token=token&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20230614T182131Z&X-Amz-SignedHeaders=host&X-Amz-Expires=43200&X-Amz-Credential=credential&X-Amz-Signature=signature

Expected model archive filename from URL: resnet-18.mar

This causes the download to fail.

Fixes #1293

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Feature/Issue validation/testing

  • Unit tests to cover the new change (runs in CI)
  • Regression tests in CI

Checklist:

  • Did you have fun?
  • Have you added tests that prove your fix is effective or that this feature works?
  • Has code been commented, particularly in hard-to-understand areas?
  • Have you made corresponding changes to the documentation?

@namannandan namannandan force-pushed the naman-mar-url-filename-fix branch from 6553560 to 7b1141e Compare June 15, 2023 00:37
@codecov
Copy link

codecov bot commented Jun 15, 2023

Codecov Report

Merging #2416 (7a1b155) into master (c2cdcfb) will not change coverage.
The diff coverage is n/a.

❗ Current head 7a1b155 differs from pull request most recent head 7b1141e. Consider uploading reports for the commit 7b1141e to get more accurate results

@@           Coverage Diff           @@
##           master    #2416   +/-   ##
=======================================
  Coverage   72.01%   72.01%           
=======================================
  Files          78       78           
  Lines        3648     3648           
  Branches       58       58           
=======================================
  Hits         2627     2627           
  Misses       1017     1017           
  Partials        4        4           

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@namannandan namannandan marked this pull request as ready for review June 15, 2023 01:41
Copy link
Collaborator

@lxning lxning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@namannandan the pre-signed url is provided by user. According to #1293, the url does have "https://xxxx".

{
  "code": 400,
  "type": "DownloadArchiveException",
  "message": "Failed to download archive from: https://log-analyzer-torchserve-mar.s3.amazonaws.com/test_service/stage/uw1/21_10_26_00_23/anomaly_detection_1635207765.7685282.mar?AWSAccessKeyId=****&Signature=****=&x-amz-security-token=****&Expires=1635283752"
}

Please provide end 2 end test result.

@@ -55,7 +54,7 @@ public static ModelArchive downloadModel(
throw new ModelNotFoundException("empty url");
}

String marFileName = FilenameUtils.getName(url);
String marFileName = ArchiveUtils.getFilenameFromUrl(url);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FilenameUtils.getName is used to extract the mar filename from the url. it is correct function. The original url is passed to function ArchiveUtils.downloadArchive.

The issue most likely happen in downloadArchive.

Copy link
Collaborator Author

@namannandan namannandan Jun 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is correct, the call path is ModelArchive.downloadModel -> ArchiveUtils.downloadArchive -> HttpUtils.copyURLToFile -> FileUtils.copyURLToFile.
org.apache.commons.io.FileUtils.copyURLToFile throws IOException because of the destination we pass to it in the case of S3 presigned URL.

For example, in the current implementation, in case of S3 pre-signed URL, we call org.apache.commons.io.FileUtils.copyURLToFile with the following arguments:

source: https://test-account.s3.us-west-2.amazonaws.com/mar_files/resnet-18.mar?response-content-disposition=inline&X-Amz-Security-Token=token&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20230614T182131Z&X-Amz-SignedHeaders=host&X-Amz-Expires=43200&X-Amz-Credential=credential&X-Amz-Signature=signature

destination: resnet-18.mar?response-content-disposition=inline&X-Amz-Security-Token=token&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20230614T182131Z&X-Amz-SignedHeaders=host&X-Amz-Expires=43200&X-Amz-Credential=credential&X-Amz-Signature=signature

and this fails with IOException.

For testing, I hardcoded the destination in the HttpUtils.copyURLToFile to resnet-18.mar and the download succeeded.

Therefore, I've fixed the the implementation to correctly identify filename from model archive URL that contains additional parameters after the filename in ModelArchive.java which is eventually passed to org.apache.commons.io.FileUtils.copyURLToFile.

With the fix in this PR, the arguments are passed correctly:
source: https://test-account.s3.us-west-2.amazonaws.com/mar_files/resnet-18.mar?response-content-disposition=inline&X-Amz-Security-Token=token&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20230614T182131Z&X-Amz-SignedHeaders=host&X-Amz-Expires=43200&X-Amz-Credential=credential&X-Amz-Signature=signature

destination: resnet-18.mar

@namannandan
Copy link
Collaborator Author

@lxning, here is the end to end test result

With current implementation:

$ curl -G --data-urlencode 'url=https://namannan-dev.s3.us-west-2.amazonaws.com/mar_files/resnet-18.mar?response-content-disposition=inline&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEN3%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaCXVzLXdlc3QtMSJGMEQCIHQjHza1%2FN03h%2FvqrpnZIKlbOO0v7BxUVX8miNGB%2FiADAiA6d6za7juEq1t6c2IOsVNUd3YkukkozRmnkfMaCf1DnirlAgg2EAEaDDg1MDQ2NDAzNzE3MSIMytNIGBxmQmZENsrgKsICg%2FBFZOLTjNiwVUagqNCmEu2lt1lBvdSiBC3M7cHXYKy9rzdZ2odMxvefmSWPpjUQ4mWk6r19l2%2B%2FF%2BWzYvQXLgjheSdteChMksz40as%2FYdSnEJ8lG4WKP7%2FVJkqoAy8fe8H9N7Y%2FXvTCya7CbMUjzJabtOkqrIRXF0ZIEvAg9cC7SMJYHfi5nEfWxuUqhzzTeNMVa6CuUWlIi%2F5geaiFwJG4OG8whk5JfXx03GY7agF%2FMv%2FmYXTIDjhil1pgJ7tFzVCw1nGle2%2FqvDbU1iVGq3ENR224N2rIOt8oevZBznzNQ1IHPeQjeg7L2XoSIn8WEM%2BA%2BGx2tO9kV%2FkfPhZY5P223d9Dw549ikNg5KrEfLprxWRmULbhmUxY4OdVyyBQOcM%2Bc9Wvq53IBeDZKU65BuZaaY0LwS2XWSw7H%2BTvWkBq7TD3%2B62kBjqIAq7TEmb%2BPX2fOAUv5cDLkQeGx5RiG2qBgVQuIS6Rgg5SFCEKIb57RLtYjvgi0JZTqISi%2BfC1Ud0Fcv6HERXUguPyN59w9Weai8lXvxvQRg8QQNGvcLdA4KpTtUDCk6SywrTYxkeq%2BYiGI7gTmk9SgnJgBKo3TOUQ3ysTn3e2gInIbZOzN9X1eF608bvI%2Bwp%2BLl%2FoES0txgx2ADO%2FJDPgN5wOyqvAl47tHJxDX9BWRxj0IkOd8nzcEP%2B7w%2B8TOu%2BTzjOHQQh9DXmN5NEYgqy3jmSBka1JdRrhmgvpug4h3Ah4tx65dmLFH9h0KxGatlBz7CnSyS32Unsgb7YCZFv9WPVkBFchvEl%2B5g%3D%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20230615T211458Z&X-Amz-SignedHeaders=host&X-Amz-Expires=43200&X-Amz-Credential=ASIA4MA43LEZ5KHSO5GS%2F20230615%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Signature=a9ba95973c6238a10f8855330e4a586de417a833a2ed7119c5b195aa981aed31' -X POST "http://localhost:8081/models"
{
  "code": 400,
  "type": "DownloadArchiveException",
  "message": "Failed to download archive from: https://namannan-dev.s3.us-west-2.amazonaws.com/mar_files/resnet-18.mar?response-content-disposition=inline&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEN3%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaCXVzLXdlc3QtMSJGMEQCIHQjHza1%2FN03h%2FvqrpnZIKlbOO0v7BxUVX8miNGB%2FiADAiA6d6za7juEq1t6c2IOsVNUd3YkukkozRmnkfMaCf1DnirlAgg2EAEaDDg1MDQ2NDAzNzE3MSIMytNIGBxmQmZENsrgKsICg%2FBFZOLTjNiwVUagqNCmEu2lt1lBvdSiBC3M7cHXYKy9rzdZ2odMxvefmSWPpjUQ4mWk6r19l2%2B%2FF%2BWzYvQXLgjheSdteChMksz40as%2FYdSnEJ8lG4WKP7%2FVJkqoAy8fe8H9N7Y%2FXvTCya7CbMUjzJabtOkqrIRXF0ZIEvAg9cC7SMJYHfi5nEfWxuUqhzzTeNMVa6CuUWlIi%2F5geaiFwJG4OG8whk5JfXx03GY7agF%2FMv%2FmYXTIDjhil1pgJ7tFzVCw1nGle2%2FqvDbU1iVGq3ENR224N2rIOt8oevZBznzNQ1IHPeQjeg7L2XoSIn8WEM%2BA%2BGx2tO9kV%2FkfPhZY5P223d9Dw549ikNg5KrEfLprxWRmULbhmUxY4OdVyyBQOcM%2Bc9Wvq53IBeDZKU65BuZaaY0LwS2XWSw7H%2BTvWkBq7TD3%2B62kBjqIAq7TEmb%2BPX2fOAUv5cDLkQeGx5RiG2qBgVQuIS6Rgg5SFCEKIb57RLtYjvgi0JZTqISi%2BfC1Ud0Fcv6HERXUguPyN59w9Weai8lXvxvQRg8QQNGvcLdA4KpTtUDCk6SywrTYxkeq%2BYiGI7gTmk9SgnJgBKo3TOUQ3ysTn3e2gInIbZOzN9X1eF608bvI%2Bwp%2BLl%2FoES0txgx2ADO%2FJDPgN5wOyqvAl47tHJxDX9BWRxj0IkOd8nzcEP%2B7w%2B8TOu%2BTzjOHQQh9DXmN5NEYgqy3jmSBka1JdRrhmgvpug4h3Ah4tx65dmLFH9h0KxGatlBz7CnSyS32Unsgb7YCZFv9WPVkBFchvEl%2B5g%3D%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20230615T211458Z&X-Amz-SignedHeaders=host&X-Amz-Expires=43200&X-Amz-Credential=ASIA4MA43LEZ5KHSO5GS%2F20230615%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Signature=a9ba95973c6238a10f8855330e4a586de417a833a2ed7119c5b195aa981aed31"
}

With fix in this PR:

$ curl -G --data-urlencode 'url=https://namannan-dev.s3.us-west-2.amazonaws.com/mar_files/resnet-18.mar?response-content-disposition=inline&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEN3%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaCXVzLXdlc3QtMSJGMEQCIHQjHza1%2FN03h%2FvqrpnZIKlbOO0v7BxUVX8miNGB%2FiADAiA6d6za7juEq1t6c2IOsVNUd3YkukkozRmnkfMaCf1DnirlAgg2EAEaDDg1MDQ2NDAzNzE3MSIMytNIGBxmQmZENsrgKsICg%2FBFZOLTjNiwVUagqNCmEu2lt1lBvdSiBC3M7cHXYKy9rzdZ2odMxvefmSWPpjUQ4mWk6r19l2%2B%2FF%2BWzYvQXLgjheSdteChMksz40as%2FYdSnEJ8lG4WKP7%2FVJkqoAy8fe8H9N7Y%2FXvTCya7CbMUjzJabtOkqrIRXF0ZIEvAg9cC7SMJYHfi5nEfWxuUqhzzTeNMVa6CuUWlIi%2F5geaiFwJG4OG8whk5JfXx03GY7agF%2FMv%2FmYXTIDjhil1pgJ7tFzVCw1nGle2%2FqvDbU1iVGq3ENR224N2rIOt8oevZBznzNQ1IHPeQjeg7L2XoSIn8WEM%2BA%2BGx2tO9kV%2FkfPhZY5P223d9Dw549ikNg5KrEfLprxWRmULbhmUxY4OdVyyBQOcM%2Bc9Wvq53IBeDZKU65BuZaaY0LwS2XWSw7H%2BTvWkBq7TD3%2B62kBjqIAq7TEmb%2BPX2fOAUv5cDLkQeGx5RiG2qBgVQuIS6Rgg5SFCEKIb57RLtYjvgi0JZTqISi%2BfC1Ud0Fcv6HERXUguPyN59w9Weai8lXvxvQRg8QQNGvcLdA4KpTtUDCk6SywrTYxkeq%2BYiGI7gTmk9SgnJgBKo3TOUQ3ysTn3e2gInIbZOzN9X1eF608bvI%2Bwp%2BLl%2FoES0txgx2ADO%2FJDPgN5wOyqvAl47tHJxDX9BWRxj0IkOd8nzcEP%2B7w%2B8TOu%2BTzjOHQQh9DXmN5NEYgqy3jmSBka1JdRrhmgvpug4h3Ah4tx65dmLFH9h0KxGatlBz7CnSyS32Unsgb7YCZFv9WPVkBFchvEl%2B5g%3D%3D&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20230615T211458Z&X-Amz-SignedHeaders=host&X-Amz-Expires=43200&X-Amz-Credential=ASIA4MA43LEZ5KHSO5GS%2F20230615%2Fus-west-2%2Fs3%2Faws4_request&X-Amz-Signature=a9ba95973c6238a10f8855330e4a586de417a833a2ed7119c5b195aa981aed31' -X POST "http://localhost:8081/models"
{
  "status": "Model \"resnet-18\" Version: 1.0 registered with 0 initial workers. Use scale workers API to add workers for the model."
}

@namannandan namannandan requested a review from lxning June 15, 2023 21:52
@namannandan namannandan merged commit 679b33d into pytorch:master Jun 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Model Management API doesn't work with S3 presigned URL
3 participants