-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aws-s3-assets: Object keys don't preserve the full extension name for Python project #30257
Comments
Reproducible using below code: import pathlib
import aws_cdk as cdk
from aws_cdk import (
Stack,
aws_s3_assets,
aws_lambda
)
from constructs import Construct
HERE = pathlib.Path(__file__).parent
class Issue30257Stack(Stack):
def __init__(self, scope: Construct, construct_id: str, **kwargs) -> None:
super().__init__(scope, construct_id, **kwargs)
self.models_s3_artifact = aws_s3_assets.Asset(
self,
"Models",
path=str(HERE),
bundling=cdk.BundlingOptions(
image=cdk.DockerImage.from_registry("debian"),
command=[
"bash", "-c", "tar -czvf /asset-output/dummy.tar.gz --files-from /dev/null"
],
output_type=cdk.BundlingOutput.ARCHIVED
)
)
lambdaFunction = aws_lambda.Function(
self, "TestLambdaFunction",
runtime=aws_lambda.Runtime.NODEJS_18_X,
code=aws_lambda.Code.from_asset("lambda"),
handler="hello.handler",
environment=dict(
S3_BUCKET_NAME=self.models_s3_artifact.s3_bucket_name,
S3_OBJECT_KEY=self.models_s3_artifact.s3_object_key,
S3_OBJECT_URL=self.models_s3_artifact.s3_object_url)
)
cdk.CfnOutput(self, "TarURI", value=self.models_s3_artifact.s3_object_url) Output of
Also reproducible using TypeScript code below (create import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import * as s3assets from 'aws-cdk-lib/aws-s3-assets';
import * as lambda from 'aws-cdk-lib/aws-lambda';
import * as path from 'path';
export class TypescriptStack extends cdk.Stack {
constructor(scope: Construct, id: string, props?: cdk.StackProps) {
super(scope, id, props);
const lambdaAsset = new s3assets.Asset(this, "LambdaAsset", {
path: path.join(path.resolve(__dirname, '..'), 'assets/dummy.tar.gz')
});
new lambda.Function(this, "myLambdaFunction", {
code: lambda.Code.fromAsset('lambda'),
runtime: lambda.Runtime.NODEJS_18_X,
handler: "index.lambda_handler",
environment: {
'S3_BUCKET_NAME': lambdaAsset.s3BucketName,
'S3_OBJECT_KEY': lambdaAsset.s3ObjectKey,
'S3_OBJECT_URL': lambdaAsset.s3ObjectUrl
}
});
new cdk.CfnOutput(this, "AssetURI", {
value: lambdaAsset.s3ObjectUrl
});
}
} The related issue #12699 was fixed for CDK v1 (which is depreciated now). Perhaps the same fix needs to be ported for CDK v2. |
OK I read this from SageMaker doc:
And that model has to be uploaded to an Amazon S3 customer-managed bucket. Is this your use case? If that is the case, I don't think you should use S3 Assets like that. You should use s3 bucket deployment this way: import {
aws_s3 as s3,
aws_s3_deployment as s3d,
} from 'aws-cdk-lib';
export class DummyStack extends Stack {
constructor(scope: Construct, id: string, props: StackProps) {
super(scope, id, props);
const sageMakerModelBucket = new s3.Bucket(this, 'SageMakerModelBucket', {
removalPolicy: RemovalPolicy.DESTROY,
})
const deployed = new s3d.BucketDeployment(this, 'SageMakerModelDeployment', {
destinationBucket: sageMakerModelBucket,
sources: [s3d.Source.asset(path.join(__dirname, '../assets'))],
extract: true,
})
new CfnOutput(this, 'SageMakerModelBucketOutput', { value: `s3://${sageMakerModelBucket.bucketName}`});
};
}
% ls -al assets/dummy.tar.gz
You should see the bucketName in the output. Try
% aws s3 ls s3://dummy-stack-sagemakermodelbucket6aea4c01-xxx
2024-05-18 08:31:30 1610 dummy.tar.gz That's it! Generally, aws-s3-assets is to create CDK assets needed by CDK App, for example, lambda functions or anything that requires staging assets to be stored in the cdk file assets bucket created in cdk bootstrap in the format of I hope it clarifies. Let me know if it works for you. |
Hi @pahud, I'm deploying a custom PyTorch inference logic as part of a SageMaker endpoint. It's a standard flow as described here, where I've created a custom inference logic. I have something similar to the described in the doc. Note that I didn't compressed all into model.tar.gz yet, and I was hoping I could use aws_s3_assets.Asset with that end (as this is only for packing and deploying the endpoint I don't want to keep running multiple tar -czvf ... commands and the Asset's functionality fits perfectly doing that for us):
I've developed the model locally and now inference is working locally (have the pre-trained weights model.pth and the working inference.py for custom inference logic). To deploy to the endpoint one must create the tarball, upload to S3 and create the endpoint. |
Describe the bug
Same as #12699 but for Python project.
Neither simple upload nor bundling asset with command in Docker container maintain full extension (HASH.tar.gz -> HASH.gz).
"tar.gz" file is expected from SageMaker endpoint.
Expected Behavior
I expect the s3 object key is ended with tar.gz
Current Behavior
When using aws-s3-assets to bundle the asset HASH.tar.gz by executing a command in a Docker container and upload to S3 it is renamed to HASH.gz
Same occurs when passing HASH.tar.gz directly and is uploaded to S3 but renamed to HASH.gz
Reproduction Steps
Reproduction Steps for Docker (same behavior when omitting bundling and passing path="path/my.tar.gz"
Code:
It is possible to see that locally in
cdk.out
the tarball has correct extensions:But in S3 and CFN output is wrong:
Possible Solution
No response
Additional Information/Context
No response
CDK CLI Version
2.142.1
Framework Version
No response
Node.js Version
v18.17.1
OS
MacOSX
Language
Python
Language Version
Python 3.9.12
Other information
No response
The text was updated successfully, but these errors were encountered: