-
Notifications
You must be signed in to change notification settings - Fork 614
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support multipart blob download #5715
base: master
Are you sure you want to change the base?
Support multipart blob download #5715
Conversation
Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5715 +/- ##
==========================================
+ Coverage 36.21% 36.25% +0.03%
==========================================
Files 1303 1303
Lines 109644 109713 +69
==========================================
+ Hits 39710 39774 +64
+ Misses 65810 65802 -8
- Partials 4124 4137 +13
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @wayner0628 - i think this is good. I want to get @eapolinario or @EngHabu to take a quick look at this as well though. This is a pretty core interface that's changing in this PR.
flytestdlib/storage/storage.go
Outdated
@@ -78,6 +78,9 @@ type RawStore interface { | |||
// Head gets metadata about the reference. This should generally be a light weight operation. | |||
Head(ctx context.Context, reference DataReference) (Metadata, error) | |||
|
|||
// GetItems retrieves the paths of all items from the Blob store or an error | |||
GetItems(ctx context.Context, reference DataReference) ([]string, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would this be more accurately named ListItems? Also what is retrieved? The relative path to the reference input? can we add comment?
flytestdlib/storage/mem_store.go
Outdated
@@ -54,6 +55,23 @@ func (s *InMemoryStore) Head(ctx context.Context, reference DataReference) (Meta | |||
}, nil | |||
} | |||
|
|||
func (s *InMemoryStore) GetItems(ctx context.Context, reference DataReference) ([]string, error) { | |||
var items []string | |||
prefix := string(reference) + "/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will reference ever already have a /?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @wayner0628
Can you test cases like this PR?
flyteorg/flytekit#2258
To be more specifically, this case
flyte_dir_io = ContainerTask(
name="flyte_dir_io",
input_data_dir="/var/inputs",
output_data_dir="/var/outputs",
inputs=kwtypes(inputs=FlyteDirectory),
outputs=kwtypes(out=FlyteDirectory),
image="futureoutlier/rawcontainer:0320",
command=[
"python",
"write_flytedir.py",
"{{.inputs.inputs}}",
"/var/outputs/out",
],
)
If possible, please proivde screenshot, thank you.
There is also this PR, https://github.com/flyteorg/flyte/pull/5674/files which I think we should merge first. The change to core api should probably be done separately. |
@wayner0628 #5741 this was just merged, adding a list api to the storage client. mind using the new interface to do this? |
@wild-endeavor No problem, I'll update this PR to align with the new interface. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tips to develop copilot in single binary.
- config
plugins:
logs:
dynamic-log-links:
- comet-ml-execution-id:
displayName: Comet
templateUris: "{{ .taskConfig.host }}/{{ .taskConfig.workspace }}/{{ .taskConfig.project_name }}/{{ .executionName }}{{ .nodeId }}{{ .taskRetryAttempt }}{{ .taskConfig.link_suffix }}"
- comet-ml-custom-id:
displayName: Comet
templateUris: "{{ .taskConfig.host }}/{{ .taskConfig.workspace }}/{{ .taskConfig.project_name }}/{{ .taskConfig.experiment_key }}"
kubernetes-enabled: true
kubernetes-template-uri: http://localhost:30080/kubernetes-dashboard/#/log/{{.namespace }}/{{ .podName }}/pod?namespace={{ .namespace }}
cloudwatch-enabled: false
stackdriver-enabled: false
k8s:
default-env-vars:
- FLYTE_AWS_ENDPOINT: "http://flyte-sandbox-minio.flyte:9000"
- FLYTE_AWS_ACCESS_KEY_ID: minio
- FLYTE_AWS_SECRET_ACCESS_KEY: miniostorage
- MLFLOW_TRACKING_URI: postgresql+psycopg2://postgres:@postgres.flyte.svc.cluster.local:5432/flyteadmin
co-pilot:
image: "localhost:30000/copilot-flytefile:0603"
- how to build copilot image?
useDockerfile.flytecopilot
to build it.
…ltipart-blob Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
Signed-off-by: wayner0628 <a901639@gmail.com>
Hi @Future-Outlier and @wild-endeavor, I’ve been encountering an issue while running a Flytekit test case. The error I'm seeing is as follows:
The error with flytectl demo start --dev
POD_NAMESPACE=flyte ./flyte start --config flyte-single-binary-local.yaml
pyflyte run --remote raw_container.py calculate_ellipse_area_shell --a 1.1 --b 1.2 Environment Details:
I build, tag and push the modified docker image when testing this PR, but I did not use modified image for the Flytesnacks, it still failed. This has been blocking me for a couple of weeks now. I’ll continue investigating, but any help or guidance you could provide would be greatly appreciated! Thank you in advance. |
Can you show me your config file? |
It's the original one, I used to run Flytesnacks
|
you have to add co-pilot image. k8s:
default-env-vars:
- FLYTE_AWS_ENDPOINT: "http://flyte-sandbox-minio.flyte:9000"
- FLYTE_AWS_ACCESS_KEY_ID: minio
- FLYTE_AWS_SECRET_ACCESS_KEY: miniostorage
- MLFLOW_TRACKING_URI: postgresql+psycopg2://postgres:@postgres.flyte.svc.cluster.local:5432/flyteadmin
co-pilot:
image: "cr.flyte.org/flyteorg/flytecopilot:v1.13.1" |
@Future-Outlier , I'll try it later, thank you |
@Future-Outlier , I add copilot image |
@wayner0628 show me your python code and show your whole k8s config. |
Tracking issue
#3632
Why are the changes needed?
Supporting multipart blob downloads allows us to completely copy the specified directory into the input path.
What changes were proposed in this pull request?
List
api to collect items under container before downloadList
api for memory storageHow was this patch tested?
unit tests, specifically in
download_test.go
Setup process
Screenshots
Check all the applicable boxes
Related PRs
flyteorg/flytekit#2258
Docs link
NA