Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add S3 and Cloud management Diagrams to our overall architecture #48

Open
2 tasks
10d9e opened this issue Jun 8, 2023 · 10 comments
Open
2 tasks

Add S3 and Cloud management Diagrams to our overall architecture #48

10d9e opened this issue Jun 8, 2023 · 10 comments
Assignees

Comments

@10d9e
Copy link
Contributor

10d9e commented Jun 8, 2023

We would like to have S3 integration added to our overall architecture diagrams. Should demonstrate how Amazon/Azure/GCP S3 storage fits into the Delta stack.

cc: @schreck23 @jimmylee

Tasks

@alvin-reyes
Copy link
Collaborator

alvin-reyes commented Jun 9, 2023

Process Flow

This is how I would design an ingestion layer that uses different cloud storage provider adapters
image

@schreck23
Copy link

This won't allow us to adhere to full standards ... the flow is more complicated than this if we are truly going to support S3. Especially as our customer list diversifies. The S3 protocol is very detailed and requires auxiliary storage to maintain customer specific data outside of just the core objects.

@schreck23
Copy link

If S3 is our desire this will require some level of detailed planning ... especially when we get into S3 policy management and verb support .. even like head_object calls are important but this flow needs to account for the handling of this meta and also ACL/bucket mapping and other things. How standardized do we need to be? If it is even 50% that is a significant lift.

@alvin-reyes
Copy link
Collaborator

Do we want a more extensive diagram? My thought was just a simplified version / High-level diagram that conceptualizes the integration.

image

@alimbuyuguen
Copy link

lmk when this is good to go and I can start chugging <3

@schreck23
Copy link

I'll put something together to highlight my thoughts then we can delegate what we wish to support/attack straightaway in v1.

@schreck23
Copy link

The S3 problem is not pulling from S3, its providing an S3 compliant endpoint where we can receive data directly. I am nearly done with an S3 compliant connector for Ptolemy, this problem we are discussing is a completely different animal.

@alvin-reyes
Copy link
Collaborator

Thanks. Interesting so this adapter is meant to replace the client library that they are using to upload from s3.. so same endpoints and same post body expectation so the dev UX will be the same.

@10d9e
Copy link
Contributor Author

10d9e commented Jun 14, 2023

per @alimbuyuguen - the latest image:

image

@schreck23
Copy link

There is one small issue I just caught in the diagram ... should say metadata going to the DB .. we are not persisting objects there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants