Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eliminate collections package, remove /files endpoint #1444

Closed
acud opened this issue Mar 17, 2021 · 4 comments · Fixed by #1501 or #1525
Closed

Eliminate collections package, remove /files endpoint #1444

acud opened this issue Mar 17, 2021 · 4 comments · Fixed by #1501 or #1525
Assignees
Labels

Comments

@acud
Copy link
Member

acud commented Mar 17, 2021

Task

The collection package needs to be deprecated as it proxies every manifest on Swarm. This is remnant of a period where we were pivoting around the approach we wanted to use for manifests.

Right now, when a user uploads a directory or a file, the file(s) get processed with their arbitrary length content chunked and saved, yielding the content addressed hash for the content. Now, in order for us to know which type of file it is, so that browsers can correctly display the file, metadata is required, alongside with file names, directories, etc. This is used with the mantaray manifest, and is used in the /dirs endpoint (but not in the /files, where only a metadata entry is used from the collections package).

Arising from this change there are 2 problems:

  • Existing content cannot be accessed
  • /files endpoint becomes unusable, since we need to add support to metadata in order to display the file correctly

The solutions:

  • For existing content, an external tool (open source) must be written, and that can allow us to remove this code out of the codebase, while allowing users to still get the correct content-addressed hash to use after this change is merged. The external tool should mimic the behavior of this existing manifest type which will be removed, and given an existing root hash, should give the user the correct hash which already exists in the system (but these days requires a few more retrievals to actually get to) (cc @agazso @vojtechsimetka, this might be a good opportunity for synergy)
  • /files endpoint should be removed, and either /bzz or /dirs endpoint should support single file uploads. This can be nicely handled. Right now, /dirs needs an x-tar content type to upload a directory. We can catch all "other" content types and fall-back into a "single-file mode", resulting in a manifest generated with just a single entry and the necessary metadata and content addressed hash needed to retrieve.

Acceptance criteria

  • /files api removed
  • /bzz must accept single file upload
  • integration tests on beekeeper must be adjusted
  • when uploading a single file through the /bzz endpoint, create manifest, with a single entry with the reference and metadata, path should be empty
@Eknir
Copy link
Contributor

Eknir commented Mar 23, 2021

suggestion: first do openAPI spec and then communicate to Rinke / JS team

@agazso
Copy link
Member

agazso commented Mar 23, 2021

Regarding the proposed solutions:

  1. With this tool given a /files/<file-reference> link (or just the file-reference part) it would be able to download the file?
  2. Currently the /dirs endpoint expects a tar file when uploading. It's also possible to upload a tar file with a single file in it and set the swarm-index-document to that file. That's what we do in the swarm-cli when uploading to a feed currently, because looking up on the /bzz endpoint wouldn't work with feed lookups and file references otherwise.
    I don't know if using the /dirs endpoint for uploading a single file is a good idea. Actually it already feels a bit off that you have to use it for uploading but for downloading you have to use /bzz. I wonder if it were possible to use /bzz for upload instead?

@aloknerurkar
Copy link
Contributor

@acud Correct me if I am wrong

  1. Tool will just return the new root hashes that need to be used for the old ones. User has to use the new hash and query /bzz API to download the file.
  2. Plan is to use the /bzz endpoint for file upload

@aloknerurkar
Copy link
Contributor

Completed sub-tasks:

  • Removed collections package
  • Fixed all unit tests
  • Added new file upload handler

In progress:

  • Meet with JS team to finalize API spec
  • Changes to API after discussion

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants