Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft for file archival and retrieval NIP #1098

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

danieldaquino
Copy link

@danieldaquino danieldaquino commented Mar 3, 2024

This a draft for a NIP that describes how immutable files can be referenced, located, and retrieved in a way that is decentralized yet flexible, extensible, and easy to implement.

I think it would be useful not only to decentralize media storage, but also unlock other Nostr use cases that require dealing with files archives (or any immutable binary data)

I tried to explain everything in the NIP, but please let me know if there is something incomplete on unclear in the spec, or any other feedback! (This is the first time I draft up a NIP)

Thank you very much!
Daniel

@danieldaquino
Copy link
Author

@fiatjaf, @jb55, I had some thoughts on decentralized media storage and other file-related applications, so I decided to write this NIP draft to get some feedback and hopefully give something to the community.

it would mean a lot to me to hear your thoughts on this draft whenever you have to skim over it! Thanks!

@jb55
Copy link
Contributor

jb55 commented Mar 4, 2024

Looks really good!

If we're not using TLVs for gateways we can just use URIs and hex ids instead of nblob:

nostr+file:hexid?gateways=

I never liked the TLV thing, so maybe we can simplify the spec by avoiding nblob here and just use URIs?

@ZenenTreadwell
Copy link

Hi! I've been working on Nostr in my own bubble for a while, and something that I implemented was a reference to the id of an existing nip-94 post.

For example, my user metadata had a 32-bit hexstring as its 'avatar' entry, and my client would look for the matching files metadata and decide how to resolve that request. It leaves the option for querying a URL, but it also opens up the potentials of loading the blurhash first, pulling that particular file over torrent, or even something like querying an IPFS endpoint.

The only issue I had with it is the potential ambiguity of leaving it as a regular hexstring - I think that bech32-encoding it would be better but I never got around to implementing that.

@jb55
Copy link
Contributor

jb55 commented Mar 4, 2024

reference to the id of an existing nip-94 post

this seems way more complicated tbh. a post should not be needed for referencing files on remote gateways. then you have to do 2 lookups.

@vitorpamplona
Copy link
Collaborator

Can the spec be integrated with NIP-96 servers? Then most image servers we use today would already respond to nblob and we could use nblobs instead of URLs in kind 1s, getting rid of the DNS dependency and the domain parsing stuff to migrate to new servers in nip-96.

+1 for your use of inline metadata like #521 did. It's easy for users to switch gateways if they need to change the server when resharing a link.

@ZenenTreadwell
Copy link

reference to the id of an existing nip-94 post

this seems way more complicated tbh. a post should not be needed for referencing files on remote gateways. then you have to do 2 lookups.

I see your point, as someone getting a post with an attached image in this form would have to look up the referenced event and then resolve it from there.

and I would argue that it's more complex, but not necessarily more complicated. Provided you store your NIP-94s in a hash table, it's an O(1) lookup. Provided users are loading and storing data as it gets sent in (the file metadata would be published alongside the post itself), I see it as a dictionary lookup to pre-loaded information about the file you're looking for.

I think that the idea that @danieldaquino has drafted is great - archiving file data and resolving it locally whenever possible is a good idea. I also think that using the event kind that has already been proposed for file metadata would be a good thing to integrate at a core level.

If you're concerned about having two lookups, I could implement an endpoint that allows you to directly query for the hexstring on the relay server and get the relevant file back directly, either from a local archive or by resolving the url server-side and stashing it.

@ZenenTreadwell
Copy link

Can the spec be integrated with NIP-96 servers? Then most image servers we use today would already respond to nblob and we could use nblobs instead of URLs in kind 1s, getting rid of the DNS dependency and the domain parsing stuff to migrate to new servers in nip-96.

This seems prescient. I think that it might be as simple as maintaining a map of all the file uploads and their bech32 identifiers. If the identifier is a hit, return it. Otherwise, resolve it and then return it. It is probably better to reference the file directly by its hash compared to the id of the file metadata event, but I still think that we should keep NIP-94 in mind if we're doing anything related to file upload & sharing.

@jb55
Copy link
Contributor

jb55 commented Mar 4, 2024

we should call this blobstr (blobs and stuff transmitted by relays) because it doesn't have any hard depedency on nostr :)

@fiatjaf
Copy link
Member

fiatjaf commented Mar 4, 2024

we should call this blobstr (blobs and stuff transmitted by relays) because it doesn't have any hard depedency on nostr :)

That's what I was going to say. This looks great but it shouldn't even be a NIP. It should be its own thing.

@fiatjaf
Copy link
Member

fiatjaf commented Mar 4, 2024

What do you think of adding another, optional, layer of indirection in there, for "routing servers" that just stores where each thing is, but not the files themselves, and you can query it to get a list of servers?

I can already feel the DHT lovers agonizing when they hear about this.

@ZenenTreadwell
Copy link

What do you think of SOLAR, which is Storage and Other Layers Added to Relays?

Codebase is still mostly a pile of spaghetti, but if there's interest then I can make cleaning it up into a priority.

@jb55
Copy link
Contributor

jb55 commented Mar 4, 2024

What do you think of adding another, optional, layer of indirection in there, for "routing servers" that just stores where each thing is, but not the files themselves, and you can query it to get a list of servers?

I can already feel the DHT lovers agonizing when they hear about this.

so a blob tracker? starting to sound like if bittorrent and nostr had a child.

@ZenenTreadwell
Copy link

I'm currently in conversation with some people in a major private tracker community, they're going to need to swap systems in the coming years because of some fundamental changes brought on by moving towards IPv6. I think I'm going to direct my dev capacity towards laying out the infrastructure for SOLAR as a distributing torrent tracking system.

@danieldaquino
Copy link
Author

Looks really good!

If we're not using TLVs for gateways we can just use URIs and hex ids instead of nblob:

nostr+file:hexid?gateways=

I never liked the TLV thing, so maybe we can simplify the spec by avoiding nblob here and just use URIs?

@jb55, thanks, I like your suggestion! The protocol portion of the URI can provide the necessary context that the hash refers to a blob in this spec.

When it comes to the encoding itself, I do prefer bech32 to pure hex because it includes error correction and Base32 is made to avoid human typos and mistakes. I imagine these URIs might be something people copy-paste around or type in somewhere, so it would be nice to have a way to check for address errors.

So perhaps something that looks like this?

nostr+file:1q9maw3n56tnvgqy2xaqzwgvjys5mptvh6hpffhwrpduc0r89pmr0q5k9p4t?filename=example.py&gateways=https%3A%2F%2Fgateway.example.com%2Chttps%3A%2F%2Fgateway-example-2.com

or even

nblob:1q9maw3n56tnvgqy2xaqzwgvjys5mptvh6hpffhwrpduc0r89pmr0q5k9p4t?filename=example.py&gateways=https%3A%2F%2Fgateway.example.com%2Chttps%3A%2F%2Fgateway-example-2.com

@danieldaquino
Copy link
Author

danieldaquino commented Mar 8, 2024

Can the spec be integrated with NIP-96 servers? Then most image servers we use today would already respond to nblob and we could use nblobs instead of URLs in kind 1s, getting rid of the DNS dependency and the domain parsing stuff to migrate to new servers in nip-96.

Thanks @vitorpamplona for your feedback!

If I understood your question correctly, I believe the answer is yes. If a server implemented NIP-96, it would probably not take much work to adopt this NIP as well.

That's because one of the things required by NIP-96 is for a server to make available the route https://your-file-server.example/custom-api-path/<sha256-file-hash>(.ext). To implement that route they probably need to have infrastructure for serving and storing files, checking their hashes, and figuring out the mime type from the file extension.

Therefore, adopting this NIP would probably be a matter of registering a new route, a bit of logic to convert bech32 into a hex format, and then connecting to their existing NIP-96 logic.

@danieldaquino
Copy link
Author

Thank you @ZenenTreadwell for your feedback!

I also think that using the event kind that has already been proposed for file metadata would be a good thing to integrate at a core level.

Regarding your point about NIP-94 and storing file metadata on a Nostr event or on a URI, I believe both methods can co-exist, and I can see each method being more suitable on particular use cases:

  • Embedding file metadata on a URI (such as in this NIP) might be simpler for cases where a human user wants to reference an image on a NIP-01 post.
  • In contrast, file metadata specified on a Nostr event might be more suitable for other uses, such as (for example) gateways following this NIP who want to share info about files and sync themselves with other gateways via Nostr.

@danieldaquino
Copy link
Author

we should call this blobstr (blobs and stuff transmitted by relays) because it doesn't have any hard depedency on nostr :)

That's what I was going to say. This looks great but it shouldn't even be a NIP. It should be its own thing.

@fiatjaf @jb55, I was debating myself on this as well 😄

@danieldaquino
Copy link
Author

What do you think of adding another, optional, layer of indirection in there, for "routing servers" that just stores where each thing is, but not the files themselves, and you can query it to get a list of servers?
I can already feel the DHT lovers agonizing when they hear about this.

so a blob tracker? starting to sound like if bittorrent and nostr had a child.

I believe NIP-94 could be one way to accomplish this

Example:

{
  "kind": 1063,
  "tags": [
    ["url", "https://example.com/.well-known/nostr/nipXX/nblob1q9maw3n56tnvgqy2xaqzwgvjys5mptvh6hpffhwrpduc0r89pmr0q5k9p4t"],
    ["m", "image/jpg"],
    ["x", "06fa8871..."],
    ["ox", "06fa8871..."]
  ],
  "content": "Some example file",
  (...)
}

There might be other ways that reduce hash repetition, but it is definitely interoperable enough that it could work even with no modifications to any of the two specs

@alltheseas
Copy link
Contributor

@mleku what's your cold eyes review of blobstr?

@jb55 jb55 mentioned this pull request May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants