Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

end user cannot read from relay (relays reject large contact lists ) #41

Open
alltheseas opened this issue May 24, 2024 · 37 comments
Open

Comments

@alltheseas
Copy link
Collaborator

alltheseas commented May 24, 2024

client initiating action

nostrudel

client receiving or observing action (or lack thereof)

nostrudel

relays

relay.mostr.pub
nostr.mom
momostr.pink
nostr.bitcoiner.social
nostr.fmt.wiz.biz
relay.nostr.band
nostr.bitcoiner.social

timestamp

image

https://njump.me/nevent1qqs2z0f52dn67j3xc79cpxhzapdfhct95dmluhn84rz5zs28fww3z5spz4mhxue69uhhyetvv9ujumt0wd68ytnsw43qzyrhwden5te0dehhxarj9emkjmn9qy8hwumn8ghj7mn0wd68ytnddaksz9rhwden5te0wfjkccte9ejxzmt4wvhxjmcmsjtzk

reported by @0xtrr

what happens

invalid: event too large: 69764

with 947 follows

suggestion

How might relays deal with large contact lists?

@alltheseas
Copy link
Collaborator Author

alltheseas commented May 24, 2024

@0xtrr suggests relays can expose this limitation via NIP-11 implementation

https://github.com/nostr-protocol/nips/blob/master/11.md#server-limitations

You can find this document by sending a HTTP request to the relay url with the "Accept" header set to "application/nostr+json". My relay uses strfry which doesn't expose that information unfortunately. You can test it by sending this curl request:

curl 'https://nostr.oxtr.dev' --header 'Accept: application/nostr+json'

@alltheseas alltheseas changed the title relays reject large contact lists relays reject large contact lists (end user cannot read from relay) May 24, 2024
@alltheseas alltheseas changed the title relays reject large contact lists (end user cannot read from relay) end user cannot read from relay (relays reject large contact lists ) May 24, 2024
@alexgleason
Copy link
Collaborator

See also: nostr-protocol/nips#1179

I am very passionate about this issue. We just need to convince @fiatjaf to bend the fabric of space and time.

@alltheseas
Copy link
Collaborator Author

alltheseas commented May 25, 2024

Suggestion: delta addition by @water783

Https://njump.me/note1hagrttg5g3urdadu6m8y0wg7emps5wz2yu46ewdrwclyeu08fadqnrm88a

The relay should no longer dump. We need to add a “add to follow” event, and then the relay should automatically add the user to the follow list, instead of updating the entire list at once.

@fiatjaf
Copy link
Collaborator

fiatjaf commented May 25, 2024

This has been discussed a million times, the consensus is that some people think this is the worst problem to ever have happened to humanity while others think it's not a big deal and that the cure may end up being worse than the disease.

@fiatjaf
Copy link
Collaborator

fiatjaf commented May 25, 2024

Why do you have to store your huge contact list in all these relays anyway?

@fiatjaf
Copy link
Collaborator

fiatjaf commented May 25, 2024

To be honest contact lists should have never existed anyway -- or relays should not index individual tags in them.

@alexgleason
Copy link
Collaborator

@fiatjaf This isn't even about indexing. This is just the event being so huge the relay rejects it.

@alltheseas
Copy link
Collaborator Author

alltheseas commented May 25, 2024

@danieldaquino do you have ideas on a protocol level solution?

@water783
Copy link

I suggest that the relay can supports downloading large events via HTTP/HTTPS, with resumable downloads, instead of using ws/wss requests.

@alltheseas
Copy link
Collaborator Author

@vitorpamplona mentions the issue is greater than contact lists

Anything that can have a bigger blob, like large blog posts, HTML pages (from hostr.cc) documents, encrypted spreadsheets, private lists, long DMs, large zap split setups, group membership events, etc.

We will get into a point that most events will simply be rejected. So yeah, we need to figure out a more accurate way for relays to expose what they support. NIP-11 is nowhere near enough.

@fiatjaf
Copy link
Collaborator

fiatjaf commented May 25, 2024

This is just the event being so huge the relay rejects it.

@alexgleason

For some reason relays don't want to store a bunch of data for free for anyone. This is perfectly normal and expected.

If you break the contact list in multiple events they will take more space in the end. That may bypass the current naïve limits these relays are imposing, but ultimately, if relays really want to save disk space, they will have to block these too, and much more aggressively.


Now, in reality, relays are probably blocking these contact lists because they don't know these are real contact lists, they are not aware that contact lists can be huge, they're just seeing a big event and blocking it thinking it's some base64 video file.

The solution here is not to try to be sneaky and bypass current limits creating a potentially worse future for everybody, the solution is to improve technology so relays can properly handle contact lists that they want to store while also protecting themselves from base64 videos.

I mentioned indexing because that's another reason relays might not want to store these lists, because they have a lot of tags. And that's another thing that should be fixed by improving relay codebases.

Honestly, with a dozens lines of code this wouldn't be a problem.

@fiatjaf
Copy link
Collaborator

fiatjaf commented May 25, 2024

With that said, I like the idea of multiple events myself at a conceptual level, it is certainly more "correct" and more elegant. I'm worried about its ultimate practicality gains, but I think it should be tried.

@alltheseas
Copy link
Collaborator Author

Honestly, with a dozens lines of code this wouldn't be a problem.

To what? Relays, e.g. strfry?

@fiatjaf
Copy link
Collaborator

fiatjaf commented May 26, 2024

To what? Relays, e.g. strfry?

Yes.

@alltheseas
Copy link
Collaborator Author

To what? Relays, e.g. strfry?

Yes.

Since strfry guy is gone, does it make sense to post a bounty for these changes?

@fiatjaf
Copy link
Collaborator

fiatjaf commented May 27, 2024

He is not gone, and I don't think it makes sense to ever post bounties anymore. Also we don't know exactly what would be these changes. What I was trying to say is that relays need space to make more fine-grained decisions about what they're going to store, but the exact best interface for that is up for experimentation. You could go very far using strfry plugins though, which are very customizable and already in use by many relays out there.

@dskvr
Copy link
Collaborator

dskvr commented May 27, 2024

or relays should not index individual tags in them.

Would make counting followers a bit more laborious.

Note: Fat-fingered a hotkey to close the issue, whoops.

@dskvr dskvr closed this as completed May 27, 2024
@dskvr dskvr reopened this May 27, 2024
@fabianfabian
Copy link
Collaborator

I don't see the point of publishing large follow lists in public and I hope relays will keep rejecting them.

@alltheseas
Copy link
Collaborator Author

I don't see the point of publishing large follow lists in public and I hope relays will keep rejecting them.

What solution do you propose?

@fabianfabian
Copy link
Collaborator

fabianfabian commented May 27, 2024

I don't see the point of publishing large follow lists in public and I hope relays will keep rejecting them.

What solution do you propose?

On the client:

  • somehow encourage to clean up follow list
  • auto clean up? offer the remove inactive accounts

In general:

  • if public, it should be useful for others
  • if private, then it should be encrypted and maybe even on a private or paid relay so you can make it as big as you want
  • find out the cause, how or why did the follow list get so big? maybe it shouldn't be a follow list but a different kind of list

@fiatjaf
Copy link
Collaborator

fiatjaf commented May 27, 2024

Would make counting followers a bit more laborious.

Good.

@fiatjaf
Copy link
Collaborator

fiatjaf commented May 27, 2024

I agree with @fabianfabian pretty much.

@jb55
Copy link
Collaborator

jb55 commented May 27, 2024

I don't see the point of publishing large follow lists in public and I hope relays will keep rejecting them.

really? it's pretty common on twitter for accounts with 10k+ follows. people have the expectation that they can follow as many people as they want. We can say "too bad", but it feels like we are just giving up and working around a bad design to begin with.

I like the solution of replaceable events for updating a follow status for a given pubkey. then you can sync all of these statuses incrementally.

The only problem after that is reducing the subscription size for large number of follows. outbox helps this a lot, but in the degenerative case where a large number of people share the same relay it might make sense to make an optimization that simply does: ["REQ","timeline",{"timeline": "pubkey..."}], but this is an ugly custom hack for a particular usecase. in nostrdb I am working on wasm queries: ["REQ","timeline",{"script": "nscript..."}] which could build a timeline query from a contact list on the relay, but I doubt this will be adopted in general, maybe only for AUTH paid relays.

In damus some of our users were experiencing stalling with rejected queries that were too large, we split our timeline query into batches, but this feels like a hack to get around relay query size limits. Batching may be the only realistic solution that works today.

if private, then it should be encrypted and maybe even on a private or paid relay so you can make it as big as you want

even if you can make it as big as you want, it's still ugly to have to uploads megabytes of data every time you follow or unfollow someone.

@water783
Copy link

Agree with @jb55 . We have many scenarios where we need large lists (we currently don't have that many users, but we might in the future), such as 10k+ follows or 10k+ group members.

It would be great if we could provide incremental updates to the list, such as offering "add follow" and "remove follow" instead of uploading an entire list. For large lists, we can offer segmented downloads or HTTP/HTTPS resume capability for downloads.

@fiatjaf
Copy link
Collaborator

fiatjaf commented May 27, 2024

Why no one ever considers the possibility of storing follow lists locally, or using Google cloud or something like that? And then publishing those to relays only selectively and/or every once in a while?

I understand the argument of having your contact lists available everywhere around Nostr and that's a good feature, but to think as that as the only good ideal goal to have is probably a symptom of us all being early adopters; still there can be multiple ways of doing things and I know there are some people that would rather not make their follows public.

EDIT: I believe Gossip does this, and Nostur used to do it too.

@fiatjaf
Copy link
Collaborator

fiatjaf commented May 27, 2024

Now I'm thinking that maybe we could have a protocol for updating lists in relays, like

["APPEND", "<event-id-to-modify>", ["p", "<pubkey>", "whatever"], "<new-id>", "<new-sig>"]`

Has anyone ever suggested that?

@vitorpamplona
Copy link

Has anyone ever suggested that?

Gotta update created_at as well.

Why "complicate" relays with custom APPEND/REMOVE commands? Relay codebase is quite simple and we still have only rudimentary codebases. I fear that adding many more custom commands to complete the set of all possible changes to events will just complicate things further and block them from working on the things that truly need work in the relay codebase: SYNC, Indexing, and multi-party CDN-like global distribution.

@fiatjaf
Copy link
Collaborator

fiatjaf commented May 27, 2024

Well, this would be strictly optional. Clients would fall back to just sending the full updated event.

I don't like it either, but being optional I think it might be a better idea than changing the way we do contact lists. But I don't know yet, the idea just came to my mind and might be garbage.

Also relay developers are too lazy. Writing a client is 100x harder than writing a relay. We gotta give them something else to do.

@fabianfabian
Copy link
Collaborator

really? it's pretty common on twitter for accounts with 10k+ follows. people have the expectation that they can follow as many people as they want. We can say "too bad", but it feels like we are just giving up and working around a bad design to begin with.

My main reason against large follow lists is that I prefer to preserve how easy it is today for clients to build a Web of Trust with just kind 3s.

People can still follow 10k+ if they want but then I'd rather have it not pollute the current kind 3 space, better keep it local, private or a different kind, or else it will become difficult for clients to deal with and we will end up having to use a few centralized caching services.

@jb55
Copy link
Collaborator

jb55 commented May 27, 2024 via email

@vitorpamplona
Copy link

I am all in for killing all types of lists on nostr in favor of nostr-protocol/nips#784, including for contact lists like on nostr-protocol/nips#761

That being said, we are going to have to deal with large lists. They are not optional. Large lists will always exist.

Lists and Event Sets are not replacing one another. Some use cases will use lists, others will use sets. Large lists are fundamental to many use cases. Relays will have to figure out if they are going to support those applications or not.

Our job should be to create an interoperable way to let relays clearly define what they support such that Clients can warn users when their relays don't.

@alltheseas
Copy link
Collaborator Author

https://njump.me/nevent1qqsdu704zqdf3rrwrsw202zz4km3udlyymd34p4enest80jrspuzcdgpzpmhxue69uhkummnw3ezumrpdejqz8rhwden5te0dehhxarj9ekh2arfdeuhwctvd3jhgtnrdakszenhwden5te0ve5kcar9wghxummnw3ezuamfdejj7mnsw43rzun6vuunv7n2v9mxwct5wdur2cmgxfm8var3x3shgct5d3un2unkv3mhzem2wqc82arcwu6r27n90fh8v7t3vejxk7rkv5lkyun0v9jxxctnws7hgun4v5q32amnwvaz7tmjv4kxz7fwd4hhxarj9ec82csm0cvmg

image

@Semisol I did a restore with hist nostr land. Some relays give the error code 87518. I imagine it’s because my follow list is too large (>1000). Does this mean that the events I broadcast to these relays are rejected too, or it’s just the follow list that is rejected?

when I was removing inactive npubs with noogle filter, at some point the follow list went from ~1300 to ~1600. I thought I had a user error so I restarted manually. But the same thing happened in Damus. At some point I was at ~1200 and it went back to ~1600. I’m not sure if this is a nostrability issue or just related to the error code I saw when restoring (so related to the size of the list). Steps to reproduce is just removing enough inactive npubs from the follow list.

@alltheseas
Copy link
Collaborator Author

@staab how do you approach large contact lists in coracle? Noticed you also have two thousand follows

@staab
Copy link
Collaborator

staab commented Jun 4, 2024

I use the outbox model, which means different sets of authors are sent to different relays. I usually end up with about 50% of my follows being requested from the top relay, and only because the top relays are treated as fallbacks for pubkeys with long tail relays (to avoid too many concurrent connections).

@alltheseas
Copy link
Collaborator Author

Issue persists

image

it's the same thing as last time but the issue isn't that I cannot read from relays, its that I cannot publish my follow list (kind 3 event) to my relays because they enforce a maximum note size limit. I cleaned out my follow list last time but now it's back after following new npubs.

@0xtrr

https://njump.me/note1qqqpdjm7kf2pf4gvc32r2s7x5ms2ycetv8cm9sg0unker6qc6l3q72ynhu

@0xtrr
Copy link

0xtrr commented Jul 17, 2024

Issue persists

image

it's the same thing as last time but the issue isn't that I cannot read from relays, its that I cannot publish my follow list (kind 3 event) to my relays because they enforce a maximum note size limit. I cleaned out my follow list last time but now it's back after following new npubs.

@0xtrr

https://njump.me/note1qqqpdjm7kf2pf4gvc32r2s7x5ms2ycetv8cm9sg0unker6qc6l3q72ynhu

This is simply just a matter of individual relay policy. The maximum size you see in the screenshot is most likely the default value set in the strfry config. I don't think most relay implementations offer special handling for kind 3 events so if you increase the maximum size, it's increased for all events.

There are legit reasons for restricting event sizes and the relays that restricts it in this screenshot are all free so I don't expect them to allow much. I still think 1000 follows is a way too low amount.

@alltheseas
Copy link
Collaborator Author

Arkinox advises

Looks like 65535 bytes is probably the limit then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants