Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIP-23 Cleanup #145

Open
staab opened this issue Jan 4, 2023 · 9 comments
Open

NIP-23 Cleanup #145

staab opened this issue Jan 4, 2023 · 9 comments

Comments

@staab
Copy link
Member

staab commented Jan 4, 2023

Once #32 is merged, I think a little cleanup is in order:

  • Kind 2 should be deprecated
  • NIP 2 should be modified to specify that kind 3 should include duplicate tags based on the target pubkey's latest relay list. Or some other fix, but a single relay url seems to be a weak point here.

See my comment here for a review of the current state of relay discovery.

I think kind 2 should be deprecated because as of NIP 23 it does the same thing as kind 10001, but less reliably. With kind 10001, it's easy to invalidate past relay recommendations (in case a relay goes down, goes rogue, or the user changes his mind), while kind 2 relies on kind 5 deletions being propagated, which is explicitly discouraged. Unless the semantics of a kind 2 recommendation are intended to communicate something different from a kind 10001? But I can't see why it would.

Also, kind 3 should be modified to somehow include multiple recommended relays for a user. This could take the form of multiple tags for a single user, or multiple relays at the end of the tag. These recommended relays should be copied from the user's kind 10001, so that 1. the relays recommended for a user are the ones they themselves have selected (rather than ones that incidentally held events from that user), and 2. so that there is redundancy in case a relay becomes invalid for finding a user.

In terms of concrete implementation, here's how I would go from a follow list for an identified pubkey (user a) to most reliably pulling the event stream from an unidentified pubkey (user b) in that list.

# 1. Start with my home relays and user a's profile info
my_relays = ["wss://a.com"]
user_a = {
  pubkey: "<user_a_pubkey>",
  relays: ["wss://a.com", "wss://b.com"],
  petnames: [
    ["#p", "<user_a_pubkey>", "wss://b.com"],
    ["#p", "<user_a_pubkey>", "wss://c.com"],
  ],
}

# 2. Identify the pubkey I want to know more about
user_b_pubkey = user_a.petnames[0][1]

# 3. Get all relays user_b_might be found at
user_b_potential_relays = user_a.petnames.filter(t => t[1] === user_b_pubkey).map(last)

# 4. Use them to find user b's canonical list of relays. Let's say this has been updated to ["wss://c.com", "wss://d.com"]
user_b_canonical_relays = first(query({kinds: [10001], pubkey: original_event.pubkey}, user_b_potential_relays))

# 5. Query those relays to find the user's profile info/events
pubkey_profile = query({kinds: [0, 1, 3], pubkey: user_b_pubkey}, user_b_canonical_relays)

Of course, nothing is bulletproof, but only including one relay in kind 3, would potentially result in wss://b.com only being queried in step 4. If relay b has banned user b, that query would fail. Similarly, using kind 2 instead of 10001 might result in extra relays being returned in step 4. This isn't really a problem, you would just be pointlessly querying an additional relay.

Pardon my wall of text, I'm currently thinking this through so I can actually implement it. But in summary, you can see above a tenuous path through relays a -> b -> c to find user b's canonical relays. Ensuring discoverability of c despite b's censorship improves the ability of clients to connect the dots.

@ghost
Copy link

ghost commented Jan 5, 2023

I think kind 2 should be deprecated because as of NIP 23 it does the same thing as kind 10001, but less reliably. With kind 10001, it's easy to invalidate past relay recommendations (in case a relay goes down, goes rogue, or the user changes his mind), while kind 2 relies on kind 5 deletions being propagated, which is explicitly discouraged. Unless the semantics of a kind 2 recommendation are intended to communicate something different from a kind 10001? But I can't see why it would.

Kind 2 is different from relay lists defined in NIP 23 as they are just recommendations. I think they can coexist to improve bootstrapping, censorship resistance, UX etc.

A proof of concept that uses kind 2 events and we could use kind 10001 in it as well:

Creating a DNS seeder for nostr using kind 2 events

Anyone can do DNS lookup for TXT records of this domain to get best possible relays for bootstrapping without relying on websites like https://nostr.watch. Also they can run their own DNS seeders with similar or different logic.

@staab
Copy link
Member Author

staab commented Jan 5, 2023

Ok, that all sounds fine, I'm still not sure I see the semantic difference between the two, but if other people find kind 2 useful I'm happy to leave it in.

@mikedilger
Copy link
Contributor

I agree with @staab here that the p tags in kind-3 events don't supply enough information about which relays the person you are following posts to.

I'm writing code for this right now, and gossip is damn hard to bootstrap because kind-3 data usually doesn't tell it where to find these people you are following. You can (1) look at the content, parse it's non-standard content as relays that you use, and HOPE that the person you follow posts to one of the relays you read from (IMHO will become less reliable over time as nostr grows), or (2) use the recommended_relay_url parameter (my preference) but that being a single point of failure. Assuming you'll be able to find them at whatever relays you read from is over time going to be a failed assumption.

So that we don't break how tags work, I suggest extending p tags like this:

["p", pubkey, recommended_relay_url, petname, relay_url_they_post_to, another_relay_they_post_to, ...]

And then including these longer "p" tags in kind-3 events.

The meaning of the existing recommended_relay_url doesn't have to change (although I'm not sure what it is in this context), or it could just be the first relay that this person posts to.

@eskema
Copy link
Collaborator

eskema commented Jan 8, 2023

In my view, the relays in kind-3 are user specific, the user chooses what relay it puts for each pubkey in their contact list. I don't see the point in having more there. one should be enough and can be updated. if none present, then use whatever means to find out, like trying to see from other people where they think that pubkey exist or existed. I don't think creating endless tags will help.

@mikedilger
Copy link
Contributor

Ok @eskema maybe you are right, but if so you'll be able to solve my problem for me without such tags. Here is my problem:

  1. Existing nostr user decides to try out the gossip client
  2. They enter their key (public or private) and pick at least 1 relay where they know they have their contact list posted.
  3. They go to the list of people they follow and press "import contact list"
  4. A bunch of people show up, but none of them have metadata, and none of their events are in the feed, because that contact list didn't provide information about where to find such people.

I don't want to rely on the happenstance that the people you follow are on the same relays you are. So I need to get person-relay pair information from somewhere. When you are just starting up a client, it has not collected such data from event tags because it has zero events so far.

@dskvr
Copy link
Contributor

dskvr commented Jan 8, 2023

The lines of code that could be erased or be completely avoided if kind 2 AND kind 3 were a replaceable kinds may soon be immeasurable

@eskema
Copy link
Collaborator

eskema commented Jan 8, 2023

the client needs to be able to function with the sparse data that it has.. it needs to be able to display any empty profile because users are not required to add any metadata. they can live without ever publishing a kind-0 or kind-3. The relay list the user has defined (or chose, if none then a default bootstrap list) should be the starting point to fetch data.

Ok @eskema maybe you are right, but if so you'll be able to solve my problem for me without such tags. Here is my problem:

  1. Existing nostr user decides to try out the gossip client
  2. They enter their key (public or private) and pick at least 1 relay where they know they have their contact list posted.
  3. They go to the list of people they follow and press "import contact list"
  4. A bunch of people show up, but none of them have metadata, and none of their events are in the feed, because that contact list didn't provide information about where to find such people.

so at this point, what you should do is query your configured relays for the profiles, contacts and posts of the pubkeys in the contacts list. stuff may not show up for certain pubkeys, but you'll probably find a bunch of data from others. Every time data arrives, it potentially may bring new relays in the form of kind-2, kind-3 content or in kind-1 e/p tags. maybe sweep those and try to build a relay score based on how much it is used by pubkeys on your contact list. try to fetch data from those.

I don't want to rely on the happenstance that the people you follow are on the same relays you are. So I need to get person-relay pair information from somewhere. When you are just starting up a client, it has not collected such data from event tags because it has zero events so far.

I never had a case were absolutely no data was returned, if that happens, then simply provide a notice prompting the user to try other relays or something... then, when some data arrives, you can update you kind-3 with relays and petnames you think make sense. This is a bit chaotic but it ends up working for me, I don't know if in the future it won't, we'll see when we get there.

p.s. it would also help if you maybe forward some stuff you think you need on startup to your preferred relays, so you can try to rebroadcast kind-0 and kind-3 from your contacts to a single relay or multiple, so the data is there when you next ask for it.

@mikedilger
Copy link
Contributor

the client needs to be able to function with the sparse data that it has.. it needs to be able to display any empty profile because users are not required to add any metadata. they can live without ever publishing a kind-0 or kind-3.

I totally agree with that.

And I read your whole comment and I understand a way to get the clients 'bootstrapped' by pulling from the configured relays and finding some events, and by letting users pick relays for people, and other jiggering around. And I will in fact put this kind of code into my client because if I don't my client will perform much more poorly than other clients given the current state of event distribution on the relays.

The relay list the user has defined (or chose, if none then a default bootstrap list) should be the starting point to fetch data.

I disagree with this. And I posit that this common notion, shared by all the current clients (except gossip AFAIK) is the largest risk to the nostr ecosystem, the most likely thing that could cause nostr to fail. I don't want to be overly dramatic, but I really do see it this way, I see this as an existential risk to the continued adoption of nostr, a massive scaling brick-wall that will drive nostr into an overloaded bloated unworkable system that will cause people to leave in droves before we can change the paradigm over to one that has a hope of scaling.

nostr is a protocol and clients and relays can do anything they damn like, and they will, but while they continue to do these things that I consider ill-advised, I will continue to stand here on my street corner preaching about the end of nostr.

Imagine 10,000 relays, and you post to 3 of them. Someone wants to follow you and they connect to 3 relays, none of which they share with you. And imagine nobody is copying events between relays. In that scenario, which is the ideal future for a scalable nostr which doesn't flood events into every fucking corner of every disk, they are not going to be able to follow you by pulling from their relays. They will have to pull from your relays.

Consider RSS. Nobody expects to be able to pull Richard Dawkin's RSS feed from Steve Bellovin's website.

The only reason clients work today is because somebody is mirroring events between relays in a massive way which is already overloading them. That won't scale much longer.

The lines of code that could be erased or be completely avoided if kind 2 AND kind 3 were a replaceable kinds may soon be immeasurable

I thought they were. Where did I read that? I think they should be.

@dskvr
Copy link
Contributor

dskvr commented Jan 13, 2023

I thought they were. Where did I read that? I think they should be.

I have been unable to locate evidence of the contrary, in documentation nor in practice. If you find something, please share.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants