-
Notifications
You must be signed in to change notification settings - Fork 602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NIP-66 Relay Discovery and Liveness Monitoring (Draft 7) #230
base: master
Are you sure you want to change the base?
Conversation
Simplify language in personas Fix error in schema example Fix Iv6
I get the idea and I think it's a good one. First, shouldn't tag names should be globally unique across all event kinds? People may search for events with certain tags without sub-specifying an event kind. So using 'r' for read collides with using 'r' for reference. Second, not sure these tags all need to be searchable. Third, should normalized URLs have a trailing slash? Have we settled on this elsewhere? My code for example normalizes without a trailing slash, but I don't know if that is good or not. I may have more comments later, I didn't try to be complete here. |
I'm not aware of any such limitation, that would mean only 52 one-letter tags for all of nostr indefinitely.
Agreed
Normalized URLs have a trailing slash, yes. It's a should not a must |
Ok but what if I search for all event with a "p" tag that has my pubkey, without specifying an event kind? I might get events where "p" was defined to mean "pedestrian" or "positive" or "playground" or anything else. That would kinda suck. EDIT: Ok this is a bad example since public keys are so unique. |
My attempts at critiques follow below. Hopefully you can successfully defend your NIP idea against me :) Seems like this can be generated client side. E.g., a client which somehow gathers the "top" relays (perhaps by fetching relay list events from NIP-65), and then opens websockets to those relays and generates response time and the NIP-11 documents etc. Then the client can display these results back to the user. Most of this info (SSL, IP4, IP6, topics, ability to read/write, is down, location) seems like it could be determined by a purpose built web client and/or served by the relay (i.e., the relay owner set the topics/location in some NIP-11 extension?) As I am typing this, I am realizing that a client-side approach will perhaps be computationally expensive, and a lot of this info does not need to be constantly re-computed (like the server "location"). And it would theoretically be nice to do a query by server location and receive a list of relay URLs... But still... I do not think that the information here will actually be that useful for people deciding which relays to use. Rather people will be drawn to relays which "perform" well and are related to their "social graph." Also, agree with @mikedilger that the 26 queryable tags should not be used with such abandon. Will people really query for events based on SSL status, IP4 and IP6? I guess I lean a bit on the conservative side when adding query-able tags. |
@barkyq Yes, I've been doing this with nostr.watch since around November. The problem is, that there are large amounts of logic required to aggregate this data, logic that not every client needs to write. As there are more relays, more computing power is required to process the relays, and it becomes a memory hog client-side. I'm already running tthese calculations, and would love to share these results in an open formatt. I would love it even more if there were other publishers posting in a similar schema that I could consume and to promote the propagation of the best data possible.
Yes
Correct.
It is not for that purpose. It is for clients to utilize and to aid in the discovery of relays with useful datapoints, as expressed in the "Personas" and "Use Cases" section.
I have yet to find literature to confirm that the protocol is limited to 26 queryable tags into infinity. Please correct me and provide citations otherwise. I'm also have trouble finding a use-case where it makes sense to subscribe without a kind, unless you were subscribing to a purpose-built relay that is supporting only specific kinds or if you were post-processing results, both of which seem to make the argument moot. From NIP-12
Key takeaways...
There is no mention of being universally unique.
I believe there is consensus on this already as expressed in previous comments. To clarify, I added all tags as queryable as a starting point. Edit This means the following:
It's understood that indexing non-unique values like "true" or "false" could prove taxing on relays, but needs to be stated that there is significant loss of usefulness with the changes. Alternatively sometthing like |
What about this for the tags? |
I added two new tags to the NIP.
These values are presently available through several services, but their data is not open. It really should be. Additionally, the following statement is no longer relevant:
As generating these values client-side is not feasible (and as previously demonstrated, generating many of these values at scale which is needed in many situations, is not feasible client-side, either) Updated example event:
|
Assigned |
Yes I guess the main argument against queryable tags is that the relay needs to process them much more than non queryable tags. I do agree that each Agreed that it is useful to query on the If a relay Another possibility is to also use the NIP-40 |
in general, it does first sound great. however protocol extensions like these that allow arbitrary free form data are "abused" and always eventually become a point of protocol dialect bifurcation. |
At some point in the near future this will be impossible, therefor using unintuitive single letter indices in the present to retain "aesthetics", might be more damaging than just working within the constraints of the protocol. But tbh it's no longer relevant to this discussion, since we have all been in agreement since the beginning to reduce the indices and they were reduced to one. Maybe we should open a new discussion around this, it seems that the majority of people's desires around NIP-12 do not actually align with the wording of NIP-12. Maybe the wording needs to be amended with guidances.
Yes it is. It's very easy to check if a relay is online if you already know the relay, but not if you want to find all relays that are online very quickly, and without adding additional business logic to your client. Connecting to hundreds of relays at once in many cases is not advisable, so you'll likely add batching, slots or queing logic, so you're only connecting to Most clients right now are optimized to "social" cases where they are connecting to ~10 relays at a time, not 500, 1000, 2000, 5000... etc.
I considered this as well, it reduces footprint, but consider the following: It complicates filtering, as you would need to know a publisher's update frequency, and then use However, adding language that says "publishers should not update the event unless the relay is online" may have value, as elaborated on below (re: "last seen"). Good idea. Should be mentioned however, this then must omit the tag "online" as you suggested, as otherwise, the event will still read
No need, this is a parameterized replaceable event and is replaced by the next event. The footprint of 1000 30303's is somewhere around 1mb. From my logs...
And this is calculated using a schema that is more bloated that the one proposed here. A relay may be offline, but it's NIP-11 may still be online, providing their contact information and/or nostr pubkey. Other data points for offline relays would also be useful. What is the impact of this relay being offline? How many events were stored there? How many users were pushing to it? etc... However, an expiration would be useful for relays that were last seen a long time ago, and are likely gone forever. Thanks for suggesting this.
Depends on how you are utilizing these events. If you are parsing I like this creative solution, but I feel as though it's trying to save space in a circumstance where very little space is being consumed to begin with at the compromise of usefulness.
Could you elaborate? Are you referring to the |
yes, as provided in the changeset:
with this as a rule of thumb, when designing protocols, places that people can tack on arbitrary junk data will always have people tacking on arbitrary junk data. this tends to be very hairy if not kept in check with constraints. i think that is something to keep in mind here. |
Yes, agreed. But that is why I think publishers should only publish events for relays which are online. E.g., I get a list of relays by querying kind 30303, and I implicitly assume the recently published events are for online relays. If I try to connect to one of the relays in a
Yes, understood. But still,
This is solved by using the
Agreed, but quite niche, and should not be something that normal "relay searching clients" are concerned with. This could be handled by a dedicated backend storing expired events, as detailed above |
Yes, this is a good point. The purpose of this was left vague because I didn't want to bloat this spec. I intended to extend these myself with datasets where
*Edit: It actually may be possible to make it must. |
de9392e
to
7efe3f3
Compare
16bcbb2
to
ceabecc
Compare
Can you extend 30166 kind event, so users of https://relays.xport.top for example be able to share their results with the network?
|
@vir2alexport The suggested format completely breaks the You could instead add counts using NIP-32 For example
or you could just add your own tag like
or
Previously, count tags were included in NIP-66, but because there's no way to specify, every possible count, I have omitted it from the NIP. Half-defined tags are difficult to interpret in a NIP. I have been unable to identify a good way to standardize these kinds of values that does not bloat the NIP to unreadable levels. I am and have been open to ideas. |
@dskvr I see, thank you. So this is good for liveness monitoring rather than discovery in my case. |
This is good for both liveness and discovery. |
- `rtt-read` The relay's read **round-trip time** in milliseconds. | ||
- `rtt-write` The relay's write **round-trip time** in milliseconds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are these measured?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
read: time.begin
subscribe { kinds: [1], limit: 1 }
, subscription fulfilled or EOSE recieved (whichever first) time.end
write: time.begin
send event, ok (success[a,b]/fail) received time.end
Notes:
- Both of these values are more useful when represented as an SMA from time-series.
- Both of these values can loosely inform relay behaviors when cross-referencing other relay attributes.
- These values cannot be universally applied since many/most relays treat all users differently.
- Write tests are particularly questionable, there's really no good way to do them (unless a user is authenticated, but this presents issues for relays)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some thought, read
and write
should probably be removed from this NIP. For these kinds of tests to be meaningful in the present-day nostr climate requires more context.
I wrote an auditor, similar to to mike's relaytester, and it seems clear that read checks will be useful, but with very specific parameters to highlight strengths and weaknesses between relays.
How to propagate or aggregate this data is still unclear, but it is very clear (and has been for a while tbh) that the simple read and write checks of yester-year provide little to no value anymore. I even completely disregard them in the new nostr.watch NIP-66 client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm planning to start using this to recommend relays to users in Flotilla. @dskvr and I chatted this morning about adding a publisher
profile to NIP 89, but on re-reading this PR I see that monitors
basically does this. I think that's better, NIP 89 is already a little overloaded, and I couldn't find a tidy way to fit publisher
s in. So I think this NIP is great as-is with one suggestion — it would be nice to be able to find monitor profiles, so a recommendation for kind 10166s to be aggressively replicated across the network like kind 0/10002 are would be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can a relay publish an event with its contact pubkey and announce its liveness on other relays? then we can query most recent relays who announced themselves and new relays will be recognized as well on the network.
@kehiy relays can self-publish and this case was always considered valid. However, depending on the intended use case, this may be less useful. If the intention is to find relays, you'll come up with a woefully incomplete dataset and implementation complexity rises. If the intended purpose is to discover a relay's meta by an operator pubkey your already know, then that is a good case for self-published events. nostr1.com already monitors their customers' relays via their You didn't ask, but I've wanted to publish the following for a while after countless discussions on this topic. When there are several crawlers, their incentive is to provide the best data. Relay monitor datasets will always be more complete than self-published data, and their incentives promote honesty. Some real life examples would be relays that advertise NIP-50 via NIP-11, but either don't adhere NIP-50 or almost entirely reject NIP-01. Most of them are case sensitive and somewhat useless because of this. Most of them are inadequate not by design but because there is no feedback loop for their operators to even realize there is a problem. The comparison I like to use is early internet directories, such as DMOZ and Yahoo Directories. This approach of self-publishing incentivizes self-promotion, which is inherently antithetical to honesty and completedness. The resulting data on these directories was incomplete and difficult to maintain as websites went offline. These added complexities are a core reason why crawlers won over self-published directories. Crawlers had incentives around distilling data to provide users with the best possible data. Crawlers inherently accumulated liveness data and so their data was much cleaner and up to date. The aforementioned dynamic eventually trended against users in the 2010's when certain platforms achieved a monopoly, however, the open-index nature of NIP-66 should mitigate monopolies and consequently algorothmic biases. The aggregate NIP-66 dataset is potentially a foundation and qualifier of other, more data rich datasets. Those unfamilair with crawling at scale are often surprised how important liveness is at scale. Monitors are incentivized to test a relay against their stated capabilities, whereas a relay operator is not incentivized or often enough not equipped to publish their own inadequacies. The most common agument against relay monitors and NIP-66 is not wanting to trust a monitor to truthfully broadcast data about a relay it does not run itself. However, there are several clear arguments against this assertion.
Sidenote: |
so, considering this, isn't it logical to add it to the standard? |
NIP-66 saves the world. |
I thought it was still in there, but upon review it was removed during the last wave of simplification. It is still there as it falls under the umbrella of ad-hoc monitoring, but it should be explicitly stated in the NIP. Thanks for pointing it out. |
draft7
Rendered NIP
tl;dr
A flexible parameterized replaceable event featuring implicit, subjective liveness detection.
Implementations
NIP-66
optimized relays 1 2 3 4NIP-66
demo, draft7Kinds
30166
- Relay Discovery10166
- Monitor DiscoveryUse Cases
Gossip/Outbox/Inbox: Sort and prioritize coalesced and grouped relay lists on various dimensions of data, such as proximity and round-trip time, for a more optimal outbox experience.
Geographic Relay Discovery: Identify relays situated near a specific geographic location or within a particular country, facilitating localized network interactions.
NIP Support Filtering: Search for relays based on their support for specific Nostr Implementation Possibilities (NIPs), ensuring compatibility with desired protocol features.
Accessibility Search: Locate relays that are free to use, that have spam protection via payment and/or require NIP-42 authentication.
Real-Time Status Monitoring: Utilize a status client to display up-to-date statuses of various relays, providing insights into their current operational state. For example, within a social or dedicated client, or within a client's relay list.
Relay Network Analysis: Analyze connections and patterns between relays using various attributes, aiding in understanding network topology and security assessments.
Error Detection in Relay Lists/Set: Clients can detect and rectify problematic entries in relay lists; for example, notify a user that relay on their list has been offline for
n
months.Performance Benchmarking: Sort relays based on performance metrics like round-trip times and uptime, aiding in the selection of the most efficient relays for specific needs.
Language and Content Filtering: Identify relays catering to specific languages or content types, enabling users to engage in a more targeted and relevant social networking experience.
History
January 2023
and proposedFebruary 2023
. "Too many one-letter indexable tags"March 2023
"Everything should be expirable"December 2023
, "Needs to be indexed", heavy reliance on NIP-32February 2024
, "Just use one-letter indexable tags" (lol) and split Discovery and parse cases.July 2024
, remove Relay Meta, offload case onto existing nips with voluntary publishing of rich metadata in monitor-defined shapes.Usage
Use the filters below to inspect events from the nostr.watch Amsterdam monitor.
relay:
filter:
Filter against search relays (NIP-50)
Filter against community relays (NIP-29)
Filter against free relays
Filter against paid relays
Filter against relays without auth
These examples hard-code an established NIP-66 monitor in the
authors
filter for demonstration purposes. A more robust implementation may dynamically find monitors by finding10166
events and selecting them based on various criteria.Another implementation may only care about self-published
30166
events from a relay's operator (satellite
relays self-publish30166
events for example)An even more robust implementation may encourage their users to submit reports, and utilize a user's web of trust to find relays.
There is a wide range of implementation possiblities, from simple to complex.