Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NIP-66 Relay Discovery and Liveness Monitoring (Draft 7) #230

Open
wants to merge 35 commits into
base: master
Choose a base branch
from

Conversation

dskvr
Copy link
Contributor

@dskvr dskvr commented Feb 7, 2023

draft7

Rendered NIP

tl;dr
A flexible parameterized replaceable event featuring implicit, subjective liveness detection.

Implementations

Kinds
30166 - Relay Discovery
10166 - Monitor Discovery

Use Cases

  1. Gossip/Outbox/Inbox: Sort and prioritize coalesced and grouped relay lists on various dimensions of data, such as proximity and round-trip time, for a more optimal outbox experience.

  2. Geographic Relay Discovery: Identify relays situated near a specific geographic location or within a particular country, facilitating localized network interactions.

  3. NIP Support Filtering: Search for relays based on their support for specific Nostr Implementation Possibilities (NIPs), ensuring compatibility with desired protocol features.

  4. Accessibility Search: Locate relays that are free to use, that have spam protection via payment and/or require NIP-42 authentication.

  5. Real-Time Status Monitoring: Utilize a status client to display up-to-date statuses of various relays, providing insights into their current operational state. For example, within a social or dedicated client, or within a client's relay list.

  6. Relay Network Analysis: Analyze connections and patterns between relays using various attributes, aiding in understanding network topology and security assessments.

  7. Error Detection in Relay Lists/Set: Clients can detect and rectify problematic entries in relay lists; for example, notify a user that relay on their list has been offline for n months.

  8. Performance Benchmarking: Sort relays based on performance metrics like round-trip times and uptime, aiding in the selection of the most efficient relays for specific needs.

  9. Language and Content Filtering: Identify relays catering to specific languages or content types, enabling users to engage in a more targeted and relevant social networking experience.

History

  • draft1 implemented in January 2023 and proposed February 2023. "Too many one-letter indexable tags"
  • draft2 experimented March 2023 "Everything should be expirable"
  • draft3 experimented, never proposed
  • draft4 experimented, never proposed
  • draft5 proposed December 2023, "Needs to be indexed", heavy reliance on NIP-32
  • draft6 proposed February 2024, "Just use one-letter indexable tags" (lol) and split Discovery and parse cases.
  • draft7 proposed July 2024, remove Relay Meta, offload case onto existing nips with voluntary publishing of rich metadata in monitor-defined shapes.

Usage

Use the filters below to inspect events from the nostr.watch Amsterdam monitor.

relay:

relay.nostr.watch

filter:

filter: {
  "authors": ["9bbbb845e5b6c831c29789900769843ab43bb5047abe697870cb50b6fc9bf923"],
  "kinds": [30166],
  "since": Math.round(Date.now()/1000)-(60*60*2)
}

Filter against search relays (NIP-50)

filter: {
  "authors": ["9bbbb845e5b6c831c29789900769843ab43bb5047abe697870cb50b6fc9bf923"],
  "kinds": [30166],
  "#N": ["50"],
  "since": Math.round(Date.now()/1000)-(60*60*2)
}

Filter against community relays (NIP-29)

filter: {
  "authors": ["9bbbb845e5b6c831c29789900769843ab43bb5047abe697870cb50b6fc9bf923"],
  "kinds": [30166],
  "#N": ["29"],
  "since": Math.round(Date.now()/1000)-(60*60*2)
}

Filter against free relays

filter: {
  "authors": ["9bbbb845e5b6c831c29789900769843ab43bb5047abe697870cb50b6fc9bf923"],
  "kinds": [30166],
  "#R": ["!payment"],
  "since": Math.round(Date.now()/1000)-(60*60*2)
}

Filter against paid relays

filter: {
  "authors": ["9bbbb845e5b6c831c29789900769843ab43bb5047abe697870cb50b6fc9bf923"],
  "kinds": [30166],
  "#R": ["payment"],
  "since": Math.round(Date.now()/1000)-(60*60*2)
}

Filter against relays without auth

filter: {
  "authors": ["9bbbb845e5b6c831c29789900769843ab43bb5047abe697870cb50b6fc9bf923"],
  "kinds": [30166],
  "#R": ["!auth"],
  "since": Math.round(Date.now()/1000)-(60*60*2)
}

These examples hard-code an established NIP-66 monitor in the authors filter for demonstration purposes. A more robust implementation may dynamically find monitors by finding 10166 events and selecting them based on various criteria.

Another implementation may only care about self-published 30166 events from a relay's operator (satellite relays self-publish 30166 events for example)

An even more robust implementation may encourage their users to submit reports, and utilize a user's web of trust to find relays.

There is a wide range of implementation possiblities, from simple to complex.

Simplify language in personas

Fix error in schema example

Fix Iv6
@dskvr dskvr changed the title NIP-59 Relay status and meta NIP-XX Relay status and meta Feb 7, 2023
@dskvr dskvr marked this pull request as draft February 7, 2023 15:27
@mikedilger
Copy link
Contributor

I get the idea and I think it's a good one.

First, shouldn't tag names should be globally unique across all event kinds? People may search for events with certain tags without sub-specifying an event kind. So using 'r' for read collides with using 'r' for reference.

Second, not sure these tags all need to be searchable.

Third, should normalized URLs have a trailing slash? Have we settled on this elsewhere? My code for example normalizes without a trailing slash, but I don't know if that is good or not.

I may have more comments later, I didn't try to be complete here.

@dskvr
Copy link
Contributor Author

dskvr commented Feb 8, 2023

First, shouldn't tag names should be globally unique across all event kinds

I'm not aware of any such limitation, that would mean only 52 one-letter tags for all of nostr indefinitely.

Second, not sure these tags all need to be searchable.

Agreed

Third, should normalized URLs have a trailing slash?

Normalized URLs have a trailing slash, yes. It's a should not a must

@mikedilger
Copy link
Contributor

mikedilger commented Feb 8, 2023

Ok but what if I search for all event with a "p" tag that has my pubkey, without specifying an event kind? I might get events where "p" was defined to mean "pedestrian" or "positive" or "playground" or anything else. That would kinda suck.

EDIT: Ok this is a bad example since public keys are so unique.

@barkyq
Copy link
Contributor

barkyq commented Feb 10, 2023

My attempts at critiques follow below. Hopefully you can successfully defend your NIP idea against me :)

Seems like this can be generated client side. E.g., a client which somehow gathers the "top" relays (perhaps by fetching relay list events from NIP-65), and then opens websockets to those relays and generates response time and the NIP-11 documents etc. Then the client can display these results back to the user.

Most of this info (SSL, IP4, IP6, topics, ability to read/write, is down, location) seems like it could be determined by a purpose built web client and/or served by the relay (i.e., the relay owner set the topics/location in some NIP-11 extension?)

As I am typing this, I am realizing that a client-side approach will perhaps be computationally expensive, and a lot of this info does not need to be constantly re-computed (like the server "location"). And it would theoretically be nice to do a query by server location and receive a list of relay URLs...

But still... I do not think that the information here will actually be that useful for people deciding which relays to use. Rather people will be drawn to relays which "perform" well and are related to their "social graph."

Also, agree with @mikedilger that the 26 queryable tags should not be used with such abandon. Will people really query for events based on SSL status, IP4 and IP6? I guess I lean a bit on the conservative side when adding query-able tags.

@dskvr
Copy link
Contributor Author

dskvr commented Feb 10, 2023

Seems like this can be generated client side. E.g., a client which somehow gathers the "top" relays (perhaps by fetching relay list events from NIP-65), and then opens websockets to those relays and generates response time and the NIP-11 documents etc. Then the client can display these results back to the user.

@barkyq Yes, I've been doing this with nostr.watch since around November. The problem is, that there are large amounts of logic required to aggregate this data, logic that not every client needs to write. As there are more relays, more computing power is required to process the relays, and it becomes a memory hog client-side. I'm already running tthese calculations, and would love to share these results in an open formatt. I would love it even more if there were other publishers posting in a similar schema that I could consume and to promote the propagation of the best data possible.

Most of this info (SSL, IP4, IP6, topics, ability to read/write, is down, location) seems like it could be determined by a purpose built web client and/or served by the relay (i.e., the relay owner set the topics/location in some NIP-11 extension?)

Yes

As I am typing this, I am realizing that a client-side approach will perhaps be computationally expensive, and a lot of this info does not need to be constantly re-computed (like the server "location"). And it would theoretically be nice to do a query by server location and receive a list of relay URLs...

Correct.

But still... I do not think that the information here will actually be that useful for people deciding which relays to use. Rather people will be drawn to relays which "perform" well and are related to their "social graph."

It is not for that purpose. It is for clients to utilize and to aid in the discovery of relays with useful datapoints, as expressed in the "Personas" and "Use Cases" section.

Also, agree with @mikedilger that the 26 queryable tags should not be used with such abandon.

I have yet to find literature to confirm that the protocol is limited to 26 queryable tags into infinity. Please correct me and provide citations otherwise. I'm also have trouble finding a use-case where it makes sense to subscribe without a kind, unless you were subscribing to a purpose-built relay that is supporting only specific kinds or if you were post-processing results, both of which seem to make the argument moot.

From NIP-12

Relays may support subscriptions over arbitrary tags. NIP-01 requires relays to respond to queries for e and p tags. This NIP allows any single-letter tag present in an event to be queried.

The <filters> object described in NIP-01 is expanded to contain arbitrary keys with a # prefix. Any single-letter key in a filter beginning with # is a tag query, and MUST have a value of an array of strings. The filter condition matches if the event has a tag with the same name, and there is at least one tag value in common with the filter and event. The tag name is the letter without the #, and the tag value is the second element. Subsequent elements are ignored for the purposes of tag queries.

Key takeaways...

arbitrary tags and This NIP allows any single-letter tag present in an event to be queried. and The filter condition matches if the event has a tag with the same name, and there is at least one tag value in common with the filter and event.

There is no mention of being universally unique.

...Will people really query for events based on SSL status, IP4 and IP6? I guess I lean a bit on the conservative side when adding query-able tags.

I believe there is consensus on this already as expressed in previous comments. To clarify, I added all tags as queryable as a starting point.

Edit g is now the only new index in the schema

This means the following:

  • online relays cannot be filtered; requires post-processing.
  • ssl compatible relays cannot be filtered; requires post-processing

It's understood that indexing non-unique values like "true" or "false" could prove taxing on relays, but needs to be stated that there is significant loss of usefulness with the changes.

Alternatively sometthing like ['t', 'relay:online'] and ['t', 'relay:ssl'] could be used (and probably will be used in the absense of useful indices whether or not we agree with it. It would be the natural progression.)

@fiatjaf
Copy link
Member

fiatjaf commented Feb 10, 2023

"tags": [  
    ["d","wss://some.relay/"],
    ["t","nostrica"],
    ["t","bitcoin"],
    ["g","ww8p1r4t8", "Amsterdam", "NL", "EU", "Earth"],
    ["ip", "1.1.1.1"],
    ["ip", "2001:db8:3333:4444:5555:6666:7777:8888"]
    ["open","true"],
    ["read","true"],
    ["write","false"],
    ["ssl","true"]
  ]

What about this for the tags?

@dskvr
Copy link
Contributor Author

dskvr commented Feb 11, 2023

I added two new tags to the NIP.

  • users - Total number of users on the relay
  • events - Total number of events on the relay

These values are presently available through several services, but their data is not open. It really should be.

Additionally, the following statement is no longer relevant:

Seems like this can be generated client side.

As generating these values client-side is not feasible (and as previously demonstrated, generating many of these values at scale which is needed in many situations, is not feasible client-side, either)

Updated example event:

{
  "id": "<eventid>",
  "pubkey": "<pubkey>",
  "created_at": "<created_at>",
  "signature": "<signature>",
  "content": "{}",
  "tags": [  
    ["d","wss://some.relay/"],
    ["t","nostrica"],
    ["t","bitcoin"],
    ["g","ww8p1r4t8","Amsterdam","NL", "EU", "Earth"],
    ["ip","1.1.1.1"],
    ["ip","2001:db8:3333:4444:5555:6666:7777:8888"],
    ["open","true"],
    ["read","true"],
    ["write","false"],
    ["ssl","true"],
    ["events","502341"],
    ["users","37482"]
  ]
}

@dskvr dskvr marked this pull request as ready for review February 11, 2023 13:33
@dskvr dskvr changed the title NIP-XX Relay status and meta NIP-66 Relay status and meta Feb 11, 2023
@dskvr
Copy link
Contributor Author

dskvr commented Feb 11, 2023

Assigned NIP-66, which as far as I can tell, is not taken. Will update the filename/squash if/when this NIP gets any traction.

@barkyq
Copy link
Contributor

barkyq commented Feb 11, 2023

Yes I guess the main argument against queryable tags is that the relay needs to process them much more than non queryable tags. I do agree that each kind is allowed to have their own interpretations of the queryable tags. Although it would be aesthetically-pleasing if the interpretations were consistent across events kinds.

Agreed that it is useful to query on the online status. One option is to say something like "Publishers SHOULD only publish updates for relays which are online" (so all events have an implicit ["online", "true"] tag) and then the created_at filter becomes a sort of proxy for which relays which were recently online. This seems like it would be useful in practice, for people trying to discover relays. Although I can see why some would not like it, because people like to get notified when something goes offline. In any case, it is very easy for clients to decide if a given relay is online or not... simply try to establish a websocket connection.

If a relay wss://foo.bar goes offline, the client can make a subscription to the publishers kind 30303 feed filtering by d tag wss://foo.bar. When an event comes back with a new created_at timestamp, the client knows the relay wss://foo.bar is online again, at least according to the publisher.

Another possibility is to also use the NIP-40 expiration tag, making it slightly longer than the republish frequency. Maybe a bit too convoluted. This would reduce storage size though, and also signal to users when they can expect an update. Dedicated endpoints could store the events for long term and display the "expired" events as the "offline" relays.

@majestrate
Copy link
Contributor

majestrate commented Feb 11, 2023

I get the idea and I think it's a good one.

in general, it does first sound great. however protocol extensions like these that allow arbitrary free form data are "abused" and always eventually become a point of protocol dialect bifurcation.

@dskvr
Copy link
Contributor Author

dskvr commented Feb 12, 2023

@barkyq

Although it would be aesthetically-pleasing if the interpretations were consistent across events kinds

At some point in the near future this will be impossible, therefor using unintuitive single letter indices in the present to retain "aesthetics", might be more damaging than just working within the constraints of the protocol. But tbh it's no longer relevant to this discussion, since we have all been in agreement since the beginning to reduce the indices and they were reduced to one.

Maybe we should open a new discussion around this, it seems that the majority of people's desires around NIP-12 do not actually align with the wording of NIP-12. Maybe the wording needs to be amended with guidances.

In any case, it is very easy for clients to decide if a given relay is online or not... simply try to establish a websocket connection.

Yes it is. It's very easy to check if a relay is online if you already know the relay, but not if you want to find all relays that are online very quickly, and without adding additional business logic to your client.

Connecting to hundreds of relays at once in many cases is not advisable, so you'll likely add batching, slots or queing logic, so you're only connecting to n relays at a time. These approaches work well, but if you want more information than just "online" status, it will take minutes, not seconds. Multiply this by x users. Not only does it strain relays more (x users vs y publisher daemons) it increases data transfers to clients, many of which may be limited by mobile data plans. Also, not all relays connect quickly, and it's often the largest relays that do not connect quickly, so to ensure that results are accurate, in general, a generous timeout is required.

Most clients right now are optimized to "social" cases where they are connecting to ~10 relays at a time, not 500, 1000, 2000, 5000... etc.

One option is to say something like "Publishers SHOULD only publish updates for relays which are online" (so all events have an implicit ["online", "true"] tag) and then the created_at filter becomes a sort of proxy for which relays which were recently online.

I considered this as well, it reduces footprint, but consider the following: It complicates filtering, as you would need to know a publisher's update frequency, and then use since to determine "who is online." Additionally, update frequencies are not likely perfectly consistent, and so false positives/negatives here are more likely.

However, adding language that says "publishers should not update the event unless the relay is online" may have value, as elaborated on below (re: "last seen"). Good idea. Should be mentioned however, this then must omit the tag "online" as you suggested, as otherwise, the event will still read ['online', 'true'].

Another possibility is to also use the NIP-40 expiration tag, making it slightly longer than the republish frequency. Maybe a bit too convoluted.

No need, this is a parameterized replaceable event and is replaced by the next event. The footprint of 1000 30303's is somewhere around 1mb. From my logs...

AVERAGE SIZE OF ONE 30303 0.0010957717895507812mb 1.1220703125kb

And this is calculated using a schema that is more bloated that the one proposed here.

A relay may be offline, but it's NIP-11 may still be online, providing their contact information and/or nostr pubkey. Other data points for offline relays would also be useful. What is the impact of this relay being offline? How many events were stored there? How many users were pushing to it? etc...

However, an expiration would be useful for relays that were last seen a long time ago, and are likely gone forever. Thanks for suggesting this.

If a relay wss://foo.bar goes offline, the client can make a subscription to the publishers kind 30303 feed filtering by d tag wss://foo.bar. When an event comes back with a new created_at timestamp, the client knows the relay wss://foo.bar is online again, at least according to the publisher.

Depends on how you are utilizing these events. If you are parsing kind:2 or kind:3 or NIP-65 and then subscribing to these events and filtering with #d, then yes. Right now, social clients are the most prevalent use case. This may not always be true. There are many developer personas other than "social client developer."

I like this creative solution, but I feel as though it's trying to save space in a circumstance where very little space is being consumed to begin with at the compromise of usefulness.

@majestrate

however protocol extensions like these that allow arbitrary free form data are "abused" and always eventually become a point of protocol dialect bifurcation.

Could you elaborate? Are you referring to the event.content field? I feel as though this statement is extremely broad.

@majestrate
Copy link
Contributor

majestrate commented Feb 12, 2023

Could you elaborate? Are you referring to the event.content field? I feel as though this statement is extremely broad.

yes, as provided in the changeset:

The .content of these events may be empty. .content may contain stringified JSON. The parsed JSON has a flexible schema, all members are optional. The parsed .content JSON should be extended by NIPs.

with this SHOULD here, it has a risk of becoming a dumping ground for arbitrary data.

as a rule of thumb, when designing protocols, places that people can tack on arbitrary junk data will always have people tacking on arbitrary junk data. this tends to be very hairy if not kept in check with constraints. i think that is something to keep in mind here.

@barkyq
Copy link
Contributor

barkyq commented Feb 12, 2023

[...] It's very easy to check if a relay is online if you already know the relay, but not if you want to find all relays that are online very quickly, and without adding additional business logic to your client.

Yes, agreed. But that is why I think publishers should only publish events for relays which are online. E.g., I get a list of relays by querying kind 30303, and I implicitly assume the recently published events are for online relays. If I try to connect to one of the relays in a d tag, and it is offline, I just drop it and continue to the next kind 30303. Not a big deal. I only connect to the relays that I would actually want to connect to, i.e., I don't need to bother "checking" every relay returned by kind 30303.

[...] No need, this is a parameterized replaceable event and is replaced by the next event. The footprint of 1000 30303's is somewhere around 1mb...

Yes, understood. But still, expiration tag has a value beyond reducing storage size. If I see created_at = 1600000000 and expiration = 1600003600, then I know that the publisher anticipates to update this relay within 3600 seconds. I think this is useful information (and it also does reduce storage requirements for relays receiving these kind 30303). As mentioned above, dedicated backends for storing this kind 30303 could store expired events to perhaps show users "historical" relays which are offline. Returning expired events to a REQ would be slightly breaking the rules of the expiration tag. But a dedicated web backend would be allowed to show them to users of the website.

[...] It complicates filtering, as you would need to know a publisher's update frequency, and then use since to determine "who is online." Additionally, update frequencies are not likely perfectly consistent, and so false positives/negatives here are more likely.

This is solved by using the expiration tag, because it is a proxy for the publisher's update frequency. Do not filter on since, unless you specifically want that.

[...] Other data points for offline relays would also be useful. What is the impact of this relay being offline? How many events were stored there? How many users were pushing to it? etc...

Agreed, but quite niche, and should not be something that normal "relay searching clients" are concerned with. This could be handled by a dedicated backend storing expired events, as detailed above

@dskvr
Copy link
Contributor Author

dskvr commented Feb 12, 2023

as a rule of thumb, when designing protocols, places that people can tack on arbitrary junk data will always have people tacking on arbitrary junk data. this tends to be very hairy if not kept in check with constraints. i think that is something to keep in mind here.

Yes, this is a good point. The purpose of this was left vague because I didn't want to bloat this spec. I intended to extend these myself with datasets where tags are not appropriate because there's data hierarchy. I'll think about a solution to this.

with this SHOULD here, it has a risk of becoming a dumping ground for arbitrary data.

event.content fields are not validated by relays, so should cannot be must. *

*Edit: It actually may be possible to make it must.

@dskvr dskvr marked this pull request as draft July 26, 2024 21:05
@dskvr dskvr changed the title NIP-66 Relay Discovery and Liveness Monitoring System NIP-66 Relay Discovery and Liveness Monitoring System (Draft 7) Jul 26, 2024
@dskvr dskvr force-pushed the nip-59-relay-status branch 6 times, most recently from de9392e to 7efe3f3 Compare July 27, 2024 10:12
@dskvr dskvr force-pushed the nip-59-relay-status branch from 16bcbb2 to ceabecc Compare July 30, 2024 11:14
@dskvr dskvr marked this pull request as ready for review August 29, 2024 10:31
@dskvr dskvr changed the title NIP-66 Relay Discovery and Liveness Monitoring System (Draft 7) NIP-66 Relay Discovery and Liveness Monitoring (Draft 7) Aug 29, 2024
@vir2alexport
Copy link

vir2alexport commented Sep 9, 2024

Can you extend 30166 kind event, so users of https://relays.xport.top for example be able to share their results with the network?

q tag stands for quality and inverse-proportional

[
	...
	"q": <number-of-connections>,
	...
]

content

{
	<relay-1-addr>: [
		<rtt-open>,
		<rtt-read>,
		<rtt-write>,
		<kind1-events-per-hour-counter>
	],
	...
	<relay-n-addr>: [
		<rtt-open>,
		<rtt-read>,
		<rtt-write>,
		<kind1-events-per-hour-counter>
	]
}

@dskvr
Copy link
Contributor Author

dskvr commented Sep 9, 2024

@vir2alexport The suggested format completely breaks the 30166 kind format and how NIP-66 functions in a wider context. Each relay gets its own event, which is how relays remain discoverable. If multiple relays are included in each event, not only would events be too large, resulting in rejection by relays in many situations, but it would not be possible to discover relays in a useful way. Additionally, there are many more counts that can be provided other than "events-per-hour".

You could instead add counts using NIP-32 l tags

For example

['l', '12345', 'count.events-per-hour']

or you could just add your own tag like

['count.events-per-hour', '12345']

or

['count', '12345', 'events-per-hour] (this was the original NIP-66 format)

Previously, count tags were included in NIP-66, but because there's no way to specify, every possible count, I have omitted it from the NIP. Half-defined tags are difficult to interpret in a NIP. I have been unable to identify a good way to standardize these kinds of values that does not bloat the NIP to unreadable levels. I am and have been open to ideas.

@vir2alexport
Copy link

@dskvr I see, thank you. So this is good for liveness monitoring rather than discovery in my case.

@dskvr
Copy link
Contributor Author

dskvr commented Sep 9, 2024

This is good for both liveness and discovery.

Comment on lines +60 to +61
- `rtt-read` The relay's read **round-trip time** in milliseconds.
- `rtt-write` The relay's write **round-trip time** in milliseconds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are these measured?

Copy link
Contributor Author

@dskvr dskvr Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read: time.begin subscribe { kinds: [1], limit: 1 }, subscription fulfilled or EOSE recieved (whichever first) time.end
write: time.begin send event, ok (success[a,b]/fail) received time.end

Notes:

  1. Both of these values are more useful when represented as an SMA from time-series.
  2. Both of these values can loosely inform relay behaviors when cross-referencing other relay attributes.
  3. These values cannot be universally applied since many/most relays treat all users differently.
  4. Write tests are particularly questionable, there's really no good way to do them (unless a user is authenticated, but this presents issues for relays)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some thought, read and write should probably be removed from this NIP. For these kinds of tests to be meaningful in the present-day nostr climate requires more context.

I wrote an auditor, similar to to mike's relaytester, and it seems clear that read checks will be useful, but with very specific parameters to highlight strengths and weaknesses between relays.

How to propagate or aggregate this data is still unclear, but it is very clear (and has been for a while tbh) that the simple read and write checks of yester-year provide little to no value anymore. I even completely disregard them in the new nostr.watch NIP-66 client.

66.md Outdated Show resolved Hide resolved
Copy link
Member

@staab staab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm planning to start using this to recommend relays to users in Flotilla. @dskvr and I chatted this morning about adding a publisher profile to NIP 89, but on re-reading this PR I see that monitors basically does this. I think that's better, NIP 89 is already a little overloaded, and I couldn't find a tidy way to fit publishers in. So I think this NIP is great as-is with one suggestion — it would be nice to be able to find monitor profiles, so a recommendation for kind 10166s to be aggressively replicated across the network like kind 0/10002 are would be good.

Copy link
Contributor

@kehiy kehiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can a relay publish an event with its contact pubkey and announce its liveness on other relays? then we can query most recent relays who announced themselves and new relays will be recognized as well on the network.

@dskvr
Copy link
Contributor Author

dskvr commented Jan 2, 2025

@kehiy relays can self-publish and this case was always considered valid. However, depending on the intended use case, this may be less useful.

If the intention is to find relays, you'll come up with a woefully incomplete dataset and implementation complexity rises. If the intended purpose is to discover a relay's meta by an operator pubkey your already know, then that is a good case for self-published events.

nostr1.com already monitors their customers' relays via their monitorlizard relay monitors using their own software written in go. It has statically set relays, and would be ideal for monitoring your own relays or your own relay list.


You didn't ask, but I've wanted to publish the following for a while after countless discussions on this topic.

When there are several crawlers, their incentive is to provide the best data. Relay monitor datasets will always be more complete than self-published data, and their incentives promote honesty. Some real life examples would be relays that advertise NIP-50 via NIP-11, but either don't adhere NIP-50 or almost entirely reject NIP-01. Most of them are case sensitive and somewhat useless because of this. Most of them are inadequate not by design but because there is no feedback loop for their operators to even realize there is a problem.

The comparison I like to use is early internet directories, such as DMOZ and Yahoo Directories. This approach of self-publishing incentivizes self-promotion, which is inherently antithetical to honesty and completedness. The resulting data on these directories was incomplete and difficult to maintain as websites went offline. These added complexities are a core reason why crawlers won over self-published directories. Crawlers had incentives around distilling data to provide users with the best possible data. Crawlers inherently accumulated liveness data and so their data was much cleaner and up to date. The aforementioned dynamic eventually trended against users in the 2010's when certain platforms achieved a monopoly, however, the open-index nature of NIP-66 should mitigate monopolies and consequently algorothmic biases. The aggregate NIP-66 dataset is potentially a foundation and qualifier of other, more data rich datasets. Those unfamilair with crawling at scale are often surprised how important liveness is at scale.

Monitors are incentivized to test a relay against their stated capabilities, whereas a relay operator is not incentivized or often enough not equipped to publish their own inadequacies.

The most common agument against relay monitors and NIP-66 is not wanting to trust a monitor to truthfully broadcast data about a relay it does not run itself. However, there are several clear arguments against this assertion.

  1. When considering relay monitors are just pubkeys, then it's obvious they can be in other pubkeys' web of trust. Therefor an implementer doesn't need to trust anything, they can defer that decision to pubkeys with reasonable defaults.
  2. Nostr already relies on various points of trust. We trust relays will store our data. We trust relays will fulfill filter requests honestly. We trust clients will fairly serve data and verify events. The trust is counter-balanced by the inherent fluidity of nostr and ability to change rrlays and clients on a whim. That ethos is carried over in this nip.
  3. In the case of "bad monitors" this is exactly why NIP-66 was written. 2 years ago the userbase relied on nostr.watch almost exclusively for relay data, and I never liked this and thought it was a huge weakness. Over 100 repos implement the centralized nostr.watch API. Hence, as written, any number of monitors can compete with nostr.watch, provide better data or higher resolution data. Or Clients could run their own monitor and rely on their own data using a unified implementation pattern. This dynamic essentially creates an open index where incentives include honesty and competition, and results in an environment where censorship or result biases are inherently difficult if not impossible.
  4. Bonus: NIP-66 also alleviates probing spam because if every user from every client probes relays, the cost of running a relay goes up substantially, not to mention client implementation complexity.

Sidenote:
The new nostr.watch is a NIP-66 nostr client (as well as the old one for that matter, with some caveats). While defaults will serve data produced by nostr.watch so that I can ensure a level of consistency from my domain for onboarding purposes, users will have full control over who serves their data. This is no different from any other nostr client. Other monitors could just as easily run the client with different defaults or write their own client if it were ever compromised in any way or nostr.watch became "evil." This is by design.

@kehiy
Copy link
Contributor

kehiy commented Jan 2, 2025

@kehiy relays can self-publish and this case was always considered valid. However, depending on the intended use case, this may be less useful.

If the intention is to find relays, you'll come up with a woefully incomplete dataset and implementation complexity rises. If the intended purpose is to discover a relay's meta by an operator pubkey your already know, then that is a good case for self-published events.

nostr1.com already monitors their customers' relays via their monitorlizard relay monitors using their own software written in go. It has statically set relays, and would be ideal for monitoring your own relays or your own relay list.

so, considering this, isn't it logical to add it to the standard?

@1l0
Copy link

1l0 commented Jan 2, 2025

NIP-66 saves the world.

@dskvr
Copy link
Contributor Author

dskvr commented Jan 3, 2025

so, considering this, isn't it logical to add it to the standard?

I thought it was still in there, but upon review it was removed during the last wave of simplification. It is still there as it falls under the umbrella of ad-hoc monitoring, but it should be explicitly stated in the NIP. Thanks for pointing it out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.