-
Notifications
You must be signed in to change notification settings - Fork 602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Querying events by tags presence #683
base: master
Are you sure you want to change the base?
Conversation
This is #523 It's a presence filter specifically. |
I think this would solve many problems and we should have it. But there are challenges in relays actually implementing it: #523 (comment) We would have to define which specific tags get the index. We can't do all of them without a full table scan. |
@alexgleason Relays already have to deal with indexing array of values like array of |
You would have to basically double the amount of space used by tag indexes. Maybe more since you can't use a partial index if you want both presence and absence filters. It's probably worth doing it at least for "e" tags since we have a strong use-case in #523 that is a very common one. Other tags would need strong arguments in favor of doing it, I think. Still, I think that shouldn't necessarily block this NIP from proceeding. We can nail down the API. |
If this NIP-100 gets merged but implementing it isn't a requirement, in practice it may be like if it never existed because no client would be able to rely on it cause not all relays would implement it. Probably when fetching the feed, clients would continue requesting all notes be them root or not, because potentially many relays are involved. @fiatjaf any chance you mark this NIP, if merged, and others like NIP-12 and NIP-20, as "Required" on README.md NIP list? Meaning it would be as if they were inside NIP-01, like minimum NIPs that must be supported to consider the relay/client nostr-compatible. Also, as an exception, put them near the top just below NIP-01 despite the numbers. |
Me stating a thing is required doesn't cause everybody to immediately implement it. Also, as @alexgleason said in the other issue, it's very costly for relays to implement this. I think we absolutely do not need it. |
I think that if your client depends on this you're implementing something wrong. If you're following a person and want to read what they write you want all their kind1s. Regardless of whether you'll display everything in the UI or in different views according to the tags, you still should download everything, store locally and display when appropriate. |
Disagree or else I wouldn't be supporting this but don't know what DB you guys are considering.
Although some clients are doing this I don't consider it efficient. But if doing everything client-side is the recommended way I think its ok to close this PR and the other issue. |
I like this.
Not exactly. There are use cases where this is a real need. If you want Global without replies, for instance, it doesn't make sense to download everything and then filter replies out. If you are doing a map of Nostr posts with a GeoHash, it doesn't make sense to download everything and then discard everything that doesn't include a |
I can understand this, but does anyone really want this? Sounds like some skewed preferences here. "Global without replies". Global is not a thing, and replies are not different from normal notes, technically. Should they have a different kind?
This I don't think is a valid use case (I mean, whatever, it is valid, but what I'm saying is that it doesn't fit Nostr, not all things fit Nostr if we want Nostr to remain simple). Either you are already fetching posts from people that you want, storing these locally somehow, and then you are displaying those that have Maybe we should be making more kinds for different types of events and relying less on tags for indexing. Since tags are so flexible it's easy to think they should be used for everything, but if we start doing that and relying on that this will not end well. |
what is not efficient? To store events that you want locally? You think it's more efficient to load them from relays over and over multiple times every day? |
Can we come back to this, please? Has any relay tried to implement it? @mikedilger since you have just coded a relay, what do you think about this filter? |
"Feed" events, for example. This event set gets stale so often that when the user re-opens the app they aren't interested anymore on the previously received events. That's why i believe these events should live in memory instead of in a persistent local db. My unreleased client's "feed" is made of root events (no edit removed ugly syntax examples. Would be great to have it but i know it won't happen =]~ |
It seems like the solution to everything not in NIP-01 is DVMs. |
I regret posting that. |
Maybe we should indeed align our expectations to what the core protocol should solve for and what is expected of a Layer 2 design (DVMs) to do it. If we want to keep the relay dev simple, we should "outsource" everything to layer 2. Or maybe we just create a new network of relays working on the same events but with more interesting filtering options. Clients can then choose which network they want/need to integrate with. |
I don't think filters are getting any more changes at this point. I need more than just presence/absence tags, anyway. I need joins. |
@Semisol had some interest in building a new type of relay with a new filtering language. I am not sure if he ended up doing anything. But we could just do a relay with regular read-only SQL as an entry point. |
Or maybe this idea of filters and subscriptions themselves should be turned into replaceable events. I can imagine a client signing an event with a Nostr filter (or an SQL query) instead of using the |
can you elaborate on joins? what's you use case? |
Ok I just went back and read #523 and this issue again, and I have a few more things to say: First, I have not encountered the need to do presence or absence (or tag count!) queries. But I don't think it is unreasonable. Second, this idea that relays will need to do complex indexing is wrong. Relays should not index for these kinds of queries at all. Neither should they do hard scans of every event. Relays should (1) require such filters also contain other fields that already narrow down the event set to something reasonable, or else reject the filter as a scraper, and (2) load all the events ignoring the new presence/absence filter specifications, and (3) post-filter all matching events with these new fields. Sure, you loaded more events than you needed and then stripped them back... but that is far less resource consuming than sending them over the network and having the client strip them back. Basically it just pushes that filter operation to the relay to save on network bandwidth. That being said, if a crafty relay developer wants to index these to boast about hyper-fast performance, that's fine, but we don't need to design for that case. And seriously, if someone sends a "give me all events that don't have a geo tag" were you really going to send them 99.9% of all the events in your database? I don't think so. I don't like modifying the "#e" to be a non-array (e.g. having a 'null' option). I prefer this PR's method of adding a new field. Clients SHOULD check NIP-11 before using the new field. But also the rule for relays ought to be "if you see a filter field you do not recognize, that is an error". I don't know if that was codified elsewhere but I think it should be. I don't follow the need to count for the number of "e" tags, especially if we are moving to "q" tags. I think this PR is pretty close as is. I'll add it to my relay if there is the momentum to do it (not too easy for me as I have meticulous memory layouts and detailed parsing to update). EDIT: I don't think this NIP becomes required or part of the core of nostr. It will be okay if most relays don't implement it. Client will have to deal with errors from relays filters that don't accept the new field. BUT we probably do have to push through a small required change which is to make those errors machine-readable (new prefix) and specify that relays must reject filters with fields they do not recognize (I didn't check the current NIPs maybe that is already there). |
Interesting take. A caveat is it may mess with "limit" filter, like if a client asks for
That's the part i disagree. If incompatible relays simply ignore the unknown filter field and apply just the ones it understands, client can still apply the extra filter client-side. Client would still have the option to use the strategy of checking NIP-11 to skip incompatible relays if it prefers not to re-filter client-side. |
Oh right.
The problem I'm worried about is if a client specifies a new filter field the relay doesn't understand in order to prune the search to something reasonable, but the relay skips that new filter and dumps massive events on the client. |
Instead of adding new filter properties, we can include an extension to NIP-50's { "search": "has:#e" } This fixes everything IMO. |
Special syntax for searches is super annoying because it mixes data with code. What if someone wants to search for a note that includes "has:#e"? Why not just add a new filter property, e.g. |
It's already part of NIP-50 https://github.com/nostr-protocol/nips/blob/master/50.md#extensions To search for a note with "has:#e" in the text, you'd do this: { "search": "\"has:#e\"" } The functionality in question (filter by tag presence/absence) IS a search functionality. It makes most sense for search relays to implement it. Also the |
Because after a lot of discussion and many months, I realized it's not going to happen. And it probably shouldn't happen. |
I did not realize that. Lame.
Yes, and I've spent way too much time dealing with user inputs that include special characters. As far as I'm aware, with postgres at least you have to do the escaping in your application code, which is painful and error-prone. There's no reason we need to make the same mistakes as the past. |
Check also #1105 to see another application of NIP-50 extensions. It makes sense to do advanced filtering there. |
I'm not saying these aren't useful, but cramming them in a plain text field is a mistake. Instead of new keys you could add an |
I believe the intended way is to pass the user's search input directly into postgres/fts5. I'm not sure how viable that is. |
fts5 is new to me, I was using postgres' built-in tsvector/tsquery stuff, which didn't play nice with raw user input. |
@staab Doesn't seem that bad in the grand scheme of things. And any parsing errors etc have basically no consequence. https://chat.openai.com/share/9e5f4a6f-b0b9-4644-ae14-2995ac71ee38 |
It's not the worst thing, and since it's already happened means I've already lost the argument. I just wish nostr developers (and developers in general) would stop making everyone write parsers. |
Agree. Custom parsers suck. |
An attempt to stop using different zoom levels on location specific use cases
View it