Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add query documentation #80

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

KonradStanski
Copy link
Contributor

This adds sample query documentation with examples

Copy link

@OpenCode OpenCode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

docs/query.md Outdated
| `regex:` | | Regex pattern | Matches content using a regular expression. | `regex:/foo.*bar/` |
| `repo:` | `r:` | Text (string or regex) | Filters repositories by name. | `repo:"github.com/user/project"` |
| `sym:` | | Text | Searches for symbol names. | `sym:"MyFunction"` |
| `branch:` | `b:` | Text | Searches within a specific branch. | `branch:main` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we replace the branch: row with revision: (alias rev:), and change the description to something along the lines of "Searches within a specific branch or tag?

Note that revision isn't a zoekt field, and is instead a alias to branch that is made in the backend. The reasoning is that revision better communicates that both branches & tags are supported. It's also how we are communicating things in the schema.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

docs/query.md Outdated
| `repo:` | `r:` | Text (string or regex) | Filters repositories by name. | `repo:"github.com/user/project"` |
| `sym:` | | Text | Searches for symbol names. | `sym:"MyFunction"` |
| `branch:` | `b:` | Text | Searches within a specific branch. | `branch:main` |
| `type:` | `t:` | `filematch`, `filename`, `file`, or `repo` | Limits result types. | `type:filematch` |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never actually seen this type prefix before - It looks like it was something Sourcegraph introduced?

I tried filematch, but that didn't seem to work.. Looking at the zoekt source, it seems like this is only used in query_proto.go, which to my understanding is also something Sourcegraph introduced since they are using GRPC and protobufs to drive their search experiences.

file, filename looks like it allows you to display the results as a list of filepaths, even when searching the content of files (example).

Queries with type repo seem like they are evaluated first. If there is a match, all files from that repo are returned. On it's own, this isn't really that useful (example), but when paired with another sub-query, can be (example).

All in all, it might make sense to omit this prefix from the docs for now until we are certain we "officially" want to support it. Curious to hear your thoughts though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

docs/query.md Outdated
@@ -0,0 +1,265 @@
# Zoekt Query Language Guide
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We might want to title to be something like "Search Query Language Guide" and modify the intro a bit to something along the lines of:

This guide explains the search query language used by Sourcebot, which is a derivative of the Zoekt query language with some minor differences. Search queries allow for combining multiple filters and expressions using logical operators, negations, and grouping. Here's how to craft queries effectively.

Reasoning: there are some differences between the prefixes we support vs. what Zoekt supports (e.g., revision:. See my other comment), so want to make sure it's clear to the reader that there isn't always a 1:1 between queries used in Sourcebot vs. queries used in regular Zoekt.


A query is made up of expressions. An **expression** can be:
- A negation (e.g., `-`),
- A field (e.g., `repo:`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: thoughts on calling this concept a "prefix"?

docs/query.md Outdated
### EBNF Summary

```ebnf
query = expression , { "or" , expression } ;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: does this encode the fact that expressions can also be separated by spaces?

Maybe we want something like this:

query = expression, { separator, expression } ;
seperator = " " | "or"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll omit this for now, then write a lezer grammar for zoekt query language and integrate it into the search bar.

docs/query.md Outdated
- **Text Fields**:
Text fields (`content:`, `repo:`, etc.) accept:
- Strings: `"my text"`
- Regular expressions: `/my.*regex/`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: everything is always evaluated as a regular expression, so wrapping things in "/" is not needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack


## Syntax Overview

A query is made up of expressions. An **expression** can be:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably worth mentioning that all expressions are evaluated as regular expressions (except for certain exceptions like if a expression is wrapped in "").

| `type:` | `t:` | `filematch`, `filename`, `file`, or `repo` | Limits result types. | `type:filematch` |

---

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably have a "Exact Matching" section that explains how the " " operator works to match exactly what you are looking for, instead of using regular expressions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

docs/query.md Outdated

2. **Exclude archived repositories and match a regex**:
```plaintext
archived:no regex:/error.*handler/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same nit on / not being needed for regular expressions. Also the regex: prefix is not needed since everything is matched as regex anyways

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need / or regex if you want to do a regex with spaces in it.

@KonradStanski
Copy link
Contributor Author

Please squash these if merging, the history is messed up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants