Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search in pages #36

Open
makew0rld opened this issue Jul 2, 2020 · 19 comments
Open

Search in pages #36

makew0rld opened this issue Jul 2, 2020 · 19 comments
Labels
documentation Improvements or additions to documentation enhancement New feature or request help wanted Extra attention is needed roadmap A user-facing feature on the roadmap. UI Deals with the visual user interface

Comments

@makew0rld
Copy link
Owner

makew0rld commented Jul 2, 2020

Ctrl-F should bring up a bottom bar prompt to enter a search term. On Enter a ["search"] region will be prepended to any bit of text that matches the search, with a [""] after it. Link selecting will have to be disabled, maybe with some global per-tab mode setting that switches between link, off, and search. Tab and Backtab can be used to navigate through all the found text fragments. Esc should switch the global per-tab mode back to off, and maybe use a regex to remove all the region open and close tags for the search.

@makew0rld makew0rld added enhancement New feature or request UI Deals with the visual user interface roadmap A user-facing feature on the roadmap. labels Jul 2, 2020
@makew0rld makew0rld added this to the v1.2.0 milestone Jul 2, 2020
@makew0rld
Copy link
Owner Author

makew0rld commented Jul 2, 2020

The searching for the bit of text should ignore any style or region tags, so that may require a regex as well. This is a example where having access to the raw content, as discussed in #33, would be useful. It might be worth waiting for that before implementing this.

If the raw content is available, maybe a search term could be part of the render process, as an arg passed to RenderGemini. So it would just be rerendered.

@makew0rld makew0rld modified the milestone: v1.3.0 Jul 2, 2020
@makew0rld
Copy link
Owner Author

makew0rld commented Jul 4, 2020

Having to add a per-tab mode setting is last straw of the code being somewhat disorganized. Using a tab struct as mentioned in the NOTES.md would be helpful here.

@makew0rld
Copy link
Owner Author

The tab struct work is in progress on the tab-struct branch.

@makew0rld
Copy link
Owner Author

That work was completed.

@makew0rld makew0rld modified the milestones: v1.3.0, v1.4.0 Jul 10, 2020
@makew0rld makew0rld removed this from the v1.4.0 milestone Jul 28, 2020
@lachrimae
Copy link

I'm investigating how to implement search in pages at the moment. Most of the difficulty seems to me to lie in the fact that it's difficult to reason about the state of the application in the relevant blocks of code. You can see, for instance, with this tool, that the number of if/else statements (they call it cyclomatic complexity) of the function Init() in amfora/display/display.go is very high. I have some ideas to simplify this that I want to run by you before I put a lot of work in.

The point of this refactor is to make the organization of the code correspond to the state of the application. Code that should run under different application state should be in separate functions. When someone presses Tab, we want the resulting behaviour to depend on the application state, specifically, whether the page is in search mode or in normal mode. Right now, application state is managed inside Init() by some deeply-nested if/else and case statements. I want to test up-front for the mode of the tab and the page, and call separate functions according to the results of those tests. (In this block of code, the tab state is tracked with the variable tabs[curTab].mode, and the page state is tracked with tabs[curTab].page.Mode.)

In particular, I propose to

  1. name the anonymous function provided as argument to bottomBar.SetDoneFunc and move it outside of Init();
  2. refactor the anonymous function provided as argument to App.SetInputCapture. I want to turn everything after line 240 into easy-to-read blocks like those on lines 226-240. The lines 241 and onwards are logically complex and difficult to read. The logical tests primarily rely on the values of three variables: event.Key(), tabs[curTab].page.Mode and tabs[curTab].mode. I think what would be easiest to maintain would be to create 6 separate functions, depending on the various values of tabs[curTab].page.Mode (which takes 2 values) and tabs[curTab].mode (which takes 3 values). These functions would contain case statements depending on event.Key() only. I want to experiment a bit with this, because it might make sense to stick either tabs[curTab].page.Mode or tabs[curTab].mode inside these functions as well, so as to only have 2 or 3 of them.

After doing this refactor, it'll be easier to keep track of application state, with the bonus that display.go will be more organized and readable. At that point, writing the search functionality will involve much less cognitive overhead.

How does that sound to you?

@makew0rld
Copy link
Owner Author

Most of my difficulty with implementing this was due to the fact that cview color tags would need to be used to highlight the search results, and they would need to changed every time the user asks for the next match. These changes would either require a complete re-render of the page content, or some sort of error-prone substitution regex.

Furthermore, inserting the search term color tags but not matching any text already in the tags seemed like a difficult problem to me, that could potentially be solved with a complicated regex.

Here is a simplified example of what the rendered text looks like before it's outputted to the screen:

[red]# Heading[-]

Red blue green, this is text.

[red[] <- that is not a color tag because it is escaped.

Now, if the user searches for the string red, you have to insert color tags like [:blue:] and [:-:] around all the search terms (to change the background to blue), and potentially insert a different color around the term that is currently "selected". But you have to make sure not to attempt to highlight the actual color tag in the first line, as that will cause errors.

With all that said, I see how that part of the code is complicated and large. If a refactor helps you, go for it! Your suggestions seem good, I would have to see them worked out before I can commit to approving anything of course. Thank you for your interest!

@lachrimae
Copy link

lachrimae commented Aug 15, 2020

Thank you for starting the project! And I appreciate you laying out the situation for me like this.

I'll go ahead with the refactor because it'll help me put in the UI work, but I understand that you have to wait and see to approve or not.

I'll take a look at the regex stuff. I suspect there'll be a problem removing the cview tags using just regex, since regex is technically just for regular languages, whereas it's my understanding that languages involving open/close brackets (or open/close tags...) are never regular. This problem may require a proper parser.

@makew0rld
Copy link
Owner Author

I'd be surprised if modern regex can't do this, but we'll see. Really hoping to not need a parser. Let me know if you have any questions!

@lachrimae
Copy link

lachrimae commented Aug 18, 2020

I've been looking into the regex approach, and I have mixed news.
I've worked out a regex expression which I'm fairly confident would match on the cview tags themselves. With a modern regex library, the problem is mostly solved once we can define the cview tags.

\[(#[[:xdigit:]]{6}|aqua|black|blue|fuchsia|gray|green|lime|maroon|navy|olive|purple|red|silver|teal|white|yellow)?(:(#[[:xdigit:]]{6}|aqua|black|blue|fuchsia|gray|green|lime|maroon|navy|olive|purple|red|silver|teal|white|yellow)?(:(-|[lbdiru]+)?)?)?\]

However, the standard library implementation of regex for go does not support lookabouts. If you read between the lines of the first couple of paragraphs here, it sounds like they left those out in order to have an O(n) complexity guarantee. The link they have there suggests that many of the popular implementations are exponential in bad cases. In any case, I don't think I can solve this with just the standard library regex.

So I'm trying to find alternatives to the standard regex library. Does this PCRE library look good to you?. I also found a parser generator goyacc which generates linear-time parsers for Go, in case it looks like we have to go down that road.

@makew0rld
Copy link
Owner Author

Thanks for looking into this.

I think trying to enumerate all the possible content of the tags is the wrong way to go. I just figured out a regex that seems to capture all valid tags and exclude escaped ones. It seems to pass all the tests listed here.

\[.*[^\[]\]

You can try it out here.

Am I missing something?

@lachrimae
Copy link

You're right that your expression matches valid tags and excludes escaped ones. I was trying to figure out the structure of valid tags (with strictly valid colour settings) with the idea that the final regex will likely look very different from that one. Your regex may well be superior for the final code.

I take it that what want to do is to be able to input <Ctrl+F>red<Return>, then to call something like regexp.ReplaceAllString to turn "red" into "[#highlight colour]red[#nonhighlight colour]", except when that "red" occurs within a cview tag. Then we can update the page's Content with the new cview tags. (Let me know if I'm misunderstanding things here!) Typically I would specify this "except when that 'red' occurs..." behaviour with lookabout expressions like (?<!...). I suspect it's impossible in a regex dialect with no lookabouts. I'd be glad to be mistaken, though.

With your expression, we would be able to transform cview tags into other kinds of things. But without lookabouts, it seems to me that we can't do the search+replace we want.

We could do something wacky like

  1. check if the query string might be found inside a tag;
  2. if so, use regex to find the indices of every match, but if not just use regex search+replace and terminate;
  3. assuming we didn't terminate, manually iterate over our match indices, testing if the string is inside a tag, and replacing it if not.

But that seems like reinventing the wheel...

@makew0rld
Copy link
Owner Author

Ah I see now, the lookabouts allow for tag detection while also searching for a substring. I am reluctant to bring in another regex engine or write a full parser. And so, I think your wacky solution might be the best one. Let me restate the steps below to ensure we're on the same page:

  1. Get all the indices of cview tags using FindAllStringIndex and the regex I presented above.
  2. Use the same function to find all the indices of where the user's query appears
  3. Iterate through all the matches of user's query, check if the indices match any of the cview tag indices, and skip over this match if they do. Otherwise add highlighting, etc

Step 1 could be cached whenever a page is loaded, for performance purposes.

Does that align with what you wrote above, am I understanding correctly?


So I'm trying to find alternatives to the standard regex library. Does this PCRE library look good to you?

Unfortunately it does not, as it is not pure Go. I am using only Go deps for this project, as it keeps cross-compiling simple.

@lachrimae
Copy link

Yes, I think we're on the same page about the "wacky solution" 😄. I'll start working in that direction.

There's one complication for step 3, though, assuming I understand the desired behaviour correctly. Supposing [HEADER], [BODY] and [HIGHLIGHT] are well-defined cview tags, I imagine we'd want the following transformation:

[HEADER]Lorem ipsum dolor sit amet[BODY]\nIn the beginning, ...

<Ctrl+F>dolor<Return>

[HEADER]Lorem ipsum [HIGHLIGHT]dolor[HEADER] sit amet[BODY]\nIn the beginning, ...

which means we'll have to backtrack to see what the most recent tag was. Since their indices were cached from step 1, it's not a big problem, though.

Avoiding an external regex engine and keeping everything in Go makes sense to me. Regarding the parser, goyacc's example is 175 SLOC, and I'd be happy to write something like that. But I recognize it's a low power-to-weight ratio, and you may have considerations in terms of the codebase and maintenance I'm not thinking about. As I said, I'll work towards the wacky solution unless it proves untenable for some reason.

@makew0rld
Copy link
Owner Author

Good catch, yes that will be required.

Since their indices were cached from step 1, it's not a big problem, though.

Yep, should just be able to copy those bytes.

But I recognize it's a low power-to-weight ratio, and you may have considerations in terms of the codebase and maintenance I'm not thinking about. As I said, I'll work towards the wacky solution unless it proves untenable for some reason.

Exactly, it seems like a lot of code, which isn't quite worth it to me. Thanks for starting your work, feel free to give me updates or ask questions here!

@makew0rld makew0rld added the help wanted Extra attention is needed label Sep 15, 2020
@makew0rld
Copy link
Owner Author

@lachrimae Have you been able to make any progress on this? No worries if not.

@lachrimae
Copy link

Hey @makeworld-the-better-one I have made some progress. I suddenly had more paperwork to do so completing this feature fell by the wayside for me. Some of that is behind me now so I'm going to get back on the wagon.

@makew0rld
Copy link
Owner Author

All good, happy to hear that! No worries if you don't complete it either, I'd just like to know.

@makew0rld
Copy link
Owner Author

makew0rld commented Nov 17, 2020

@lachrimae Are you still planning on adding this? No worries if not.

Edit on 2020-12-09: I've removed you from the assignees for this issue, and will accept PRs from others on this now. Would be happy to hear back from you at any point, though. Thanks for hashing stuff out with me in thread, it was valuable.

@makew0rld makew0rld added the documentation Improvements or additions to documentation label Dec 15, 2020
@makew0rld
Copy link
Owner Author

makew0rld commented Dec 16, 2020

Note: Vim/less keybindings should also be added. Forward slash to start query, n (forward) and Shift-N (backward) for iterating through results, Esc-u to stop highlighting, etc.

@makew0rld makew0rld added this to the v1.11.0 milestone Dec 8, 2021
@makew0rld makew0rld removed this from the v1.11.0 milestone Feb 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request help wanted Extra attention is needed roadmap A user-facing feature on the roadmap. UI Deals with the visual user interface
Projects
None yet
Development

No branches or pull requests

2 participants