Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when reading logseq-specific markdown files #156

Closed
mrene opened this issue Feb 8, 2023 · 10 comments
Closed

Crash when reading logseq-specific markdown files #156

mrene opened this issue Feb 8, 2023 · 10 comments
Labels
bug Something isn't working

Comments

@mrene
Copy link
Contributor

mrene commented Feb 8, 2023

The following exception occurs when trying to parse logseq markdown files that have additional properties.

Unhandled exception: System.AggregateException: One or more errors occurred. (Exception has been thrown by the target of an invocation.)
 ---> System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation.
 ---> System.ArgumentOutOfRangeException: Index and length must refer to a location within the string. (Parameter 'length')
   at System.String.Substring(Int32 startIndex, Int32 length)
   at Marksman.Parser.Markdown.scrapeText(Text text) in /build/source/Marksman/Parser.fs:line 172
   at Marksman.Parser.parseText(Text text) in /build/source/Marksman/Parser.fs:line 392
   at Marksman.Workspace.DocModule.mk(PathUri path, RootPath rootPath, FSharpOption`1 version, Text text) in /build/source/Marksman/Workspace.fs:line 45

In logseq, blocks (bullet points) and markdown files can have properties, which are structured as identifier:: contents. It will save the last filter settings and block states (collapsed, expanded) into these properties.

In this case, a file named inbox.md was causing the crash and contained the following:

filters:: {"work" true}

-

Note that it crashes even if the file isn't the file being browsed as it's trying to index the full directory.

@artempyanykh
Copy link
Owner

Thanks for reporting this @mrene!
Unfortunately, so far I couldn't reproduce locally with the provided text.

Could you tell me what is the output of marksman --version and what text editor you use it with? Could you also check the file for any special symbols (e.g. a non-breaking space)?

The stack trace points to an error in header parsing, but I'm not sure what in the contents of the file might cause this.

@artempyanykh artempyanykh added the needs-further-input Further information is required from the reporter label Feb 8, 2023
@mrene
Copy link
Contributor Author

mrene commented Feb 8, 2023

Thanks for looking at this so quickly!

I'm using neovim on NIxOS, I tried the latest commit on the main branch (ecef5e8) and it would crash as well.

Here's the file that reproduces the crash, maybe it's sensitive to some non-visible whitespace content. Starting marksman in an empty dir with only that file (and .git so it finds a root) crashes reliably

@artempyanykh
Copy link
Owner

@mrene thank you! I could reproduce the issue.

@artempyanykh artempyanykh added bug Something isn't working and removed needs-further-input Further information is required from the reporter labels Feb 11, 2023
@artempyanykh
Copy link
Owner

artempyanykh commented Feb 11, 2023

The issue is in one of Marksman's dependencies. I submitted a fix, let's see if/when it gets merged.

@mrene as a workaround just add a newline after the last dash -. Note that the following snippet

-
-

is treated as a 2nd level heading with the text - according to the markdown spec (see setext headings).

If these logseq properties are always at the top of the file, they could be treated similarly to a YAML front-matter, but I'd need someone who uses logseq to confirm 😉

@mrene
Copy link
Contributor Author

mrene commented Feb 13, 2023

Thanks for the fix! These properties can be in any block (denoted by - ) so I don't think it fits the frontmatter concept. They use these to remember which block is collapsed and which one is expanded (amongst other things, these are user-customizable and you can use them to query for things)

@artempyanykh
Copy link
Owner

I see! @mrene could you provide a more complete example of a logseq markdown document with a couple of these blocks? I think adding an option to disable setext-headings would be an easy way to avoid problems with logseq markdown, but I'd need to see a more complete example first.

@mrene
Copy link
Contributor Author

mrene commented Feb 18, 2023

Their own docs are written using this format, here's a few examples

@artempyanykh
Copy link
Owner

@mrene thank you! So, these properties are really just lines of the form bar:: baz. They are not delimited with 2 single dash lines, right?

-
-

If so, I think we should be good.

@mrene
Copy link
Contributor Author

mrene commented Feb 19, 2023

Oh no the two single dash lines just happen by accident when you hit enter in an empty page (they all start with a single one), props are just key:: value

@artempyanykh
Copy link
Owner

Cool! Than we're good here.

artempyanykh added a commit that referenced this issue Oct 27, 2023
This resolves parser crashes from #156 and #235
artempyanykh added a commit that referenced this issue Oct 27, 2023
This resolves parser crashes from #156 and #235
artempyanykh added a commit that referenced this issue Oct 27, 2023
This resolves parser crashes from #156 and #235
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants