Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FR: Restrict regex to note body #1149

Open
tryonkus opened this issue Aug 25, 2024 · 14 comments
Open

FR: Restrict regex to note body #1149

tryonkus opened this issue Aug 25, 2024 · 14 comments
Labels
rule suggestion Suggestion to add or edit a rule

Comments

@tryonkus
Copy link

tryonkus commented Aug 25, 2024

Is Your Feature Request Related to a Problem? Please Describe.

I’d like to be able to restrict regex rules to just the YAML header or the body of the note.

Describe the Solution You'd Like

Apropos of #1143, I was able to build a regex rule to change double hyphens to em dashes:

(?<!-{1,2})-{2}(?!-{1,2}) to —

This works great, even in iOS because I’m using the most recent version. On a related note, Longform misinterprets horizontal rules as YAML delimiters and strips content between them. To prevent this, I change -{3,4} to ***, but that catches the YAML delimiters as well.

is there a way, or could a feature be added, to restrict regex rules to the body of the note or to the YAML header? I can easily skip the opening YAML ---, but skipping the closing delimiter and the intervening YAML is much more challenging.

Please include an example where applicable:

See above

Describe Alternatives You've Considered

I’m able to do this by manually selecting the note body and running a TextSoap cleaner—I just can’t use Linter’s automation tools. I also can’t run TextSoap anywhere but MacOS.

Thanks,

Ken Tryon

@tryonkus tryonkus added the rule suggestion Suggestion to add or edit a rule label Aug 25, 2024
@pjkaufman
Copy link
Collaborator

Hey @tryonkus , could you use a negative lookahead to prevent the modification of any two dashes followed by a 3rd?

So adding (?!-) should be the negative lookahead to prevent a dash followed by another.

I will need to do some testing of the regex to verify that it is possible, but I think that should fix your issue.

If there is a reason a negative lookahead is not feasible, please let me know.

@tryonkus
Copy link
Author

tryonkus commented Aug 25, 2024 via email

@pjkaufman
Copy link
Collaborator

I think I see what you are saying. If I am understanding correctly, you are trying to swap HRs to a format that Longform does not consider to be YAML delimeters. But that regex has to match 3 or 4 dashes. That regex is the one that matches the YAML indicators.

@pjkaufman
Copy link
Collaborator

For my own reference, the Longform plugin issue is this one: kevboh/longform#265.

@pjkaufman
Copy link
Collaborator

@tryonkus , I think I would rather implement logic for setting the horizontal rule style rather than altering how custom replacement works. Would that be a viable alternative in your opinion?

@tryonkus
Copy link
Author

That would definitely be a viable solution. I was trying to think of something that would be more generally useful, hence the option to restrict regex rules to YAML or the body. I don't have a ton of documents with horizontal rules to fix, so I can simply change my workflow to typing three asterisks instead of three hyphens, and that will take care of this particular issue—it would be nice if Obsidian allowed the user to define what to insert with a horizontal rule command (as they do with, for instance, comment styles), but I don't see that.

I need to look at your other notes, as I may be able to address this in Longform. There is an option to turn off YAML stripping, but I think that will lead to YAML headers for all my notes being included in the final manuscript. I actually want the YAML header for just the first note to be included, and I'll propose that as a feature request for Longform.

I've just been getting back into writing in Obsidian, and I'm experimenting with my workflow to make it fairly seamless and not require a lot of manual tweaking, hence all the questions and ideas.

@pjkaufman
Copy link
Collaborator

Gotcha. I may make a PR to longform to fix the issue with three or more dashes not at the start of a file being considered YAML. I think it is a small change, but I am not sure what UTs they have setup so I am not sure if a fix there actually will be very convincing to the maintainer.

@tryonkus
Copy link
Author

tryonkus commented Aug 25, 2024 via email

@tryonkus
Copy link
Author

tryonkus commented Aug 25, 2024

There is some Obsidian weirdness happening too--if I don't put a blank line before and after a horizontal rule, it doesn't seem to recognize it as a paragraph. If I place a rule directly after a paragraph with no blank line, it formats the paragraph like a header (H1, I think). The text below will break Obsidian and Linter's paragraph spacing rules. If I insert a blank line before the first HR, that will at least fix the paragraph spacing in the text, but it doesn't fix the rules. This is pretty artificial, since I would never put a bunch of rules one after the other—it's just test text.

  • I have paragraph spacing set to exactly one blank line.
  • First regex is (?<!-{1,2})-{2}(?!-{1,2}) options gm, replace with
  • Second regex is \n\n-{3,4} options gm, replace with \n\n***

This works for the HRs, so long as Obsidian and Linter recognize the paragraph break and add the blank line.

Note: I don't know how to wrap text in a code block, so this needs to be copied into Obsidian to see the effect.

**Coming up for air**

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
---
----
--
--

@tryonkus
Copy link
Author

Sorry for the flurry . . .

If this is my text:

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
---
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.

Obsidian treats the trailing --- as formatting on the preceding paragraph. Linting inserts a blank line after the HR but not before, which I'm guessing is due to how Obsidian is defining paragraph breaks. Gotta look at the Markdown spec—I don't know if a trailing --- is some kind of block formatting or if this is an Obsidian bug.

@tryonkus
Copy link
Author

tryonkus commented Aug 25, 2024

Found it—--- after a paragraph is an alternative syntax for H2. I always insert a blank line when adding a HR, and this explains why it goes all wonky if I don't.

https://www.markdownguide.org/basic-syntax/#alternate-syntax

So long as I add a blank line after paragraphs and before a HR, Linter correctly adds a blank line after a HR and before a paragraph. If I have a line of styled text, e.g. **This is a small paragraph heading** before a paragraph with no blank line between, then Linter/Obsidian does not insert a blank line between the "heading" and the paragraph. Dunno if that's your logic or Obsidian's.

@pjkaufman
Copy link
Collaborator

I was actually about to mention that syntax since it is something that the Linter currently does not play well with, but it is considered to be a header.

@tryonkus
Copy link
Author

tryonkus commented Aug 25, 2024

Yeah—it sure looked like an H1 or H2, which is why I checked the spec. I can understand why it was added (that’s a common way to mark a heading in plain text), but I’d never encountered it before in all the time I’ve been using Markdown. Must be a MM or GFM extension, maybe CommonMark—I think those are the main flavors.

Do you know why a line of bolded text isn’t treated as a separate paragraph? It seems intentional, but I haven’t found any documentation.

@tryonkus
Copy link
Author

tryonkus commented Sep 1, 2024

@pjkaufman I looked for other ways to run a regex search and replace and found a plugin called just that, but it only ran ad hoc rules, meaning, you had to enter the rule every time. Then I tried Regex Pipeline, which is really just a way to save multiple regex rules in ruleset files, then run them as needed. It has slightly wonky syntax, and including newlines in the replacement text requires a workaround (they have to be part of a capturing group), but I was able to get it to:

  • Remove highlights from text by zapping all occurrences of ==
  • Remove all text between arbitrary markers (I used <#>), so that I can easily truncate unfinished sections at the end of a Longform manuscript. I created a couple of Longform scenes with just that text, one of which lives at the end of the document and the other after the last finished scene.

Given that I can put multiple regex substitutions into one ruleset, I'll be able now to do any manuscript post-processing that I don't want to include as Linter rules in one command. Linter is cleaner and easier to use, but Regex Pipeline lets me save rules to catch the oddball stuff that I don't want to do every time I lint. It hasn't been updated in a couple of years, but it's still working . . . .

FWIW, Longform lets you create user defined steps in JS, but I haven't figured out the syntax. You have to build a JS module, and I only understand the basics of JS—I can modify an existing script but not build my own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rule suggestion Suggestion to add or edit a rule
Projects
None yet
Development

No branches or pull requests

2 participants