Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforce consistent formatting for Markdown docs #231

Open
lambda-fairy opened this issue Oct 11, 2020 · 12 comments
Open

Enforce consistent formatting for Markdown docs #231

lambda-fairy opened this issue Oct 11, 2020 · 12 comments

Comments

@lambda-fairy
Copy link
Owner

lambda-fairy commented Oct 11, 2020

https://prettier.io/docs/en/install.html

npx prettier --check .

or https://pypi.org/project/mdformat/

@hukkin
Copy link

hukkin commented Jan 9, 2021

Hi @lambda-fairy it's mdformat author here!

It seems you have a bit of rust code in the Markdown docs. FYI, if you consider mdformat, it's very simple to create a plugin to run cargo fmt on the rust code when running mdformat (and enforce in CI with mdformat --check).

I can do this for you in like 30 mins if you find it useful (it's basically a copy paste of the Go plugin)

@lambda-fairy
Copy link
Owner Author

Hi @hukkinj1!

No rush, but that would be great! I think mdformat is my leading option at the moment, as the opinionated approach + semantic line breaks matches what I'm looking for.

We probably want to call rustfmt, not cargo fmt, as the latter will format the entire project (instead of a single file / stdin). See rustfmt docs.

@lambda-fairy lambda-fairy changed the title Consider using Prettier (or similar) to format Markdown docs Enforce consistent formatting for Markdown docs Jan 14, 2021
@hukkin
Copy link

hukkin commented Jan 14, 2021

Hey, actually already did the plugin 😄
You can try

pip install mdformat-rustfmt

I did some comparison of prettier and mdformat output using this repository as input. The intention was for me to learn if there's ways in which mdformat could improve. You'll find the learnings here executablebooks/mdformat#110. Maybe this makes choosing the right tool easier for you?

Unfortunately points 5 and 6 from executablebooks/mdformat#110 hit this repo quite hard, and CHANGELOG.md is uglified quite a bit. You may want to consider excluding that file or choosing prettier until those issues are resolved.

EDIT: Point 5 is now fixed in mdformat 0.5.5. So the only ugly bit remaining is point 6 (i.e. square bracket escapes).

@lambda-fairy
Copy link
Owner Author

Thanks 😄

The changelog isn't a priority for me, so I'm happy with excluding it for now.

My greater concern (which I realized just now) is with the "hidden setup code" feature. I want to run the code snippets with doctest (#257), but most of the snippets need extra setup code (e.g. imports) that I don't want to show in the rendered documentation. The standard solution in Rust is to prefix the setup lines with # – see mdBook and rustdoc. But this line-by-line approach doesn't interact well with code formatters.

I looked into how rustfmt handles code snippets inside doc comments. (Rust doc comments are Markdown, so whatever approach is taken there should transfer to mdformat.) It looks like they comment out the setup code, format the snippet, then uncomment it back. That should be easy enough to reimplement ourselves.

@hukkin
Copy link

hukkin commented Jan 15, 2021

If I understood correctly, I think something like this would do what you describe: https://github.com/hukkinj1/mdformat-rustfmt/pull/1/files

This doesn't seem to change formatting of https://github.com/lambda-fairy/maud docs however. Maybe you'd be able to provide a rust snippet for a test case (input and expected output)?

@lambda-fairy
Copy link
Owner Author

Yep, that's what I had in mind, thanks!

Maybe you'd be able to provide a rust snippet for a test case (input and expected output)?

Sure! I think Maud actually isn't such a good test case, as most of the examples are macro calls which rustfmt ignores 😛

Here's a test input:

# let x=a();b();c();
let y=d();e();f();

Output:

# let x=a();b();c();
let y = d();
e();
f();

I also found another potential blocker. By default, rustfmt parses its input in item scope, i.e. as a complete module. But most documentation examples are in statement scope, i.e. only make sense inside of a function, so are rejected by rustfmt.

For example, this doesn't work:

$ echo 'let x = 1;' | rustfmt
error: expected item, found keyword `let`
 --> <stdin>:1:1
  |
1 | let x = 1;
  | ^^^ expected item

But this works:

$ echo 'fn main() { let x = 1; }' | rustfmt
fn main() {
    let x = 1;
}

This is by design, as mentioned in the readme:

  • Any fragment of a program (i.e., stability guarantees only apply to whole programs, even where fragments of a program can be formatted today).

I'm not sure how best to solve this. We might need to propose a change to rustfmt...

@hukkin
Copy link

hukkin commented Jan 19, 2021

Here's a test input: ...

Thanks this works perfectly (when wrapped in a fn main() {}), added the test.

I also found another potential blocker.... I'm not sure how best to solve this. We might need to propose a change to rustfmt

The first thing that comes to my mind is a similar hack that we did with the setup code. That is, do something like:

  1. Try a normal rustfmt run. If exit code is zero (success), return the output, if not continue to step 1.
  2. Wrap the text in a fn main() { }.
  3. Run rustfmt. If exit code is non-zero (failure), raise an error, we couldn't find a way to rustfmt successfully. If exit code is zero (success), continue to next step.
  4. Remove fn main() { } wrapping of the formatted text.
  5. Remove one level of indentation.
  6. Voila, return the string.

I haven't tested this. What do you think, could this work? I'm afraid this might be destructive in some weird cases. 😄 Can you think of any?

@lambda-fairy
Copy link
Owner Author

Yeah, I considered that as well, but unfortunately you're right about the weird cases...

  • rustfmt enforces line length. Since the contents of fn main() { } are indented, this means that the code will end up narrower than expected.
  • Indent size is configurable, so you'll have to either parse the rustfmt config or use heuristics to figure that out.
  • Raw strings in Rust are just like Python, in that internal whitespace is preserved. Trimming spaces inside such a string might change the meaning of the code. See playground.

So while this wrapping could work in a pinch, I think long term we'll need changes in rustfmt to do this properly. I'm not familiar with the rustfmt roadmap, though, so I don't know what such a change will look like 😅

@hukkin
Copy link

hukkin commented Jan 20, 2021

Ah I see, damn.

The first two (indentation related) problems might not be that bad. Even if completely ignored, it just means that the target width is 4 chars narrower. If the default is something like 80 then the plugin would use 76 for statement scope code blocks. That is IMHO not very significant at all, and the plugin (and the ability to format statement scope code) might still be net positive like 99.9% of the time.

The raw string issue is the kind of case that I was more afraid of. I'd hate to be changing abstract syntax in any way.

@ezpuzz
Copy link

ezpuzz commented Jan 21, 2021

I like the effort you're making here and hope you don't mind I've made a PR on mdformat-rustfmt hukkin/mdformat-rustfmt#3 to get some stuff working needed to use it on the amethyst docs. I may tackle wrapping the code in fn main() {} when it fails if that's cool with you!

@ezpuzz
Copy link

ezpuzz commented Jan 21, 2021

so I think just increasing max_width by the value of tab_spaces should result in expected formatting right?

@hukkin
Copy link

hukkin commented Jan 21, 2021

Hi @ezpuzz and thanks for the PR! Would be great if @lambda-fairy had the time to have a quick look at it as I'm not well-versed in many things Rust. Already wrote down some thoughts there though.

I may tackle wrapping the code in fn main() {} when it fails if that's cool with you!

Sounds great!

so I think just increasing max_width by the value of tab_spaces should result in expected formatting right?

IIRC rustfmt config is slightly more complex than many other formatters. There might be more than one value that you have to increase by tab_spaces. A simple approach could be to use the --config flag and override all config file configs in the CLI. Just use default values and add the +4 to any max width type configs. A bit more complex (not sure at all if better) is to somehow read Rust configuration in the environment, read tab_spaces and only manipulate the values where tab_spaces needs to be added.

The indentation problem is the lesser concern to me though. I'm more worried about the raw string issue raised by @lambda-fairy . I think a fairly simple way to tackle that is to make a regex to detect a multiline raw string, e.g.

import re
RE_RUST_MULTILINE_RAW_STRING = re.compile(r'r#*".*\n.*"', flags=re.DOTALL)

and if the regex finds any matches, just raise an exception (in which case mdformat doesn't apply formatting to the code block). In the odd case this happens, I feel its better to not do anything than to apply potentially AST breaking changes.

Edit: Btw I made an issue about statement scope code formatting. We might want to continue there to not completely hijack @lambda-fairy's issue 😄

Edit 2: rustfmt seems to have rustfmt --print-config current. That could be used to read the config, and then rustfmt --config=... to override.

lambda-fairy added a commit that referenced this issue Jan 22, 2021
mdformat will preserve semantic line breaks, so we may as well commit to
using them.

cc #231
lambda-fairy added a commit that referenced this issue Jan 22, 2021
mdformat will preserve semantic line breaks, so we may as well commit to
using them.

cc #231
lambda-fairy added a commit that referenced this issue Jan 26, 2021
I don't think they add much, and they conflict with mdformat which
escapes square brackets (hukkin/mdformat#112).

cc #231
lambda-fairy added a commit that referenced this issue Jan 26, 2021
I don't think they add much, and they conflict with mdformat which
escapes square brackets (hukkin/mdformat#112).

cc #231
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants