Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept additional user-defined syntax classes in fenced code blocks #110800

Merged
merged 13 commits into from
Sep 16, 2023

Conversation

GuillaumeGomez
Copy link
Member

@GuillaumeGomez GuillaumeGomez commented Apr 25, 2023

Part of #79483.

This is a re-opening of #79454 after a big update/cleanup. I also converted the syntax to pandoc as suggested by @notriddle: the idea is to be as compatible as possible with the existing instead of having our own syntax.

Motivation

From the original issue: #78917

The technique used by inline-c-rs can be ported to other languages. It's just super fun to see C code inside Rust documentation that is also tested by cargo doc. I'm sure this technique can be used by other languages in the future.

Having custom CSS classes for syntax highlighting will allow tools like highlight.js to be used in order to provide highlighting for languages other than Rust while not increasing technical burden on rustdoc.

What is the feature about?

In short, this PR changes two things, both related to codeblocks in doc comments in Rust documentation:

  • Allow to disable generation of language-* CSS classes with the custom attribute.
  • Add your own CSS classes to a code block so that you can use other tools to highlight them.

The custom attribute

Let's start with the new custom attribute: it will disable the generation of the language-* CSS class on the generated HTML code block. For example:

/// ```custom,c
/// int main(void) {
///     return 0;
/// }
/// ```

The generated HTML code block will not have class="language-c" because the custom attribute has been set. The custom attribute becomes especially useful with the other thing added by this feature: adding your own CSS classes.

Adding your own CSS classes

The second part of this feature is to allow users to add CSS classes themselves so that they can then add a JS library which will do it (like highlight.js or prism.js), allowing to support highlighting for other languages than Rust without increasing burden on rustdoc. To disable the automatic language-* CSS class generation, you need to use the custom attribute as well.

This allow users to write the following:

/// Some code block with `{class=language-c}` as the language string.
///
/// ```custom,{class=language-c}
/// int main(void) {
///     return 0;
/// }
/// ```
fn main() {}

This will notably produce the following HTML:

<pre class="language-c">
int main(void) {
    return 0;
}</pre>

Instead of:

<pre class="rust rust-example-rendered">
<span class="ident">int</span> <span class="ident">main</span>(<span class="ident">void</span>) {
    <span class="kw">return</span> <span class="number">0</span>;
}
</pre>

To be noted, we could have written {.language-c} to achieve the same result. . and class= have the same effect.

One last syntax point: content between parens ((like this)) is now considered as comment and is not taken into account at all.

In addition to this, I added an unknown field into LangString (the parsed code block "attribute") because of cases like this:

/// ```custom,class:language-c
/// main;
/// ```
pub fn foo() {}

Without this unknown field, it would generate in the DOM: <pre class="language-class:language-c language-c">, which is quite bad. So instead, it now stores all unknown tags into the unknown field and use the first one as "language". So in this case, since there is no unknown tag, it'll simply generate <pre class="language-c">. I added tests to cover this.

Finally, I added a parser for the codeblock attributes to make it much easier to maintain. It'll be pretty easy to extend.

As to why this syntax for adding attributes was picked: it's Pandoc's syntax. Even if it seems clunkier in some cases, it's extensible, and most third-party Markdown renderers are smart enough to ignore Pandoc's brace-delimited attributes (from this comment).

Raised concerns

It's not obvious when the language-* attribute generation will be added or not.

It is added by default. If you want to disable it, you will need to use the custom attribute.

Why not using HTML in markdown directly then?

Code examples in most languages are likely to contain <, >, & and " characters. These characters require escaping when written inside the <pre> element. Using the ``` code blocks allows rustdoc to take care of escaping, which means doc authors can paste code samples directly without manually converting them to HTML.

cc @poliorcetics
r? @notriddle

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Apr 25, 2023
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@GuillaumeGomez
Copy link
Member Author

Fixed the feature version errors.

@notriddle
Copy link
Contributor

notriddle commented Apr 25, 2023

I'm surprised that it's not aligning the syntax more closely with RFC 3311.

Last time, on that RFC, I did a Babelmark run to try to design a syntax that would produce reasonable results in most Markdown parsers. The important thing is that this syntax should gracefully degrade in mdBook, docs.rs, crates.io, GitHub, and other widely-used Markdown renderers, so that people can use #[doc=include!("README.md")] and easily write documentation that works in both Rustdoc and in whatever other renderer they're using.

p.s. I still think we should align Rustdoc's handling of code strings as closely as possible with Pandoc's. Even if it seems clunkier in some cases, it's extensible, and most third-party Markdown renderers are smart enough to ignore Pandoc's brace-delimited attributes.

Copy link
Contributor

@notriddle notriddle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Babelmark run for the proposed syntax shows that this syntax behaves badly in other markdown parsers.

@GuillaumeGomez
Copy link
Member Author

GuillaumeGomez commented Apr 25, 2023

I'm surprised that it's not aligning the syntax more closely with RFC 3311.

I wasn't aware of it. Using = sounds good to me as well.

About following pandoc, it depends on one big thing: do we want to allow code-blocks to have user-defined IDs? Also, as you mentioned, it's a bit clunkier. class= is easier to read and doesn't require to implement a new parser for it. And finally, do we want it to be extensible?

EDIT: In any case, both solutions sound good to me. But I really find the pandoc one ugly. ^^'

EDIT2: Thinking about it some more, using an existing syntax would actually be wiser so I think going for the pandoc one is the right move here. I'll update the PR for it and specifically reject id for the time being.

Note for myself: should only allow for class=something and .something. Any other attribute will be ignored.

@notriddle
Copy link
Contributor

And finally, do we want it to be extensible?

We've already got two features that want key-value pairs here (RFC 3311, and this one). It seems pretty arbitrary to think there won't be more.

In any case, both solutions sound good to me. But I really find the pandoc one ugly.

I know their documentation always shows key=value pairs with quoted values, but you don't actually need to quote them. Here's an example where rust {class=test} does what you'd expect.

@GuillaumeGomez
Copy link
Member Author

I switched to the pandoc syntax as suggested and created a new parser for the markdown codeblock attributes. Any attribute which is not a class (so either . or class=) will emit a warning inside "attribute block" (ie {}).

// This will list the wrong items to make them more easily searchable.
// To ensure the most correct hits, it adds back the 'class:' that was stripped.
&format!(
"found these custom classes: class:{}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs updated.

src/librustdoc/html/markdown.rs Show resolved Hide resolved
src/doc/rustdoc/src/unstable-features.md Show resolved Hide resolved
@GuillaumeGomez
Copy link
Member Author

GuillaumeGomez commented Apr 27, 2023

I added the suggested tests, I updated the documentation and I added the support for double quotes (which is much more permissive than pandoc's).

@notriddle
Copy link
Contributor

I'd like to create an eBNF to describe this grammar.

There's a few aspects of it, like the way quotes and dots are handled, that make writing one difficult. Would you be okay with making this syntax a lot stricter in order to make it easier to write a spec?

@GuillaumeGomez
Copy link
Member Author

I'm open to it. The current handling could have an eBNF as well I think. But you seem like you have an idea for a syntax so don't hesitate to suggest an eBNF. :)

@notriddle
Copy link
Contributor

@GuillaumeGomez

Alright, I've put together an eBNF that seems to "fit the spirit of this change," by passing many of the test cases and generally acting "reasonable." It, and a combine model of it, can be found at https://gitlab.com/notriddle/fuzz-fenced-code-blocks.

@notriddle
Copy link
Contributor

Some noticeable differences between the two of them:

  • It doesn't allow opening quotes in the middle of tokens. {key=v"alue"} is invalid
  • It doesn't allow barewords to contain most ASCII punctuation. You have to quote them. For example, .foo=bar is a particularly nasty and ambiguous parse, which is solved by rejecting it as invalid

Some notable similarities:

  • It allows keys in "key"=value attributes to be quoted
  • It allows lang tokens and dotted classes to be quoted, like "rust" {."myClass"}
  • It allows multiple attribute lists, and they can be reordered, like {.myClass} rust {class=myClass2}
  • Commas are allowed as separators

All of this stuff should be changeable, if any of it is a problem (though none of it seems too onerous to me).

@notriddle notriddle added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 28, 2023
@bors
Copy link
Contributor

bors commented Apr 30, 2023

☔ The latest upstream changes (presumably #111001) made this pull request unmergeable. Please resolve the merge conflicts.

@GuillaumeGomez GuillaumeGomez force-pushed the custom_code_classes_in_docs branch 2 times, most recently from 0a6a99c to ed4a959 Compare May 2, 2023 20:07
@rust-log-analyzer

This comment has been minimized.

@rustbot rustbot added the perf-regression Performance regression. label Sep 16, 2023
@RalfJung
Copy link
Member

Is it possible that this caused a regression? We're seeing rustdoc errors about "custom classes" on nightly Miri CI since this morning. See Zulip.

@GuillaumeGomez
Copy link
Member Author

Yes, it should emit a warning and not an error. I'll fix this right away.

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 17, 2023
…ses_in_docs-into-warning, r=Manishearth

Turn custom code classes in docs into warning

By habit, since it was a new feature gate, I added a check which emitted an error in case the new syntax was used. However, since rustdoc tags parser was accepting *everything*, using the "new" syntax should never ever emit errors. It now emits a warning.

Follow-up of rust-lang#110800.

cc `@Manishearth`
r? `@notriddle`
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 18, 2023
…ses_in_docs-into-warning, r=Manishearth

Turn custom code classes in docs into warning

By habit, since it was a new feature gate, I added a check which emitted an error in case the new syntax was used. However, since rustdoc tags parser was accepting *everything*, using the "new" syntax should never ever emit errors. It now emits a warning.

Follow-up of rust-lang#110800.

cc `@Manishearth`
r? `@notriddle`
bors bot added a commit to rust-num/num-traits that referenced this pull request Sep 18, 2023
286: Bugfix for text codeblock in documentation. r=cuviper a=robamu

Fixes #285 .
From what I have seen, the local output generated with `cargo +nightly doc --open` (rustc v1.74.0 nightly 203c57dbe) does not look differently than the most current one found here: https://docs.rs/num-traits/latest/num_traits/identities/trait.One.html

Propably related to rust-lang/rust#110800 ? If it is, then the issue might be fixed soon and this PR should be ignored..

Co-authored-by: Robin Mueller <robin.mueller.m@gmail.com>
@Mark-Simulacrum Mark-Simulacrum added the perf-regression-triaged The performance regression has been triaged. label Sep 19, 2023
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Sep 19, 2023
…_in_docs-warning, r=notriddle

Custom code classes in docs warning

Fixes rust-lang#115938.

This PR does two things:
 1. Unless the `custom_code_classes_in_docs` feature is enabled, it will use the old codeblock tag parser.
 2. If there is a codeblock tag that starts with a `.`, it will emit a behaviour change warning.

Hopefully this is the last missing part for this feature until stabilization.

Follow-up of rust-lang#110800.

r? `@notriddle`
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Sep 19, 2023
Rollup merge of rust-lang#115947 - GuillaumeGomez:custom_code_classes_in_docs-warning, r=notriddle

Custom code classes in docs warning

Fixes rust-lang#115938.

This PR does two things:
 1. Unless the `custom_code_classes_in_docs` feature is enabled, it will use the old codeblock tag parser.
 2. If there is a codeblock tag that starts with a `.`, it will emit a behaviour change warning.

Hopefully this is the last missing part for this feature until stabilization.

Follow-up of rust-lang#110800.

r? `@notriddle`
@apiraino apiraino removed the to-announce Announce this issue on triage meeting label Sep 21, 2023
notriddle added a commit to notriddle/rust that referenced this pull request Dec 6, 2023
It's not stable yet, and shouldn't be mentioned here.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Dec 6, 2023
docs: remove rust-lang#110800 from release notes

It's not stable yet, and shouldn't be mentioned here. At least, the message shouldn't be written like this.

I realize it's weird to go through an FCP, and then have the feature remain unstable, but this was an unusual case.

Rustdoc used to silently swallow unknown language tokens on code blocks, and now it produces a compatibility warning. The FCP got everyone's sign-off on the warning, not the finished feature, which remains unstable.
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 6, 2023
…iaskrgr

Rollup of 7 pull requests

Successful merges:

 - rust-lang#116496 (Provide context when `?` can't be called because of `Result<_, E>`)
 - rust-lang#117563 (docs: clarify explicitly freeing heap allocated memory)
 - rust-lang#117874 (`riscv32` platform support)
 - rust-lang#118516 (Add ADT variant infomation to StableMIR and finish implementing TyKind::internal())
 - rust-lang#118650 (add comment about keeping flags in sync between bootstrap.py and bootstrap.rs)
 - rust-lang#118664 (docs: remove rust-lang#110800 from release notes)
 - rust-lang#118669 (library: fix comment about const assert in win api)

r? `@ghost`
`@rustbot` modify labels: rollup
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Dec 6, 2023
Rollup merge of rust-lang#118664 - notriddle:master, r=Mark-Simulacrum

docs: remove rust-lang#110800 from release notes

It's not stable yet, and shouldn't be mentioned here. At least, the message shouldn't be written like this.

I realize it's weird to go through an FCP, and then have the feature remain unstable, but this was an unusual case.

Rustdoc used to silently swallow unknown language tokens on code blocks, and now it produces a compatibility warning. The FCP got everyone's sign-off on the warning, not the finished feature, which remains unstable.
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Jan 8, 2024
Pkgsrc changes:

 * Remove NetBSD-8 support (embedded LLVm requires newer C++
   than what is in -8; it's conceivable that this could still
   build with an external LLVM)
 * undo powerpc 9.0 file naming tweak, since we no longer support -8.
 * Remove patch to LLVM for powerpc now included by upstream.
 * Minor adjustments, checksum changes etc.


Upstream changes:

Version 1.74.1 (2023-12-07)
===========================

- [Resolved spurious STATUS_ACCESS_VIOLATIONs in LLVM]
  (rust-lang/rust#118464)
- [Clarify guarantees for std::mem::discriminant]
  (rust-lang/rust#118006)
- [Fix some subtyping-related regressions]
  (rust-lang/rust#116415)

Version 1.74.0 (2023-11-16)
==========================

Language
--------

- [Codify that `std::mem::Discriminant<T>` does not depend on any
  lifetimes in T]
  (rust-lang/rust#104299)
- [Replace `private_in_public` lint with `private_interfaces` and
  `private_bounds` per RFC 2145]
  (rust-lang/rust#113126)
  Read more in
  [RFC 2145](https://rust-lang.github.io/rfcs/2145-type-privacy.html).
- [Allow explicit `#[repr(Rust)]`]
  (rust-lang/rust#114201)
- [closure field capturing: don't depend on alignment of packed fields]
  (rust-lang/rust#115315)
- [Enable MIR-based drop-tracking for `async` blocks]
  (rust-lang/rust#107421)

Compiler
--------

- [stabilize combining +bundle and +whole-archive link modifiers]
  (rust-lang/rust#113301)
- [Stabilize `PATH` option for `--print KIND=PATH`]
  (rust-lang/rust#114183)
- [Enable ASAN/LSAN/TSAN for `*-apple-ios-macabi`]
  (rust-lang/rust#115644)
- [Promote loongarch64-unknown-none* to Tier 2]
  (rust-lang/rust#115368)
- [Add `i686-pc-windows-gnullvm` as a tier 3 target]
  (rust-lang/rust#115687)

Libraries
---------

- [Implement `From<OwnedFd/Handle>` for ChildStdin/out/err]
  (rust-lang/rust#98704)
- [Implement `From<{&,&mut} [T; N]>` for `Vec<T>` where `T: Clone`]
  (rust-lang/rust#111278)
- [impl Step for IP addresses]
  (rust-lang/rust#113748)
- [Implement `From<[T; N]>` for `Rc<[T]>` and `Arc<[T]>`]
  (rust-lang/rust#114041)
- [`impl TryFrom<char> for u16`]
  (rust-lang/rust#114065)
- [Stabilize `io_error_other` feature]
  (rust-lang/rust#115453)
- [Stabilize the `Saturating` type]
  (rust-lang/rust#115477)
- [Stabilize const_transmute_copy]
  (rust-lang/rust#115520)

Stabilized APIs
---------------

- [`core::num::Saturating`]
  (https://doc.rust-lang.org/stable/std/num/struct.Saturating.html)
- [`impl From<io::Stdout> for std::process::Stdio`]
  (https://doc.rust-lang.org/stable/std/process/struct.Stdio.html#impl-From%3CStdout%3E-for-Stdio)
- [`impl From<io::Stderr> for std::process::Stdio`]
  (https://doc.rust-lang.org/stable/std/process/struct.Stdio.html#impl-From%3CStderr%3E-for-Stdio)
- [`impl From<OwnedHandle> for std::process::Child{Stdin, Stdout, Stderr}`]
  (https://doc.rust-lang.org/stable/std/process/struct.Stdio.html#impl-From%3CStderr%3E-for-Stdio)
- [`impl From<OwnedFd> for std::process::Child{Stdin, Stdout, Stderr}`]
  (https://doc.rust-lang.org/stable/std/process/struct.Stdio.html#impl-From%3CStderr%3E-for-Stdio)
- [`std::ffi::OsString::from_encoded_bytes_unchecked`]
  (https://doc.rust-lang.org/stable/std/ffi/struct.OsString.html#method.from_encoded_bytes_unchecked)
- [`std::ffi::OsString::into_encoded_bytes`]
  (https://doc.rust-lang.org/stable/std/ffi/struct.OsString.html#method.into_encoded_bytes)
- [`std::ffi::OsStr::from_encoded_bytes_unchecked`]
  (https://doc.rust-lang.org/stable/std/ffi/struct.OsStr.html#method.from_encoded_bytes_unchecked)
- [`std::ffi::OsStr::as_encoded_bytes`]
  (https://doc.rust-lang.org/stable/std/ffi/struct.OsStr.html#method.as_encoded_bytes)
- [`std::io::Error::other`]
  (https://doc.rust-lang.org/stable/std/io/struct.Error.html#method.other)
- [`impl TryFrom<char> for u16`]
  (https://doc.rust-lang.org/stable/std/primitive.u16.html#impl-TryFrom%3Cchar%3E-for-u16)
- [`impl<T: Clone, const N: usize> From<&[T; N]> for Vec<T>`]
  (https://doc.rust-lang.org/stable/std/vec/struct.Vec.html#impl-From%3C%26%5BT;+N%5D%3E-for-Vec%3CT,+Global%3E)
- [`impl<T: Clone, const N: usize> From<&mut [T; N]> for Vec<T>`]
  (https://doc.rust-lang.org/stable/std/vec/struct.Vec.html#impl-From%3C%26mut+%5BT;+N%5D%3E-for-Vec%3CT,+Global%3E)
- [`impl<T, const N: usize> From<[T; N]> for Arc<[T]>`]
  (https://doc.rust-lang.org/stable/std/sync/struct.Arc.html#impl-From%3C%5BT;+N%5D%3E-for-Arc%3C%5BT%5D,+Global%3E)
- [`impl<T, const N: usize> From<[T; N]> for Rc<[T]>`]
  (https://doc.rust-lang.org/stable/std/rc/struct.Rc.html#impl-From%3C%5BT;+N%5D%3E-for-Rc%3C%5BT%5D,+Global%3E)

These APIs are now stable in const contexts:

- [`core::mem::transmute_copy`]
  (https://doc.rust-lang.org/beta/std/mem/fn.transmute_copy.html)
- [`str::is_ascii`]
  (https://doc.rust-lang.org/beta/std/primitive.str.html#method.is_ascii)
- [`[u8]::is_ascii`]
  (https://doc.rust-lang.org/beta/std/primitive.slice.html#method.is_ascii)

Cargo
-----

- [fix: Set MSRV for internal packages]
  (rust-lang/cargo#12381)
- [config: merge lists in precedence order]
  (rust-lang/cargo#12515)
- [fix(update): Clarify meaning of --aggressive as --recursive]
  (rust-lang/cargo#12544)
- [fix(update): Make `-p` more convenient by being positional]
  (rust-lang/cargo#12545)
- [feat(help): Add styling to help output ]
  (rust-lang/cargo#12578)
- [feat(pkgid): Allow incomplete versions when unambigious]
  (rust-lang/cargo#12614)
- [feat: stabilize credential-process and registry-auth]
  (rust-lang/cargo#12649)
- [feat(cli): Add '-n' to dry-run]
  (rust-lang/cargo#12660)
- [Add support for `target.'cfg(..)'.linker`]
  (rust-lang/cargo#12535)
- [Stabilize `--keep-going`]
  (rust-lang/cargo#12568)
- [feat: Stabilize lints]
  (rust-lang/cargo#12648)

Rustdoc
-------

- [Add warning block support in rustdoc]
  (rust-lang/rust#106561)
- [Accept additional user-defined syntax classes in fenced code blocks]
  (rust-lang/rust#110800)
- [rustdoc-search: add support for type parameters]
  (rust-lang/rust#112725)
- [rustdoc: show inner enum and struct in type definition for concrete type]
  (rust-lang/rust#114855)

Compatibility Notes
-------------------

- [Raise minimum supported Apple OS versions]
  (rust-lang/rust#104385)
- [make Cell::swap panic if the Cells partially overlap]
  (rust-lang/rust#114795)
- [Reject invalid crate names in `--extern`]
  (rust-lang/rust#116001)
- [Don't resolve generic impls that may be shadowed by dyn built-in impls]
  (rust-lang/rust#114941)

Internal Changes
----------------

These changes do not affect any public interfaces of Rust, but they represent
significant improvements to the performance or internals of rustc and related
tools.

None this cycle.
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Jun 1, 2024
…de_classes_in_docs, r=rustdoc

Stabilize `custom_code_classes_in_docs` feature

Fixes rust-lang#79483.

This feature has been around for quite some time now, I think it's fine to stabilize it now.

## Summary

## What is the feature about?

In short, this PR changes two things, both related to codeblocks in doc comments in Rust documentation:

 * Allow to disable generation of `language-*` CSS classes with the `custom` attribute.
 * Add your own CSS classes to a code block so that you can use other tools to highlight them.

#### The `custom` attribute

Let's start with the new `custom` attribute: it will disable the generation of the `language-*` CSS class on the generated HTML code block. For example:

```rust
/// ```custom,c
/// int main(void) {
///     return 0;
/// }
/// ```
```

The generated HTML code block will not have `class="language-c"` because the `custom` attribute has been set. The `custom` attribute becomes especially useful with the other thing added by this feature: adding your own CSS classes.

#### Adding your own CSS classes

The second part of this feature is to allow users to add CSS classes themselves so that they can then add a JS library which will do it (like `highlight.js` or `prism.js`), allowing to support highlighting for other languages than Rust without increasing burden on rustdoc. To disable the automatic `language-*` CSS class generation, you need to use the `custom` attribute as well.

This allow users to write the following:

```rust
/// Some code block with `{class=language-c}` as the language string.
///
/// ```custom,{class=language-c}
/// int main(void) {
///     return 0;
/// }
/// ```
fn main() {}
```

This will notably produce the following HTML:

```html
<pre class="language-c">
int main(void) {
    return 0;
}</pre>
```

Instead of:

```html
<pre class="rust rust-example-rendered">
<span class="ident">int</span> <span class="ident">main</span>(<span class="ident">void</span>) {
    <span class="kw">return</span> <span class="number">0</span>;
}
</pre>
```

To be noted, we could have written `{.language-c}` to achieve the same result. `.` and `class=` have the same effect.

One last syntax point: content between parens (`(like this)`) is now considered as comment and is not taken into account at all.

In addition to this, I added an `unknown` field into `LangString` (the parsed code block "attribute") because of cases like this:

```rust
/// ```custom,class:language-c
/// main;
/// ```
pub fn foo() {}
```

Without this `unknown` field, it would generate in the DOM: `<pre class="language-class:language-c language-c">`, which is quite bad. So instead, it now stores all unknown tags into the `unknown` field and use the first one as "language". So in this case, since there is no unknown tag, it'll simply generate `<pre class="language-c">`. I added tests to cover this.

EDIT(camelid): This description is out-of-date. Using `custom,class:language-c` will generate the output `<pre class="language-class:language-c">` as would be expected; it treats `class:language-c` as just the name of a language (similar to the langstring `c` or `js` or what have you) since it does not use the designed class syntax.

Finally, I added a parser for the codeblock attributes to make it much easier to maintain. It'll be pretty easy to extend.

As to why this syntax for adding attributes was picked: it's [Pandoc's syntax](https://pandoc.org/MANUAL.html#extension-fenced_code_attributes). Even if it seems clunkier in some cases, it's extensible, and most third-party Markdown renderers are smart enough to ignore Pandoc's brace-delimited attributes (from [this comment](rust-lang#110800 (comment))).

r? `@notriddle`
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 1, 2024
…_classes_in_docs, r=rustdoc

Stabilize `custom_code_classes_in_docs` feature

Fixes rust-lang#79483.

This feature has been around for quite some time now, I think it's fine to stabilize it now.

## Summary

## What is the feature about?

In short, this PR changes two things, both related to codeblocks in doc comments in Rust documentation:

 * Allow to disable generation of `language-*` CSS classes with the `custom` attribute.
 * Add your own CSS classes to a code block so that you can use other tools to highlight them.

#### The `custom` attribute

Let's start with the new `custom` attribute: it will disable the generation of the `language-*` CSS class on the generated HTML code block. For example:

```rust
/// ```custom,c
/// int main(void) {
///     return 0;
/// }
/// ```
```

The generated HTML code block will not have `class="language-c"` because the `custom` attribute has been set. The `custom` attribute becomes especially useful with the other thing added by this feature: adding your own CSS classes.

#### Adding your own CSS classes

The second part of this feature is to allow users to add CSS classes themselves so that they can then add a JS library which will do it (like `highlight.js` or `prism.js`), allowing to support highlighting for other languages than Rust without increasing burden on rustdoc. To disable the automatic `language-*` CSS class generation, you need to use the `custom` attribute as well.

This allow users to write the following:

```rust
/// Some code block with `{class=language-c}` as the language string.
///
/// ```custom,{class=language-c}
/// int main(void) {
///     return 0;
/// }
/// ```
fn main() {}
```

This will notably produce the following HTML:

```html
<pre class="language-c">
int main(void) {
    return 0;
}</pre>
```

Instead of:

```html
<pre class="rust rust-example-rendered">
<span class="ident">int</span> <span class="ident">main</span>(<span class="ident">void</span>) {
    <span class="kw">return</span> <span class="number">0</span>;
}
</pre>
```

To be noted, we could have written `{.language-c}` to achieve the same result. `.` and `class=` have the same effect.

One last syntax point: content between parens (`(like this)`) is now considered as comment and is not taken into account at all.

In addition to this, I added an `unknown` field into `LangString` (the parsed code block "attribute") because of cases like this:

```rust
/// ```custom,class:language-c
/// main;
/// ```
pub fn foo() {}
```

Without this `unknown` field, it would generate in the DOM: `<pre class="language-class:language-c language-c">`, which is quite bad. So instead, it now stores all unknown tags into the `unknown` field and use the first one as "language". So in this case, since there is no unknown tag, it'll simply generate `<pre class="language-c">`. I added tests to cover this.

EDIT(camelid): This description is out-of-date. Using `custom,class:language-c` will generate the output `<pre class="language-class:language-c">` as would be expected; it treats `class:language-c` as just the name of a language (similar to the langstring `c` or `js` or what have you) since it does not use the designed class syntax.

Finally, I added a parser for the codeblock attributes to make it much easier to maintain. It'll be pretty easy to extend.

As to why this syntax for adding attributes was picked: it's [Pandoc's syntax](https://pandoc.org/MANUAL.html#extension-fenced_code_attributes). Even if it seems clunkier in some cases, it's extensible, and most third-party Markdown renderers are smart enough to ignore Pandoc's brace-delimited attributes (from [this comment](rust-lang#110800 (comment))).

r? `@notriddle`
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Jun 5, 2024
…in_docs, r=rustdoc

Stabilize `custom_code_classes_in_docs` feature

Fixes #79483.

This feature has been around for quite some time now, I think it's fine to stabilize it now.

## Summary

## What is the feature about?

In short, this PR changes two things, both related to codeblocks in doc comments in Rust documentation:

 * Allow to disable generation of `language-*` CSS classes with the `custom` attribute.
 * Add your own CSS classes to a code block so that you can use other tools to highlight them.

#### The `custom` attribute

Let's start with the new `custom` attribute: it will disable the generation of the `language-*` CSS class on the generated HTML code block. For example:

```rust
/// ```custom,c
/// int main(void) {
///     return 0;
/// }
/// ```
```

The generated HTML code block will not have `class="language-c"` because the `custom` attribute has been set. The `custom` attribute becomes especially useful with the other thing added by this feature: adding your own CSS classes.

#### Adding your own CSS classes

The second part of this feature is to allow users to add CSS classes themselves so that they can then add a JS library which will do it (like `highlight.js` or `prism.js`), allowing to support highlighting for other languages than Rust without increasing burden on rustdoc. To disable the automatic `language-*` CSS class generation, you need to use the `custom` attribute as well.

This allow users to write the following:

```rust
/// Some code block with `{class=language-c}` as the language string.
///
/// ```custom,{class=language-c}
/// int main(void) {
///     return 0;
/// }
/// ```
fn main() {}
```

This will notably produce the following HTML:

```html
<pre class="language-c">
int main(void) {
    return 0;
}</pre>
```

Instead of:

```html
<pre class="rust rust-example-rendered">
<span class="ident">int</span> <span class="ident">main</span>(<span class="ident">void</span>) {
    <span class="kw">return</span> <span class="number">0</span>;
}
</pre>
```

To be noted, we could have written `{.language-c}` to achieve the same result. `.` and `class=` have the same effect.

One last syntax point: content between parens (`(like this)`) is now considered as comment and is not taken into account at all.

In addition to this, I added an `unknown` field into `LangString` (the parsed code block "attribute") because of cases like this:

```rust
/// ```custom,class:language-c
/// main;
/// ```
pub fn foo() {}
```

Without this `unknown` field, it would generate in the DOM: `<pre class="language-class:language-c language-c">`, which is quite bad. So instead, it now stores all unknown tags into the `unknown` field and use the first one as "language". So in this case, since there is no unknown tag, it'll simply generate `<pre class="language-c">`. I added tests to cover this.

EDIT(camelid): This description is out-of-date. Using `custom,class:language-c` will generate the output `<pre class="language-class:language-c">` as would be expected; it treats `class:language-c` as just the name of a language (similar to the langstring `c` or `js` or what have you) since it does not use the designed class syntax.

Finally, I added a parser for the codeblock attributes to make it much easier to maintain. It'll be pretty easy to extend.

As to why this syntax for adding attributes was picked: it's [Pandoc's syntax](https://pandoc.org/MANUAL.html#extension-fenced_code_attributes). Even if it seems clunkier in some cases, it's extensible, and most third-party Markdown renderers are smart enough to ignore Pandoc's brace-delimited attributes (from [this comment](rust-lang/rust#110800 (comment))).

r? `@notriddle`
bors added a commit to rust-lang/rust-analyzer that referenced this pull request Jun 20, 2024
…in_docs, r=rustdoc

Stabilize `custom_code_classes_in_docs` feature

Fixes #79483.

This feature has been around for quite some time now, I think it's fine to stabilize it now.

## Summary

## What is the feature about?

In short, this PR changes two things, both related to codeblocks in doc comments in Rust documentation:

 * Allow to disable generation of `language-*` CSS classes with the `custom` attribute.
 * Add your own CSS classes to a code block so that you can use other tools to highlight them.

#### The `custom` attribute

Let's start with the new `custom` attribute: it will disable the generation of the `language-*` CSS class on the generated HTML code block. For example:

```rust
/// ```custom,c
/// int main(void) {
///     return 0;
/// }
/// ```
```

The generated HTML code block will not have `class="language-c"` because the `custom` attribute has been set. The `custom` attribute becomes especially useful with the other thing added by this feature: adding your own CSS classes.

#### Adding your own CSS classes

The second part of this feature is to allow users to add CSS classes themselves so that they can then add a JS library which will do it (like `highlight.js` or `prism.js`), allowing to support highlighting for other languages than Rust without increasing burden on rustdoc. To disable the automatic `language-*` CSS class generation, you need to use the `custom` attribute as well.

This allow users to write the following:

```rust
/// Some code block with `{class=language-c}` as the language string.
///
/// ```custom,{class=language-c}
/// int main(void) {
///     return 0;
/// }
/// ```
fn main() {}
```

This will notably produce the following HTML:

```html
<pre class="language-c">
int main(void) {
    return 0;
}</pre>
```

Instead of:

```html
<pre class="rust rust-example-rendered">
<span class="ident">int</span> <span class="ident">main</span>(<span class="ident">void</span>) {
    <span class="kw">return</span> <span class="number">0</span>;
}
</pre>
```

To be noted, we could have written `{.language-c}` to achieve the same result. `.` and `class=` have the same effect.

One last syntax point: content between parens (`(like this)`) is now considered as comment and is not taken into account at all.

In addition to this, I added an `unknown` field into `LangString` (the parsed code block "attribute") because of cases like this:

```rust
/// ```custom,class:language-c
/// main;
/// ```
pub fn foo() {}
```

Without this `unknown` field, it would generate in the DOM: `<pre class="language-class:language-c language-c">`, which is quite bad. So instead, it now stores all unknown tags into the `unknown` field and use the first one as "language". So in this case, since there is no unknown tag, it'll simply generate `<pre class="language-c">`. I added tests to cover this.

EDIT(camelid): This description is out-of-date. Using `custom,class:language-c` will generate the output `<pre class="language-class:language-c">` as would be expected; it treats `class:language-c` as just the name of a language (similar to the langstring `c` or `js` or what have you) since it does not use the designed class syntax.

Finally, I added a parser for the codeblock attributes to make it much easier to maintain. It'll be pretty easy to extend.

As to why this syntax for adding attributes was picked: it's [Pandoc's syntax](https://pandoc.org/MANUAL.html#extension-fenced_code_attributes). Even if it seems clunkier in some cases, it's extensible, and most third-party Markdown renderers are smart enough to ignore Pandoc's brace-delimited attributes (from [this comment](rust-lang/rust#110800 (comment))).

r? `@notriddle`
flip1995 pushed a commit to flip1995/rust-clippy that referenced this pull request Jun 28, 2024
…in_docs, r=rustdoc

Stabilize `custom_code_classes_in_docs` feature

Fixes #79483.

This feature has been around for quite some time now, I think it's fine to stabilize it now.

## Summary

## What is the feature about?

In short, this PR changes two things, both related to codeblocks in doc comments in Rust documentation:

 * Allow to disable generation of `language-*` CSS classes with the `custom` attribute.
 * Add your own CSS classes to a code block so that you can use other tools to highlight them.

#### The `custom` attribute

Let's start with the new `custom` attribute: it will disable the generation of the `language-*` CSS class on the generated HTML code block. For example:

```rust
/// ```custom,c
/// int main(void) {
///     return 0;
/// }
/// ```
```

The generated HTML code block will not have `class="language-c"` because the `custom` attribute has been set. The `custom` attribute becomes especially useful with the other thing added by this feature: adding your own CSS classes.

#### Adding your own CSS classes

The second part of this feature is to allow users to add CSS classes themselves so that they can then add a JS library which will do it (like `highlight.js` or `prism.js`), allowing to support highlighting for other languages than Rust without increasing burden on rustdoc. To disable the automatic `language-*` CSS class generation, you need to use the `custom` attribute as well.

This allow users to write the following:

```rust
/// Some code block with `{class=language-c}` as the language string.
///
/// ```custom,{class=language-c}
/// int main(void) {
///     return 0;
/// }
/// ```
fn main() {}
```

This will notably produce the following HTML:

```html
<pre class="language-c">
int main(void) {
    return 0;
}</pre>
```

Instead of:

```html
<pre class="rust rust-example-rendered">
<span class="ident">int</span> <span class="ident">main</span>(<span class="ident">void</span>) {
    <span class="kw">return</span> <span class="number">0</span>;
}
</pre>
```

To be noted, we could have written `{.language-c}` to achieve the same result. `.` and `class=` have the same effect.

One last syntax point: content between parens (`(like this)`) is now considered as comment and is not taken into account at all.

In addition to this, I added an `unknown` field into `LangString` (the parsed code block "attribute") because of cases like this:

```rust
/// ```custom,class:language-c
/// main;
/// ```
pub fn foo() {}
```

Without this `unknown` field, it would generate in the DOM: `<pre class="language-class:language-c language-c">`, which is quite bad. So instead, it now stores all unknown tags into the `unknown` field and use the first one as "language". So in this case, since there is no unknown tag, it'll simply generate `<pre class="language-c">`. I added tests to cover this.

EDIT(camelid): This description is out-of-date. Using `custom,class:language-c` will generate the output `<pre class="language-class:language-c">` as would be expected; it treats `class:language-c` as just the name of a language (similar to the langstring `c` or `js` or what have you) since it does not use the designed class syntax.

Finally, I added a parser for the codeblock attributes to make it much easier to maintain. It'll be pretty easy to extend.

As to why this syntax for adding attributes was picked: it's [Pandoc's syntax](https://pandoc.org/MANUAL.html#extension-fenced_code_attributes). Even if it seems clunkier in some cases, it's extensible, and most third-party Markdown renderers are smart enough to ignore Pandoc's brace-delimited attributes (from [this comment](rust-lang/rust#110800 (comment))).

r? `@notriddle`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This issue / PR is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this PR / Issue. merged-by-bors This PR was explicitly merged by bors. perf-regression Performance regression. perf-regression-triaged The performance regression has been triaged. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.