Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding attributes to fenced markdown code blocks breaks syntax highlighting #62

Closed
janosh opened this issue Oct 25, 2019 · 3 comments · Fixed by #63
Closed

Adding attributes to fenced markdown code blocks breaks syntax highlighting #62

janosh opened this issue Oct 25, 2019 · 3 comments · Fixed by #63
Assignees
Labels
feature-request Request for new features or functionality help wanted Issues identified as good community contribution opportunities

Comments

@janosh
Copy link
Contributor

janosh commented Oct 25, 2019

The following highlights fine in VS Code markdown.

```py
def greetings():
    print("Hello world!")
```

If I add attributes like line highlighting or a title, syntax highlighting is gone.

```py{1}:title=greet.py
def greetings():
    print("Hello world!")
```

How about just discarding everything after the first special character when reading the fence's language specifier?

@mjbvz mjbvz transferred this issue from microsoft/vscode Oct 25, 2019
@mjbvz
Copy link
Contributor

mjbvz commented Oct 25, 2019

What markdown engine uses that syntax?

@mjbvz mjbvz added the feature-request Request for new features or functionality label Oct 25, 2019
@janosh
Copy link
Contributor Author

janosh commented Oct 25, 2019

@mjbvz Not part of an engine. It's preprocessing but the practice of adding various attributes to code fences is quite wide spread. In my case, the titles are powered by gatsby-remark-code-titles and the line lighlighting by gatsby-remark-vscode

@mjbvz
Copy link
Contributor

mjbvz commented Oct 25, 2019

Ok, I asked because we do already ignore attribute but other parsers require a space between the language name and attributes

PRs welcome but keep in mind that some language names do contain special characters (c++, c#, ...). You likely need to change this line and add tests:

(^|\\G)(\\s*)(\`{3,}|~{3,})\\s*(?i:(${identifiers.join('|')})(\\s+[^\`~]*)?$)

@mjbvz mjbvz added the help wanted Issues identified as good community contribution opportunities label Oct 25, 2019
@mjbvz mjbvz closed this as completed in #63 Oct 28, 2019
reenberg added a commit to reenberg/vscode-markdown-tm-grammar that referenced this issue Dec 21, 2023
In microsoft#57 support for Codebraid syntax was added, which essentially is just
Pandoc attribute syntax, but with a specific class attribute added.

The support was added as an extra `identifier` in the list of languages,
for which Codebraid has support, such as for python:
`\\{\\.python.+?\\}`.

The below example would give the following scope: "text.html.markdown
markup.fenced_code.block.markdown fenced_code.block.language.markdown"
to the entire line:

```{.python .cb.nb jupyter_kernel=python3}
```

However the "language scope" should only be given to the "python" part,
and the current support doesn't allow spaces between the curly braces,
and it lacks support for all languages.

MkDocs allows a few ways to annotate fenced code blocks, but if
additional classes, id or key/value pairs are used, then the curly
braces must be used and the language must be prefixed with a dot.  In
simple cases where only the language is specified, then the curly braces
and the dot may be omitted.  The following are quick examples:

``` { .python #id .class title="My Title"}
```

or

``` python
```

This change removes the Codebraid support from the specific languages as
an `identifier` attribute, and moved into the RegEx by defining it as
two alternative cases: surrounded by curly braces or allowing them after
the language:

1. The case where the entire line after the code fence is wrapped in
   curly braces.  In this case the curly braces is not part of the
   language and attribute scope.
2. The case where the attributes follows the language specification in
   all sorts of ways (I'm specifically thinking of you Gatsby microsoft#62).  In
   this case the curly braces are included in the attribute scope as it
   is not trivial to handle all the various ways it may be used, and
   since this is the current behavior.

@microsoft-github-policy-service agree

Closes microsoft#153
Refs: https://github.com/Python-Markdown/markdown/blob/master/docs/extensions/fenced_code_blocks.md
reenberg added a commit to reenberg/vscode-markdown-tm-grammar that referenced this issue Dec 21, 2023
In microsoft#57 support for Codebraid syntax was added, which essentially is just
Pandoc attribute syntax, but with a specific class attribute added.

The support was added as an extra `identifier` in the list of languages,
for which Codebraid has support, such as for python:
`\\{\\.python.+?\\}`.

The below example would give the following scope: "text.html.markdown
markup.fenced_code.block.markdown fenced_code.block.language.markdown"
to the entire line:

```{.python .cb.nb jupyter_kernel=python3}
```

However the "language scope" should only be given to the "python" part,
and the current support doesn't allow spaces between the curly braces,
and it lacks support for all languages.

MkDocs allows a few ways to annotate fenced code blocks, but if
additional classes, id or key/value pairs are used, then the curly
braces must be used and the language must be prefixed with a dot.  In
simple cases where only the language is specified, then the curly braces
and the dot may be omitted.  The following are quick examples:

``` { .python #id .class title="My Title"}
```

or

``` python
```

This change removes the Codebraid support from the specific languages as
an `identifier` attribute, and moved into the RegEx by defining it as
two alternative cases: surrounded by curly braces or allowing them after
the language:

1. The case where the entire line after the code fence is wrapped in
   curly braces.  In this case the curly braces is not part of the
   language and attribute scope.
2. The case where the attributes follows the language specification in
   all sorts of ways (I'm specifically thinking of you Gatsby microsoft#62).  In
   this case the curly braces are included in the attribute scope as it
   is not trivial to handle all the various ways it may be used, and
   since this is the current behavior.

@microsoft-github-policy-service agree

Closes microsoft#153
Refs: https://github.com/Python-Markdown/markdown/blob/master/docs/extensions/fenced_code_blocks.md
reenberg added a commit to reenberg/vscode-markdown-tm-grammar that referenced this issue Dec 22, 2023
In microsoft#57 support for Codebraid syntax was added, which essentially is just
Pandoc attribute syntax, but with a specific class attribute added.

The support was added as an extra `identifier` in the list of languages,
for which Codebraid has support, such as for python:
`\\{\\.python.+?\\}`.

The below example would give the following scope: "text.html.markdown
markup.fenced_code.block.markdown fenced_code.block.language.markdown"
to the entire line:

```{.python .cb.nb jupyter_kernel=python3}
```

However the "language scope" should only be given to the "python" part,
and the current support doesn't allow spaces between the curly braces,
and it lacks support for all languages.

MkDocs allows a few ways to annotate fenced code blocks, but if
additional classes, id or key/value pairs are used, then the curly
braces must be used and the language must be prefixed with a dot.  In
simple cases where only the language is specified, then the curly braces
and the dot may be omitted.  The following are quick examples:

``` { .python #id .class title="My Title"}
```

or

``` python
```

This change removes the Codebraid support from the specific languages as
an `identifier` attribute, and moved into the RegEx by defining it as
two alternative cases: surrounded by curly braces or allowing them after
the language:

1. The case where the entire line after the code fence is wrapped in
   curly braces.  In this case the curly braces is not part of the
   language and attribute scope.
2. The case where the attributes follows the language specification in
   all sorts of ways (I'm specifically thinking of you Gatsby microsoft#62).  In
   this case the curly braces are included in the attribute scope as it
   is not trivial to handle all the various ways it may be used, and
   since this is the current behavior.

@microsoft-github-policy-service agree

Closes microsoft#153
Refs: https://github.com/Python-Markdown/markdown/blob/master/docs/extensions/fenced_code_blocks.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request Request for new features or functionality help wanted Issues identified as good community contribution opportunities
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants