Mention that link reference definitions are constructed from paragraphs #605

wooorm · 2019-09-11T13:29:59Z

Problem

According to the spec text:

[
␠␠␠␠# a
␠␠␠␠b
␠␠␠␠]:
␠␠␠␠example.com '
␠␠␠␠line1
␠␠␠␠...
␠␠␠␠'

...is fine: it’s a proper link reference definition. This lead me to believe that true streaming, as noted in § Appendix ¶ Phase 1, wouldn‘t work because if the last apostrophe wasn’t there, we’d need to backtrack (to the start because the opening apostrophe is on the line of the destination, if it was on its own line, the definition would be valid but we’d still need to backtrack to parse the title again).

To my surprise, the following is not a link reference definition (note one less space before a)

[
␠␠␠# a
␠␠␠␠b
␠␠␠␠]:
␠␠␠␠example.com '
␠␠␠␠line1
␠␠␠␠...
␠␠␠␠'

...# a is now a heading! Only then did I see that the Appendix contains:

Reference link definitions are detected when a paragraph is closed; the accumulated text lines are parsed to see if they begin with one or more reference link definitions. Any remainder becomes a normal paragraph.

Solution

I think it’s good to mention in the main text that link reference definitions are created from paragraphs, and include a test for it. Not entirely sure how to describe this though. This will also help prevent blank lines that are currently possible in labels (GH-586)

Extra

As paragraph lines are made into actual paragraphs and definitions, setext heading lines come into play, so relating to GH-395, I think the following may also be interesting to expand upon:

[foo]: /url
'alpha
=
bravo'

[foo]

Dingus:

<h1>'alpha</h1>
<p>bravo'</p>
<p><a href="/url">foo</a></p>

The text was updated successfully, but these errors were encountered:

jgm · 2019-09-11T22:32:52Z

The reference parser does construct these from paragraphs (similarly setext headers). That's an implementation detail, though. If we didn't care about efficiency, we could simply have a separate block parser for these and backtrack.

wooorm · 2019-09-11T22:46:00Z

Implementation details should indeed be in the appendix, agreed, but what my issue is more about, is that there’s nothing in the spec arguing for, taking a maybe more clear example, why:

[
# alpha
]: https://example.com

[# alpha][]

Yields a heading.

jgm · 2019-09-12T00:45:16Z

Yes, I agree that more needs to be said about reference link definitions.
I'm just not sure talking about "paragraphs" is the best way to do it.

wooorm · 2019-09-12T09:02:45Z

I can’t see an easy solution.

One way would be to use “interrupting content” instead of “interrupting paragraphs”:

An indented code block cannot interrupt ~~a paragraph~~ a content line. (This allows hanging indents and the like.)

ATX headings need not be separated from surrounding content by blank lines, and they can interrupt ~~paragraphs~~ content lines:

...and then both definition “lines” and paragraphs fall into that category? 🤔

jgm · 2019-09-12T15:13:39Z

Another alternative would be just to say "interrupt a paragraph or a link reference definition."

wooorm · 2019-10-01T18:09:04Z

Yeah, maybe that’s good!
I’m not so sure about the word paragraph, as setext headings are made from that construct, but as they are headings, they aren‘t really paragraphs

jgm · 2019-10-01T18:33:19Z

setext headings are made from that construct

That's just how they're handled in the reference implementation (for parsing efficiency). As far as the spec goes, they have nothing to do with paragraphs.

wooorm · 2020-07-04T18:15:33Z

Another point of confusion for me, I don‘t understand the interplay between paragraphs/setext headings/definitions:

E.g.,:

[a]: b
    content?

a
=
    content?

Yields:

content?

a

content?

What gives that there can be code after a setext heading, but not a definition? I was expecting both content?s to be paragraphs.

vassudanagunta · 2020-07-06T20:19:13Z

It seems to me the discussion above assumes that that CommonMark.js / Dingus behavior is the spec and thus the spec needs to be updated to conform to that behavior. I would suggest that this is the wrong way to look at it (with the one exception of maintaining backward compatibility that should be maintained, since that is a CommonMark spec goal).

For example, I'm working on an implementation of the CommonMark spec. It passes all the tests, yet does NOT treat the # alpha in @wooorm's example as a heading. It interprets it as the label of a link ref def.

As far as the spec goes, they have nothing to do with paragraphs.

The reference parser does construct these from paragraphs (similarly setext headers). That's an implementation detail, though. If we didn't care about efficiency, we could simply have a separate block parser for these and backtrack.

This is what my implementation does.

Given Markdown's principles (reader oriented), to me the way one decides is by asking: What does the following look like to most readers?

[
# alpha
]: https://example.com

[# alpha][]

Though at the end of the day, it's an unimportant corner case. If the author of the above Markdown cared about the reader, they would not write something so unnecessary! The line breaks serve no purpose.

But also, by that same note, any inefficiency resulting from rules that would require backtracking (is look-ahead considered backtracking?) would only affect such corner cases.

jgm mentioned this issue Nov 2, 2019

[Link Reference Definition] Block starts in Link Title that spans multiple lines break the definition #622

Open

wooorm mentioned this issue Nov 6, 2019

Clarify wording in spec for character groups #618

Merged

wooorm mentioned this issue Aug 26, 2021

inline parsing algorithm, spec conflicts with implementation #686

Open

vassudanagunta mentioned this issue Sep 5, 2021

link reference definition versus other elements #688

Open

wooorm mentioned this issue Sep 13, 2022

Definition + setext underline/thematic break, creates loose empty paragraph commonmark/commonmark.js#262

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mention that link reference definitions are constructed from paragraphs #605

Mention that link reference definitions are constructed from paragraphs #605

wooorm commented Sep 11, 2019

jgm commented Sep 11, 2019

wooorm commented Sep 11, 2019

jgm commented Sep 12, 2019

wooorm commented Sep 12, 2019

jgm commented Sep 12, 2019

wooorm commented Oct 1, 2019

jgm commented Oct 1, 2019

wooorm commented Jul 4, 2020

vassudanagunta commented Jul 6, 2020 •

edited

Loading

Mention that link reference definitions are constructed from paragraphs #605

Mention that link reference definitions are constructed from paragraphs #605

Comments

wooorm commented Sep 11, 2019

Problem

Solution

Extra

jgm commented Sep 11, 2019

wooorm commented Sep 11, 2019

jgm commented Sep 12, 2019

wooorm commented Sep 12, 2019

jgm commented Sep 12, 2019

wooorm commented Oct 1, 2019

jgm commented Oct 1, 2019

wooorm commented Jul 4, 2020

a

vassudanagunta commented Jul 6, 2020 • edited Loading

vassudanagunta commented Jul 6, 2020 •

edited

Loading