Improving the italic and bold styles #89

IonicaBizau · 2015-04-15T13:24:54Z

This will fix #84. Since I'm not a regex guru I got some inspiration from the marked package.

Italic style fix
Bold style fix
Combined (?)
Handle styles on multiple lines. Example:
```
*Lorem
ipsum
dolor
sit
amet*
```

Before

After

Waiting for feedback if the direction is good, then I will continue. 😄

/cc @izuzak

mnquintana · 2015-04-15T13:41:21Z

This is a good start! Eventually you'll want to add specs for this.

(Also, 💄 is deprecated in favor of 🎨 for code structure / formatting changes, and this isn't really a code structure / formatting change – it's really a 🐛 )

IonicaBizau · 2015-04-15T13:46:30Z

Indeed, the art is much better than a lipstick. 😄

OK! Will make sure that the tests are passing and I will also try to add the specs for this.

tpoisot · 2015-04-15T13:56:29Z

Really cool. I think that having the italic/bold marks with a different syntax style than the italic/bold content is useful.

IonicaBizau · 2015-04-15T19:43:48Z

I fixed the bold and bold + italic styles as well:

Is it possible to pass the regex flags in this format (would need to pass m for multiline)? Or how to fix the multiline thing (the last two lines from the screenshot are supposed to have bold style)?

Also, I'm not sure where to start fixing the failing tests. 💚

tpoisot · 2015-04-15T20:59:02Z

I think you should capture the * and _ too, so that they can be separately styled.

And the tests are in the spec/folder

IonicaBizau · 2015-04-16T09:00:02Z

I think you should capture the * and _ too, so that they can be separately styled.

That's already done, but not separately, but I was asking how can I apply bold style on the last two rows? I guess it's related to the m (multiline) regex flag.

I will try to fix the failing tests and then writing the new specs on this issue.

IonicaBizau · 2015-04-17T09:10:20Z

Is it possible to log only the first failing test when running apm test?

izuzak · 2015-04-17T09:22:32Z

Is it possible to log only the first failing test when running apm test?

I don't know if that's possible (see jasmine/jasmine#414), but you can focus on a specific describe or it block by prefixing it with "f". See https://github.com/atom/jasmine-focused#using. Does that help perhaps?

IonicaBizau · 2015-04-17T09:28:02Z

Yes, that's a good solution. Thanks! 👍

Also, is it possible to merge repetitive snippets like the following ones, into one?

{
  'match': '...',
  'captures': 
    '2': 
       'name': 'markup.bold.italic.gfm'
    '3': 
       'name': 'markup.bold.italic.gfm'
    '4': 
       'name': 'markup.bold.italic.gfm'
}

...maybe something like this:

{
  'match': '...',
  'captures': 
    '2': 
    '3': 
    '4': 
       'name': 'markup.bold.italic.gfm'
}

I'm not very familiar with the cson format.

izuzak · 2015-04-17T09:33:27Z

I don't think that's possible.

IonicaBizau · 2015-04-17T10:05:33Z

grammars/gfm.cson

- 'begin': '(?<=^|[^\\w\\d_])__(?!$|_|\\s)'
- 'end': '(?<!^|\\s)__*_(?=$|[^\\w|\\d])'
- 'name': 'markup.bold.gfm'
+ 'match': '^___([\\s\\S]+?)___(?!_)|^\\*\\*\\*([\\s\\S]+?)\\*\\*\\*(?!\\*)'


I currently have 'match': '(?<=^|[^\\w\\d\\*])(\\*\\*\\*)([\\s\\S]+?)(\\*\\*\\*)(?=$|[^\\w|\\d])', and I the tokens are:

[ { "value": "this is ", "scopes": [ "source.gfm" ], "bufferDelta": 8, "hasPairedCharacter": false, "screenDelta": 8 }, { "value": "***bold italic***", "scopes": [ "source.gfm", "markup.bold.italic.gfm" ], "bufferDelta": 17, "hasPairedCharacter": false, "screenDelta": 17 }, { "value": " text", "scopes": [ "source.gfm" ], "bufferDelta": 5, "hasPairedCharacter": false, "screenDelta": 5 } ]

I expected that by doing (\\*\\*\\*) these will be grouped in another element ({ value: "***" }). The regex101 tester works fine in this direction:

How can I catch these characters into a separate group when using apm test? Also, what tool is used for parsing this regex?

leipert · 2015-04-19T09:11:17Z

The main problem I found: mixed emphasis are handled badly, as well as some cases of escaped * and _.

I created this test markdown, below you can see how github renders it. As you can see some github rendered stuff is not consistent.

### Emphasis

#### Bold `_`

+ __a__
+ __ab__
+ __abc__
+ __bbc___
+ __x + y + z__

NOT:

+ as __danach__fasd
+ as __danach____fasd
+ davor__asd__ as
+ in__a__word
+ in 1__8__99 numbers
+ __ abc __
+ __ abc__
+ __abc __

MULTILINE:
__sadsad
asdsadasd
__

#### Italic `_`

+ _a_
+ _ab_
+ _abc_
+ _bbc__
+ _x + y + z_

NOT:

+ as _danach_fasd
+ as _danach___fasd
+ davor_asd_ as
+ in_a_word
+ in 1_8_99 numbers
+ _ abc _
+ _ abc_
+ _abc _

#### Bold `*`

+ **a**
+ **ab**
+ **\\**
+ in**a**word
+ in 1**8**99 numbers
+ as **danach**fasd
+ davor**asd** as
+ **abc**
+ **bbc***
+ **x + y + z**

NOT:

+ **\**
+ ** **, *****, **abc*\*
+ ** abc **
+ ** abc**
+ **abc **

#### Italic `*`

+ *a*
+ *ab*
+ *\\*
+ in*a*word
+ as *danach*fasd
+ davor*asd* as
+ in 1*8*99 numbers
+ *abc*
+ *bbc**
+ *x + y + z*

NOT:

+ *\*, * *, ***, *abc*
+ * abc *
+ * abc*
+ *abc *

#### Mixed

Works:

this is ***bold italic*** text

***BoldItalic***
___BoldItalic___
**Bold *BoldItalic* Bold**
__Bold *BoldItalic* Bold__
_Italic **ItalicBold** Italic_
**Bold _BoldItalic_ Bold**
*Italic __ItalicBold__ Italic*
__Bold _BoldItalic_ Bold__

Works not:

*Italic **ItalicBold** Italic*

_Italic __ItalicBold__ Italic_



Should not work:

_\_, _ _, ____
__\__, __ __, _____

__asd \*neither is this\* asd__

**asd \_neither is this\_ asd**

__asd \**neither is this*\* asd__

__asd \_neither is this\_ asd__

#### Rest

This text is _emphasized with underscores_, and this
is *emphasized with asterisks*

This is **strong emphasis** and __with underscores__.

Working:
***bold italic***

Not Working:
*italic **bold italic** italic*
**bold *bold italic* bold**

This is * not emphasized *, and \*neither is this\*.

Emphasis

Bold `_`

a
ab
abc
bbc_
x + y + z

NOT:

as __danach__fasd
as __danach____fasd
davor__asd__ as
in__a__word
in 1__8__99 numbers
__ abc __
__ abc__
__abc __

MULTILINE:
__sadsad
asdsadasd
__

Italic `_`

a
ab
abc
bbc_
x + y + z

NOT:

as _danach_fasd
as _danach___fasd
davor_asd_ as
in_a_word
in 1_8_99 numbers
_ abc _
_ abc_
_abc _

Bold `*`

a
ab
**
inaword
in 1899 numbers
as danachfasd
davorasd as
abc
bbc*
x + y + z

NOT:

**
** **, ****_, *abc_
** abc **
* abc*
*abc *

Italic `*`

a
ab
__
in_a_word
as _danach_fasd
davor_asd_ as
in 1_8_99 numbers
abc
bbc*
x + y + z

NOT:

__, * , *__, *abc
* abc *
* abc*
*abc *

Mixed

Works:

this is _bold italic_ text

_BoldItalic_
_BoldItalic_
Bold BoldItalic Bold
Bold BoldItalic Bold
Italic ItalicBold Italic
Bold BoldItalic Bold
Italic ItalicBold Italic
Bold BoldItalic Bold

Works not:

_Italic _ItalicBold* Italic*

Italic ItalicBold Italic

Should not work:

__, _ _, ____
****, __ , ___

asd neither is this asd

Rest

This text is emphasized with underscores, and this
is emphasized with asterisks

This is strong emphasis and with underscores.

Working:
_bold italic_

Not Working:
_italic _bold italic* italic*
bold bold italic bold

This is * not emphasized _, and _neither is this*.

leipert · 2015-06-11T12:00:05Z

Please have a look at my comment in #44. There is simply no satisfying way to implement multiline emphasized/bold text with the current grammar, as far as I can see it.

burodepeper · 2015-08-30T15:43:50Z

Hey @leipert (and others),

I've been looking into this as well (see #120, different thing, but got me thinking about Markdown), and I think multiline emphasis is just not part of the philosophy behind Markdown. From that same philosophy, I believe headings, for example, shouldn't be able to contain inline markup. Markdown is about simplicity. If you want complex markup, use HTML.

I haven't looked at all your examples above, but how about we write specs for how we believe the simple version of Markup should be implemented, and ignore the weird edge cases?

leipert · 2015-08-30T16:07:50Z

@burodepeper You are right, markdown is all about simplicity, but it is also really expressive. That it is the reason why you are able to compile it to nice looking HTML, PDF or even Powerpoint with tools like pandoc.

Even the original MD definition by John Gruber is allowing you advanced stuff like inline HTML https://daringfireball.net/projects/markdown/syntax. The original spec does not forbid you to use something like this:

# This is a *fancy* header

This is the reason why most markdown renderers allow you to write inline emphasis.

Furthermore the syntax highlighting should reflect the actual render. It makes no sense, that the language-gfm package shows you another syntax than the markdown-preview output.

On a side note:
After dealing several hours (something like 80) with Atoms rendering while creating language-pfm, I came to several conclusions:

Syntax highlighting with Regex sucks. Especially markdown with Atoms Grammar Regex Parsers, as markdown syntax heavily relies on empty lines, previous lines and new paragraphs, which cannot be detected with the current Grammar Parser.
There are too many different markdown flavors. I hope, that http://spec.commonmark.org/ will overcome this in the future.
The only reliable syntax highlighting of markdown can be done with help of Syntax Trees. Hopefully this is somewhat possible in the future, as markdown is not really made for line by line parsing.

burodepeper · 2015-08-30T16:21:46Z

Furthermore the syntax highlighting should reflect the actual render. It makes no sense, that the language-gfm package shows you another syntax than the markdown-preview output.

I don't agree on that. I believe Markdown is about nothing more than semantics (as html is supposed to be). Aren't you supposed to be able to semantically read a Markdown document without any syntax highlighting? I've started using Markdown in my emails, and people understand marking words with asterisks, and specifying headers and <hr>s.

Syntax highlighting with Regex sucks.

That's something I do agree on! ; )

I hope, that http://spec.commonmark.org/ will overcome this in the future.

I've heard about it, but haven't really looked into it. Putting it on my list right now.

In short, I use markdown files a lot, so I don't mind spending time to improve the language-gfm package. And since it's Github Flavored, who's in charge of deciding the level of simplicity?

leipert · 2015-08-30T16:42:39Z

I also use markdown files a lot, which brought me to participate in these three packages:

markdown-preview-plus. An awesome Markdown Preview (new release coming tonight)
language-pfm. Pandoc Flavored Markdown, which is a fork of this package and works out a lot of the quirks, while maintaing compability to this package.
linter-markdown. Markdown Linter

Sorry for the advertisement, if you find any useful code in language-pfm, I am happy to start writing PRs for language-gfm.

burodepeper · 2015-08-30T16:51:08Z

Thanks, I'm going to take a look at them.

tpoisot · 2015-08-31T19:55:25Z

@burodepeper I don't think your point on nested emphasis or emphasis in headers not being within the markdown spirit is right. In fact, if you look at the original markdown implementation, # a *b* is parsed as <h1>a <em>b</em></h1>. And emphasis is semantics!

I would also be happy if CommonMark gained more prevalence. Github flavored markdown is already more or less CommonMark compliant, by the way.

IonicaBizau · 2015-10-18T13:51:41Z

I'm going to close this since I just feel I'm not ready to understand this regex black magic yet. 😄 🔮

IonicaBizau mentioned this pull request Apr 15, 2015

Unexpected italic style in github.com diff views #84

Closed

🐛 Fixed the inline styles (bold, italic and combined).

9b29017

IonicaBizau force-pushed the style-fix branch from cfbbe28 to 9b29017 Compare April 15, 2015 19:38

Moved the inline styles after the hr separator

402c9f3

IonicaBizau reviewed Apr 17, 2015
View reviewed changes

izuzak added the work-in-progress label Jun 10, 2015

IonicaBizau closed this Oct 18, 2015

burodepeper mentioned this pull request Nov 15, 2016

Bold across newline is not recognized as bold burodepeper/language-markdown#143

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improving the italic and bold styles #89

Improving the italic and bold styles #89

IonicaBizau commented Apr 15, 2015

mnquintana commented Apr 15, 2015

IonicaBizau commented Apr 15, 2015

tpoisot commented Apr 15, 2015

IonicaBizau commented Apr 15, 2015

tpoisot commented Apr 15, 2015

IonicaBizau commented Apr 16, 2015

IonicaBizau commented Apr 17, 2015

izuzak commented Apr 17, 2015

IonicaBizau commented Apr 17, 2015

izuzak commented Apr 17, 2015

IonicaBizau Apr 17, 2015

leipert commented Apr 19, 2015

leipert commented Jun 11, 2015

burodepeper commented Aug 30, 2015

leipert commented Aug 30, 2015

burodepeper commented Aug 30, 2015

leipert commented Aug 30, 2015

burodepeper commented Aug 30, 2015

tpoisot commented Aug 31, 2015

IonicaBizau commented Oct 18, 2015

Improving the italic and bold styles #89

Improving the italic and bold styles #89

Conversation

IonicaBizau commented Apr 15, 2015

Before

After

mnquintana commented Apr 15, 2015

IonicaBizau commented Apr 15, 2015

tpoisot commented Apr 15, 2015

IonicaBizau commented Apr 15, 2015

tpoisot commented Apr 15, 2015

IonicaBizau commented Apr 16, 2015

IonicaBizau commented Apr 17, 2015

izuzak commented Apr 17, 2015

IonicaBizau commented Apr 17, 2015

izuzak commented Apr 17, 2015

IonicaBizau Apr 17, 2015

Choose a reason for hiding this comment

leipert commented Apr 19, 2015

Emphasis

Bold _

Italic _

Bold *

Italic *

Mixed

Rest

leipert commented Jun 11, 2015

burodepeper commented Aug 30, 2015

leipert commented Aug 30, 2015

burodepeper commented Aug 30, 2015

leipert commented Aug 30, 2015

burodepeper commented Aug 30, 2015

tpoisot commented Aug 31, 2015

IonicaBizau commented Oct 18, 2015

Bold `_`

Italic `_`

Bold `*`

Italic `*`