Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add smart punctuation #134

Merged
merged 6 commits into from
Jul 9, 2015
Merged

Add smart punctuation #134

merged 6 commits into from
Jul 9, 2015

Conversation

0b10011
Copy link
Contributor

@0b10011 0b10011 commented Jul 7, 2015

Per commonmark/commonmark.js@7bd852b

This PR implements the smart punctuation work done in commonmark.js and fixes a few bugs with this package.

  • Smart quote, ellipsis, and en/em-dash replacement
  • Punctuation tests added from jgm/commonmark.js
  • Fix a unicode issue with parsing (see Environment.php)
  • Fix bug where only a single Processor could be added (see closeBracketParser.php, EmphasisProcessor.php, and InlineParserEngine.php)

@colinodell
Copy link
Member

Wow, thanks so much for taking this on! On first glance this looks pretty good. Would it be possible to split those fixes into separate commits? It would make it a little easier to visualize and manage those changes.

I'll take a closer look later tonight or tomorrow and get back to you with any other thoughts or comments. Thanks again!

@colinodell colinodell self-assigned this Jul 7, 2015
0b10011 added 2 commits July 7, 2015 18:17
Additional processors aren't able to access delimiters
if they're removed by another processor first
and bugs are introduced if delimiters aren't removed.
@0b10011
Copy link
Contributor Author

0b10011 commented Jul 7, 2015

Would it be possible to split those fixes into separate commits? It would make it a little easier to visualize and manage those changes.

Done. Let me know if you need anything else!

@colinodell colinodell added the enhancement New functionality or behavior label Jul 7, 2015
@cebe
Copy link

cebe commented Jul 7, 2015

nice, so this basially implements #98 :)

@colinodell
Copy link
Member

Oh good call @cebe, I forgot about that one.

I think wrapping this up as an Extension would make it even easier to use. The question becomes whether that should be included in this package or perhaps as a separate (yet still) official package?

@0b10011
Copy link
Contributor Author

0b10011 commented Jul 8, 2015

I've amended the most recent commit to remove an unnecessary line (adding a Quote element which didn't turn out to be necessary in the end and I forgot to clean it up).

I think wrapping this up as an Extension would make it even easier to use. The question becomes whether that should be included in this package or perhaps as a separate (yet still) official package?

I think it would make sense in the main repo, since it's part of jgm/commonmark.js. Were you thinking something along the lines of c2c9ec9? It is a barebones extension that, if used by itself, only parses punctuation and paragraphs. Otherwise, it can be added to another environment to inherit from there (as I updated the tests to do).

Just let me know if you want to keep it or not and I can change/remove it as necessary.

@cebe
Copy link

cebe commented Jul 8, 2015

I have no strong opinion on whether this should be a separate package or not, it is not a heavy one and the JS implementation also ships it in one package so it may just stay in this repo.

Am I right that this is not enabled by default? I am not that deep into the code to find that out.

@colinodell
Copy link
Member

@bfrohs yes that is exactly what I had in mind!

@cebe correct, this would not be enabled by default. And I agree with your point, perhaps it's best to just include this specific feature.

I want to do one final read-through of the code tonight and then I will accept the merge. 😄

On a related side-note - I'm also wondering if perhaps we should look at reorganizing the classes a bit; basically separating the core parser, CommonMark elements/parsers, and this feature into 3 separate sub-namespaces. Right now things are all lumped in together. It's fine for now though - just something to think about in the future, especially if other similar features might be coming.

@0b10011
Copy link
Contributor Author

0b10011 commented Jul 8, 2015

@colinodell Fixed the latest commit per c2c9ec9#commitcomment-12068861

// Process the emphasis characters
$delimiterStack->iterateByCharacters(['“', "’"], $callback, $stackBottom);
// Remove gaps
$inlines->removeGaps();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lines 43 and 44 can (should?) probably be removed. I'm thinking this was probably copypasted from the other processor which basically nulls-out some of the inlines to mark them for eventual removal by removeGaps(). We're not doing this here, so the removeGaps() call shouldn't be necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, copy/paste issue. Fixed.

@colinodell
Copy link
Member

Alright I've taken a close look and am really happy with what you've done :) I've just got two final comments. Let me know your thoughts.

@@ -23,6 +23,7 @@
"require-dev": {
"erusev/parsedown": "~1.0",
"jgm/CommonMark": "0.20",
"jgm/SmartPunct": "0.20",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? The smart_punct.txt file is included in jgm/CommonMark.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? The smart_punct.txt file is included in jgm/CommonMark.

Is it? jgm/CommonMark points to http://spec.commonmark.org/0.20/spec.txt in composer.json, which doesn't seem to include any smart punctuation tests. Additionally, there is no smart_punct.txt file in jgm/CommonMark anywhere I can find it.

There is, however a smart_punct.txt file in jgm/CommonMark.js, which is what jgm/SmartPunct points to in this file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry, I didn't notice it.

colinodell added a commit that referenced this pull request Jul 9, 2015
@colinodell colinodell merged commit 4032649 into thephpleague:master Jul 9, 2015
@colinodell
Copy link
Member

Merged. Thanks so much for this contribution!

colinodell added a commit that referenced this pull request Jul 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New functionality or behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants