-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modularization of library #746
Comments
This idea is intriguing to me, primarily from the perspective of making it easier for the library to communicate what and how it works from an architectural perspective. I don't know enough about Marked to do this work myself. What would you propose as a full folder structure given what's already here? |
hey @joshbruce I worked on marked a bit in past and even changed some of its functionalities based on my needs back then. One thing I found annoying was keeping track of things in the same file even if I was working on a different component. Here's what I did back then:
I found it easy because having independent components in different files, I could easily change/extend one component without worrying about others. The code became smaller, readable and more manageable too. I feel comfortable with marked and am open to make a PR on this. |
@Nalinc: Sorry it's taken a moment - I totally didn't see (or catch) your comments and questions at the bottom. That architecture sounds about right to me; given what I've seen in the issues and PRs. Let me tag in a few of the other recently active community members to weigh in as well (@UziTech, @Feder1co5oave, @KostyaTretyak). There's not an existing card beyond this one; so, congrats on that. :) Regarding submitting a PR, this is where I would like more feedback from the community. Our focus for the 0.4.0 milestone is to try and get functional issues taken care of. We have this card marked for the 0.5.0 milestone...but, if we can do it without much negative impact to cleaning up the other issues - I think it could actually help make things a bit more obvious on what's causing some of the other issues. I also know @UziTech is trying to make some improvements to all things testing, which will help the issue house cleaning and worthwhile for the 0.4.0 milestone as well. |
Been thinking about this a lot lately - blaming the day job as I'm a lead architect and trying to build up my chops in the software architecture practice. Mainly thinking about the open/closed principle and a way to make marked easy to extend while maintaining speed. Would like to know the purpose of each of these, as I can't really discern them from looking at the code:
Most of the flavors of Markdown I've seen don't really go too far afield from the original Markdown specification - what I believe we call "pedantic". Looking at this as an object-oriented problem to solve, we might end up with something like the following as a class inheritance tree (based on specification):
So, if we make "Pedantic" the default - it can be its own class. Then CommonMark can extend Pedantic - overriding what needs overriding and adding anew what needs adding. The GitHubFlavored can extend CommonMark...and so on. At which point, adding in new flavors should be relatively easy (famous last words). CriticMarkup, for example, would extend Markdown. Of course, this introduces an interesting concept - combining specs in an ad-hoc fashion. So, instead of using inheritance to extend a base class (because Inheritance is the Base Class of Evil) we could have renderers for each of the specs - each with only the parts that are different from the others. So, using GFM would include the stuff from CommonMark and Pedantic - without inheritance - it would be through dependency or some other type of inclusion. But, again, I'm not sure what the purpose of each of the recommended JS files is. What is the difference between a parser and a renderer and a lexer?? Also, what about allowing consumers to skip certain parsing steps (a text editor that says, "Don't allow headers," which would no longer need to go through the header check and parse steps)?? What about changing the order of parsing, does that affect performance or output at all?? |
Daring = {}
Daring.prototype.paragraph = function(markdown) {
// whatever this is against the param :)
// /^([^\n]+(?:\n?(?!hr|heading|lheading| {0,3}>|tag)[^\n]+)+)/
return result;
}
CommonMark = {}
CommonMark.prototype.paragraph = function(markdown) {
return Daring.paragraph(markdown);
}
GFM = {}
GFM.prototype.paragraph = function(markdown) {
return Daring.paragraph(markdown);
} (This could be horrible JS, btw) I should be able to override all the things at that point, yeah? (Hyper-extensibility.) We capture the exact differences between each of the specifications. I mean, unless I'm just a complete idiot, which is possible - this move would resolve most of the "feature request" problems that got us into hot water in the first place. Having said that, I have no idea how slow it would be - but Marked isn't the fastest kid on the block anymore anyway. We could also bail on the the configuration options for choosing which flavor, because you're always doing it directly - then divide the architecture by flavor to minimize compilation and package size - for example, Daring would be the lightest weight solution for client-side...but have the least amount of extended features. GFM would require CommonMark - override and add where need be. Add builds for minified versions of each flavor to the Granted I still don't know what the difference is between the lexer, the parser, and all that jazz. But, I mean, if we're going to be doing breaking changes anyway, might as well reconsider our lives a bit on the way. lol Again, just spitballing from the sidelines based on things I've done in other languages to solve similar problems. Tagging a gentleman by the name of @colinalford - He's been helping me with my JS, is wicked smart, and should probably not add this to the list of things he's already doing. rofl |
We can also do it in such a way that the changes aren't breaking for those who haven't modified or extend it - I think. Leave the same entry API as the entry point for the facade to do everything else. |
The question is not really "To ES6 or not to ES6" because I sure don't want to stick with ES5 forever. It's more of a question of when to ES6. ES6+ has lots of improvements for code readability and modularization. We will want to keep an eye on the benchmarks though. P.S. ES2018 will come with named capture groups which would make the RegExp heavy code much more readable. |
@UziTech: How so? Can you drop a small code sample? Pseudo is good enough as long as it does es6 things. :) |
I think before we start modularizing the code base we need to do some serious work on automating testing and benchmarking in node and a browser environment like chrome headless or phantomjs |
That's fair. Before that (or during it), we should probably keep working on the fixing known issues thing as well - this is just future visioning. What do you propose we need? What stops us from getting any of them? (CI against PRs, for example.) Maybe a ticket on this? I don't think we have one yet. |
I'll flag that one as part of 0.4.0 milestone to make sure we don't go out of order. |
@joshbruce Here is the same code with ES6 classes as your example above: class Daring {
paragraph(markdown) {
// whatever this is against the param :)
// /^([^\n]+(?:\n?(?!hr|heading|lheading| {0,3}>|tag)[^\n]+)+)/
return result;
}
}
class CommonMark extends Daring {}
class GFM extends Daring {} Much less boilerplate code |
As far as automated testing is concerned, it would be nice if @chjj could set up CI for PRs on Travis for mac and linux and AppVeyor for windows. Both are free for open source projects. I could create the CI scripts to automatically run the tests on multiple versions of node and different browser engines including benchmarks. That would help a lot when modularizing the code base to prevent accidentally introducing bugs or performance hits. |
Yep. This is where @colinalford really comes into play for this conversation because he knows more about it than me but the I mean every object in JS is not only a definition (a class) but an instance as well; so, by boilerplate are we talking about the "prototype" piece? And functions are first class citizens, yeah? Inheritance of classes can get tricky from the perspective that I can't just inherit parts and pieces (https://youtu.be/bIhUE5uUFOA). And, if this is what makes it easier to modularize, I'm not sure we need/want ES6 for most of it. This is also what makes the prototype system interesting...I can inherit just the things I want - the banana without the gorilla holding it. Similar to traits in PHP, for example. Agreed on the automated CI piece. Is that it; or, is there more we could/should be doing on that score? (I don't know how we can do the CI thing and actually maintain the repo being under @chjj . I mean it took us a lot of work to get to having me able to publish things, thanks for that. Does GitHub have a "I would like to take over this thing" like NPM does?) |
Right, the class syntax is just hiding all the boilerplate "prototype" pieces. The code is doing the same thing just much easier to read. We can just copy parts if we need to: class CommonMark {
// Daring has many other things but
// we only want the paragraph
paragraph(markdown) {
return Daring.paragraph(markdown);
}
} again same as the ES5 code, just cleaner. ES6 has many benefits over ES5 especially engine optimizations but since we will be transpiling it to ES5 anyways developer sanity will be the only benefit we will get until we can drop older browser/node versions. That being said, keeping up with current ES versions will help us get those optimizations when we can drop the old version since we won't have to do an entire rewrite. All we will have to do is change our transpiler configuration. |
Once the class fields proposal lands in ES it will be even easier: class CommonMark {
// Daring has many other things but
// we only want the paragraph
paragraph = Daring.paragraph;
} |
@UziTech: Fair. And I agree. Now the next question, not a recommendation, just want to cover the base: What about TypeScript? It adds keywords that do not exist in the ES spec as of today, but might in the future; could the same argument be made and does it hold the same value? |
This is a great Crockford video, and he advocates what he calls a class free paradigm starting 42:50: https://youtu.be/DePE0ffiMf4 No need for classes, no need to worry about inheritance. Write a constructor, add the parts you want from other classes, and return the new object. Don't have to wait for the new spec, either. Very cool. I am not the biggest fan of classes in JavaScript since it's a false construct. I write Typescript everyday for an Angular project, though, and I don't think it really matters given you understand the underlying JavaScript. Re: using Typescript, I really like it for the type system. It has been incredibly helpful organizing my code. However, it's kind of a pain to setup and adds a bunch of complexity with the build. I don't think I would advocate adding it for such a small project -- ~1300 LOC seems manageable without a type system. |
@KostyaTretyak has done a pretty good job of rewriting marked using TypeScript in marked-ts I'm not opposed to TypeScript but I feel like that would be an extra barrier to getting contributors. Anyone who would want to submit a pull request would need to learn TypeScript. The class-free method seems clever, which is usually not a good thing in the long run. Almost all code smells were once clever hacks. My vote is for sticking with modern standards. It seems like the easiest way to not fall behind in the future. |
@UziTech I hear your concerns about being clever, but it is generally accepted in OOP that we should prefer composition over inheritance. The class free approach means using only composition while ignoring inheritance. There has been a lot of debate in the community since the introduction of classes about whether we should even use them or not. I'm not sure I would say its a modern standard just because its part of the modern spec. But that's JS in a nutshell. Its not like we ever had a PEP like Python or whatever the PHP style guide is. Probably Crockford's Good Parts is the closest thing in the past, and AirBNB's style guide is pretty popular nowadays, so maybe pick that (they prefer classes to prototypes, btw). IMO as long as everyone understands how Javascript really works and that classes don't actually exist in JS, then it is a nice syntax to organize your code. My fears are based around Java/C# devs working on ES6/Typescript projects for the first time without learning the differences between the languages. Also, here's a really good article from Dan Abramov titled "How to Use Classes and Sleep At Night": https://medium.com/@dan_abramov/how-to-use-classes-and-sleep-at-night-9af8de78ccb4 |
@joshbruce Actually, that's a good recommendation. Use AirBnb's Style Guide for the project. |
@colinalford: See PHP-fig - PSRs 1 and 2 specifically - https://www.php-fig.org :) Agreed on the AirBnB style guide. The USWDS also started using it. It definitely seems to be becoming a more accepted standard. Having said that, as I'm sure you already know about me, ultimately it's up to the people actually maintaining the majority of the codebase. ;) (@Feder1co5oave and @UziTech, AirBnB is definitely worth looking into.) I think that's my only issue using the @UziTech: This is what I'm referring to on that score. https://github.com/8fold/php-html-component/blob/master/tests/ComponentTest.php |
@colinalford: Having said that, we are talking linters: |
@colinalford sticking with modern specs is what I meant by modern standards. Specs aren't always the best way to do things but they are the way that most people know and the starting point of future specs. If we want to keep this project from becoming stale again we will want to keep up with the specs. I'm not married to the javascript class either but I think this: class Markdown {
paragraph() {
// code...
}
} is easier to read and write than: function Markdown(){}
Markdown.prototype.paragraph = function () {
// code...
} If we want something to be a javascript "class" we might as well call it so. The class syntax is not really why I want to use ES6+ that was just an easy example of making the code easier to read. ES6+ also comes with:
Again, since we are compiling to ES5 in the build, the only benefit to any of this is ease of development. Being that this is a community project with lots of people working on it (as opposed to a single person writing most of the code) remember that which ever code style we pick will need to be learned and written by anyone who submits a PR. Sticking with modern specs seems to be the easiest way to be sure that most people are able to easily help with the project. |
@UziTech: Agreed on the easier to write; however, it seems a bit more awkward to cherry pick functionality - but I suppose, technically, they're static methods. Also, agree with @colinalford, that as long as everyone understands what's going on under the hood re classes, it's not that big of a deal. I do appreciate a lot of what ES6 offers. I also appreciate the I might have to call a bit on the "lots of people working on it" piece a bit. Most of the PRs submitted since taking over have been from 2 people. Most of the "contributions" have been in the form of issues - not fixes - by other people...this, in my opinion, is what made Marked go stale after reading the issues. People just asking for more and more stuff and people either letting it through (not acting like the hardcore filter @chjj used to be when he was still participating) and the other contributors and Chris seeming to walk away - it just became too much to sustain - but some folks didn't want to walk away when we put the call out - it's got potential. To be honest, if you (specifically) and @Feder1co5oave hadn't showed up, I was actually going to start systematically shutting it down and probably closing the project altogether - after having been made an owner of the package. The world doesn't need another dead NPM package to choose from based on its popularity by being one of the first. #956 - for future reference while working on other things. |
It will be much easier to cherry pick once private methods are introduced. "lots" may have been an exaggeration but my point was that this is a project where the only way it will last very long is if it is easy to hand off to the next person that wants to maintain it. I don't think there would have been two or three people willing to take over this project from Chris if it was written in Elm or ClojureScript. While I enjoy writing TypeScript, I will never use it for one of my projects unless the project is meant for the TypeScript community (like an angular tool). |
Not entirely sure, but it seems like #1077 and the push for a faster move to the 1.0 is clearing scrutiny. It got me thinking about the transition period between 1.0 (marked with current APIs and whatnot) and 2.0 (all the architecture and deprecation changes ever) and backward compatibility while actually making the changes. We might be able to maintain
Then we could compile minified versions for each flavor to allow users to take only the functionality they need into their projects. We could also modify the way the option settings change things in
Doing the update to the docs has me seeing a lot of setting options I'm curious if we need or want as well. |
@joshbruce @UziTech This ticket along with #1288 and #1347 seems to be requesting the same thing. What do you think of closing the duplicates? I would imagine we still want to focus on spec compliance before tearing things apart to modularize, and probably wait for ESM in Node.js to go stable. Since this ticket appears to be the oldest, should we close the other two and get all this discussion in one place? |
I'm down as long as the overarching thread stays connected. Definitely sticking to spec compliance for now: https://github.com/markedjs/marked/milestones - doesn't really change until 1.x or 2.x. I'm not sure we should close #1288 and #1347 to be honest. I'd almost want to keep the newest open, to be honest. |
I'm fine with closing #1347 |
Closing this ticket as a duplicate of #1288 |
There is a rising demand to move towards modularization of *marked * and separate different components in different files(grammar-rules, lexers, parsers and renderers). This though not have been mentioned explicitly, is evident from issues #743, #717, #704.
A properly modularized code would make maintenance easy and also allow addition of new grammar, implementation of custom parser, renderer and would address separation of concerns properly.
Like may be we can have an
/src
folder to fiddle with the code in development and use/lib
for deliverable file.The text was updated successfully, but these errors were encountered: