Question about interface. #14

garry-iglesias · 2015-12-26T13:12:57Z

Well it's an issue, but that would probably requires some refactor, so it's more a 'question'/'suggestion'.

To my knowledge, the renderer interface can't be correlated with some 'user data' (a void pointer). Moreover the "output buffer" have a 'hardcoded' type.

The questions is, if there's a renderer, why can't it be the "owner" of the output buffer ?

My "ideal" interface would be to pass, a "renderer interface + user data pointer" for the output (and the input it would have been great too).

Because with this interface we must "translate" foreign data containers with the library ones, which generates a lot of overhead in both memory and CPU time. Not mentioning the way library buffers are managed is not really optimal, but that wouldn't be a problem if we could just provide the 'in and out streams' to speak simple.

faelys · 2015-12-26T19:22:12Z

Have you checked the opaque field of struct mkd_renderer?

garry-iglesias · 2015-12-27T10:47:35Z

Well I didn't need it for now so I missed it. Sorry for this point.
And what about the renderer vs output buffer ? I mean do we really need to have the output buffer explicit from the markdown() function, couldn't it be up to the renderer to decide where to output ?
I understand that this would lead to quite some changes in the API, but I'm curious, if there are any reason I might have missed.

faelys · 2015-12-27T19:40:25Z

As you correctly point out, the current code does rely on the output buffer being a transparent type. It does so in two different areas:

some default behaviours copy the input buffer directly into the output, like normal text or escaping when no dedicated callback are given,
the parser takes care of all the recrusvity involved in Markdown syntax.

While the first point could easily be changed by a change of semantics in the API, and providing a compatibility layer to help with migration, the second is rooted much deeper in the library design.

The second point is a consequence of the choice in callback semantics: each callback gathers everything needed to render a given element, with contents provided by the parser. Therefore the parser is in charge of feeding the rendered result of inner elements into the callbacks of outer elements.

Taking it a bit further upstream into the design decisions, the core idea of libsoldout is the separation between the parser and the renderer. So the abstract syntax tree (AST) has to be somehow communicated to the latter by the former.

Designing data structures and an API for an AST is quite difficult, so I worked around it by choosing a callback-based interface, conceptually merging the AST building and a certain way of walking the tree. That way, I avoid the difficulties with storing and manipulating an AST, and should an AST ever be needed, it can still be relatively easily be built behind the the existing fusion.

I could have used an event-based interface for a similar effect but a more complex (and bug-prone) library. It's also simplicity of code that made me choose an online algorithm, even though markdown need of a two-pass parsing suppresses the other advantages of online algorithms.

The tree-walking part in libsoldout is designed around the afore-mentioned idea of each callback being called once for each element and with the contents generated by contained elements. So the deepest elements are rendered first, into a temporary buffer, assuming each callback appends to the existing buffer (but that assumption is never used or enforced). Then the containing element is called with that temporary buffer, and so on.

I can't think of any other way of walking the tree that doesn't put much more complexity into the renderer, and since the point of the project was multiple renderers that can be plugged into the same parser, it made sense to factor as much complexity as possible into the parser.

Conversely, if you need a non-linear or more structured output, you have to work around the parser and at least deal with recursivity yourself.

I tried another way of walking the tree in a project named markup-ada with three callbacks per elements: opening, appending data, and closing, without any control of the parser on the rendered output (and I used generic programming so that the output only has an append method available). That would be better for structured output, but has the cost of putting recursivity in the renderer and having to deal with elements scattered accross three callback functions.

To answer the question, the reason it will probably never be that way in libsoldout is that it would involve a deep change not only in the API, but also in the perimeter of the library and in the implementation. With that much to change, what is there to keep? almost thing, so it's not an evolution but the writing of a brand new and very different project, that probably deserve a brand new name too.

After seeing what markup-ada ended it up being, I don't imagine that much progress happening through so much developper work.

garry-iglesias · 2015-12-28T10:38:00Z

Thank you for the great explanation.

I agree this would require too much changes and maybe a different approach. Anyway it does the job well right now, it's just there's a 'tiny lie' in the doc saying that the API is just one function as it requires some more buffer functions, and sometimes "hit into structures". Not a big deal.

Many thanks for the nice work, and thank you again to take the time to make a deep answer.

Bests.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about interface. #14

Question about interface. #14

garry-iglesias commented Dec 26, 2015

faelys commented Dec 26, 2015

garry-iglesias commented Dec 27, 2015

faelys commented Dec 27, 2015

garry-iglesias commented Dec 28, 2015

Question about interface. #14

Question about interface. #14

Comments

garry-iglesias commented Dec 26, 2015

faelys commented Dec 26, 2015

garry-iglesias commented Dec 27, 2015

faelys commented Dec 27, 2015

garry-iglesias commented Dec 28, 2015