Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added (optional) ability to parse XML Comments #44

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

mediaupstream
Copy link

Hey,

I needed to be able to parse XML comments in my project so I forked your code and added a few lines of code which I think do the trick.

You can turn this on when instantiating the parser like so

var parser = xml2js.Parser({ parseComments: true });

Then in your JS object you get the comments in a special "$" key eg:

{
  "example": {
    "$":" This is a sample Comment! "
  }
}

Go ahead and pull the updates or make it better, if you think it's a good idea!

Later,

@Leonidas-from-XIV
Copy link
Owner

Looks good, I have a number of issues, though:

  1. Not sure how useful it is to parse comments, in the first place. I am thinking about changing the parser backend, and maybe parsers like JSDom don't parse comments in the first place.

  2. I would like to use some other character than $, because I want to swap @ with $ in 0.2, so objects can be traversed without the array-access-syntax, thus decreasing the clutter.

@Leonidas-from-XIV
Copy link
Owner

Hey, you still here? I'd love to hear your opinion on my points.

@dpashkevich
Copy link

Hi, @Leonidas-from-XIV and @mediaupstream!

I just came across the repo and played with the parser for several minutes but then found out that it won't parse comments. I need this because I'm making a tool that generates some code from XML, and I want XML comments turn in to the according comments in code.

Overall, since in XML comments are legit nodes just like other nodes, I see no reason why you shouldn't parse them.

Leonidas, I suggest that you merge the pull request, substituting $ for something else (~ maybe? or even the now free @-sign?).

@Leonidas-from-XIV
Copy link
Owner

Hi @dpashkevich, thanks for your input.

I'd be happy to merge this, but not yet sure about the best character to map it to. The problem is that JavaScript does not allow foo.@.bar and `foo['@'].bar is a lot of syntactic clutter.

Also, this pull request is completely missing unit tests, so to integrate it, I'd need to write them before. If you'd like to contribute, I'm sure I'd be much faster.

@dpashkevich
Copy link

Wow, great that you responded so soon (I noticed the topic is pretty old)!

Hm, looks like there are no special ASCII characters besides $ and _ that can be easily entered from keyboard and be valid javascript identifiers.

To be honest, the existing $ and _ look cryptic to me, maybe you'd consider a more unified approach like the following:

  • $cdata property to access character data
  • $attr to access attributes
  • $comment to access comment

And they would still be invalid XML attributes so you should be good to go. IHMO that's easier to remember and more extensible since, as mentioned, there's no other js identifier beside $ and _ that's not some weird unicode char.

I've never used CoffeeScript but the tests should be pretty straightforward since the submitted code is just using the existing comment parsing functionality of sax-js.

@danactive
Copy link

I agree parsing XML comment would be useful. Currently XML comments are removed which annoys other developers as I automate XML value changes.

-=Dan=-

@0cv
Copy link

0cv commented Sep 16, 2015

Yeah, I'd like to have that too. The use case in case it's not clear: currently working on an app, which merge XML files with a more or less known structure. These files may contain comments and ideally, these comments should remain once the merge is done.

@tflanagan
Copy link
Contributor

@Krisa sounds like you need a diff engine, https://www.npmjs.com/search?q=diff

@0cv
Copy link

0cv commented Sep 16, 2015

no no, the merge is done automatically because there is always a logical way to build the final file. Additionally, the output must be sorted alphabetically and a diff engine wouldn't work well here.... At the end, it was just a use case why I would need comments with this plugin, but apart from the comments the plugin does in few lines of codes exactly what I expect... I'm sure there are many other use cases for the usefulness of parsing comments.

I'm unfortunately not fit at all with CoffeeScript and cannot provide a pull request on that, but the original pull request has few flaws. If there are more than 1 comment in the parent node, only the latest comment is saved in the JSON (the previous are overwritten). Additionally, the place of the comment is not kept, but is assigned to the parent node, while it may actually refer to something else. I think, that anytime a comment is found, it should be assigned to the following XML tag and placed in an array in case there are more than 1 comment...

@jefffriesen
Copy link

Has there been any success with some version of preserving comments? They would definitely be helpful to me.

@wparad
Copy link

wparad commented Jul 26, 2016

+1

2 similar comments
@brunano21
Copy link

+1

@alinex
Copy link

alinex commented Aug 10, 2016

+1

@Leonidas-from-XIV
Copy link
Owner

I would like to ask you to add your reactions to the topmost comment in the discussion instead of posting "+1" here.

@brunano21
Copy link

@Leonidas-from-XIV I thought the upvote was enough in order to get support for comments. Actually I don't see your point in saying that comments are non relevant. They are instead. As matter of fact, think about the parser-builder chain, where we want to modify the original document (i.e. adding/removing attributes) without wiping out existing (and maybe important) comments.

@Leonidas-from-XIV
Copy link
Owner

Leonidas-from-XIV commented Aug 10, 2016 via email

@ttraenkler
Copy link

I feel your need to focus on essentials if development on this is not paid. I wonder if a company using xml2js wants to sponsor this.

Why parsing comments is useful: Here an XML comment encoded source URL in a website saved to disk in Google Chrome:

<!DOCTYPE html>
<!-- saved from url=(0071)https://www.sitepoint.com/10-ways-minimize-reflows-improve-performance/ -->

Regarding the conflicting suggestion to use $ for comments, maybe one can simply use "!--" instead (below a Node.js example demonstrating JS actually allows/understands this)?

> const object = { "!--": "Hello" }
undefined
> object
{ '!--': 'Hello' }
> object['!--']
'Hello'

@xanZ08
Copy link

xanZ08 commented May 24, 2017

waiting for this to be merged. Handling comments in XML is really something I have been looking around for.

@ziracmo
Copy link

ziracmo commented Oct 24, 2017

+1

@joskuijpers
Copy link

$attr, $cdata and $comment sound great to me. $ can then map to $attr for backwards compatibility.

@akrantz
Copy link

akrantz commented Jan 14, 2019

It would be great to have the ability to parse/write comments. Is this PR going to be completed?

@mastakillahBlitzar
Copy link

Still no solution to this? how can we contribute? it would be a very nice feature

@lifeart
Copy link

lifeart commented Sep 2, 2021

Also think it's very useful feature, because there is a cases where you have to modify manually defined xml with comments per page. It will be great to have possibility to preserve comments / newlines too

@daiyral
Copy link

daiyral commented Nov 12, 2023

Please add this

@a2937
Copy link

a2937 commented Sep 19, 2024

Did anyone ever write a test for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.