Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad SYSTEM for DTD DocType and Entity breaks the XML validation. #647

Closed
Oobiewan opened this issue Feb 2, 2022 · 22 comments
Closed

Bad SYSTEM for DTD DocType and Entity breaks the XML validation. #647

Oobiewan opened this issue Feb 2, 2022 · 22 comments
Labels
bug Something isn't working validation
Milestone

Comments

@Oobiewan
Copy link

Oobiewan commented Feb 2, 2022

When I start VS Code and an xml document is open, it is validated as expected, and problems are thrown where needed. However, if I close the file with problems and then open it again, no validation happens anymore, and no problems are shown. The same way, if I fix problems in a file with invalid structures and the validation does not find any problems anymore, I cannot trigger the validation anymore, even if I write content that is not only invalid according to the defined DTD but even invalid xml is not shown as a problem. Basically validation never works except when opening vs code.
I am trying to validate xml files against the DocBook 4.4 DTD.

@angelozerr
Copy link
Contributor

angelozerr commented Feb 2, 2022

Could you share your xml and dtd please.

I cannot reproduce it.

@angelozerr angelozerr added bug Something isn't working validation labels Feb 3, 2022
@Oobiewan
Copy link
Author

Oobiewan commented Feb 3, 2022

Thanks for the quick reply. I tested some things and the problem seems to be related to some entity references.
An XML file where the validation fails has the following declarations:

<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.4//EN" "http://www.docbook.org/xml/4.4/docbookx.dtd" [
    <!ENTITY % xinclude SYSTEM "http://www.docbook.org/xml/4.4/xinclude.mod">
    %xinclude;
    <!ENTITY % document SYSTEM "document.ent">
    %document;
]>

If I delete the %document; parameter entity reference, the validation works again. The contents of document.ent are:

<!ENTITY % project SYSTEM "project.ent"> %project;
<!-- Adjust the document-specific information! -->
<!ENTITY document-state            "&lt;Draft&gt;" >
<!ENTITY document-revision         "&lt;1.0.3&gt;" >

If I keep the %document; reference in the xml file but remove %project; in document.ent, the validation works again. The contents of project.ent are:

<!ENTITY % company SYSTEM "company.ent"> %company;

<!-- Adjust the project-specific information! -->
<!ENTITY project-name              "best_project">
<!ENTITY project-number            "112344">

Unlike in the previous file, removing the %company; reference does not help, the validation doesn't work. It seems like the validation has problem substituting entities on more than two levels. In Oxygen XML Author (22.1), all of the entity files and their content are resolved without problems, I can use and validate entities from all three levels/files that we have in this project.

@Oobiewan
Copy link
Author

Oobiewan commented Feb 3, 2022

I have also tried referencing all entity files directly from the XML file instead, but the result is the same, that is, as long as I refer to %document; only, validation works if document.ent has not further entity references. Referencing %project; and/or %company; in any combination breaks the validation.

@Oobiewan
Copy link
Author

Oobiewan commented Feb 3, 2022

Suddenly after restarting VS code (for about the 6th time since I first encountered this error), everything worked perfectly, even though all contents, declarations and references were the same as originally. Another restart of VS code, and everything is back to the previously described problem.

@Oobiewan
Copy link
Author

Oobiewan commented Feb 3, 2022

Not sure this is useful but the referenced xinclude entity is mapped to a local file with a catalog. The local file has the following contents:

<!ELEMENT xi:include (xi:fallback?) >
<!ATTLIST xi:include
xmlns:xi            CDATA       #FIXED       "http://www.w3.org/2001/XInclude"
href                CDATA       #REQUIRED
parse               (xml|text)  "xml"
xpointer            CDATA       #IMPLIED
encoding            CDATA       #IMPLIED
accept              CDATA       #IMPLIED
accept-charset      CDATA       #IMPLIED
accept-language     CDATA       #IMPLIED >
<!ELEMENT xi:fallback ANY >
<!ATTLIST xi:fallback
xmlns:xi            CDATA       #FIXED "http://www.w3.org/2001/XInclude" >
<!ENTITY % local.preface.class    "| xi:include" >
<!ENTITY % local.part.class       "| xi:include" >
<!ENTITY % local.chapter.class    "| xi:include" >
<!ENTITY % local.divcomponent.mix "| xi:include" >
<!ENTITY % local.para.char.mix    "| xi:include" >
<!ENTITY % local.info.class       "| xi:include" >

@angelozerr
Copy link
Contributor

Many thanks @Oobiewan for your detailed information.

I can reproduce the problem. I need some investigation to understand the problem.

@angelozerr
Copy link
Contributor

@Oobiewan it seems http://www.docbook.org/xml/4.4/xinclude.mod is not available?

@Oobiewan
Copy link
Author

Oobiewan commented Feb 3, 2022

@angelozerr correct, http://www.docbook.org/xml/4.4/xinclude.mod is not available at this specific address, but it is mapped with a catalog file to a local file in the vs code property xml.catalogs.
image
The contents of this catalog file are:

<?xml version="1.0"?>
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
    <system systemId="http://www.docbook.org/xml/4.4/xinclude.mod"
        uri="C:\5.2.10\custom\dtd\4.4\xinclude.mod"/>
</catalog>

In a comment above I have copied the complete contents of this local .mod file. You can create it for yourself and add its location to the catalog if you want to reproduce this part of the implementation I have. But I think you can also refer to the .mod file directly from the XML, or simply remove the xinclude declaration and reference completely from the doctype definition as there is no xinclude element in my document anyway.

@Oobiewan
Copy link
Author

Oobiewan commented Feb 3, 2022

One more test I just did confirms that it might be something with the multi-level entity files:
If I merge the contents of project.ent and company.ent into document.ent (and get rid of the declarations and references for those two files), the validation works fine. So it looks like the plugin cannot resolve entity references inside the referenced entity files.

@angelozerr
Copy link
Contributor

@Oobiewan
Copy link
Author

Oobiewan commented Feb 3, 2022

Yes, I'm pretty sure that was one of the first things I did.

@Oobiewan
Copy link
Author

Oobiewan commented Feb 3, 2022

@angelozerr What I either haven't tried before or tried it when there were still other problems with the setup is triggering the re-validation manually with the "Revalidate current XML file" command. Apparently if I trigger this, the validation works properly even with the original setup of the entities. Strangely, if I remove the %document; entity reference, the document is validated in real-time continuously. With the %document; reference in place, I need to trigger it manually. So after all, the validation works, but the automatic validation doesn't. It would be great if I didn't have to trigger it manually each time, but at least it does work in general then.
Allow me to note that aside from this inconvenience, this plugin looks really good with very nice features, thanks a lot to you and all other contributors. And of course, thanks a lot for your support.

angelozerr added a commit to angelozerr/lemminx that referenced this issue Feb 3, 2022
angelozerr added a commit to angelozerr/lemminx that referenced this issue Feb 3, 2022
@angelozerr
Copy link
Contributor

I have started to fix some errors.

Is it possible to share an XML which elements from xinclude.ent please.

@Oobiewan
Copy link
Author

Oobiewan commented Feb 4, 2022

In this zip I have added a set of files with xincludes.
In their current form, the plugin can validate devd_installing_system.xml when triggered manually, but says that the chapter element has invalid content. But it is actually valid, Oxygen author also has no problem with it.
Strangely, if you move the third <xi:include> element above the <section>, the validation doesn't find any problems anymore.
Just in case, I have added my catalog file and the xinclude.mod to the zip, but their contents are the same as described above.
docbook_entities_xinclude.zip

@Oobiewan
Copy link
Author

Oobiewan commented Feb 4, 2022

Not sure if that's also a validation problem or simply a missing feature, but it seems like the plugin cannot handle the references to content within xi:include entities. E.g, if you try to use an <xref> in devd_installing_system.xml that points to an id that is inside devd_building_artifacts_windows.xml, the validation says that the id is missing.

angelozerr added a commit to angelozerr/lemminx that referenced this issue Feb 6, 2022
angelozerr added a commit to angelozerr/lemminx that referenced this issue Feb 6, 2022
angelozerr added a commit to angelozerr/lemminx that referenced this issue Feb 6, 2022
angelozerr added a commit to angelozerr/lemminx that referenced this issue Feb 7, 2022
@angelozerr
Copy link
Contributor

@Oobiewan I'm working on this issue and I think I have found a solution. For the moment I have created 2 PR to improve a little entities support:

@Oobiewan
Copy link
Author

Oobiewan commented Feb 7, 2022

@angelozerr Sounds fantastic, thank you very much. Looking forward to trying it.

@angelozerr
Copy link
Contributor

angelozerr commented Feb 8, 2022

@Oobiewan could you install last vsix from https://download.jboss.org/jbosstools/vscode-xml/staging/?C=M;O=D and give us feedback if it works better now?

You should have too :

Please give us feedback if it works and don't hesitate to create detailed issues (I think entities support is not perfect) to improve vscode-xml.

@angelozerr
Copy link
Contributor

Not sure if that's also a validation problem or simply a missing feature, but it seems like the plugin cannot handle the references to content within xi:include entities. E.g, if you try to use an in devd_installing_system.xml that points to an id that is inside devd_building_artifacts_windows.xml, the validation says that the id is missing.

please create an issue for that.

@Oobiewan
Copy link
Author

Oobiewan commented Feb 9, 2022

Hi @angelozerr , the files are now validated continuously even with the parameter reference in place. The features in the referenced pull requests also seem to work fine. Thank you.
Unfortunately there is still an error with validating the xi:includeelements and references that rely on the xi:incude functionality. I'll create separate issues for those then. Thanks again for the improvements.

@angelozerr
Copy link
Contributor

angelozerr commented Feb 9, 2022

Thanks for your feedback. To be honnest with you I dont know xi include.

So please create a detailled issue with the expected behavior in the issue.

@angelozerr
Copy link
Contributor

Fixed with eclipse-lemminx/lemminx#1169

@angelozerr angelozerr added this to the 0.18.4 milestone Feb 9, 2022
@angelozerr angelozerr changed the title validation only happens on starting VS code Bad SYSTEM for DTD DocType and Entity breaks the XML validation. Feb 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working validation
Projects
None yet
Development

No branches or pull requests

2 participants