Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Element ref not resolved correctly #72

Closed
SimonCockx opened this issue Oct 28, 2024 · 10 comments
Closed

Element ref not resolved correctly #72

SimonCockx opened this issue Oct 28, 2024 · 10 comments

Comments

@SimonCockx
Copy link
Contributor

SimonCockx commented Oct 28, 2024

Describe the bug
A forward reference to an element results in an unresolved reference for the type of that element.

Consider the following schema:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
	<xs:complexType name="Document">
	    <xs:sequence>
	        <xs:element ref="myElem"/>
	    </xs:sequence>
	</xs:complexType>
	
	<xs:element name="myElem" type="Foo" />
	
	<xs:complexType name="Foo">
	    <xs:attribute name="bar" type="xs:string" />
	</xs:complexType>
</xs:schema>

This correctly results in a complex type Document with a single element myElem, but the type of the element - Foo - is not resolved.

There are two references which should be resolved here: ref="myElem" and type="Foo".

I did some debugging and I think the reason why this is currently not working is as follows:

  1. Before references are resolved, both are represented as an UnsolvedReference.
  2. While resolving ref="myElem", a copy of the target element is made. See XSDParserCore#replaceUnsolvedReference line 560:
    XsdNamedElements substitutionElement = (XsdNamedElements) concreteElement.getElement().clone(oldElementAttributes, concreteElement.getElement().getParent());
    
    Note that this also copies the unsolved reference type="Foo".
  3. For some reason, this new copy is never resolved.

Expected behavior
I would expect no unresolved references.

Library Version
1.2.15

@SimonCockx
Copy link
Contributor Author

Digging further. This is probably related to working on Windows. I've noticed the unsolvedElements map changes from using slashes to using backslashes during reference resolution.

@SimonCockx
Copy link
Contributor Author

This method is the culprit.

    public void addUnsolvedReference(UnsolvedReference unsolvedReference) {
        XsdSchema schema;

        try {
            schema = XsdAbstractElement.getXsdSchema(unsolvedReference.getElement(), new ArrayList<>());
        } catch (ParentAvailableException e) {
            schema = null;
        }

        String localCurrentFile = currentFile;

        if (schema != null) {
            String schemaFilePath = schema.getFilePath();

            if (!localCurrentFile.equals(schemaFilePath)) {
                localCurrentFile = schemaFilePath;
            }
        }

        List<UnsolvedReference> unsolved = unsolvedElements.computeIfAbsent(localCurrentFile, k -> new ArrayList<>());

        unsolved.add(unsolvedReference);
    }

Since localCurrentFile has slashes, and schemaFilePath has backslashes, this fails.

@SimonCockx
Copy link
Contributor Author

SimonCockx commented Oct 28, 2024

This is not the first time I see issues in this library related to a poor handling of file paths. (slash versus backslash, relative versus absolute, ...) I wonder if it's doable to properly clean that up, for example (but not limited to) by using normalized Java Paths instead. @lcduarte any advice?

@lcduarte
Copy link
Member

Hello Simon,

Thanks for the issue and for using the library. Also thanks for the investigation, while I agree that the file paths are a mess in the library I'm not entirely sure that's the source of your issue.

The issue is that the cloned XsdElement myElem doesn't yet the type resolved, which means that while eventually the instance I used to generate the clone will have the type resolved in the next iteration, the cloned instance never will.

This might be solved by solving the type references first and only then other references.

I've added the following line in XSDParserCore#resolveInnerRefs at 464:

Collections.sort(unsolvedReferenceList, new Comparator<UnsolvedReference>() {
    @Override
    public int compare(UnsolvedReference abc1, UnsolvedReference abc2) {
        return Boolean.compare(abc2.isTypeRef(), abc1.isTypeRef());
    }
});

Here:

imagem

Please check on your side if that fixes your issue.

@SimonCockx
Copy link
Contributor Author

SimonCockx commented Oct 29, 2024

I can do you better: I can verify that normalizing the schemaFilePath in addUnsolvedReference solves my issue. While your solution might also solve it, I feel it looks more like a workaround for the actual underlying issue (i.e., unnormalized paths). Resolving cloned references works perfectly fine, but not if the schema path contains backslashes, since it will wrongly be treated as an unsolved reference of a different file.

My current workaround:

    public void addUnsolvedReference(UnsolvedReference unsolvedReference) {
        XsdSchema schema;

        try {
            schema = XsdAbstractElement.getXsdSchema(unsolvedReference.getElement(), new ArrayList<>());
        } catch (ParentAvailableException e) {
            schema = null;
        }

        String localCurrentFile = currentFile;

        if (schema != null) {
            String schemaFilePath = schema.getFilePath().replace("\\", "/"); // SEE FIX HERE

            if (!localCurrentFile.equals(schemaFilePath)) {
                localCurrentFile = schemaFilePath;
            }
        }

        List<UnsolvedReference> unsolved = unsolvedElements.computeIfAbsent(localCurrentFile, k -> new ArrayList<>());

        unsolved.add(unsolvedReference);
    }

@lcduarte
Copy link
Member

I might be missing something but that replace alone doesn't seem to solve the problem with your sample xsd. I'm on windows if that makes any difference. I've created a file issue_72.xsd with your example and created the following test:

imagem

Test code:

@Test
public void testIssue72() {
    XsdParser xsdParser = new XsdParser( getFilePath("issue_72.xsd"));
    XsdComplexType documentComplexType = xsdParser.getResultXsdSchemas().findFirst().get().getChildrenComplexTypes().filter(e -> e.getName().equals("Document")).collect(Collectors.toList()).stream().findFirst().get();

    XsdElement sequenceElement = documentComplexType.getChildAsSequence().getChildrenElements().collect(Collectors.toList()).stream().findFirst().get();
}

With my solution this doesn't happen. Resolving the cloned references wasn't really right, because if I first resolve references that clone elements that themselves have unsolved references in the end it won't be correctly resolved because I don't solve the unsolvedReferences of the clones.

And like you said, the solution is already pretty hardcoded with backslashes and slashes, I'd rather not have more if I can avoid them.

@SimonCockx
Copy link
Contributor Author

SimonCockx commented Oct 29, 2024

Hmm, perhaps I minimized the example too much.

The actual schema I was working with is this one. Note the ref="animal":

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
	<!-- XML root element -->
	<xs:element name="document" type="Document" />
	<xs:complexType name="Document">
	  <xs:sequence>
	    <xs:element ref="animal"/> <!-- Note that the name of this element is determined by the actual type of animal -->
	  </xs:sequence>
	</xs:complexType>
	
	<!-- Element names: `Cow`s should be serialised as an element named `cow`, `Goat`s should be serialised as an element named `goat`. -->
	<xs:element name="animal" type="Animal" abstract="true" />
	<xs:element name="cow" type="Cow" substitutionGroup="animal" />
	<xs:element name="goat" type="Goat" substitutionGroup="animal" />
	
	<!-- Types: `Cow` and `Goat` extend from `Animal` -->
	<xs:complexType name="Animal">
	  <xs:attribute name="name" type="xs:string" />
	</xs:complexType>
	<xs:complexType name="Cow">
	  <xs:complexContent>
	    <xs:extension base="Animal" />
	  </xs:complexContent>
	</xs:complexType>
	<xs:complexType name="Goat">
	  <xs:complexContent>
	    <xs:extension base="Animal" />
	  </xs:complexContent>
	</xs:complexType>
</xs:schema>

Perhaps there are two issues then. My own issue was resolved with replace("\\", "/").

Resolving the cloned references wasn't really right, because if I first resolve references that clone elements that themselves have unsolved references in the end it won't be correctly resolved because I don't solve the unsolvedReferences of the clones.

This is not true though - I have stepped through the code. Upon cloning an UnsolvedReference, the clone is added to the unsolvedReferences, after which the next iteration in resolveInnerRefs will pick it up. See XsdParserCore#resolveInnerRefs line 468. My issue was caused by not adding the clone to the right file inside unsolvedReferences.

@lcduarte
Copy link
Member

This is not true though - I have stepped through the code. Upon cloning an UnsolvedReference, the clone is added to the unsolvedReferences, after which the next iteration in resolveInnerRefs will pick it up. See XsdParserCore#resolveInnerRefs line 468. My issue was caused by not adding the clone to the right file inside unsolvedReferences.

You are right, but due to the stop condition of having the same number of UnsolvedReferences after iterating the list the resolveInnerRefs will stop resolving, at least with the first example you've provided. Anyway adding the sort will always be an improvement, by ordering I'll avoid generating unnecessary UnsolvedReferences in clones which will make it faster to parse.

I can't really replicate your issue, but then again I'm on Windows. Since your suggested solution doesn't seem to break any of the existing tests I'll add it, together with the improvement I mentioned above. Would that work for you?

@SimonCockx
Copy link
Contributor Author

I'm on Windows too, actually, so I'm surprised it's not possible to replicate.

But yes, I agree it might make things more efficient, and yes, if you can add that replacement in that would be great! Cheers.

@lcduarte
Copy link
Member

Great!

The new version is available, 1.2.16. If you find any other issues let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants