Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attribute with fixed enum value + default/fixed values for elements #69

Merged
merged 5 commits into from
Jul 18, 2024

Conversation

jods4
Copy link
Contributor

@jods4 jods4 commented Jul 5, 2024

Fixes #68
... and a bit more because it turned out as a rabbit hole and I got ambitious.

Background on the bug

There was a DefaultValueType that was identical to ReturnType except enums were typed as string.
The reason is that enums are not returned by XTypedService.ParseValue<E>(attr, type, T defaultValue).
This would be best, and doable in modern .net, but the legacy generation code had to do
(E)Enum.Parse(typeof(E), XTypedService.ParseValue<string>(attr, type, defaultValue))
and that's why defaultValue had to be a string for enums.

That works nicely for default values, but the issue is that the same DefaultValueType was used for fixed values and as the linked issue shows, fixed values must be actual enum type and not strings.

Change 1: Get rid of DefaultValueType, fix fixed enums values

I'm simply using ReturnType instead.
The difference is that enum fixed and default values are now declared with their enum type rather than string.

This change fixes #68, here's one example from AIXM schema:
image

Note that I also reversed the check in setter from value.Equals(fixed) to fixed.Equals(value).
I was afraid that if the fixed value is a reference type, such as string, users could pass a null value and the check would throw a NRE instead of the more specific error.

Change 2: fix default values

Of course change 1 broke default enum values because now an enum is passed to ParseValue<string> as explained in Background.
If fixed it by handling default attribute values differently. Now they're checked in the getter itself with a if (attr == null) return defaultValue;. A small benefit is that no more parsing is done for default enums as they're not stored as strings anymore.

Here's an example with an enum (from AIXM):
image

And another example with a decimal (from AIXM):
image

At this point, overload ParseValue<T>(XAttribute, type, T defaultValue) is not needed anymore.
I left a comment but kept the function so that it doesn't break users who upgrade XObjectsCore without regenerating their code.

Change 3: adding element support

I got ambitious and noted that the code that reads fixed and default attributes in XSD schema was commented for elements.
I added it, but there is one slight difference worth noting.

Fixed values work the same way. The getter is always returning the fixed value; the setter throws if value != fixed and that's it.

The getter for default element values is different though. XSD specs say that default attribute values kick in when attribute is absent, hence my change 2 if (attr == null) return defaultValue. But for elements, XSD specs say that the default kicks in for empty elements. Specifically, missing elements are still considered null/absent. So the codegen differs a bit and I scripted if (elem != null && elem.IsEmpty) return defaultValue.

Example with an enum but other values work the same (from 1707__ISYBAU_XML_Schema):
image

(Yes that condition is formatted horribly by CodeDOM 😫)

@jods4
Copy link
Contributor Author

jods4 commented Jul 5, 2024

@mamift the legacy CodeDOM is terrible, what do you think of using a template, text-based approach instead (like Scriban)?

I'm not gonna do it any time soon, I don't have the bandwidth for that currently, but I would like to have your opinion.
Is it a future option or is there a showstopper?

@mamift
Copy link
Owner

mamift commented Jul 7, 2024

Can you please investigate your code further as there are failing tests for this PR, that do not appear in the current master branch:

https://dev.azure.com/mamift1/LinqToXsdCore/_build/results?buildId=302&view=ms.vss-test-web.build-test-results-tab

@mamift
Copy link
Owner

mamift commented Jul 8, 2024

@jods4 In the future can you also please post code as plain text and NOT as a screenshot.

@jods4
Copy link
Contributor Author

jods4 commented Jul 8, 2024

@mamift thanks for reviewing, I'm gonna take a look at the failing test.

I also just realised that my new support for elements with fixed value is slightly incorrect:
The getter shouldn't always return the fixed value for an element: if the element is optional, null is possible as well.
Likewise the setter for an optional fixed element should accept null as a way to remove the element.
I'll fix this as well.

@jods4
Copy link
Contributor Author

jods4 commented Jul 8, 2024

I fixed several issues with the new default/fixed support for Elements.

Here's a rundown of the issues caught by tests and my last changes:

Test failures

First test failure was caused by complex types in default values of elements...
One example is found in windowsTaskSched.xsd that has elements with fixed empty content (a slightly weird idea):

  <!-- DaysOfWeek -->
<xs:complexType name="daysOfWeekType">
    <xs:all>
        <xs:element name="Monday"       fixed="" minOccurs="0" />
        <xs:element name="Tuesday"      fixed="" minOccurs="0" />
        <xs:element name="Wednesday"    fixed="" minOccurs="0" />
        <xs:element name="Thursday"     fixed="" minOccurs="0" />
        <xs:element name="Friday"       fixed="" minOccurs="0" />
        <xs:element name="Saturday"     fixed="" minOccurs="0" />
        <xs:element name="Sunday"       fixed="" minOccurs="0" />
    </xs:all>
</xs:complexType>

This is a can of worms I don't want to get into -> I have simply opted out when the value is a complex type. Only elements with simple types get default value support (probably the main use case anyway).

The next 3 failures were weird because it was mostly the same files as the first failure, and the file didn't seem to contain typeof(void) anyway... They seem to be gone now except 2 in OfficeOpenXml, I'll check it out tomorrow.

Last failure was funny, it's a limitation of the code that creates initializer for default values that was never noticed before (there are more test cases now that I've added elements to the party).
It detected lists with a simple type_as_string.Contains("List") was can easily lead to false positives! Offender here was AuthorListType in shema Pubmed. To make it slightly more robust I used a regular expression with word boundaries, so that when List is part of another type name it won't match.
Also later I noticed that this method didn't handle nullable type names with a ? at the end properly, which caused new issues so I added this as well.

Feature changes

As I said I realised my handling of fixed values was bad for elements, because they should be handled like defaults: an optional element with a fixed value can still be additionally be interpreted as null when the element is absent.

To support this, the first change is that the return type of an element with default/fixed value remains nullable.

When reading, the codegen still checks whether the element is present before returning its fixed value.
When writing, the codegen of an optional elements accepts both the fixed value and null (which would remove the XElement).

Here are a few code examples after those changes for elements:

An boolean element with fixed value true (I used a locally modified windowsTaskSched.xsd.cs for this as there was no test case readily available):

[DebuggerBrowsable(DebuggerBrowsableState.Never)]
private static System.Boolean? EnabledFixedValue = System.Xml.XmlConvert.ToBoolean("true");
        
/// <summary>
/// <para>
/// Occurrence: optional
/// </para>
/// <para>
/// Regular expression: (AllowStartOnDemand?, RestartOnFailure?, MultipleInstancesPolicy?, DisallowStartIfOnBatteries?, StopIfGoingOnBatteries?, AllowHardTerminate?, StartWhenAvailable?, NetworkProfileName?, RunOnlyIfNetworkAvailable?, WakeToRun?, Enabled?, Hidden?, DeleteExpiredTaskAfter?, IdleSettings?, NetworkSettings?, ExecutionTimeLimit?, Priority?, RunOnlyIfIdle?, UseUnifiedSchedulingEngine?, DisallowStartOnRemoteAppSession?)
/// </para>
/// </summary>
public virtual System.Boolean? Enabled {
    get {
        XElement x = this.GetElement(EnabledXName);
        if ((x == null)) {
            return null;
        }
        return EnabledFixedValue;
    }
    set {
        if ((EnabledFixedValue.Equals(value) 
                    || (value == null))) {
        }
        else {
            throw new Xml.Schema.Linq.LinqToXsdFixedValueException(value, EnabledFixedValue);
        }
        this.SetElement(EnabledXName, value, XmlSchemaType.GetBuiltInSimpleType(XmlTypeCode.Boolean).Datatype);
    }
}

Here's an element with a default value, from the same schema:

[DebuggerBrowsable(DebuggerBrowsableState.Never)]
private static System.TimeSpan? ExecutionTimeLimitDefaultValue = System.Xml.XmlConvert.ToTimeSpan("PT72H");
        
/// <summary>
/// <para>
/// Occurrence: optional
/// </para>
/// <para>
/// Regular expression: (AllowStartOnDemand?, RestartOnFailure?, MultipleInstancesPolicy?, DisallowStartIfOnBatteries?, StopIfGoingOnBatteries?, AllowHardTerminate?, StartWhenAvailable?, NetworkProfileName?, RunOnlyIfNetworkAvailable?, WakeToRun?, Enabled?, Hidden?, DeleteExpiredTaskAfter?, IdleSettings?, NetworkSettings?, ExecutionTimeLimit?, Priority?, RunOnlyIfIdle?, UseUnifiedSchedulingEngine?, DisallowStartOnRemoteAppSession?)
/// </para>
/// </summary>
public virtual System.TimeSpan? ExecutionTimeLimit {
    get {
        XElement x = this.GetElement(ExecutionTimeLimitXName);
        if ((x == null)) {
            return null;
        }
        if (((x != null) 
                    && x.IsEmpty)) {
            return ExecutionTimeLimitDefaultValue;
        }
        return XTypedServices.ParseValue<System.TimeSpan>(x, XmlSchemaType.GetBuiltInSimpleType(XmlTypeCode.Duration).Datatype);
    }
    set {
        this.SetElement(ExecutionTimeLimitXName, value, XmlSchemaType.GetBuiltInSimpleType(XmlTypeCode.Duration).Datatype);
    }
}

Considerations about setting attributes to null

It occured to my that setting a default/fixed attribute to null can make sense as well, because it would remove the XAttribute from document.

The problem is that when using value types this is impossible unless the property type is changed to a Nullable, which is a big source-breaking change. In addition, this is quite an edge case whereas reading the non-nullable default value is the main, common use-case. So I did not do it. Default/Fixed attributes are always non-nullable. The workaround to remove them from document is to remove the XAttribute from Untyped, I guess.

For reference types, this could be done with [AllowNull] on the property and code similar to elements in the setter. I did not do it because I don't think it's worth the effort. It could always be added later.

@jods4
Copy link
Contributor Author

jods4 commented Jul 10, 2024

The two new failures were caused by being more strict on Contains("List")... sometimes it's actually IList.
This string-based type detection is very finicky. I have fine-tuned the Regex a bit.

(FYI any Debug assertion or exception is reported by the "no typeof(void)" test although the cause is quite different from what the test report says.)

I had some temporary failures not related to this PR but to DateOnly / TimeOnly and maybe I'm tired but I don't understand.
There are two build targets: .net standard 2.0 and .net 6.0. The former doesn't support DateOnly/TimeOnly, yet the code compiles without #if? I also notice that when parsing I'm calling [Date|Time]Only.Parse(string, IFormatProvider) and that specific overload was added in .net 7.0. So again: how does it successfully compile targetting .net 6.0?? How do the tests run without failure? (well I had some failures that pointed these issues to me, but I changed nothing and now everything works on my machine...)

@mamift tests are green now :)

@mamift
Copy link
Owner

mamift commented Jul 17, 2024

the legacy CodeDOM is terrible, what do you think of using a template, text-based approach instead (like Scriban)?

I'm not gonna do it any time soon, I don't have the bandwidth for that currently, but I would like to have your opinion. Is it a future option or is there a showstopper?

@jods4 I agree; CodeDOM itself is very limiting because it only models the common type system, which doesn't model everything C# can do. It's funny in hindsight, that MS engineers back in 2007 chose CodeDOM (which has an API that supports multi-language code generation) but then added a bunch of hacks that make it all C# specific.

I'm open to using a string/template based solution like Scriban, but my openness to that is very tentative. I've had lots of trouble with T4 templates before - when they go wrong, they're a huge hassle to debug; I always had a really mixed experience, constantly having to resort to Debugger.Break() or Launch() or just dumping values to Debug.Write. While this CodeDOM implementation is bad, it has one redeeming benefit: the entirety of the code generation process is readily debuggable, every class declaration, method definition, property or statement body that gets generated can be debugged easily using the VS user interface (i.e. setting a breakpoint works; hitting the breakpoint never fails and F10, F11 to step in and out always works as expected).

Does Scriban offer the same experience? Haven't looked into it, but I would base my judgement on how easy it is to edit/debug. Also maybe the tooling for T4 templates has improved a lot in 2022 (I last used them extensively in VS2015) and that too, might be worth looking into.

Finally, I did look into using the Roslyn API and this tool: https://github.com/KirillOsenkov/RoslynQuoter
Basically, you feed this tool some C# code and it will generate the Roslyn code to generate the input code (if that makes sense - if you use the tool it'll probably make sense). Something like this would probably be the closest equivalent to what the debugging experience is now with the current CodeDom implementation. The downside with Roslyn is that because it powers the C# compiler, the API surface is massive, much more than CodeDom. Even using RoslynQuoter to scaffold things, we'd probably still have to write some thin or superficial wrapper around many of the required Roslyn APIs [granted that's the case now with CodeDOM because of all the hacks].

But, one upside to using Roslyn is that it might enable opting into generating newer C# lang features more easily.

@mamift
Copy link
Owner

mamift commented Jul 17, 2024

I had some temporary failures not related to this PR but to DateOnly / TimeOnly and maybe I'm tired but I don't understand.
There are two build targets: .net standard 2.0 and .net 6.0. The former doesn't support DateOnly/TimeOnly, yet the code compiles without #if? I also notice that when parsing I'm calling [Date|Time]Only.Parse(string, IFormatProvider) and that specific overload was added in .net 7.0. So again: how does it successfully compile targetting .net 6.0?? How do the tests run without failure? (well I had some failures that pointed these issues to me, but I changed nothing and now everything works on my machine...)

Were you using Rider by chance? The whole solution itself will not build with Rider due to a bug in Rider not correctly building the right targets. It just trips and falls over when it encounters the the .NET standard 2.0 / .net 6 build targeting. Some projects on their own will build though.

@jods4
Copy link
Contributor Author

jods4 commented Jul 17, 2024

@mamift

I'm open to using a string/template based solution like Scriban, but my openness to that is very tentative. I've had lots of trouble with T4 templates before - when they go wrong, they're a huge hassle to debug;

I think you're spot on with this comment.

I have experience with Scriban and it's a great templating engine with good features, esp. to generate source code (amongst others because of white space control), but debugging is not one of them.
There's no source code for the template itself, so when something breaks inside your template expressions you're in for some trial-and-error.

I had some success by putting all my logic outside of templates, producing some specific "ViewModel" objects for binding, and keeping the templates pretty basic, formatting only. This mostly avoids the need to debug templates but of course there's always an occasional hiccup.

Maybe the ideal would be to find a good templating engine that has Code Generator support, so that the template itself becomes just C# that can be easily debugged, as any other code.

Another idea might be to rely on C# template literals (esp. the new triple quoted ones) and a set of helper functions that integrate with interpolation (e.g. to support lists/iteration, etc.). The main drawback of this is that controlling indentation/whitespace of generated code is tricky. A bold move could be to not care at all, and run the output through a formatter (like dotnet format or csharpier).

The downside with Roslyn is that because it powers the C# compiler, the API surface is massive, much more than CodeDom.

I have the same concern. I'm afraid the code will still be way more complicated than it needs to be. Building a DOM tree is so much more difficult, and unreadable, compared to a templated $"if ({member} is null) return null".

@jods4
Copy link
Contributor Author

jods4 commented Jul 17, 2024

Were you using Rider by chance?

No, just VS and sometimes VS Code.

I don't understand how it works and compiles (referencing APIs that are unavailable in target runtime), but hey if it works, it works 😄

@mamift mamift merged commit e7da3b0 into mamift:master Jul 18, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fixed enum values are generated as string, code does not compile
2 participants