-
Notifications
You must be signed in to change notification settings - Fork 288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement "inline schemas": ability to add type hints into the type providers' source documents #1447
Conversation
@mlaily you might need to rebase here now that some stuff has merged. |
Thanks for the heads up! |
…ering types from inline schema definitions Ensure backward compatibility + add doc
This means changing inferPrimitiveType to take a UnitsOfMeasureProvier parameter, and return InferedType instead of Type so we can set the units on the returned values in a way that will work with the rest of the code.
where inline schemas types override value-inferred types instead of just adding new types to the list of inferred types. This requires adding metadata on InferedType.Primitive to know if the type should replace other types when merging them. This also requires adding metadata on InferedType.Heterogeneous to keep track of optionality inside an heterogeneous type. (Since we want to be able to transform a heterogeneous type back into a primitive type when the primitive type is supposed to override other primitive types) Preserving optionality in heterogeneous types effectively makes nulls and inline schemas both being considered at the same level of importance when merging.
Not sure why the assert was there, but looking at the similar ConversionsGenerator gives me confidence it shouldn't be here anymore now that we can have units of measure in json...
The existing ones changed to take the BackwardCompatible inference mode. All Json and Xml cases that had value inference enabled have been copied to 3 new cases: one for each value of InferenceMode. The expected content is always the same: enabling inline schemas when no schema is defined in the source file should not change anything to the output.
…nd xml providers)
Hello, Any tips on how I could help this PR move forward? |
So sorry for the delay, let me review it now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a little while, but I think this looks good.
As a follow-up PR @mlaily, could you add an example in the docs that explains the feature and how to turn it on?
Awesome, thanks! |
Woohoo! thank you! |
Hello,
I know I should probably have asked about such a feature beforehand, but I was playing with the project and got a bit carried away... sorry!
Let's at least talk about it now!
Does it seem like a welcome addition?
I can see myself using it with the json and xml providers at least.
What do you think?
High level overview:
When enabled (after setting the new
InferenceMode
static parameter on JsonProvider, XmlProvider,and HtmlProvidertoValuesAndInlineSchemasHints
orValuesAndInlineSchemasOverrides
— default value of the enum should be backward compatible), it's now possible to define inline schemas: string values that are used as hints or overrides for type inference in the providers.Using
ValuesAndInlineSchemasHints
, inline schemas are about equivalent to setting values interpreted as these types, but easier and with more control over the result.The
ValuesAndInlineSchemasOverrides
mode is probably the most intuitive to use:Using a inline schema once for a value repeated in a sample list or collection is enough to override value-based inference and indicate to the provider we know better.
Example use case:
I have a sample with
Code
properties whose values look something like "000" "123" "4E5" or "ABC". This list would be inferred as an heterogeneous list of floats or strings, but I know using these as floats makes no sense, so I only want the string version.Before inline schemas, I would have to either disable type inference (undesirable because I want to infer other values), or replace all the
Code
properties in my xml/json sample with something that is unambiguously not a float. This is annoying, especially in large samples.Note it's also possible to provide a measure of unit! :)
Example:
(Old screenshot. The enum names have changed)
I added
typeof{int{metre}}
andtypeof{datetimeoffset}
in the sample list, so the type provider generates the necessary members to access the value with these types.Syntax
typeof<typeName>
typeof{typeName}
typeof<typeName<measureName>>
typeof{typeName{measureName}}
List of type names allowed: int, int64, bool, float, decimal, date, datetimeoffset, timespan, guid, string,
int option, int64 option, bool option, float option, decimal option, date option, datetimeoffset option, timespan option, guid option and string option.This is the same list as the one already supported by the CsvProvider in its schemas, minus the nullable and optional types.
(I started by allowing optional type definitions, but it led to weird bugs with weird generated types when the parent is also optional, so I decided to only allow "normal" types, and continue relying only on structural inference to generate optionals)
Units of measure allowed: the ones already supported by the default
IUnitsOfMeasureProvider
in the project. I believe this is the default SI units.Either
<>
or{}
can be used, so it's easy to define an inline schema both in the JsonProvider and the XmlProvider and HtmlProvider.Remarks
I don't know if this is the best approach, but I did my best to stay backward compatible.
I created an enum for the public api (because we need to be able to create constant values for the compiler to accept them on a provider, and because there is a special default value for backward compat). This public enum is then mapped on an internal DU used to represent valid cases in an unambiguous way (enums are just vaguely typed ints, DU enforce the list of possible values).
There are currently 4 distinct modes of inference:
After experimenting a bit, I feel like this is all the choices I would want. What do you think?
Is there any way to mark a provided static parameter obsolete? It does not seem to be implemented (and I tried to implement it but the attribute didn't seem to be picked up by VS Code at least)
I realized after seeing the broken docs that I may have changed public apis I should have been more careful not to change, but I don't know what is considered as the public api that shouldn't change too much, and what is fair game. Please advise.
✨ I made an effort to have clean commits with minimum overlap. It should hopefully be easy to review them independently ✨
This PR should fix #1323 and provide a better workaround for #1418 and #1221