Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SDMX / VTL - Type / Value domain #402

Open
NicoLaval opened this issue Mar 21, 2024 · 3 comments
Open

SDMX / VTL - Type / Value domain #402

NicoLaval opened this issue Mar 21, 2024 · 3 comments
Labels

Comments

@NicoLaval
Copy link
Collaborator

Hi all,

I need your advice regarding the SDMX --> VTL

Considering the below SDMX DSD fragment (included in a Component):

<str:LocalRepresentation minOccurs="1" maxOccurs="1">
    <str:Enumeration>urn:sdmx:org.sdmx.infomodel.codelist.Codelist=FR1:CL_TEST(1.0)</str:Enumeration>
</str:LocalRepresentation>

The CL_TEST code list is composed of strings (but this string type is not defined anywhere).

My questions are:

  • what is the type to give to my VTL component?
  • do you confirm me that the VTL value domain is CL_TEST?
  • if so, how can I use it in the VTL world? In what type of operation is this useful other than validation? Have I to check that the data of the associated dataset only take values in this range?

Thanks in advance

@vpinna80
Copy link
Collaborator

Hi Nicolas,
Unfortunately the answers to your questions are "it depends"....
You can find all the rules for mapping SDMX artifacts to VTL in the SDMX Specifications section 6, chapter 12.

@antonio-olleros
Copy link

antonio-olleros commented Apr 9, 2024

Hi Nico,

Very interesting (and hot) question... In my view (but it is just a view...) all enumerated types should be strings, even if the codes can be represented as integers. The reason is that is that you will never operate with those codes, and in the VTL world types matter for what operations you want to do with them.

Regarding the value domain, well, it certainly depends... and mainly on the practice of the SDMX modeller (which I find often not to be the best...) It is quite likely that CL_TEST contains a lot of values that are not used in the dataset, and modellers may have chosen to use region constraints to limit those values. If so, I think those constraints would provide the relevant value domain.

And for your third point, if I understand your question correctly, I agree with you, and that was my main point in the meeting in Paris (and in the word document we shared after that regarding the data model). We should define why we have value domains and other objects in VTL (and compatibility with GSIM is not a good reason, for me, because dropping things does not make VTL incompatible, it is already a subset of GSIM... So, we can drop artifacts while keeping compatibility). In my view:

  • There is a necessity of enumerated value domains, if we want to use the Pivot operator, without value domains it is not usable.
  • I cannot find any other use case for value domains in the definition of a dataset. Validation is for me not a use case. For that we have SDMX engines, which should do that automatically. And, if you want to validate within VTL, you should explicitly use VTL, for instance writing: val1 := check(DS1#Comp1 in value_doain1 errorcode "1001"). That's the correct way to validate with VTL.

@vpinna80
Copy link
Collaborator

vpinna80 commented Apr 9, 2024

Actually there are quite a lot of integer enumerated sets examples, most of them coming from the statistical domain where you have surveys with numbered responses with the addition of special values (i.e. -99: invalid, -101: no response, and so on).

Creating and manipulating hierarchies is also a use case for codes, for example you can have an automatic hierarchy ruleset import system from the dsd (I used this while working with Pacific Community Hub).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants