Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sidecar validation #154

Closed
happy5214 opened this issue Jan 22, 2021 · 4 comments
Closed

Sidecar validation #154

happy5214 opened this issue Jan 22, 2021 · 4 comments

Comments

@happy5214
Copy link
Member

happy5214 commented Jan 22, 2021

Over the past couple of weeks, @VisLab and I have discussed the procedure for validating sidecars as part of BIDS datasets. Here is the plan we came up with, slightly modified based on some thoughts I've had since then:

  • Parse sidecars.
  • Run syntax-only validation on all event sidecars, exiting on failure.
  • Per TSV:
    • Find sidecars for that TSV, based on filename.
    • Parse TSV.
    • Run syntax-only validation on HED TSV column, skipping to next TSV on failure.
    • Parse and validate definitions.
    • Run semantic validation on sidecars and HED column, skipping to next TSV on failure of this or definition validation.
    • Substitute sidecar columns.
    • Run dataset validation on TSV.
@VisLab
Copy link
Member

VisLab commented Jan 22, 2021

In bullet item 2,

I think that there has to be more than syntax-only validation on all side cars. I think it should actually check that tags that are not library tags are in the base schema. This will cover 99.99% of the errors.

I am wondering whether we shouldn't try to come up with a different way of declaring library schema so that we don't have to have them embedded in the files. I don't have any ideas on this, but it would make things a lot easier..... thoughts?

@happy5214
Copy link
Member Author

Since it appears we're going to be passing in the library schemas with the base schema version and not as definition tags (per hed-standard/hed-specification#156), the separate passes are no longer necessary. Therefore, the new proposal is this:

  • Parse sidecars.
  • Run string-level validation on all event sidecars, exiting on any errors (not warnings).
  • Per TSV:
    • Find sidecars for that TSV, based on filename.
    • Parse TSV.
    • Run string-level validation on HED TSV column, skipping to next TSV on failure.
    • Substitute sidecar columns.
    • Run event- and dataset-level validation (including definitions) on TSV.

@VisLab
Copy link
Member

VisLab commented Feb 2, 2021

A few thoughts/questions:

For sidecars: Does string-level validation includes handling # placeholders? This problem is complicated by placeholders in definitions as explained below.

Definitions: How do definitions get incorporated into this? Would the parser recognize Def/XXX and Def-expand/YYY as defined names. Would the parser return a dictionary of definitions as well as a list of Def/XXX and Def-expand/YYY's that appear.

In order to conveniently use definitions with placeholders, the specification is going to allow the following:

A definition tag-group can contain at most one # placeholder. If it has a #, then when it is used, a value must be supplied:

Example: Def/XXX/1 Hz

My original thought was to postpone supporting this until later, but when we have worked out practical tagging examples, this appears to be very convenient and useful. Since implementing it may change the way some parsing is done, we should probably think about it now.

@VisLab
Copy link
Member

VisLab commented Oct 29, 2021

@IanCa @happy5214

Can you both confirm that we have finalized our strategy (both python and javascript) and that this issue can be closed?

@VisLab VisLab closed this as completed Nov 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants