Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In text tables, validate attempts to match pattern associated with data_type before checking Special_Constants #837

Closed
cgobat opened this issue Feb 23, 2024 · 11 comments · Fixed by #893
Assignees
Labels

Comments

@cgobat
Copy link
Contributor

cgobat commented Feb 23, 2024

Checked for duplicates

Yes - I've already checked

May be part of #832 and/or related to #831, but I figured I'd open this for another example case if nothing else.

🐛 Describe the bug

When validating an ASCII table of times that uses "99:99:99" as a Special_Constants/not_applicable_constant value, validate fails due to the presence of the 99:99:99 value(s), with the message

Value does not match its data type 'ASCII_Time': Could not parse 99:99:99 using these patterns 'hh:mm:ss[.ffffff][Z]'

🕵️ Expected behavior

I expected that instances of Special_Constants would get special treatment, and would be allowed as exceptions to data_type format rules.

📜 To Reproduce

  1. Label an ASCII table that includes value(s) not normally allowed by the pattern associated with the applicable <data_type>
  2. Include a definition of these value(s) as <Special_Constants> in the label
  3. Run validate. Note resulting failure reason.

🖥 Environment Info

  • Version of this software: v3.3.3 & v.3.5.0-SNAPSHOT
  • Operating System: Ubuntu 20.04
  • Java: OpenJDK 17.0.9

🩺 Test Data / Additional context

Sample data:

01:20:33
02:04:51
07:36:22
99:99:99
13:18:48
23:49:04

Relevant label section:

<Table_Delimited>
    <name>local_time</name>
    <offset unit="byte">0</offset>
    <parsing_standard_id>PDS DSV 1</parsing_standard_id>
    <records>6</records>
    <record_delimiter>Line-Feed</record_delimiter>
    <field_delimiter>Comma</field_delimiter>
    <Record_Delimited>
        <fields>1</fields>
        <groups>0</groups>
        <Field_Delimited>
            <name>time</name>
            <data_type>ASCII_Time</data_type>
            <maximum_field_length unit="byte">8</maximum_field_length>
            <description>Local (solar) time at boresight/target intercept</description>
            <Special_Constants>
                <not_applicable_constant>99:99:99</not_applicable_constant>
            </Special_Constants>
        </Field_Delimited>
    </Record_Delimited>
</Table_Delimited>

The sample data file, a functionally complete label for it, and a failing validate report can be found in this attached gzipped tar file.

🦄 Related requirements

No response

⚙️ Engineering Details

Hopefully fairly simple fix to change ordering of checks here.

@cgobat cgobat added bug Something isn't working needs:triage labels Feb 23, 2024
@cgobat cgobat changed the title In text tables, validate attempts to match pattern associated with data_type before checking Special_Constants In text tables, validate attempts to match pattern associated with data_type before checking Special_Constants Feb 23, 2024
@jordanpadams
Copy link
Member

@cgobat we will take a look at this when we work on #832. Hopefully in the next couple months.

@al-niessner
Copy link
Contributor

@cgobat @jordanpadams `

This is a chicken and egg problem. The value in the table is being converted to compare to the special constants. In most cases that we have, floats convert to out of range floats or ints or longs or doubles. In this case, the conversion to a time is trouble before it can be compared to the special constant.

Error is here:

addTableProblem(ExceptionType.ERROR, ProblemType.FIELD_VALUE_DATA_TYPE_MISMATCH,

and is duplicated here:

addTableProblem(ExceptionType.ERROR, ProblemType.FIELD_VALUE_DATA_TYPE_MISMATCH,

but check for special constants is done further down and only for min/max. The best suggestion is during the handling of the exception check to see if the string content matches any of the special characters and if it does, do not generate the error. I am sure this will lead to some cases where the match should generate an error but others not. If the proposed solution sounds reasonable I will make the fix.

@cgobat
Copy link
Contributor Author

cgobat commented May 1, 2024

Yeah, it makes sense to me to do special constant checking in text fields as a string comparison operation (rather than value-based after conversion/interpretation).

@al-niessner
Copy link
Contributor

@cgobat @jordanpadams

I made the changes but still not to sure we want to do this. The model says that special constants should be in the format of the data. If it cannot be converted, then it is not in the format. Not in the format, then breaks the model. If you agree @jordanpadams then delete the PR and branch. Otherwise it is there and done.

@jordanpadams
Copy link
Member

@al-niessner that is a good question. I will bring this to some other folks for thought.

@rchenatjpl
Copy link
Contributor

@jordanpadams @al-niessner @cgobat Not sure what to do about this. If validate matches the special constants by string, the value of the special constant can be anything with any type. In the attached example, the .xml has

          <data_type>ASCII_Time</data_type>
          <Special_Constants>
            <unknown_constant>XOXOXOXO</unknown_constant>
            <not_applicable_constant>99:99:99</not_applicable_constant>
          </Special_Constants>

and validate passes the .csv, which has

01:20:33,22  
02:04:51,-999
XOXOXOXO,22  
99:99:99,22  
13:18:48,22  
23:49:04,XXX 

Do we accept that? The PDS3 standard wanted strings "N/A", "UNK", and something else in numerical ASCII fields. I think that hurt processing software (besides validate), maybe because such software couldn't directly read data into numerical variables. But the only way I see for marking a fake time value is to exceed the type limitations of ASCII_Time.

@rchenatjpl
Copy link
Contributor

val837.zip

@jordanpadams
Copy link
Member

@rchenatjpl good point. We probably need to take this to the DDWG for clarification.

@matthewtiscareno
Copy link

matthewtiscareno commented Aug 30, 2024

It seems to me that the value being rejected (specifically 99:99:99) has the correct syntax (hh:mm:ss), but that they are being rejected as not being valid ASCII_Time values because of their numerical value (99>24)? This would be different from how XOXOXO would be rejected.

@jordanpadams
Copy link
Member

@matthewtiscareno actually, what I believe @rchenatjpl is saying is that both of those constants (XOXOXOXO and 99:99:99) are now allowable constants because it is not required that these values meet the data type of the parent object (ASCII_Time).

@rchenatjpl not sure what the best answer is here. currently these Special_Constants values are not required to be that type, only valid_maximum, valid_minimum, *_saturation constants require that. we could throw a warning?

@matthewtiscareno
Copy link

Wait, what? I thought the whole point of special constants was that they have the same data type as the rest of the column but encode a specific meaning. Is this what is usually done but not actually required by Validate? Is that intentional? What do the standards say?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment