Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update cdflib version #606

Closed
subagonsouth opened this issue Jun 10, 2024 · 9 comments · Fixed by #991
Closed

Update cdflib version #606

subagonsouth opened this issue Jun 10, 2024 · 9 comments · Fixed by #991
Assignees
Labels
CDF Related to CDF files
Milestone

Comments

@subagonsouth
Copy link
Contributor

subagonsouth commented Jun 10, 2024

Algorithm Description:

Update to the latest version of cdflib. The major relevant changes to be aware of are:

Other Notes:

- The code that sets the to_datetime keyword argument in imap_processing.cdf.utils.load_cdf should be removed.
to_datetime was already removed as part of another PR.

Related Issues/PRs:

#595 - this was the ticket that added defaulting to to_datetime = True

@subagonsouth subagonsouth added the CDF Related to CDF files label Jun 10, 2024
@subagonsouth subagonsouth self-assigned this Aug 26, 2024
@subagonsouth subagonsouth moved this to In Progress in IMAP Aug 26, 2024
@subagonsouth subagonsouth moved this from In Progress to Todo in IMAP Aug 27, 2024
@subagonsouth
Copy link
Contributor Author

I started on this but didn't make appreciable progress. The one thing that I did identify is that there is a bug in cdflib in that it tries to put arrays with dtype=uint64 into CDF_UINT8 which does not exist. All variables that are specified as uint64 should be changed to int64.

@greglucas
Copy link
Collaborator

Upgrading to v1.3.1 produces 5 failures within the test suite. All of which, I'm not able to address, so I'm going to punt to others on this as well 😂

FAILED imap_processing/tests/hi/test_l1a.py::test_sci_de_decom - KeyError: 'CDF_UINT8'
FAILED imap_processing/tests/hi/test_l1a.py::test_app_hist_decom - cdflib.xarray.xarray_to_cdf.ISTPError: ISTP Compliance Warning: LABL_PTR_1 attribute is required for variable...
FAILED imap_processing/tests/mag/test_mag_decom.py::test_mag_raw_cdf_generation - cdflib.xarray.xarray_to_cdf.ISTPError: ISTP Compliance Warning: LABL_PTR_1 attribute is required for variable...
FAILED imap_processing/tests/mag/test_mag_l1b.py::test_cdf_output - cdflib.xarray.xarray_to_cdf.ISTPError: Variable epoch was determined to be an ISTP 'Epoch' variable, but it i...
FAILED imap_processing/tests/swe/test_swe_l1a.py::test_cdf_creation - Exception: Cannot include both LABLAXIS and LABL_PTR_2 in the attributes to variable science_data.

@bryan-harter : There is an issue with CDF_UINT8, if we use a np.uint64 as a numpy array variable. The CDF format doesn't have a UINT8. https://cdaweb.gsfc.nasa.gov/pub/software/cdf/doc/cdf34/cdf340ifd.pdf
image
@subagonsouth are you willing to use a different data type (either uint32 or int64)? Or do we need to make two variables to store this field on Hi?

Most of the other failures are related to ISTP attributes, specifically this one:

Exception: Cannot include both LABLAXIS and LABL_PTR_2 in the attributes to variable science_data.

However, even when I remove all LABLAXIS values from the yaml definitions, the error still occurs. Somewhere along the line LABLAXIS is getting injected into the CDF attributes. @maxinelasp I think this might have to do with the AttributeManager, but I am not positive. There are still a bunch of warnings for the attribute manager schema that I think were going to be solved with SAMMI? So do we need to wait for that and what do we estimate the timescale to be?

@tech3371
Copy link
Contributor

I can comment on what I found on LABLAXIS error. We tried to remove LABLAXIS for data with 2 or more dimensions since it will need to use LABL_PTR_i. But even after removing LABLAXIS from yaml file, it was being added back again because it's defined as required in this line. I believe we need to change logic in this and that line.

This blocks show example of what CDF attrs definition looks like for data with 2 or more dimensions.

science_data:
  CATDESC: Decompressed Counts
  DEPEND_0: epoch
  DEPEND_1: spin_angle
  DEPEND_2: polar_angle
  LABL_PTR_1: spin_angle_label
  LABL_PTR_2: polar_angle_label
  DISPLAY_TYPE: spectrogram
  FIELDNAM: Decompressed Counts
  FORMAT: I5
  UNITS: counts
  VALIDMAX: 66539
  VALIDMIN: 0
  VAR_TYPE: data
  FILLVAL: -9223372036854775808
  VAR_TYPE: data
  SCALETYP: linear

@tech3371
Copy link
Contributor

I believe we get those errors because updated cdflib already follows this scheme from SPDF website

LABLAXIS --- required if not using LABL_PTR_1
should be a short character string (approximately 10 characters, but preferably 6 characters - more only if absolutely required for clarity) which can be used to label a y-axis for a plot or to provide a heading for a data listing.
LABL_PTR_1, LABL_PTR_2, etc. --- required if not using LABLAXIS
is used to label a dimensional variable when one value of LABLAXIS is not sufficient to describe the variable or to label all the axes. LABL_PTR_i is used instead of LABLAXIS, where i can take on any value from 1 to n where n is the total number of dimensions of the original variable. The value of LABL_PTR_1 is a variable which will contain the short character strings which describe the first dimension of the original variable. The actual labels should be short as described above for LABLAXIS. The value of the attribute must be a variable in the same CDF data set. See example.

@maxinelasp
Copy link
Contributor

@tech3371 I believe if you update the schema here to set LABLAXIS required=false it will fix the error. This is also something that is reworked in SAMMI - I'm not sure if the same issue exists there.

@maxinelasp
Copy link
Contributor

swxsoc/sammi#7

@maxinelasp
Copy link
Contributor

Sorry @greglucas , I missed your question to me at the end! This is eventually going to be fixed in SAMMI, probably on the scale of weeks. But, like I mentioned in my prior comment, it's an easy bandaid fix for now. We just need to make sure manually to include either LABLAXIS or LABL_PTR_n until it's actually fixed in SAMMI, but cdflib checks that so we have a safety net still.

@subagonsouth
Copy link
Contributor Author

@subagonsouth are you willing to use a different data type (either uint32 or int64)? Or do we need to make two variables to store this field on Hi?

Yep... see my initial comment. They should all be changed to int64 by modifying the default_uint64_attrs -> default_int64_attrs in imap_hi_variable_attrs.yaml.

@subagonsouth
Copy link
Contributor Author

subagonsouth commented Sep 5, 2024

However, even when I remove all LABLAXIS values from the yaml definitions, the error still occurs. Somewhere along the line LABLAXIS is getting injected into the CDF attributes.

What confuses me is that Hi has two dimensional variables that don't have this issue. I didn't have time to dig down to understand the differences. Perhaps it is because I am calling with check_schema=False?

attr_mgr.get_variable_attributes(f"hi_pset_{var_name}", check_schema=False)

@greglucas greglucas removed their assignment Sep 19, 2024
@tech3371 tech3371 self-assigned this Sep 30, 2024
@tech3371 tech3371 added this to the Oct 2024 milestone Oct 7, 2024
@tech3371 tech3371 moved this from Todo to In Progress in IMAP Oct 10, 2024
@github-project-automation github-project-automation bot moved this from In Progress to Done in IMAP Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CDF Related to CDF files
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants