CDF Updates #660

anamanica · 2024-06-25T15:22:14Z

Change Summary

Overview

Cleaning up the CdfAttributeManager class code in order to ensure schema validation when a user is accessing global and variable attributes. This is done by changing .global_attributes and .variable_attributes to be private properties, establishing the methods .get_global_attributes and .get_variable_attributes as the way to retrieve a validated, specific dictionary.

This particular change has been slightly debated. Originally, the methods .get_global_attriubtes and .get_variable_attributes were going to be changed into Python properties, but that proved to be unintuitive with the functionality of getters and setters. Then, we were going to create a new class to place validation into the __getitem__ functionality of dictionaries, but this would require creating a class with multiple inheritance and doing some funky things. Overall, it was decided by Maxine and I that the best current solution is the one described above.

This branch was created from cdf-attribute-tests in order to carry the testing work from that branch onto this one.

New Dependencies

None

New Files

None

Deleted Files

None

Updated Files

-cdf_attribute_manger
- Changing .global_attributes and .variable_attributes to private methods.
- Finishing the code of the method .get_variable_attributes.

ultra_l1_utils.py
- Changing lines 37 and 51 to use .get_variable_attributes instead of .variable_attributes
codice_l1b.py
- Changing line 66 to use .get_variable_attributes instead of .variable_attributes
codice/utils.py
- Changing line 158 to use .get_variable_attributes instead of .variable_attributes

Testing

All covered in test_cdf_attribute_manager.py.

closes #593

…tests

maxinelasp

A few changes, but looking good so far!

imap_processing/cdf/cdf_attribute_manager.py

maxinelasp · 2024-06-25T17:25:53Z

imap_processing/cdf/cdf_attribute_manager.py

+            if variable_name in self._variable_attributes:
+                output = self._variable_attributes[variable_name]
+            elif (
+                variable_name is not None


I don't think variable_name should ever be None, so this check is not needed. Actually, this whole elif block should be removed, because the person calling get_variable_attributes should always provide variable_name.

This was necessary in get_global_attributes because the instrument_id is an optional parameter, so it can be none.

It looks like this is why your tests are failing too!

maxinelasp · 2024-06-25T17:30:21Z

imap_processing/cdf/cdf_attribute_manager.py

+            ):
+                output = self._variable_attributes["variable_name"][attr_name]
+            elif (
+                variable_name not in self.variable_attribute_schema


Rather than checking variable_name against the schema, we should check the attributes against the schema. So you would rewrite this block as something like:

for attribute in self.variable_attribute_schema.keys(): if attribute in self._variable_attributes[variable_name]: add to output if attribute not in self._variable_attributes[variable_name] and variable is required: add to output as None

Hahah, yes, thank you so much! I caught this momentarily after I submitted the PR request. Felt very silly. I should have probably started this one as a draft.

maxinelasp · 2024-06-25T17:31:04Z

imap_processing/cdf/imap_cdf_manager.py

@@ -11,11 +11,28 @@


 class ImapCdfAttributes(CdfAttributeManager):
-    """Contains IMAP specific tools and settings for CDF management."""
+    """


Looks like this PR includes some stuff from your test PR - that will go away once you merge the test PR.

subagonsouth

This is a ton of work and things are looking pretty good. I agree with the decision to remove the property accessors.

Let me know if you need any clarification on the comments that I left.

subagonsouth · 2024-06-25T16:40:20Z

imap_processing/cdf/tests/__init__.py

Commenting on the __init__.py file but this is relevant to all files in this directory...

All .py files in this directory should be moved to imap_processing/tests/cdf/
All .yaml files should be moved under the parent directory imap_processing/tests/cdf/test_data

Basically, we want the directory structure under imap_processing/tests to mirror the directory structure of under imap_processing with the exception of test data.

Sweet! I had to leave one .yaml file under /tests/cdf due to the way the code is written in __init__, but I will be sure to ask Maxine about that.

The reason we left the tests where they were, is because this code is going to be moved to another repo very soon and hopefully removed from here. So, I wanted it to be isolated from everything else.

but, in general, all tests should go under imap_processing/tests. So I don't feel strongly about this, but I'd slightly prefer to keep it all together to make it easier to move and feel confident I got all of it.

imap_processing/cdf/cdf_attribute_manager.py

subagonsouth · 2024-06-25T17:14:50Z

imap_processing/cdf/tests/shared/default_variable_cdf_attrs_schema.yaml

I would suggest reducing this test schema file to have the minimum number of entries needed for complete test coverage. I think that you can get away with as few as two entries under attribute_key. They can be completely made up entries with values specified such that you can test everything that you need to.

Ah yes! Initially I had made a default_test_schema file, but then was asked to use the actual schema file since that is a bit more concrete that the individual instrument files.

I think I see where some of the confusion is coming from. Either you should use the test_schema file (if there is some testing reason to keep it) and point to it in the tests (so change the code in Tim's other comment) OR continue using the real schema, leave the tests the same, and remove these unused schemas.

If I'm missing something here, please let me know!

imap_processing/cdf/tests/shared/default_global_cdf_attrs_schema.yaml

imap_processing/cdf/tests/test_cdf_attribute_manager.py

subagonsouth · 2024-06-25T17:28:11Z

imap_processing/cdf/tests/test_cdf_attribute_manager.py

+    """
+
+    # Initialize CdfAttributeManager object which loads in default info
+    cdf_manager = CdfAttributeManager(Path(__file__).parent.parent / "config")


Is there a reason to first load the default (not for testing) schema and default global attributes? It makes this test a bit confusing to have that cross pollination that isn't very obvious. Can the test schema be used instead so that this is totally isolated to using test specific schema and global attribute defaults?

This comment applies to all of the test functions in this file.

I believe the reason we are using the default schema for these tests is because they are actually loaded into the CdfAttributeManager class from the __init__ function. I would have to change the code here if I wanted to use test schema, which I have been told is not ideal.

I prefer to use the real schema against the test data so we can avoid the real schema and test schema from getting out of sync, so I asked Ana to do it this way.

But, since you got a comment on it, can you add a short comment in the code explaining why you're loading in the default schema?

@maxinelasp, Does this mean that you are considering these tests a validation of the real schema? I think of unit tests as only covering the code implementation, not validating the schema.

Yeah, good question. I can see either argument, but in this case, I think it makes sense to have the real schema be part of the unit tests because it's closely tied to the code. But, I can definitely see the philosophical argument that it's not really part of the "unit" so it shouldn't be tested.

subagonsouth · 2024-06-25T17:37:02Z

imap_processing/cdf/tests/test_imap_cdf_manager.py

Nice job implementing succinct tests in this file that cover only the functionality added by the ImapCdfAttributes class!

…processing into cdf-updates

maxinelasp

Some additional comments but it's definitely getting close!!

imap_processing/cdf/cdf_attribute_manager.py

maxinelasp · 2024-06-27T16:56:24Z

imap_processing/cdf/cdf_attribute_manager.py

+            # Case to handle DEPEND_i schema issues
+            elif attr_name == "DEPEND_i":
+                # range(3) because the highest DEPEND_i value is 3.
+                for i in range(3):


You can't assume that there are only 3 DEPEND_i cases. To address your todo comment below, I also believe depend_0 does need to exist and it needs to be epoch.

Suggested rewrite:

elif attr_name == "DEPEND_i": # Find all the attributes of variable_name that contain "DEPEND" variable_depend_attrs = [key for key in self._variable_attributes[variable_name].keys() if "DEPEND" in key] # Confirm that each DEPEND_i attribute is unique if len(set(variable_depend_attrs)) != len(variable_depend_attrs): logger.warn(f"Found duplicate DEPEND_i attribute in variable {variable_name}: {variable_depend_attrs}") for variable_depend_attr in variable_depend_attrs: output[variable_depend_attr] = self._variable_attributes[variable_name][variable_depend_attr]

Also as a TODO, please add a comment to add some additional validation for the depend attributes.

Ah! This works way better. Much more sophisticated. Thanks!

Regarding DEFAULT_0, I actually talked with Sean about his tests for l0_l1a, and that is were he told me that DEPEND_0 does not need to be present in some cases he is working with when epoch is not one of the dimensions. I am doing a poor job of explaining this, so I can send you the message he sent me if you would like.

Ok, for posterity: according to Bryan Harter, Depend_0 must be an epoch, but it isn't actually required for every variable. So we will be updating the schema to make it not required and removing the special cases from the code.

imap_processing/cdf/cdf_attribute_manager.py

imap_processing/cdf/tests/imap_test_variable.yaml

imap_processing/cdf/tests/test_cdf_attribute_manager.py

imap_processing/hi/l1c/hi_l1c.py

subagonsouth

Thanks for persevering through this challenging PR. Sorry that my off-nominal use of the CdfAttributeManager caused problems. I'm happy with this end result.

imap_processing/hi/l1a/science_direct_event.py

subagonsouth · 2024-06-27T20:04:03Z

imap_processing/tests/hi/test_hi_l1b.py

@@ -21,6 +23,7 @@ def test_hi_l1b_hk():
    assert l1b_dataset.attrs["Logical_source"] == "imap_hi_l1b_45sensor-hk"


+@pytest.mark.xfail()


I will write myself a ticket to fix this test.

maxinelasp

Nice job!

…processing into cdf-updates Updating branch

anamanica added 30 commits June 13, 2024 09:49

Writing initial TODO comments.

272ba6f

Troubleshooting

537b341

Schema tests passed

526d0d4

I think I understand instrument_id

2407f18

Fixing test file

32c568d

Var attribute tests

3f19842

First draft

42face3

Merge branch 'IMAP-Science-Operations-Center:dev' into cdf-attribute-…

e08ea48

…tests

Deleting file

649d2da

Test

d0c3be2

Fixing pulled errors

2033c8b

Merge branch 'IMAP-Science-Operations-Center:dev' into cdf-attribute-…

ac9d8d9

…tests

Quick

c1f60ff

Trying different things to pass pre-checks

308d09a

Testing

b964266

TEST

191c234

Fixing failed pre-check

a023663

Test2

4719b28

Fixing pre-commit issues

7fcb53e

Fixing prechecks

97db92b

Removing test statements

5e6c8e5

Moving files

c1b6a55

Fixing PR draft comments

706e981

Adding additional tests

a8ec74c

Breaking up test functions

8b99182

Adding more tests

f2f45c1

Getting closer to final draft

46eb203

Finishing touches

7032161

Merge branch 'IMAP-Science-Operations-Center:dev' into cdf-attribute-…

8fa0535

…tests

Codcov check

7906ed6

anamanica requested review from subagonsouth and maxinelasp June 25, 2024 15:22

anamanica added 3 commits June 25, 2024 09:28

Fixing pre-commits

cdb2d27

Reverting changes

9d9cada

Looking for failing tests in pre-commits

7a3ac5e

maxinelasp reviewed Jun 25, 2024

View reviewed changes

subagonsouth reviewed Jun 25, 2024

View reviewed changes

anamanica added 12 commits June 25, 2024 14:22

Merge branch 'dev' of github.com:IMAP-Science-Operations-Center/imap_…

7fd4225

…processing into cdf-updates

Push before Pull

0a9395a

Resolving conflicts

0297af0

Test change

7c6320a

Fixing test errors

8138cfb

Updating code, and marking test as fail.

e5e5b05

Stupid solution

d904c81

Added DEPEND_i case

1d4c487

Added check_schema tag so hi tests pass

f288f6a

Excluding DEPEND_0 as required just to get tests to pass.

0099a81

Cleaning up my code

87c64a3

Changing file locations

69dff37

maxinelasp reviewed Jun 27, 2024

View reviewed changes

Cleaning up code, changing attriburte schema required, adding _i cases.

4875d20

anamanica requested review from Alrobbertz, maxinelasp and subagonsouth June 27, 2024 19:22

subagonsouth approved these changes Jun 27, 2024

View reviewed changes

maxinelasp approved these changes Jun 27, 2024

View reviewed changes

anamanica added 2 commits June 27, 2024 15:50

Merge branch 'dev' of github.com:IMAP-Science-Operations-Center/imap_…

a02fa7e

…processing into cdf-updates Updating branch

PR comment changes

b32f77b

anamanica merged commit a9debee into IMAP-Science-Operations-Center:dev Jul 1, 2024
17 checks passed

anamanica deleted the cdf-updates branch July 1, 2024 16:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CDF Updates #660

CDF Updates #660

anamanica commented Jun 25, 2024 •

edited

Loading

maxinelasp left a comment

maxinelasp Jun 25, 2024

maxinelasp Jun 25, 2024

maxinelasp Jun 25, 2024

anamanica Jun 25, 2024

maxinelasp Jun 25, 2024

subagonsouth left a comment

subagonsouth Jun 25, 2024

anamanica Jun 27, 2024

maxinelasp Jun 27, 2024

subagonsouth Jun 25, 2024

anamanica Jun 27, 2024

maxinelasp Jun 27, 2024

subagonsouth Jun 25, 2024

anamanica Jun 27, 2024

maxinelasp Jun 27, 2024

subagonsouth Jun 27, 2024

maxinelasp Jun 27, 2024

subagonsouth Jun 25, 2024

maxinelasp left a comment

maxinelasp Jun 27, 2024

anamanica Jun 27, 2024

maxinelasp Jun 27, 2024

subagonsouth left a comment

subagonsouth Jun 27, 2024

maxinelasp left a comment

		@@ -21,6 +23,7 @@ def test_hi_l1b_hk():
		assert l1b_dataset.attrs["Logical_source"] == "imap_hi_l1b_45sensor-hk"


		@pytest.mark.xfail()

CDF Updates #660

CDF Updates #660

Conversation

anamanica commented Jun 25, 2024 • edited Loading

Change Summary

Overview

New Dependencies

New Files

Deleted Files

Updated Files

Testing

maxinelasp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

subagonsouth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maxinelasp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

subagonsouth left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maxinelasp left a comment

Choose a reason for hiding this comment

anamanica commented Jun 25, 2024 •

edited

Loading