Science Direct Event Decompression Re-write #397

sdhoyt · 2024-04-11T21:15:44Z

Change Summary

Overview

This PR re-writes the Lo Science Direct Event Decompression code based on recent discussions with and updated documents from Lo Flight Software. This PR also removes the use of the bitstring library and creates fake data in the unit tests so they can be run without sample data.

New Dependencies

None

New Files

imap_processing/lo/l0/utils/binary_string.py
- simple class to iterate through a binary string
imap_processing/lo/l0/utils/set_dataclass_attr.py
- sets object attributes to matching space packet parser output fields

Deleted Files

imap_processing/lo/l0/lo_base.py
- This parent class was removed and put into the set_dataclass_attr.py function

Updated Files

imap_processing/lo/l0/data_classes/science_direct_events.py
- Updated decompression algorithm based on updated document and conversations with Lo FSW
- Added MODE field
imap_processing/lo/l0/decompression_tables/decompression_tables.py
- Updated tables based on updated documents

Testing

Re-wrote imap_processing/tests/lo/test_science_direct_events.py unit tests
Added imap_processing/tests/lo/test_binary_string.py to test the BinaryString class

greglucas · 2024-04-11T21:48:58Z

imap_processing/lo/l0/data_classes/science_direct_events.py

+        self.software_version = software_version
+        self.packet_file_name = packet_file_name
+        self.ccsds_header = CcsdsData(packet.header)
+        # TODO: Is there a better way to initialize these arrays?


Could you initialize this to an array of fill values right away. This might also simplify some of your if/append logic down below to not have to set this every time and only update a specific position in the array if it is present.

np.ones(self.COUNT) * GlobalConstants.DOUBLE_FILLVAL

Thanks, good idea!

greglucas · 2024-04-11T21:50:21Z

imap_processing/lo/l0/data_classes/science_direct_events.py

+
+    def __init__(self, packet, software_version: str, packet_file_name: str):
+        """Initialize Science Direct Events Data class."""
+        set_attributes(self, packet)


Could you bring this function into this class?

self.set_attributes(packet) def _set_attributes(self, packet): # inline in this file instead of elsewhere in another module

I made this an outside function because I am going to re-use it for every data class. Do you think it would still be better within the class?

Yes, I personally think so. One other suggestion then would be to inherit your dataclasses and set this function in the base of the inheritance.

class BaseLoClass: def _set_attributes(self, packet): ... @dataclass class ScienceDE(BaseLoClass): # Should have your method here class ScienceMessageClass(BaseLoClass): # Another class with the same method inherited

Yeah, that what it's doing prior to this PR. LoBase is currently a parent class with that function in it and the direct events data class is inheriting from it. I removed that in this PR because during a discussion about the ENA direct event data classes, I got convinced that the LoBase class isn't a true parent class of the data classes (and having that as a parent class would complicate things if ENAs are able to create parent classes for our direct events), so I created the outside function instead.

I'm not crazy about importing that either, so I might revert back to using the LoBase and change that in the future if we start using a DE parent class across ENAs.

😂 Sorry I missed all of these discussions! I personally think that is a good decision, and re-evaluating it later seems reasonable. Funny that you had what I suggested earlier and I just didn't realize it :)

greglucas · 2024-04-11T21:53:03Z

imap_processing/lo/l0/data_classes/science_direct_events.py

+        # determine which case number we are working with.
+        # The case number is used to determine how to
+        # decompress the TOF values.
+        case_number = int(data.next_bits(4), 2)


Are you able to do loop indexing into the data field and keep track of the pointer directly here/passed in somehow? I think data[start:start + 4] is a bit easier to follow than having to create a separate class to keep track of the position, but that is just a personal preference and can be ignored if this is easier/makes more sense to you.

Yeah, I'll remove that class. When I wrote the class I was iterating over the bit string across a few different methods which made me feel like it was easier to keep track with the class. I didn't rethink that after I simplified things more.

I think I changed my mind about this class. I'd rather avoid incrementing the bit position so many times inside the data class. I think it's a little distracting from the purpose of the method it's happening in. I should be able to reuse the class for a bunch of other Lo classes and possible HIT as well. I'll think about it a little more though

OK, that is fine. It is somewhat just a personal preference I think.

Another thing I just thought about though is whether you should bring the for _ in range(self.COUNT): self._parse_data(data) inside this function directly and iterate here (only calling it once in the initializer then)?

index = 0 for i in range(self.COUNT): self.TIME[i] = int(data.next_bits(DATA_BITS.TIME), 2) # Or keeping track of index without the class self.TIME[i] = int(data[index:index + DATA_BITS.TIME), 2) index += DATA_BITS.TIME ...

Yeah, that's a good point. I'll move everything in _parse_data into _decompress_data and remove _parse_data entirely.

greglucas · 2024-04-11T21:56:02Z

imap_processing/lo/l0/data_classes/science_direct_events.py

+        # transmitted for this case. Then grab the bits from
+        # the binary and perform a bit shift to the left. The


The bitshift description is a little confusing to me currently. I think you're bitshifting the integer representation of those bits, where initially I read this as shifting the binary data you're reading one to the left at first.

Suggested change

# transmitted for this case. Then grab the bits from

# the binary and perform a bit shift to the left. The

# transmitted for this case. Then grab the bits from

# the binary, turn these into an integer, and perform a bit shift to the left on that integer value. The

greglucas · 2024-04-11T21:57:32Z

imap_processing/lo/l0/data_classes/science_direct_events.py

+        if case_decoder.TOF0:
+            tof0 = int(data.next_bits(DATA_BITS.TOF0), 2) << DE_BIT_SHIFT
+        self.TOF0 = np.append(self.TOF0, tof0)


If you take the previous suggestion of creating the array initially with fill values, where index would need to be passed in for your current case.

Suggested change

if case_decoder.TOF0:

tof0 = int(data.next_bits(DATA_BITS.TOF0), 2) << DE_BIT_SHIFT

self.TOF0 = np.append(self.TOF0, tof0)

if case_decoder.TOF0:

self.TOF0[index] = int(data.next_bits(DATA_BITS.TOF0), 2) << DE_BIT_SHIFT

greglucas · 2024-04-11T21:58:48Z

imap_processing/lo/l0/decompression_tables/decompression_tables.py

I will just say my preference to avoid so many modules again, but feel free to ignore this comment.

I liked keeping this as lo/l0/decompression_tables.py without the extra nesting.

I nested this because the direct events decompression is different than the decompression for science counts and star sensor packets, so the decompression_tables folder will have multiple files in it. In the Star Sensor / Science Counts PR I have three CSVs in there for that decompression algorithm. I don't have a strong preference on how to organize this though, so if you think those CSVs should be in a nested folder and this decompression_tables.py file should be in l0, I can make that change.

Sorry for not seeing the other PRs that came in!

I would put all of those into a single file if that makes sense.
decompression_tables.py that has all of those various ScienceCountsDecompression, DirectEventsDecompression, ...

The data files could be in l0/data or just in l0 directly, which it looks like SWE is doing: https://github.com/IMAP-Science-Operations-Center/imap_processing/tree/dev/imap_processing/swe/l1b

greglucas · 2024-04-11T22:00:11Z

imap_processing/lo/l0/utils/binary_string.py

See comment above about potentially eliminating this altogether, but if you keep it my preference would be for this to be l0/utils.py and this be a class within that utils.py file rather than nested further.

greglucas · 2024-04-11T22:00:52Z

imap_processing/lo/l0/utils/set_dataclass_attr.py

Suggest moving this into the class that uses above directly rather than the indirection to another module.

greglucas · 2024-04-11T22:01:24Z

imap_processing/tests/hit/test_housekeeping.py

@@ -82,6 +82,7 @@ def test_houskeeping():
    ]

    for pkt_idx, packet in enumerate(packets):
+        print(packet)


greglucas · 2024-04-11T22:02:29Z

imap_processing/tests/lo/test_science_direct_events.py

-        "POS": [],
-        "CKSM": [],
-    }
+    absent = "1010"  # case 10


This is a really nice way to do this testing :) Well done!

tech3371

minor comments and questions. Looks great!

tech3371 · 2024-04-12T16:51:20Z

imap_processing/lo/l0/decompression_tables/decompression_tables.py

+# must be shifted by 1 bit to the left
+DE_BIT_SHIFT = 1
+
+DataFields = namedtuple(


Can you add note here that explanation of how these are used is explained in detailed in science_direct_events.py

tech3371 · 2024-04-12T16:53:09Z

imap_processing/lo/l0/decompression_tables/decompression_tables.py

+    (12, 1): TOFFields(False, False, True, False, False, True),
+    (12, 0): TOFFields(False, False, True, True, False, False),
+    (13, 0): TOFFields(False, False, True, False, False, False),
+}


Nice way of organizing these different scenarios!

tech3371 · 2024-04-12T16:59:30Z

imap_processing/lo/l0/data_classes/science_direct_events.py

+
+    Bit Shifting:
+    TOF0, TOF1, TOF2, TOF3, and CKSM all must be shifted by one bit to the
+    left. All other fields do not need to be bit shifted.


Really nice documentation!

tech3371 · 2024-04-12T17:17:28Z

imap_processing/lo/l0/data_classes/science_direct_events.py

+
+    Methods
+    -------
+    __init__(packet, software_vesion, packet_file_name):


can you add a docstring about packet parameter? I think it's a list of decom packets, is that right?

That's correct. Yes, I'll add that to the __init__ docstring

tech3371 · 2024-04-12T17:19:43Z

imap_processing/lo/l0/data_classes/science_direct_events.py

+
+    def _decompress_data(self):
+        """Decompress the Lo Science Direct Events data."""
+        data = BinaryString(self.DATA)


A stupid question. Where is this self.DATA and self.COUNT initialized? I feel like I am missing something.

Not a stupid question! Right now it's being initialized in the set_attributes function which gets passed self and the packet. I made that update in this PR, but I'm going to go back to having a parent class LoBase that has a method that works the same as set_attributes. So it will work the same, but instead of calling set_attributes it will call that as a parent method.

tech3371 · 2024-04-12T17:23:31Z

imap_processing/lo/l0/data_classes/science_direct_events.py

+        # decompress the TOF values.
+        case_number = int(data.next_bits(4), 2)
+
+        # time, energy, and mode are always transmitted.


nice comments!

tech3371 · 2024-04-12T17:28:44Z

imap_processing/tests/lo/test_binary_string.py

+    next_four_bits = data.next_bits(4)
+
+    assert first_three_bits == "000"
+    assert next_four_bits == "1110"


tech3371 · 2024-04-12T17:31:15Z

imap_processing/tests/lo/test_science_direct_events.py

-        "POS": [],
-        "CKSM": [],
-    }
+    absent = "1010"  # case 10


greglucas

A few minor comments and all can be addressed in future PRs/restructuring if desired.

greglucas · 2024-04-15T19:12:49Z

imap_processing/lo/l0/data_classes/science_direct_events.py

+        # cases, so these can be initialized to the
+        # CDF fill val and stored with this value for
+        # those cases.
+        self.TIME = np.ones(self.COUNT) * GlobalConstants.DOUBLE_FILLVAL


Do these need to be a specific datatype like "uint8" rather than "float"?

As an aside and ignored for this PR: Another thought I had after looking at some of this repetition is if we should think about Structured/named Arrays instead of dataclasses?
https://numpy.org/doc/stable/user/basics.rec.html#
So then everything would be in a single numpy array, but lookup via the name. self.data["energy"] for example.

greglucas · 2024-04-15T19:22:23Z

imap_processing/lo/l0/data_classes/science_direct_events.py

+        self.POS = np.ones(self.COUNT) * GlobalConstants.DOUBLE_FILLVAL
+        self._decompress_data()
+
+    def _decompress_data(self):


Nice! I like that this is more consolidated now and I think it flows nicely with the indexing to set the variables.

greglucas · 2024-04-15T19:28:25Z

imap_processing/lo/l0/utils/lo_base.py

Because this is a class it seems out of place in the utils area to me. What about putting these things into larger files to avoid the imports and needing to figure out where things are coming from?

lo/l0/packets.py (or whatever name would make sense to hold all of your packet class definitions
which would contain:

class LoBase: ... class ScienceDirectEvent(LoBase): ... class OtherPacketClasses(LoBase): ...

Compared to right now I think each of those classes is in a dedicated file.

I definitely have a preference for smaller, more specific files. I'm ok with grouping things together in the same file if that's the style we're aiming for. However, for this in particular, I'm going to have at least 12 data classes, many of which have a lot more attributes than the direct events class, so that file would get very large and I think it would be a pain to find things in it.

👍 Sounds good, I think most other instruments have been going the more files route on this project than I'm used to, so I think this does match up with the rest of the code base and is consistent.

I still would have a slight preference for moving this class/file that serves as the base for inheritance to be moved into the same folder as the other classes. But, I'm not blocking on that and this seems good as-is.

sdhoyt added 5 commits March 25, 2024 17:08

simplified decompression tables, removed bitstring

f89272f

initial update sci de algorithm based on new document

066f73c

add unit tests

401bb3b

merged dev

3f48b85

removed bit shift for POS

a178a3c

sdhoyt added Ins: Lo Related to the IMAP-Lo instrument Level: L0 Level 0 processing labels Apr 11, 2024

sdhoyt requested review from joycecj, KenWFairchild and a team April 11, 2024 21:15

sdhoyt self-assigned this Apr 11, 2024

sdhoyt requested review from bourque, greglucas, subagonsouth, tech3371, vmartinez-cu, laspsandoval and maxinelasp and removed request for a team April 11, 2024 21:15

sdhoyt added 2 commits April 11, 2024 15:21

fixed binary string unit test

d25e602

spelling fixes

6aadd20

greglucas reviewed Apr 11, 2024

View reviewed changes

sdhoyt added 2 commits April 12, 2024 11:10

comment updates

84d771a

moved decompression table back into decompression folder

fd4f1b5

tech3371 approved these changes Apr 12, 2024

View reviewed changes

sdhoyt added 5 commits April 15, 2024 07:54

comment updates

dec4d5b

added back lo_base, fixed unit tests

82ea545

updated Lo docs

a52cbf1

update sci de docstring to fix doc issue

5ac472c

added utils and data_classes to conf.py

2b3474b

greglucas approved these changes Apr 15, 2024

View reviewed changes

sdhoyt merged commit 8a8278c into IMAP-Science-Operations-Center:dev Apr 16, 2024
15 checks passed

sdhoyt deleted the sci-de-redo branch April 16, 2024 18:20

sdhoyt linked an issue Apr 19, 2024 that may be closed by this pull request

Remove Bitstring from Lo Sci DE data class #359

Closed

		# transmitted for this case. Then grab the bits from
		# the binary and perform a bit shift to the left. The

Science Direct Event Decompression Re-write #397

Science Direct Event Decompression Re-write #397

Conversation

sdhoyt commented Apr 11, 2024

Change Summary

Overview

New Dependencies

New Files

Deleted Files

Updated Files

Testing

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tech3371 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sdhoyt Apr 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

greglucas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sdhoyt Apr 12, 2024 •

edited

Loading