-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Science Direct Event Decompression Re-write #397
Science Direct Event Decompression Re-write #397
Conversation
self.software_version = software_version | ||
self.packet_file_name = packet_file_name | ||
self.ccsds_header = CcsdsData(packet.header) | ||
# TODO: Is there a better way to initialize these arrays? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you initialize this to an array of fill values right away. This might also simplify some of your if/append logic down below to not have to set this every time and only update a specific position in the array if it is present.
np.ones(self.COUNT) * GlobalConstants.DOUBLE_FILLVAL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, good idea!
|
||
def __init__(self, packet, software_version: str, packet_file_name: str): | ||
"""Initialize Science Direct Events Data class.""" | ||
set_attributes(self, packet) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you bring this function into this class?
self.set_attributes(packet)
def _set_attributes(self, packet):
# inline in this file instead of elsewhere in another module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made this an outside function because I am going to re-use it for every data class. Do you think it would still be better within the class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I personally think so. One other suggestion then would be to inherit your dataclasses and set this function in the base of the inheritance.
class BaseLoClass:
def _set_attributes(self, packet):
...
@dataclass
class ScienceDE(BaseLoClass):
# Should have your method here
class ScienceMessageClass(BaseLoClass):
# Another class with the same method inherited
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that what it's doing prior to this PR. LoBase is currently a parent class with that function in it and the direct events data class is inheriting from it. I removed that in this PR because during a discussion about the ENA direct event data classes, I got convinced that the LoBase class isn't a true parent class of the data classes (and having that as a parent class would complicate things if ENAs are able to create parent classes for our direct events), so I created the outside function instead.
I'm not crazy about importing that either, so I might revert back to using the LoBase and change that in the future if we start using a DE parent class across ENAs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😂 Sorry I missed all of these discussions! I personally think that is a good decision, and re-evaluating it later seems reasonable. Funny that you had what I suggested earlier and I just didn't realize it :)
# determine which case number we are working with. | ||
# The case number is used to determine how to | ||
# decompress the TOF values. | ||
case_number = int(data.next_bits(4), 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you able to do loop indexing into the data field and keep track of the pointer directly here/passed in somehow? I think data[start:start + 4]
is a bit easier to follow than having to create a separate class to keep track of the position, but that is just a personal preference and can be ignored if this is easier/makes more sense to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'll remove that class. When I wrote the class I was iterating over the bit string across a few different methods which made me feel like it was easier to keep track with the class. I didn't rethink that after I simplified things more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I changed my mind about this class. I'd rather avoid incrementing the bit position so many times inside the data class. I think it's a little distracting from the purpose of the method it's happening in. I should be able to reuse the class for a bunch of other Lo classes and possible HIT as well. I'll think about it a little more though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, that is fine. It is somewhat just a personal preference I think.
Another thing I just thought about though is whether you should bring the for _ in range(self.COUNT): self._parse_data(data)
inside this function directly and iterate here (only calling it once in the initializer then)?
index = 0
for i in range(self.COUNT):
self.TIME[i] = int(data.next_bits(DATA_BITS.TIME), 2)
# Or keeping track of index without the class
self.TIME[i] = int(data[index:index + DATA_BITS.TIME), 2)
index += DATA_BITS.TIME
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a good point. I'll move everything in _parse_data
into _decompress_data
and remove _parse_data
entirely.
# transmitted for this case. Then grab the bits from | ||
# the binary and perform a bit shift to the left. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bitshift description is a little confusing to me currently. I think you're bitshifting the integer representation of those bits, where initially I read this as shifting the binary data you're reading one to the left at first.
# transmitted for this case. Then grab the bits from | |
# the binary and perform a bit shift to the left. The | |
# transmitted for this case. Then grab the bits from | |
# the binary, turn these into an integer, and perform a bit shift to the left on that integer value. The |
if case_decoder.TOF0: | ||
tof0 = int(data.next_bits(DATA_BITS.TOF0), 2) << DE_BIT_SHIFT | ||
self.TOF0 = np.append(self.TOF0, tof0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you take the previous suggestion of creating the array initially with fill values, where index would need to be passed in for your current case.
if case_decoder.TOF0: | |
tof0 = int(data.next_bits(DATA_BITS.TOF0), 2) << DE_BIT_SHIFT | |
self.TOF0 = np.append(self.TOF0, tof0) | |
if case_decoder.TOF0: | |
self.TOF0[index] = int(data.next_bits(DATA_BITS.TOF0), 2) << DE_BIT_SHIFT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will just say my preference to avoid so many modules again, but feel free to ignore this comment.
I liked keeping this as lo/l0/decompression_tables.py
without the extra nesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I nested this because the direct events decompression is different than the decompression for science counts and star sensor packets, so the decompression_tables
folder will have multiple files in it. In the Star Sensor / Science Counts PR I have three CSVs in there for that decompression algorithm. I don't have a strong preference on how to organize this though, so if you think those CSVs should be in a nested folder and this decompression_tables.py file should be in l0
, I can make that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for not seeing the other PRs that came in!
I would put all of those into a single file if that makes sense.
decompression_tables.py
that has all of those various ScienceCountsDecompression, DirectEventsDecompression, ...
The data files could be in l0/data
or just in l0
directly, which it looks like SWE is doing: https://github.com/IMAP-Science-Operations-Center/imap_processing/tree/dev/imap_processing/swe/l1b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment above about potentially eliminating this altogether, but if you keep it my preference would be for this to be l0/utils.py
and this be a class within that utils.py
file rather than nested further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest moving this into the class that uses above directly rather than the indirection to another module.
@@ -82,6 +82,7 @@ def test_houskeeping(): | |||
] | |||
|
|||
for pkt_idx, packet in enumerate(packets): | |||
print(packet) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove?
"POS": [], | ||
"CKSM": [], | ||
} | ||
absent = "1010" # case 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really nice way to do this testing :) Well done!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor comments and questions. Looks great!
# must be shifted by 1 bit to the left | ||
DE_BIT_SHIFT = 1 | ||
|
||
DataFields = namedtuple( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add note here that explanation of how these are used is explained in detailed in science_direct_events.py
(12, 1): TOFFields(False, False, True, False, False, True), | ||
(12, 0): TOFFields(False, False, True, True, False, False), | ||
(13, 0): TOFFields(False, False, True, False, False, False), | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice way of organizing these different scenarios!
|
||
Bit Shifting: | ||
TOF0, TOF1, TOF2, TOF3, and CKSM all must be shifted by one bit to the | ||
left. All other fields do not need to be bit shifted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really nice documentation!
|
||
Methods | ||
------- | ||
__init__(packet, software_vesion, packet_file_name): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a docstring about packet
parameter? I think it's a list of decom packets, is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. Yes, I'll add that to the __init__
docstring
|
||
def _decompress_data(self): | ||
"""Decompress the Lo Science Direct Events data.""" | ||
data = BinaryString(self.DATA) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A stupid question. Where is this self.DATA
and self.COUNT
initialized? I feel like I am missing something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a stupid question! Right now it's being initialized in the set_attributes
function which gets passed self
and the packet
. I made that update in this PR, but I'm going to go back to having a parent class LoBase
that has a method that works the same as set_attributes
. So it will work the same, but instead of calling set_attributes
it will call that as a parent method.
# decompress the TOF values. | ||
case_number = int(data.next_bits(4), 2) | ||
|
||
# time, energy, and mode are always transmitted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice comments!
next_four_bits = data.next_bits(4) | ||
|
||
assert first_three_bits == "000" | ||
assert next_four_bits == "1110" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
"POS": [], | ||
"CKSM": [], | ||
} | ||
absent = "1010" # case 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments and all can be addressed in future PRs/restructuring if desired.
# cases, so these can be initialized to the | ||
# CDF fill val and stored with this value for | ||
# those cases. | ||
self.TIME = np.ones(self.COUNT) * GlobalConstants.DOUBLE_FILLVAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these need to be a specific datatype like "uint8" rather than "float"?
As an aside and ignored for this PR: Another thought I had after looking at some of this repetition is if we should think about Structured/named Arrays instead of dataclasses?
https://numpy.org/doc/stable/user/basics.rec.html#
So then everything would be in a single numpy array, but lookup via the name. self.data["energy"]
for example.
self.POS = np.ones(self.COUNT) * GlobalConstants.DOUBLE_FILLVAL | ||
self._decompress_data() | ||
|
||
def _decompress_data(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I like that this is more consolidated now and I think it flows nicely with the indexing to set the variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this is a class it seems out of place in the utils
area to me. What about putting these things into larger files to avoid the imports and needing to figure out where things are coming from?
lo/l0/packets.py
(or whatever name would make sense to hold all of your packet class definitions
which would contain:
class LoBase:
...
class ScienceDirectEvent(LoBase):
...
class OtherPacketClasses(LoBase):
...
Compared to right now I think each of those classes is in a dedicated file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely have a preference for smaller, more specific files. I'm ok with grouping things together in the same file if that's the style we're aiming for. However, for this in particular, I'm going to have at least 12 data classes, many of which have a lot more attributes than the direct events class, so that file would get very large and I think it would be a pain to find things in it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Sounds good, I think most other instruments have been going the more files route on this project than I'm used to, so I think this does match up with the rest of the code base and is consistent.
I still would have a slight preference for moving this class/file that serves as the base for inheritance to be moved into the same folder as the other classes. But, I'm not blocking on that and this seems good as-is.
Change Summary
Overview
This PR re-writes the Lo Science Direct Event Decompression code based on recent discussions with and updated documents from Lo Flight Software. This PR also removes the use of the bitstring library and creates fake data in the unit tests so they can be run without sample data.
New Dependencies
None
New Files
Deleted Files
Updated Files
Testing