SWE science L1A algorithm #154

tech3371 · 2023-09-12T22:02:52Z

Change Summary

This contains code for SWE science data's l1a algorithm implementation. This is implementation of this issue, #103.

Overview

I moved files around to match folder structure we decided to use, imap_processing/{instrument}/packet_definitions/.

New Files

swe_l1a.py
- This code read l0 file and based on appId, it starts processing science or housekeeping data and so on.
swe_science.py
- This code contains decompression algorithms and stores process data to xarray dataset.

Updated Files

swe_packet_definition.xml
- Updated to match latest packet definition.
decom_swe.py and test_decom.py
- updated decom with new folder path and update import in test file.
pyproject.toml
- updated it to include bitstring

Testing

test_swe_science.py
- Wrote basic tests for SWE science.

…904.xlsx

…d to add it back in.

…904.xlsx

…d to add it back in.

…mmitted by mistake

… DataArray.

…ounts (IMAP-Science-Operations-Center#158) Added pkt definitions for housekeeping, sectored, angular,and priority counts.

- update way of checking if description exists or not. update generator template. - updated generator to use correct IntegerParameterType

greglucas

Looks good, just a few minor suggestions to move towards more numpy functions when possible, but nothing blocking and those can be made later if you want as well.

imap_processing/swe/l1a/swe_science.py

greglucas · 2023-09-14T17:40:36Z

imap_processing/swe/l1a/swe_science.py

+        bit_array = bitstring.ConstBitStream(bin=binary_data)
+        # chunk bit array into 1260 units, each with 8-bits
+        byte_data = np.frombuffer(bit_array.bytes, dtype=np.uint8)


Do you need to create the bit_array at all? Or can you just grab the bytes from binary_data directly? I don't know what the .bytes attribute is doing if it returns utf-8 bytes or something else, but I'm guessing there is a way to avoid bitstring here.

Suggested change

bit_array = bitstring.ConstBitStream(bin=binary_data)

# chunk bit array into 1260 units, each with 8-bits

byte_data = np.frombuffer(bit_array.bytes, dtype=np.uint8)

# chunk bit array into 1260 units, each with 8-bits

byte_data = np.frombuffer(bytes(binary_data), dtype=np.uint8)

Yeah, I tried it but it wouldn't work. If I call bytes(binary_data), it expect bytes-like object. And to convert binary string to byte-like object, it looks like I need to do this

binary_string = "0b0110011001100001011011000110011001100101" bytes_object = bytes(int(binary_string, 2).to_bytes((len(binary_string) + 7) // 8, byteorder='big'))

It looks more complicate and confusing. That's why I didn't do it. Do you know another way though?

Uggh, I forgot that it was a string :( I was thinking it was already a bytesIO object. I think you're right that is what you'd have to do. I agree it looks more complicated and confusing, but removes a potentially slow step and import. But, I think you could get rid of some of the bit padding and checks since you know the exact length in bytes already:

int(s, 2).to_bytes(len(s) // 8, byteorder='big') # Or since you know the exact length already just use it directly int(s, 2).to_bytes(1260, byteorder='big')

One other question would be is bit_array really a string, or was it a bitstream already that the space_packet_parser returns to you and you could just call binary_data.bytes directly on that object instead?

binary_data that space_packet_parser returns is a string type so I can't call .bytes. The example I wrote above wrong. It actually looks like this.

binary_string = "0000000011" # without 0b at start

Sorry for the confusion. But I think this suggestion makes sense.

int(s, 2).to_bytes(1260, byteorder='big')

imap_processing/swe/l1a/swe_science.py

greglucas · 2023-09-14T17:52:36Z

imap_processing/swe/l1a/swe_science.py

+
+    dataset = xr.Dataset(
+        {"SCIENCE_DATA": science_xarray},
+        coords={"met_time": metadata_arrays["SHCOARSE"]},


I think we will all be using some form of "met_time", so is that we are using on all of the other instruments as well? Or do we want to call the variable "MET" which I've seen in most of the POC documents?

We should also put a short_description tag in with a label "Mission Elapsed Time" and units field "seconds since start of the mission" or something like that as well.

I am not sure on how we use this, tbh. I thought that this packet creation time could be something of use.

But I did change to have those inputs. I created xr.DataArray for SHCOARSE and added those attrs.

imap_processing/swe/l1a/swe_science.py

tech3371 · 2023-09-14T21:44:27Z

imap_processing/swe/l1a/swe_science.py

+        # read raw data as binary array using bitstring
+        # Eg. "0b0101010111"
+        bit_array = bitstring.ConstBitStream(bin=binary_data)
+        # chunk bit array into 1260 units, each with 8-bits
+        byte_data = np.frombuffer(bit_array.bytes, dtype=np.uint8)


Suggested change

# read raw data as binary array using bitstring

# Eg. "0b0101010111"

bit_array = bitstring.ConstBitStream(bin=binary_data)

# chunk bit array into 1260 units, each with 8-bits

byte_data = np.frombuffer(bit_array.bytes, dtype=np.uint8)

# read raw data

binary_data = data_packet.data["SCIENCE_DATA"].raw_value

# read binary string as an int and then convert it to

# bytes.

# Eg. "0000000011110011" --> b'\x00\xf3'

byte_data = int(binary_data, 2).to_bytes(1260, byteorder='big')

# convert bytes to numpy array of uint8

raw_counts = np.frombuffer(byte_data, dtype=np.uint8)

@greglucas Does this change look ok?

Yeah. I personally think that looks better and then you can remove the explicit dependency in our requirements. But looks good either way to me!

Perfect. I can remove import in mine but I can't remove bitstring from dependency because IDEX code is using it.

…ience-Operations-Center#78)

bryan-harter · 2023-09-19T16:39:12Z

imap_processing/swe/l1a/swe_science.py

+
+    science_xarray = xr.DataArray(
+        science_array,
+        dims=["number_of_packets", "seconds", "energy_steps", "cem_counts"],


I'm not quite sure how to change this around, but when we run the xarray->cdf, time should be in the first dimension.

Actually should the first dimension be number_of_packets? It seems like "number_of_packets" could just be "met_time"? Then you don't need the intermediate dimension of "number_of_packets". You'd also need to change the dimensions of "met_time" below to just be "met_time" (a variable is allowed to be its own dimension within xarray).

That's good to know. I didn't know that. I will write a suggested change based on this discussion in a new conversation.

bryan-harter · 2023-09-19T16:41:25Z

imap_processing/swe/l1a/swe_science.py

+
+    # We know we can only have 8 bit numbers input, so iterate over all
+    # possibilities once up front
+    decompression_table = np.array([uncompress_counts(i) for i in range(256)])


If the table is static and doesn't change, does it make sense to just have it live in a csv file somewhere? I guess it's less flexible in case the table needs to be changed. But it might help people understand what is going on a little better.

They said that it will hardly change. But do you mean, run uncompression once for 256 values and store in a csv? Then read csv and use it directly. That could work.

I will be using lookup table for l1b that could change sometimes. I was going to ask you about how to store that lookup table and I think you gave me ideas for that. Thank you!

bryan-harter · 2023-09-19T16:43:07Z

imap_processing/swe/l1a/swe_science.py

+            description="Mission elapsed time",
+            units="seconds since start of the mission",
+        ),
+    )


Is this time since launch? At some point we should convert the time to unix timestamp (seconds since 1970). The xarray_to_cdf routine has a flag to convert unix timestamp to CDF_TT2000 time

I totally agree. I am not sure of that myself. I asked Nick Dutton and his guess is that it may be the time when we start getting first series of data packet from spacecraft or time when we start getting science data. We don't know for sure though.

That is a good question of when we want to convert time. Because you can't get to unix timestamp from MET without using SPICE. So I think we'll need @laspsandoval's routines to handle that conversion.

Can we convert to Unix time right away in L1a? Or do we need to carry MET around in every file for the SPICE queries? We can carry around both MET and Unix time presumably, just call them different variable names.

bryan-harter · 2023-09-19T17:38:10Z

imap_processing/swe/l1a/swe_science.py

+        # where 1260 = 15 seconds x 12 energy steps x 7 CEMs
+        # Take the "raw_counts" indices/counts mapping from
+        # decompression_table and then reshape the return
+        uncompress_data = np.take(decompression_table, raw_counts).reshape(15, 12, 7)


I had to look up what the np.take function does, I guess it does the same thing as the "fancy indexing" I'm used to! I believe that decompression_table[raw_counts] actually returns the same thing.

same! I had to lookup too.

bryan-harter · 2023-09-19T17:42:50Z

imap_processing/swe/l1a/swe_science.py

+
+    science_xarray = xr.DataArray(
+        science_array,
+        dims=["number_of_packets", "seconds", "energy_steps", "cem_counts"],


Actually should the first dimension be number_of_packets? It seems like "number_of_packets" could just be "met_time"? Then you don't need the intermediate dimension of "number_of_packets". You'd also need to change the dimensions of "met_time" below to just be "met_time" (a variable is allowed to be its own dimension within xarray).

bryan-harter · 2023-09-19T17:43:31Z

imap_processing/swe/l1a/swe_l1a.py

+    # If appId is science, then the file should contain all data of science appId
+    if decom_data[0].header["PKT_APID"].raw_value == 1344:
+        logging.info("Processing science data")
+        swe_science(decom_data=decom_data)


Should this function return the output of swe_science?

yes. good catch. I should return the data here. Then we can think about how we go from this to CDF in the future.

tech3371 · 2023-09-20T18:37:50Z

imap_processing/swe/l1a/swe_l1a.py

+    # If appId is science, then the file should contain all data of science appId
+    if decom_data[0].header["PKT_APID"].raw_value == 1344:
+        logging.info("Processing science data")
+        swe_science(decom_data=decom_data)


yes. good catch. I should return the data here. Then we can think about how we go from this to CDF in the future.

tech3371 · 2023-09-20T18:42:41Z

imap_processing/swe/l1a/swe_science.py

+
+    # We know we can only have 8 bit numbers input, so iterate over all
+    # possibilities once up front
+    decompression_table = np.array([uncompress_counts(i) for i in range(256)])


They said that it will hardly change. But do you mean, run uncompression once for 256 values and store in a csv? Then read csv and use it directly. That could work.

I will be using lookup table for l1b that could change sometimes. I was going to ask you about how to store that lookup table and I think you gave me ideas for that. Thank you!

tech3371 · 2023-09-20T18:43:43Z

imap_processing/swe/l1a/swe_science.py

+        # where 1260 = 15 seconds x 12 energy steps x 7 CEMs
+        # Take the "raw_counts" indices/counts mapping from
+        # decompression_table and then reshape the return
+        uncompress_data = np.take(decompression_table, raw_counts).reshape(15, 12, 7)


same! I had to lookup too.

tech3371 · 2023-09-20T18:47:27Z

imap_processing/swe/l1a/swe_science.py

+
+    science_xarray = xr.DataArray(
+        science_array,
+        dims=["number_of_packets", "seconds", "energy_steps", "cem_counts"],


That's good to know. I didn't know that. I will write a suggested change based on this discussion in a new conversation.

tech3371 · 2023-09-20T18:55:52Z

imap_processing/swe/l1a/swe_science.py

+    science_xarray = xr.DataArray(
+        science_array,
+        dims=["number_of_packets", "seconds", "energy_steps", "cem_counts"],
+    )
+
+    met_time = xr.DataArray(
+        metadata_arrays["SHCOARSE"],
+        name="MET",
+        dims=["number_of_packets"],
+        attrs=dict(
+            description="Mission elapsed time",
+            units="seconds since start of the mission",
+        ),
+    )
+
+    dataset = xr.Dataset(
+        {"SCIENCE_DATA": science_xarray},
+        coords={"met_time": met_time},
+    )
+
+    # create xarray dataset for each metadata field
+    for key, value in metadata_arrays.items():
+        if key == "SHCOARSE":
+            continue
+
+        dataset[key] = xr.DataArray(
+            value,
+            dims=["number_of_packets"],
+        )


@bryan-harter will it look something like this when we add time to be the first dimension.

Suggested change

science_xarray = xr.DataArray(

science_array,

dims=["number_of_packets", "seconds", "energy_steps", "cem_counts"],

)

met_time = xr.DataArray(

metadata_arrays["SHCOARSE"],

name="MET",

dims=["number_of_packets"],

attrs=dict(

description="Mission elapsed time",

units="seconds since start of the mission",

),

)

dataset = xr.Dataset(

{"SCIENCE_DATA": science_xarray},

coords={"met_time": met_time},

)

# create xarray dataset for each metadata field

for key, value in metadata_arrays.items():

if key == "SHCOARSE":

continue

dataset[key] = xr.DataArray(

value,

dims=["number_of_packets"],

)

met_time = xr.DataArray(

metadata_arrays["SHCOARSE"],

name="met_time",

dims=["met_time"],

attrs=dict(

description="Mission elapsed time",

units="seconds since start of the mission",

),

)

science_xarray = xr.DataArray(

science_array,

dims=["met_time", "seconds", "energy_steps", "cem_counts"],

)

dataset = xr.Dataset(

{"SCIENCE_DATA": science_xarray},

coords={"met_time": met_time},

)

# create xarray dataset for each metadata field

for key, value in metadata_arrays.items():

if key == "SHCOARSE":

continue

dataset[key] = xr.DataArray(

value,

dims=["met_time"],

)

tech3371 · 2023-09-20T19:02:20Z

imap_processing/swe/l1a/swe_science.py

+            description="Mission elapsed time",
+            units="seconds since start of the mission",
+        ),
+    )


I totally agree. I am not sure of that myself. I asked Nick Dutton and his guess is that it may be the time when we start getting first series of data packet from spacecraft or time when we start getting science data. We don't know for sure though.

* added generators for hit and lo, updated telem generator * removed hit and lo files * better None/NAN handling, using both short and long description tags

sdhoyt

Great job!

sdhoyt · 2023-09-20T20:07:30Z

imap_processing/swe/l1a/swe_science.py

+        # read binary string to an int and then convert it to
+        # bytes. This is to convert the string to bytes.
+        # Eg. "0000000011110011" --> b'\x00\xf3'
+        byte_data = int(binary_data, 2).to_bytes(1260, byteorder="big")


Where are you getting the 1260 byte length from?

1260 is from 1260 = 15 seconds x 12 energy steps x 7 CEMs. I had comment about it before but removed from recent changes. I will add it back. Thank you!

tech3371 · 2023-09-21T23:43:06Z

I have addressed all of feedback and created issue for time conversion question. Will revisit in future PR. Going to squash commits and merge this PR.

First implementation of SWE l1a algorithm. This decom raw data and uncompress raw counts and save data to xarray dataset.

tech3371 added 23 commits August 31, 2023 15:27

fix merge

9eb12bc

committing swe l1a algorithm

106e60c

update telemetry to match latest packet definition from TLM_SWE_20230…

7bcb371

…904.xlsx

removed converting to engineering unit in XTCE. TODO: check if we nee…

78001a0

…d to add it back in.

some updates to l1a algorithm

8e5dc8a

undid some changes

076b1c5

undid codice changes.

16ecbc4

code is little messy but I think it works. TODO: confirm it.

607b31b

fix merge

67f1afd

committing swe l1a algorithm

84e24db

update telemetry to match latest packet definition from TLM_SWE_20230…

20e3c51

…904.xlsx

removed converting to engineering unit in XTCE. TODO: check if we nee…

cc518b9

…d to add it back in.

some updates to l1a algorithm

accf5eb

undid some changes

a9bffe4

undid codice changes.

f18ff2c

code is little messy but I think it works. TODO: confirm it.

e757041

Merge branch 'dev' of github.com:tech3371/imap_processing into dev

627edc5

refactored l1a after discussion with SWE team

cdcd5fb

removed unused libray from toml file and removed nc files that was co…

c97af5e

…mmitted by mistake

format changes

b65c3b7

refactored code to store science data ancillary information in xarray…

1e58e43

… DataArray.

refactored swe l1a code and added tests

751fc51

removed hardcoded packet file in l1a code

ffe62e7

tech3371 added Ins: SWE Related to the SWE instrument Level: L1 Level 1 processing labels Sep 12, 2023

tech3371 added this to the SDC SIT-3 milestone Sep 12, 2023

tech3371 requested a review from a team September 12, 2023 22:02

tech3371 self-assigned this Sep 12, 2023

tech3371 requested review from bourque and removed request for a team September 12, 2023 22:02

GFMoraga and others added 2 commits September 13, 2023 17:30

Lo pkt definitions for housekeeping, sectored, angular,and priority c…

1c617e6

…ounts (IMAP-Science-Operations-Center#158) Added pkt definitions for housekeeping, sectored, angular,and priority counts.

Xtce IntegerParameterType fixes (IMAP-Science-Operations-Center#159)

d69dadd

- update way of checking if description exists or not. update generator template. - updated generator to use correct IntegerParameterType

greglucas reviewed Sep 14, 2023

View reviewed changes

tech3371 added 2 commits September 14, 2023 13:03

feedback changes

9cf535a

format changes

7f64d8f

tech3371 commented Sep 14, 2023

View reviewed changes

removed bitstring call in swe_science

d2ccb90

greglucas approved these changes Sep 14, 2023

View reviewed changes

Commit for the devcontainer/dockerfile needed for Codespaces (IMAP-Sc…

e9faad2

…ience-Operations-Center#78)

bryan-harter reviewed Sep 19, 2023

View reviewed changes

tech3371 commented Sep 20, 2023

View reviewed changes

Handle Telem Def Nans (IMAP-Science-Operations-Center#161)

0d3271c

* added generators for hit and lo, updated telem generator * removed hit and lo files * better None/NAN handling, using both short and long description tags

sdhoyt approved these changes Sep 20, 2023

View reviewed changes

tech3371 added 4 commits September 21, 2023 16:46

First implementation of SWE L1A algorithm

1f1a56a

added MET/SHCOARSE time as first dimension of dataset

f1fbf5a

Merge branch 'dev' of github.com:tech3371/imap_processing into dev

8afa112

delete make_xtce.py

636759a

tech3371 mentioned this pull request Sep 21, 2023

Create common utils function to convert S/C time UNIX or UTC time #165

Open

tech3371 merged commit b1a7b65 into IMAP-Science-Operations-Center:dev Sep 21, 2023
12 checks passed

tech3371 deleted the dev branch September 21, 2023 23:45

tech3371 restored the dev branch September 21, 2023 23:53

tech3371 linked an issue Oct 4, 2023 that may be closed by this pull request

Implement L1A algorithm for SWE #100

Closed

tech3371 modified the milestones: SDC SIT-3, Decom and L1A Complete Oct 4, 2023

tech3371 removed the Ins: SWE Related to the SWE instrument label Oct 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SWE science L1A algorithm #154

SWE science L1A algorithm #154

tech3371 commented Sep 12, 2023 •

edited

Loading

greglucas left a comment

greglucas Sep 14, 2023

tech3371 Sep 14, 2023

greglucas Sep 14, 2023

tech3371 Sep 14, 2023

greglucas Sep 14, 2023

tech3371 Sep 14, 2023

tech3371 Sep 14, 2023

greglucas Sep 14, 2023

tech3371 Sep 14, 2023

bryan-harter Sep 19, 2023

bryan-harter Sep 19, 2023

tech3371 Sep 20, 2023

bryan-harter Sep 19, 2023

tech3371 Sep 20, 2023

bryan-harter Sep 19, 2023

tech3371 Sep 20, 2023

greglucas Sep 20, 2023

bryan-harter Sep 19, 2023

tech3371 Sep 20, 2023

bryan-harter Sep 19, 2023

bryan-harter Sep 19, 2023

tech3371 Sep 20, 2023

tech3371 Sep 20, 2023

tech3371 Sep 20, 2023

tech3371 Sep 20, 2023

tech3371 Sep 20, 2023

tech3371 Sep 20, 2023 •

edited

Loading

tech3371 Sep 20, 2023

sdhoyt left a comment

sdhoyt Sep 20, 2023

tech3371 Sep 20, 2023

tech3371 commented Sep 21, 2023

SWE science L1A algorithm #154

SWE science L1A algorithm #154

Conversation

tech3371 commented Sep 12, 2023 • edited Loading

Change Summary

Overview

New Files

Updated Files

Testing

greglucas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tech3371 Sep 20, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sdhoyt left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tech3371 commented Sep 21, 2023

tech3371 commented Sep 12, 2023 •

edited

Loading

tech3371 Sep 20, 2023 •

edited

Loading