Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT: Cdflib update #991

Merged
merged 8 commits into from
Oct 11, 2024
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ default_attrs: &default
FIELDNAM: ""
FILLVAL: -9223372036854775808
FORMAT: I12
LABLAXIS: ""
REFERENCE_POSITION: ""
RESOLUTION: ""
SCALETYP: linear
Expand Down
21 changes: 19 additions & 2 deletions imap_processing/cdf/config/imap_glows_l1b_variable_attrs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,23 @@ max_uint16: &max_uint16 65535
min_epoch: &min_epoch -315575942816000000
max_epoch: &max_epoch 3155630469184000000

# <=== Label Attributes ===>
# LABL_PTR_i expects VAR_TYPE of metadata with char data type.
# We need to define this if we have DEPEND_1 or more.
# TODO: I am not sure what the FIELDNAM should be.
# I tried best to match this: https://spdf.gsfc.nasa.gov/istp_guide/variables.html#Metadata_eg1
within_the_second_label:
CATDESC: Direct events recorded in individual seconds
FIELDNAM: Direct events within a second
FORMAT: A5
VAR_TYPE: metadata

bins_label:
CATDESC: Histogram bin number
FIELDNAM: Bin number
FORMAT: A4
VAR_TYPE: metadata

default_attrs: &default_attrs
# TODO: Remove unneeded attributes once SAMMI is fixed
RESOLUTION: ' '
Expand Down Expand Up @@ -108,10 +125,10 @@ histogram:
CATDESC: Histogram of photon counts in scanning-circle bins
DEPEND_0: epoch
DEPEND_1: bins
LABL_PTR_1: bins_label
FIELDNAM: Histogram of photon counts
FORMAT: I4
DISPLAY_TYPE: time_series
LABLAXIS: Counts
FILL_VAL: *max_uint16
UNITS: counts
VAR_TYPE: data
Expand Down Expand Up @@ -566,10 +583,10 @@ direct_event_glows_times:
direct_event_pulse_lengths:
<<: *support_data_defaults
DEPEND_1: within_the_second
LABL_PTR_1: within_the_second_label
VAR_TYPE: data
CATDESC: Pulse lengths for direct events
FIELDNAM: Pulse lengths for direct events
LABLAXIS: Pulse lengths

missing_packets_sequence: # Used to be missing_packets_sequence
<<: *support_data_defaults
Expand Down
10 changes: 9 additions & 1 deletion imap_processing/cdf/config/imap_hi_variable_attrs.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,11 @@
# <=== Label Attributes ===>
tech3371 marked this conversation as resolved.
Show resolved Hide resolved
# LABL_PTR_i expects VAR_TYPE of metadata with char data type
hi_hist_angle_label:
CATDESC: Angle bin centers for histogram data.
FIELDNAM: ANGLE
FORMAT: A5
VAR_TYPE: metadata

# ------- Default attributes section -------
default_attrs: &default
DEPEND_0: epoch
Expand Down Expand Up @@ -183,8 +191,8 @@ hi_hist_counters:
FIELDNAM: "{counter_name} histogram"
VALIDMAX: 4095
DEPEND_1: angle
LABL_PTR_1: angle_label
FORMAT: I4
LABLAXIS: "{counter_name}"

# ======= L1B DE Section =======
hi_de_coincidence_type:
Expand Down
13 changes: 12 additions & 1 deletion imap_processing/cdf/config/imap_mag_l1_variable_attrs.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
# <=== Label Attributes ===>
# LABL_PTR_i expects VAR_TYPE of metadata with char data type.
# We need to define this if we have DEPEND_1 or more.
# TODO: I am not sure what the FIELDNAM should be.
# I tried best to match this: https://spdf.gsfc.nasa.gov/istp_guide/variables.html#Metadata_eg1
direction_label:
CATDESC: magnetic field vector data
FIELDNAM: Magnetic Field Vector
FORMAT: A3
VAR_TYPE: metadata

default_attrs: &default
# Assumed values for all variable attrs unless overwritten
DEPEND_0: epoch
Expand All @@ -23,8 +34,8 @@ raw_vector_attrs:
<<: *default_coords
CATDESC: Raw unprocessed magnetic field vector data in bytes
DEPEND_1: direction
LABL_PTR_1: direction_label
FIELDNAM: Magnetic Field Vector
LABLAXIS: Raw binary magnetic field vector data
FORMAT: I3

vector_attrs:
Expand Down
6 changes: 4 additions & 2 deletions imap_processing/cdf/config/imap_swe_l1a_variable_attrs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -60,8 +60,8 @@ default_attrs: &default
SCALETYP: linear

raw_counts:
<<: *default
CATDESC: Raw Counts stored in 8bits length
DEPEND_0: epoch
DEPEND_1: spin_angle
DEPEND_2: polar_angle
LABL_PTR_1: spin_angle_label
Expand All @@ -72,11 +72,12 @@ raw_counts:
UNITS: counts
VALIDMAX: 255
VALIDMIN: 0
FILLVAL: -9223372036854775808
VAR_TYPE: data

science_data:
<<: *default
CATDESC: Decompressed Counts
DEPEND_0: epoch
DEPEND_1: spin_angle
DEPEND_2: polar_angle
LABL_PTR_1: spin_angle_label
Expand All @@ -87,6 +88,7 @@ science_data:
UNITS: counts
VALIDMAX: 66539
VALIDMIN: 0
FILLVAL: -9223372036854775808
VAR_TYPE: data

shcoarse:
Expand Down
6 changes: 2 additions & 4 deletions imap_processing/cdf/config/imap_swe_l1b_variable_attrs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -84,8 +84,8 @@ default_attrs: &default
SCALETYP: linear

science_data:
<<: *default
CATDESC: Electron count rates organized by voltage step and spin sector and CEM
DEPEND_0: epoch
DEPEND_1: energy
DEPEND_2: angle
DEPEND_3: cem
Expand All @@ -96,7 +96,6 @@ science_data:
FIELDNAM: Counts rate by volt step and spin sector and CEM
FORMAT: E14.7
FILLVAL: -1.0000000E+31
LABLAXIS: Count Rates
UNITS: counts/sec
VALIDMAX: 0.000015514
VALIDMIN: 0
Expand All @@ -106,8 +105,8 @@ science_data:
Dividing max counts by acq_duration gave validmax

sci_step_acq_time_sec:
<<: *default
CATDESC: Acquisition time organized by voltage step and spin sector and CEM
DEPEND_0: epoch
DEPEND_1: energy
DEPEND_2: angle
DEPEND_3: cem
Expand All @@ -117,7 +116,6 @@ sci_step_acq_time_sec:
DISPLAY_TYPE: spectrogram
FIELDNAM: Acquisition by volt step and spin sector and CEM
FILLVAL: -1.0000000E+31
LABLAXIS: Count Acq Time
UNITS: sec
VAR_TYPE: support_data
VAR_NOTES: >
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,15 @@ attribute_key:
description: >
fixed (0AD, 1900, 1970 (POSIX), J2000 (used by CDF_TIME_TT2000),
4714 BC (Julian)) or flexible (provider-defined)
required: true # NOT Required in ISTP Guide
required: false # NOT Required in ISTP Guide
overwrite: false
valid_values: null
alternate: null
RESOLUTION:
description: >
Using ISO8601 relative time format, for example: "1s" = 1 second.
Resolution provides the smallest change in time that is measured.
required: true # NOT Required in ISTP Guide
required: false # NOT Required in ISTP Guide
overwrite: false
valid_values: null
alternate: null
Expand Down Expand Up @@ -190,7 +190,7 @@ attribute_key:
description: >
Used to label a plot axis or to provide a heading for a data listing. This field is generally
6-10 characters. Only one of LABLAXIS or LABL_PTR_i should be present.
required: true
required: false
overwrite: false
valid_values: null
alternate: LABL_PTR_1
Expand Down
15 changes: 15 additions & 0 deletions imap_processing/glows/l1b/glows_l1b.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,12 @@ def glows_l1b(input_dataset: xr.Dataset, data_version: str) -> xr.Dataset:
dims=["bins"],
attrs=cdf_attrs.get_variable_attributes("bins_attrs"),
)
bin_label = xr.DataArray(
bin_data.data.astype(str),
name="bins_label",
dims=["bins_label"],
attrs=cdf_attrs.get_variable_attributes("bins_label"),
)

output_dataarrays = process_histogram(input_dataset)
# TODO: Is it ok to copy the dimensions from the input dataset?
Expand All @@ -79,6 +85,7 @@ def glows_l1b(input_dataset: xr.Dataset, data_version: str) -> xr.Dataset:
coords={
"epoch": data_epoch,
"bins": bin_data,
"bins_label": bin_label,
"bad_angle_flags": bad_flag_data,
"flag_dim": flag_data,
"ecliptic": eclipic_data,
Expand Down Expand Up @@ -107,6 +114,13 @@ def glows_l1b(input_dataset: xr.Dataset, data_version: str) -> xr.Dataset:
dims=["within_the_second"],
attrs=cdf_attrs.get_variable_attributes("within_the_second"),
)
# Add the within_the_second label to the xr.Dataset coordinates
within_the_second_label = xr.DataArray(
input_dataset["within_the_second"].data.astype(str),
name="within_the_second_label",
dims=["within_the_second_label"],
attrs=cdf_attrs.get_variable_attributes("within_the_second_label"),
)

flag_data = xr.DataArray(
np.arange(11),
Expand All @@ -119,6 +133,7 @@ def glows_l1b(input_dataset: xr.Dataset, data_version: str) -> xr.Dataset:
coords={
"epoch": data_epoch,
"within_the_second": within_the_second_data,
"within_the_second_label": within_the_second_label,
"flag_dim": flag_data,
},
attrs=cdf_attrs.get_global_attributes("imap_glows_l1b_de"),
Expand Down
7 changes: 7 additions & 0 deletions imap_processing/hi/l1a/histogram.py
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,13 @@ def allocate_histogram_dataset(num_packets: int) -> xr.Dataset:
dims=["angle"],
attrs=attr_mgr.get_variable_attributes("hi_hist_angle"),
)
coords["angle_label"] = xr.DataArray(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Label variables should not be added to the coordinates list. They are just variables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LABL_PTR_i confused me a lot. Changes in this file and XML file is related to that. This is something SPDF and ISTP caused. When your data has great than 1D array, it requires to have label for each DEPEND_1 and onward. And we need to create xr.DataArray for those labels and need to add those to coordinate, otherwise cdflib will throw error.

FAILED imap_processing/tests/hi/test_l1a.py::test_app_hist_decom - cdflib.xarray.xarray_to_cdf.ISTPError: ISTP Compliance Warning: variable qual_ab listed angle_label as its LABL_PTR_1.  However, it was not found in the dataset.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
coords["angle_label"] = xr.DataArray(
data_vars["angle_label"] = xr.DataArray(

I think that is what @subagonsouth is suggesting. cdflib fails with that as well? Coordinates should be optional, so this might be an issue to raise with cdflib.

Copy link
Contributor Author

@tech3371 tech3371 Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moving this to data variable. I will refactor other instrument's label in new PR. I do want to point out that other past mission did store their coordinate label in coordinate as well. I wonder if they did that to be able to select and filter data easier later on. That's something we should think about. I will check with Andriy first before I refactor other instrument.

coords["angle"].data.astype(str),
name="angle_label",
dims=["angle_label"],
tech3371 marked this conversation as resolved.
Show resolved Hide resolved
attrs=attr_mgr.get_variable_attributes("hi_hist_angle_label"),
)

data_vars = dict()
data_vars["ccsds_met"] = xr.DataArray(
np.empty(num_packets, dtype=np.uint32),
Expand Down
12 changes: 11 additions & 1 deletion imap_processing/mag/l0/decom_mag.py
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,12 @@ def generate_dataset(
dims=["direction"],
attrs=attribute_manager.get_variable_attributes("raw_direction_attrs"),
)
direction_label = xr.DataArray(
direction.astype(str),
name="direction_label",
dims=["direction_label"],
attrs=attribute_manager.get_variable_attributes("direction_label"),
)

# TODO: Epoch here refers to the start of the sample. Confirm that this is
# what mag is expecting, and if it is, CATDESC needs to be updated.
Expand All @@ -151,7 +157,11 @@ def generate_dataset(
logical_id = f"imap_mag_l1a_{mode.value.lower()}-raw"

output = xr.Dataset(
coords={"epoch": epoch_time, "direction": direction},
coords={
"epoch": epoch_time,
"direction": direction,
"direction_label": direction_label,
},
attrs=attribute_manager.get_global_attributes(logical_id),
)

Expand Down
3 changes: 3 additions & 0 deletions imap_processing/tests/codice/test_codice_l1a.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,7 @@ def test_l1a_data(request) -> xr.Dataset:
return dataset


@pytest.mark.xfail(reason="Epoch variable data needs to monotonically increase")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created new issue to capture this #992

@pytest.mark.parametrize(
"test_l1a_data, expected_logical_source",
list(zip(TEST_PACKETS, EXPECTED_LOGICAL_SOURCE)),
Expand All @@ -110,6 +111,7 @@ def test_l1a_cdf_filenames(test_l1a_data: xr.Dataset, expected_logical_source: s
assert dataset.attrs["Logical_source"] == expected_logical_source


@pytest.mark.xfail(reason="Epoch variable data needs to monotonically increase")
@pytest.mark.parametrize(
"test_l1a_data, expected_shape",
list(zip(TEST_PACKETS, EXPECTED_ARRAY_SHAPES)),
Expand Down Expand Up @@ -167,6 +169,7 @@ def test_l1a_data_array_values(test_l1a_data: xr.Dataset, validation_data: Path)
)


@pytest.mark.xfail(reason="Epoch variable data needs to monotonically increase")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

captured this work in #993

@pytest.mark.parametrize(
"test_l1a_data, expected_num_variables",
list(zip(TEST_PACKETS, EXPECTED_NUM_VARIABLES)),
Expand Down
1 change: 1 addition & 0 deletions imap_processing/tests/mag/test_mag_l1b.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,7 @@ def test_mag_attributes(mag_l1a_dataset):
assert output.attrs["Data_level"] == "L1B"


@pytest.mark.skip(reason="Epoch variable data need to be monotonically increasing")
def test_cdf_output():
l1a_cdf = load_cdf(
Path(__file__).parent / "imap_mag_l1a_burst-magi_20231025_v001.cdf"
Expand Down
18 changes: 10 additions & 8 deletions poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ classifiers = [
]

[tool.poetry.dependencies]
cdflib = "==1.2.6"
cdflib = "==1.3.1"
imap-data-access = ">=0.5.0"
python = ">=3.9,<4"
space_packet_parser = "^4.2.0"
Expand Down
Loading