-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1026 logical source parser #1044
1026 logical source parser #1044
Conversation
imap_processing/hi/utils.py
Outdated
regex_str = ( | ||
r"^(?P<mission>imap)_" # Required mission | ||
r"(?P<instrument>[^_]+)_" # Required instrument | ||
r"(?P<data_level>[^_]+)_" # Required data level | ||
r"((?P<sensor>\d{2}sensor)?-)?" # Optional sensor number | ||
r"(?P<descriptor>[^_]+)" # Required descriptor | ||
r"(_(?P<start_date>\d{8}))?" # Optional start date | ||
r"(-repoint(?P<repointing>\d{5}))?" # Optional repointing field | ||
r"(?:_v(?P<version>\d{3}))?" # Optional version | ||
r"(?:\.(?P<extension>cdf|pkts))?$" # Optional extension | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need other fields or can you do something similar to this and then apply regex on descriptor
? I am afraid of losing track of this in the future. I am wondering now if it's an feature we should implement in imap-data-access
, especially update this function.
regex_str = ( | |
r"^(?P<mission>imap)_" # Required mission | |
r"(?P<instrument>[^_]+)_" # Required instrument | |
r"(?P<data_level>[^_]+)_" # Required data level | |
r"((?P<sensor>\d{2}sensor)?-)?" # Optional sensor number | |
r"(?P<descriptor>[^_]+)" # Required descriptor | |
r"(_(?P<start_date>\d{8}))?" # Optional start date | |
r"(-repoint(?P<repointing>\d{5}))?" # Optional repointing field | |
r"(?:_v(?P<version>\d{3}))?" # Optional version | |
r"(?:\.(?P<extension>cdf|pkts))?$" # Optional extension | |
) | |
regex_str = ( | |
r"((?P<sensor>\d{2}sensor)?-)?" # Optional sensor number | |
r"(?P<descriptor>[^_]+)" # Required descriptor | |
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, perhaps it is overkill to parse all of the fields. I doubt I will need the mission
or instrument
field but I could see needing the start_date
or repoint
. I based this regex on the one in the ScienceFilePath
class. I don't think that one should be modified because it needs to be very strict to have invalid file names fail where I don't want to be strict. Parsing using split()
could be done but it is lots of hard coding indices and gets messy with some of the optional fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you like to keep it as it is, that's ok with me. I do want @laspsandoval and @sdhoyt to review these two function especially and see if they can benefit from it. If so, then we may want to move this in common utils function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't come across a situation where I need to parse anything from the logical source. I agree with Tim though about the start_date
and pointing
. I could see parsing those two fields being useful for instruments that need to reprocess because there's data spread across CCSDS files for two pointings and they need to put some bits of data from Day 2 into the Day 1 data product. In that case I think there would be multiple L0 dependency files with different dates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tech3371, I decided to move this function into the cdf/utils.py module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tim, that's a good place to put it since it's related to CDF attrs. Cool. I will approve this PR then. Then if others find that function helpful, it's there to use.
}, | ||
), | ||
( | ||
"imap_hi_l1c_90sensor-pset_20250415_v001.cdf", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should add a pointing number to this test case if the regex parsing stays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be helpful.
94324f5
to
bb4115e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Besides the previous comment about may be moving it to common utils, everything else looks good to me. I will let Sean and Laura approve this PR.
}, | ||
), | ||
( | ||
"imap_hi_l1c_90sensor-pset_20250415_v001.cdf", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be helpful.
…d or full filename
bb4115e
to
9a4b6e9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add test with repointing number in the filename like you were thinking? Otherwise, it looks good to me.
978bae9
into
IMAP-Science-Operations-Center:dev
Change Summary
Overview
Updated Files
Closes: #1026