-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid receiver-specific behavior #160
Comments
As it is now, yes, the data would be handled incorrectly after the beam mislabelling is fixed. Once the mislabelling is fixed, my plan was to set a date constraint; e.g., if the receiver is Ka and after the fix, then proceed as usual, if Ka and before the fix, then invert the labels here. There's no way of telling that the beam labels are flipped from the SDFITS metadata. I do not like leaving it up to the user to correct this kind of problem. It has not worked well in the past. I've seen in other observatories that fixes to data issues are implemented as part of the data reduction package (e.g., DPPP for LOFAR will fix bad metadata if you are working with observations taken during specific time ranges). For receiver specific issues (like the Ka beam mislabelling), What other attributes, instead of receiver name, could we use to address this kind of problem? How much effort would be required to retroactively update the engineering fits files to label the beams correctly? |
I'm not saying that Dysh shouldn't handle this. This should be transparent to the user, but also easy for us to maintain as new quirks like this inevitably pop up. I'm also not saying this is a high priority at this stage of development -- this is a very minor thing. But one of the big issues that we have with things like configtool, SDFITS filler, the original pre-2018 version of OOF/poof is that there are lots of receiver-specific things being done all over the place, and trying to modify behavior becomes difficult. So, I'd like to make sure that we address this early Specifically what I'm worried about here is two-fold:
For example, if we continue adding cases like this, it can get out of hand fairly quickly if rx == "Rcvr26_40" and (
(datetime(2016, 1, 1) < date_obs < datetime(2016, 6, 1))
or (datetime(2019, 1, 1) < date_obs < datetime(2022, 1, 1))
):
# Switch the polarizations to match the beams.
if fdnum == 0:
plnum = 1
elif fdnum == 1:
plnum = 0
elif rx == "RcvrArray75_115" and (datetime(2023, 1, 1) < date_obs < datetime(2023, 6, 1)):
...
... Then imagine we have more receivers here, and that this sort of branching was distributed across various methods/files. Every time we add a new receiver or modify an existing one, lots of places need to change -- but which places? How do we find them? So I'd propose introducing some sort of receiver data structure, and either housing this logic there or in some sort of If we were just trying to solve the string literal issue, something like an enum is typically used for this. e.g. from enum import Enum
class Receiver(Enum):
RcvrPF_1: "RcvrPF_1"
...
Rcvr40_52: "Rcvr40_52"
Rcvr68_92: "Rcvr68_92"
RcvrArray75_115: "RcvrArray75_115"
if rx == "Rcvr_26_40": # silent failure due to typo
...
if rx == Receiver.Rcvr_26_40: # typo causes error
... But realistically we'll probably end up with is something resembling a receiver database, because you at the very least will need to map between the M&C name and the common name. Rather than having a bunch of dictionaries, it's easier to have a tabular structure, either in JSON/CSV/etc or SQLite For the OOF rewrite we made a CSV file that looked like this (I've snipped off a few columns): Modern Python has a dataclass now, so regardless of how we represented this in the DB, we'd end up with a proper from dataclasses import dataclass
@dataclass
class Receiver:
mc_name: str
abbrev: str
num_beams: int
num_pols: int
def __str__(self):
return f"{self.abbrev} ({self.mc_name})"
Rcvr26_40 = Receiver(mc_name="Rcvr26_40", abbrev="Ka", num_beams=2, num_pols=2)
# ^ These would be generated for each receiver from CSV/JSON/SQL
print(Rcvr26_40) # Prints 'Ka (Rcvr26_40)' Last thing I'll say is that doing this sort of swap silently is problematic. For example, imagine a case where a user has manually fixed their data files, because they are using them in some other pipeline. Dysh is unaware of the swap, so it swaps them back, and doesn't inform the user. How would they go about debugging this? How would they know there was anything to debug? We at the very least need to be logging that the swap has happened (I see that there's no logging in Dysh at all right now; this is something we can address separately) And a larger question: is there any way to automatically detect swapped beams? i.e. can we check for errors here, or check when beams need to be swapped based on some heuristic rather than a date range/receiver name? |
Reminder for later: There was something similar with ARGUS where two beams were mislabeled before a certain date |
Spinning this into a new issue
Originally posted by @astrofle in #153 (comment)
If I'm understanding correctly, Ka data prior to the mislabeling, and any taken after it is fixed, will be handled incorrectly in Dysh? Unless there is a way to detect the mislabeling based on FITS data, I don't think there's any way to handle this in code. We could for example provide a way of mapping labels to beams via a config file, leaving it up to the user to determine when the mappings should be inverted
In general we should avoid receiver-specific behavior (i.e. conditionals based on receiver name). There will perhaps be cases where it is unavoidable, but in general we can base our conditionals on attributes of a receiver other than its name
The text was updated successfully, but these errors were encountered: