You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When searching for all Beach Lab data, OpenGrid only [displays data through the 2016 beach season]. We recently reorganized the data on the data portal where the primary timestamp moved from culture_sample_1_timestamp to dna_sample_timestamp. It appears that Plenario has not been updating the beach data.
This move will create an issue because Plenario is expecting a consistent date. Without a timestamp in culture_sample_1_timestamp, the ETL will fail. The problem is, if Plenario switches the primary timestamp to the new field, it will not ingest the old data that does not have a DNA test.
@levyj - we might need to reorganize the beach data again. I can think of at least two options (1) have a master timestamp that corresponds to the day of the test, perhaps the first available test for that day or (2) move to a more granular row breakdown so each row represents a particular kind of test (i.e., DNA or culture) at a beach on a day. Any other suggestions or thoughts?
I think I prefer the master timestamp. We are already doing something similar with latitude, longitude, location, and beach. We pull from the culture value if it is present, otherwise from the DNA value.
To clarify, though (and I think this is for @HeyZoos), if a date or location value is not present for a record, does the ETL fail outright or merely skip that record?
@HeyZoos - Good to know. I am surprised this has not bitten us earlier. Especially with the locations, nulls are not unheard of in datasets, due to missing values.
When searching for all Beach Lab data, OpenGrid only [displays data through the 2016 beach season]. We recently reorganized the data on the data portal where the primary timestamp moved from
culture_sample_1_timestamp
todna_sample_timestamp
. It appears that Plenario has not been updating the beach data.This move will create an issue because Plenario is expecting a consistent date. Without a timestamp in
culture_sample_1_timestamp
, the ETL will fail. The problem is, if Plenario switches the primary timestamp to the new field, it will not ingest the old data that does not have a DNA test.@levyj - we might need to reorganize the beach data again. I can think of at least two options (1) have a master timestamp that corresponds to the day of the test, perhaps the first available test for that day or (2) move to a more granular row breakdown so each row represents a particular kind of test (i.e., DNA or culture) at a beach on a day. Any other suggestions or thoughts?
/cc @nicklucius and @HeyZoos as FYI.
The text was updated successfully, but these errors were encountered: