Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Beach Lab Data not available in OpenGrid since 2016 #309

Closed
tomschenkjr opened this issue Jul 5, 2017 · 3 comments
Closed

Beach Lab Data not available in OpenGrid since 2016 #309

tomschenkjr opened this issue Jul 5, 2017 · 3 comments
Assignees

Comments

@tomschenkjr
Copy link
Contributor

When searching for all Beach Lab data, OpenGrid only [displays data through the 2016 beach season]. We recently reorganized the data on the data portal where the primary timestamp moved from culture_sample_1_timestamp to dna_sample_timestamp. It appears that Plenario has not been updating the beach data.

This move will create an issue because Plenario is expecting a consistent date. Without a timestamp in culture_sample_1_timestamp, the ETL will fail. The problem is, if Plenario switches the primary timestamp to the new field, it will not ingest the old data that does not have a DNA test.

@levyj - we might need to reorganize the beach data again. I can think of at least two options (1) have a master timestamp that corresponds to the day of the test, perhaps the first available test for that day or (2) move to a more granular row breakdown so each row represents a particular kind of test (i.e., DNA or culture) at a beach on a day. Any other suggestions or thoughts?

/cc @nicklucius and @HeyZoos as FYI.

@levyj
Copy link
Member

levyj commented Jul 5, 2017

I think I prefer the master timestamp. We are already doing something similar with latitude, longitude, location, and beach. We pull from the culture value if it is present, otherwise from the DNA value.

To clarify, though (and I think this is for @HeyZoos), if a date or location value is not present for a record, does the ETL fail outright or merely skip that record?

@HeyZoos
Copy link

HeyZoos commented Jul 5, 2017

@levyj It will fail outright, as the underlying schema expects those values to never be null.

@levyj
Copy link
Member

levyj commented Jul 5, 2017

@HeyZoos - Good to know. I am surprised this has not bitten us earlier. Especially with the locations, nulls are not unheard of in datasets, due to missing values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants