Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repeat observation not allowed in PUT brapi/v1/observations anymore #3630

Open
ctucker3 opened this issue Jul 20, 2021 · 8 comments
Open

Repeat observation not allowed in PUT brapi/v1/observations anymore #3630

ctucker3 opened this issue Jul 20, 2021 · 8 comments
Assignees
Labels
Priority: High Issue/PR significantly impacts users. Type: Implementation Issue proposes a non-feature change to implementation.

Comments

@ctucker3
Copy link
Contributor

There was a changed in this PR:

b339934

So that repeat observations are now ignored. Not sure if this was an intentional change, but it would be useful to still be able to store repeat observations. @bellerbrock Do you remember if you had a reason for that change?

@MFlores2021
Copy link
Contributor

it affects v2 /observations also

@ctucker3
Copy link
Contributor Author

Also, the guilty line is 568.

@lukasmueller
Copy link
Member

needs more discussion :-)
Is this a feature or a bug?
Others have requested upload of multiple observations per trial and trait...
Do not allow values with the same timestamp though
The problem is the download, it assumes one value per trial/trait, so that needs to be fixed

@ctucker3
Copy link
Contributor Author

Not sure its original intention, but the observations parse method in repos/sgn/lib/CXGN/Phenotypes/ParseUpload/Plugin/Observations.pm only throws an error for non-unique, observation unit, variable, timestamp combos. So it seems like if it started as a bug, somewhere along the line it turned into a feature.

@lukasmueller lukasmueller added Priority: High Issue/PR significantly impacts users. Type: Implementation Issue proposes a non-feature change to implementation. labels Aug 2, 2021
@bellerbrock
Copy link
Member

Definitely a complicated one. Historically, multiple measurements over time has usually been handled via trait ontologies. First built into the ontology itself (CBSD 3-month eval, 6-month eval, 9-month eval, etc.), then factored out into independent time ontologies (day1-day365, week1-week52, month1-month12 ) that can be combined with the core ontology via 'Compose a new trait'.

Until recently most data has not had timestamps attached, and from what I can tell the default behavior (that repeat observations of the same trait would be stored if the user didn't chose to overwrite) was an oversight/long-running bug. It caused strange behavior like negative values in the summary stats %missing column on trial detail pages, and was not accommodated for by any analysis tools or download formats.

I like the idea of handling things differently though. Seems like there's a lot of demand for storing repeated measurements differentiated just via timestamp, especially with the fieldbook -breedbase api connection making observation submissions with timestamps so easy. It'll mean lots of changes though, to upload parsers, summary and analysis tools, and download formats.

Lukas is back at then end of next week, maybe we can discuss in more detail then?

@MFlores2021
Copy link
Contributor

Sounds like a good idea. Let me know when works better

@bellerbrock
Copy link
Member

Features from the discussion today:

Upload:

  • Separate upload phenotypes tool into 'upload phenotypes' and 'update phenotypes', analagous to BrAPI /observations POST and PUT, respectively.
  • Improve data checks. Similar to diagram below but check for uniqueness of observationunit+ traitname + timestamp before checking value

upload decisions

Display:

  • Fix % missing calulations in summary stats table to handle multiple observations
  • Add additional summary features (table with timestamps? time series visualization?) to display multiple observations.

Download:

  • Add option to existing download format to extract single value from multiple observations. Choice to calculate avg, or use first value or latest value
  • Add new download format consistent with fieldbook database format (column for observationunits, column for trait names, column for values, column for timestamps, column for operators)
  • Explore additional download formats. Maybe append '_timestamp' to trait names in column header and append timestamp value to trait value in the spreadsheet cell, with multiple measurements separated by semicolons?

@hkmanching
Copy link

@lukasmueller I uploaded a phenotype file (in fieldbook export format) that had multiple timestamps. I realized it only kept the data with the first timestamp. I went back and split the data by date to upload individually and I was able to upload the second timestamp data okay, however, the 3rd timestamp through an error (see attachment) and I can not upload anymore with out this error. I also still can't delete trials or trait data in the database, so I can't go back and try again. I think this relates to this issue.
Screenshot 2024-07-04 at 11 20 40 AM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Priority: High Issue/PR significantly impacts users. Type: Implementation Issue proposes a non-feature change to implementation.
Projects
None yet
Development

No branches or pull requests

5 participants