Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calibrated C14 measurements are displayed with a standard error #46

Closed
joeroe opened this issue Aug 24, 2021 · 8 comments
Closed

Calibrated C14 measurements are displayed with a standard error #46

joeroe opened this issue Aug 24, 2021 · 8 comments

Comments

@joeroe
Copy link
Contributor

joeroe commented Aug 24, 2021

E.g. in the measurements table. My understanding was that this was not meaningful since a calibrated distribution is (very) not normal?

@MartinHinz
Copy link
Collaborator

That is true! For reason unclear to me, in the original development a decision was made that this is reasonable. Should be fixed to 95% conf interval, I guess? Need to be reflected in the db, the views and the api... (api 2.0?).

@joeroe
Copy link
Contributor Author

joeroe commented Aug 24, 2021

I would say so yeah. I'll leave this NULL in the upcoming import, then.

@joeroe
Copy link
Contributor Author

joeroe commented May 27, 2022

We discussed this today @MartinHinz, and agreed that it would be better to move to a confidence interval soonish (continuing to code around the cal_bp/cal_std columns in the frontend is an unnecessary maintenance burden). In terms of implementing that, we don't want to calibrate the date anew every time this is needed.

Instead, we could create a table which caches the calibrated date for each combination of bp+std+a calibration curve. This is potentially a large table: all dates for 0–50000 BP, with a range of errors between 0–100 (though we have errors as high as 17789 in the current database!), and two curves for the northern and southern hemispheres = 10,000,000 rows. But it can be efficiently indexed and we can just add rows as needed.

@MartinHinz
Copy link
Collaborator

To take this up again: Since we are now using calibrator, and only display the ranges in the view, this issue is actually solved. But, we should perhaps now store the calibration results in the database so that they become searchable (e.g. for filtering by calibrated data).

The following consideration:

New column calibrator_json, type: json

in c14s_helper

calib = c14.calibrator_json
if calib.blank?
  ... calibrator call ...
  c14.calibrator_json = calib
  c14.save!
end

Then we can also trigger the calibration with a rake task:

require "#{Rails.root}/app/helpers/c14s_helper"
include C14sHelper

namespace :c14 do
  task :calibrate do
    C14.each do |date|
      date.calibrator
    end
  end
end

@joeroe
Copy link
Contributor Author

joeroe commented Feb 17, 2023

Are you sure about coupling our data model so closely to the implementation of calibrator? What would it look like in e.g. API output?

For querying I imagine all we really need is the 2 sigma range.

@MartinHinz
Copy link
Collaborator

The json output of calibrator is rather generic, here you get an example:

{"date":{"bp":[5925,5920,5915,5910,5905,5900,5895,5890,5885,5880,5875,5870,5865,5860,5855,5850,5845,5840,5835,5830,5825,5820,5815,5810,5805,5800,5795,5790,5785,5780,5775,5770,5765,5760,5755,5750,5745,5740,5735,5730,5725,5720,5715,5710,5705,5700,5695,5690,5685,5680,5675,5670,5665,5660,5655,5650,5645,5640,5635,5630,5625,5620,5615,5610,5605,5600,5595,5590,5585],"probabilities":[1.6829345934613167e-05,3.103756663857961e-05,5.893881263473874e-05,0.00012607811506225024,0.0003056562260941562,0.0006926157892147185,0.0013336871077356046,0.0021008418694186183,0.002781319711032586,0.003177851958319404,0.0032433075196323462,0.003176447081104458,0.003176447081104458,0.0033096102995565663,0.003643949453715534,0.003973810455991539,0.004166187752090025,0.004588191091358068,0.004806148141163276,0.004352394576712823,0.003108329566380915,0.0023369115318920795,0.0020992314226773567,0.001820258148278945,0.0015150968338394418,0.0013760997931970191,0.0014683136711878479,0.0015651647658921218,0.0015158701940154017,0.0014224825084158622,0.0013760997931970191,0.0013768531753302362,0.0016146379007851148,0.0019858987334671688,0.002779798871490282,0.003842949450021829,0.004806426739496424,0.005283817812220969,0.005491282189160613,0.005561152085102163,0.005522224086472058,0.005347777341493072,0.0051334785844728,0.004857404582150913,0.004587840553388982,0.004530394724557036,0.004588191091358068,0.004805869337768756,0.005002505765345976,0.005047607835207576,0.005091476741426194,0.005212272660237865,0.005248973134649066,0.004587840553388982,0.003108329566380915,0.0019858987334671688,0.0015158701940154017,0.0015143232612592997,0.0017654894305608889,0.002155901930976141,0.0025209816788661406,0.003172927888382773,0.00390696013516625,0.0035758788834244814,0.0022755770626076994,0.0010078855386214797,0.0003651952662840762,0.00010147063346558377,1.6719617245594494e-05],"sigma_ranges":[{"begin":5894,"end":5802,"sigma_level":0.954},{"begin":5797,"end":5780,"sigma_level":0.954},{"begin":5769,"end":5602,"sigma_level":0.954}],"uncal_bp":5000,"uncal_error":50}}

I am happy to adapt it to whatever seems to be necessary to XRONOS, since it is designed foremost with the use in XRONOS in mind (and it includes 2sigma by default, if you specify -r).

How do you feel about this?

@joeroe
Copy link
Contributor Author

joeroe commented Mar 1, 2023

Fair enough. We're essentially creating a submodel within a single attribute, though, which doesn't feel quite right...

We could separate out the probability distribution and the sigma range into two attributes (still both jsonb). And in that case, do we really need to store the full probability distribution?

Thinking at future-proofing things, as well: an uncalibrated date can potentially have multiple calibrated dates (e.g. with different calibration curves or reservoir corrections). Following this logic we could make C14Calibration its own model, which would bring some benefits in terms of sticking to CRUD (i.e. calibrating a date is creating a C14Calibration) and caching (we only need one C14Calibration for each unique combination of age, error and cal curve).

@joeroe
Copy link
Contributor Author

joeroe commented Mar 15, 2023

We discussed this a few days ago and agreed that a new model is probably the way to go. @MartinHinz suggested generalising this to a "CalendarDate" that can store the calendar TAQs, TAPs, and probability distribution of any xron, which I think is a very good solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants