-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The keys of PD_CALC
, PD_MEAS
and PD_PROC
#159
Comments
This is not directly answering your question, but I’ll comment on my motivation here way back in the distant past. In the most common case, one collects a diffraction pattern and fits those points. One loop. In less common cases, one collects a pattern at too fine a point spacing and for fitting, some of the observed points are merged together into a processed pattern that is used for fitting. Two loops: one for observed data & one for processed & calc. I wanted to use the same data names for obs & calc in both cases, database normalization (which was added later as a CIF goal) be damned.
On Jun 22, 2023, at 8:40 AM, Antanas Vaitkus ***@***.***> wrote:
This is a bit of a technical question related to the parent-child relationships between looped categories.
The looped PD_DATA category is intended to function as "a 'container' category that is defined in order to allow raw, processed, and calculated data points in a diffraction data set to be optionally tabulated together". This is reflected by the fact that the looped PD_CALC, PD_MEAS and PD_PROC categories have it as their parent category. Now I have the following questions:
1. This allows the same point to have properties from all three categories, i.e. the same point can be described using items from PD_CALC, PD_MEAS and PD_PROC. Is this the intention or could this lead to some data anomalies?
2. All four categories have composite keys in the form of [ _cat.point_id, _cat.diffractogram_id]. All _cat.point_id data items (e.g. _pd_calc.point_id) are properly directly linked to the key of the parent category (_pd_data.point_id). However, all _cat.diffractogram_id items, including the one from the parent PD_DATA category, are linked to the _pd_diffractogram.id. Strictly from a formal point of view -- is this allowed (i.e. software should figure this out) or should links of _pd_calc.diffractogram_id, _pd_meas.diffractogram_id and _pd_proc.diffractogram_id actually be linked to _pd_data.diffractogram_id (and thus only transiently to _pd_diffractogram.id via _pd_data.diffractogram_id)?
—
Reply to this email directly, view it on GitHub<#159>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ACH7E2HWOUF2E7EYWTTLCNTXMRDOHANCNFSM6AAAAAAZQHL6NE>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
@vaitkus' suggestion (2) is better, indeed the The current parent-child arrangement of these categories was created (by me) as a way to use the parent-child relationships in DDLm to properly fit these categories into a relational scheme, so that data names from apparently different categories could be looped together, as Brian T wanted. The only "wrinkle" is that the |
Ok, I'll create a PR for that.
I wonder if there won't be a slight problem here due to people incorrectly assuming that the Consider the following example in which the values from separate categories get incorrectly merged due to reused point IDs:
Joint
I am not saying that anything should be redesigned here, but maybe a disclaimer of some sort on the shared point identifier namespace should be added? |
. |
Yes. I think the NiSi example cif does this, There isn't a one-to-one correspondance in point_id, and it isn't just an offset; you can see that the last peak in both datasets correspond, and so on, just not linearly with point id. I think this ties in with a previous correspondence I had with James re the number of permissible PD_DATA loops in a block, which I think is summarised here. In particular, how can we deal with measured and processed diffractograms where there isn't a one-to-one correspondence of data points? If we're averaging, smoothing, splining, or otherwise altering data points (ie changing the data points such athat there isn't a one-to-one correspondence), can they be considered to be the same diffractogram (as in have identical Maybe there could be
This only allows a proc dataset to be derived from a single meas dataset, where in practice, it could be more. but it is a starting point. (maybe it could have a |
This is just saying that there are no calc items for that data point, which may be entirely legit; consider an excluded region. Strictly, the data points excluded should be given a
But yes, your concern is definitely a legitimate one and one that I've seen in the wild. |
Maybe add some text to the descriptions of `_pd_calc|meas|proc.point_id"? The current descriptions are (essentially):
Could be changed to something like:
For reference, the description of
|
This is a bit of a technical question related to the parent-child relationships between looped categories.
The looped
PD_DATA
category is intended to function as "a 'container' category that is defined in order to allow raw, processed, and calculated data points in a diffraction data set to be optionally tabulated together". This is reflected by the fact that the loopedPD_CALC
,PD_MEAS
andPD_PROC
categories have it as their parent category. Now I have the following questions:PD_CALC
,PD_MEAS
andPD_PROC
. Is this the intention or could this lead to some data anomalies?_cat.point_id
,_cat.diffractogram_id
]. All_cat.point_id
data items (e.g._pd_calc.point_id
) are properly directly linked to the key of the parent category (_pd_data.point_id
). However, all_cat.diffractogram_id
items, including the one from the parentPD_DATA
category, are linked to the_pd_diffractogram.id
. Strictly from a formal point of view -- is this allowed (i.e. software should figure this out) or should links of_pd_calc.diffractogram_id
,_pd_meas.diffractogram_id
and_pd_proc.diffractogram_id
actually be linked to_pd_data.diffractogram_id
(and thus only transiently to_pd_diffractogram.id
via_pd_data.diffractogram_id
)?The text was updated successfully, but these errors were encountered: