Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long formats for conditions and experiments (timecourses) #586

Open
dilpath opened this issue Jul 17, 2024 · 10 comments
Open

Long formats for conditions and experiments (timecourses) #586

dilpath opened this issue Jul 17, 2024 · 10 comments

Comments

@dilpath
Copy link
Member

dilpath commented Jul 17, 2024

Follow-up to #585 (no need to read that).

Here are specs for long formats of the conditions and experiments (timecourses) tables. Additional feedback is very welcome!

Conditions table

conditionId inputId inputType inputValue
PETAB_ID NON_ESTIMATED_ENTITY_ID constant OR initial OR ... PETAB_MATH
e.g.
cond1 rate1 constant 1
cond2 species1 initial species1 + 5

Row and column ordering are arbitrary, although using the above column ordering may improve human readability.

Additional columns are allowed, for example, to specify a human-friendly name for the condition.

Other optional columns we could officially support include conditionName, but this might mean duplicated the same condition name to all rows with that condition ID...

Detailed field description

  • conditionId [PETAB_ID, REQUIRED]
    Unique identifier for the simulation/experimental condition, to be used in the experiments table.
  • inputId [NON_ESTIMATED_ENTITY_ID, REQUIRED]
    An entity that will be changed in this condition.
  • inputType [constant OR initial OR ..., REQUIRED]
    How the value inputValue changes the entity inputId.
    • constant
      The entity inputId is fixed to the value inputValue. The entity must be static in time while the condition is active, e.g. a model parameter.
    • initial
      The entity inputId is initialized to the value inputValue. The entity must be dynamic and defined in terms of time-derivative information, e.g. a model species involved in some reaction or specified by an ordinary differential equation.
    • rate/assignment/relativeRate/relativeAssignment
      These are currently not supported, until a tool implements them. However, they are reserved to mean changes equivalent to setting a new SBML rateRule or assignmentRule for the entity. relative indicates relative changes to pre-existing rates or assignments. edit: These can only be applied to entities (inputId) that are already governed by these kinds of dynamics. i.e. rate can only apply to entities that already have a rate rule in the original model. assignment/relativeAssignment can only apply to entities that already have an assignment rule in the original model. relativeRate can only apply to entities that already have either a rate rule or reactions.
  • inputValue [PETAB_MATH, REQUIRED]
    The value that will be used to change the entity inputId. If a PEtab math expression involves time-dependent entities, then they represent their values at the simulation time when the condition is activated (edit: or active, for time-varying inputTypes like rate), as defined in the experiments table.

Experiments table

experimentId time conditionId
PETAB_ID NUMERIC OR -inf conditionId
e.g.
timecourse1 -inf cond1
timecourse1 0 cond2

Row and column ordering are arbitrary, although using the above column ordering may improve human readability.

Additional columns are allowed, for example, to specify a human-friendly name for the experiment.

Detailed field description

  • experimentId [PETAB_ID, REQUIRED]
    Unique identifier for the experiment, to be used in the measurements table.
  • time [NUMERIC OR -inf, REQUIRED]
    The time when the condition will become active, in the time unit specified in the model. -inf indicates pre-equilibration (e.g. for drug treatments, the model would be pre-equilibrated with the no-drug condition).
  • conditionId [conditionId, REQUIRED]
    A conditionId from the conditions table.

Measurements table

Only the required or changed columns are included here (other optional columns, e.g. noiseFormula, are still supported by irrelevant to this discussion).

observableId [experimentId] time measurement
observableId [experimentId] NUMERIC OR inf NUMERIC
e.g.
obs1 experiment1 5 2

Detailed field description
observableId and measurement are unchanged.

  • experimentId [experimentId, OPTIONAL]
    An experimentId from the experiments table. This replaces the preequilibrationConditionId and simulationConditionId in PEtab v1. If unspecified, then the simulation will be performed with the default parameters in the model.
  • time [NUMERIC OR inf, REQUIRED]
    Time point of the measurement in the time unit specified in the SBML model. inf (lower-case) indicates steady-state measurements. Cannot be lower than the lowest finite time in the experiments table.

Example

Conditions table

conditionId inputId inputValue inputType units
cond1 rate1 0 constant mg/s
cond1 rate2 1 constant m/s
cond2 species1 0 initial mol
preeq_cond1 rate1 1 constant g/s
switch_on switch 1 constant dimensionless
switch_off switch 0 constant dimensionless

Experiments table

experimentId time conditionId
timecourse1 -inf preeq_cond1
timecourse1 0 cond1
timecourse1 10 cond2
experiment1 -5 cond1
experiment1 -5 cond2
switch_sequence 0 switch_on
switch_sequence 1 switch_off
switch_sequence 2 switch_on
switch_sequence 3 switch_off
switch_sequence 4 switch_on
switch_sequence 5 switch_off

timecourse1 has a PEtab v1 preequilibrationConditionId (preeq_cond1), a PEtab v1 simulationConditionId (cond1), and then a 3rd timecourse period at t=10 with condition cond2.

experiment1 is not a timecourse, rather a single-condition simulation starting at t=-5 where two conditions are applied simultaneously.

switch_sequence is a repeating timecourse, equivalent to a nested timecourse (see #585).

Open points

  1. There is currently some undefined behavior in the conditions table -- do we clarify that now or in a future PEtab v2.1 when the use cases are clearer? For example, what happens when a user specifies a parameter in the conditions table with inputType=constant, but then an SBML event affects the same parameter? We could simply disallow this for now.
  2. How are simultaneous conditions handled (e.g. experiment1 in the example). We could decide that they are only allowed if they change different entities. Otherwise we would need to care about some ordering.
  3. I'm happy to change naming, e.g. inputId->targetId, or experimentId->timecourseId. Let me know what you prefer. experimentId was chosen because most users won't care about timecourses, but then would still need to use that table for their single-condition "timecourses".
@paulflang
Copy link
Contributor

Thanks a lot Dilan for accommodation the suggestions from #585 .🙏 Looks good to me for the most part, but I have two questions

rate/assignment/relativeRate/relativeAssignment
These are currently not supported, until a tool implements them. However, they are reserved to mean changes equivalent to setting a new SBML rateRule or assignmentRule for the entity. relative indicates relative changes to pre-existing rates/reactions or assignments.

Can you give examples for the use of these? Cause to me, only the rate makes intuitive sense. I.e. rate means that the model is really perturbed. assignment would just mean that the current way of calculating the value of an SBML species/parameter/compartment is overridden with an assignment rule, right? But that would violate the SBML specs, which state "an assignment rule cannot be defined for a species that is created or destroyed in a reaction unless that species is defined as a boundary condition in the model." Not that PEtab has to stick to the SBML specs here, but still.

inputValue [PETAB_MATH, REQUIRED]
The value that will be used to change the entity inputId. If a PEtab math expression involves time-dependent entities, then they represent their values at the simulation time when the condition is active, as defined in the experiments table.

Do you mean "is activated" instead of "is active"?

@dilpath
Copy link
Member Author

dilpath commented Jul 17, 2024

Thanks a lot Dilan for accommodation the suggestions from #585 .🙏 Looks good to me for the most part, but I have two questions

Sure! Thanks for the feedback.

rate/assignment/relativeRate/relativeAssignment
These are currently not supported, until a tool implements them. However, they are reserved to mean changes equivalent to setting a new SBML rateRule or assignmentRule for the entity. relative indicates relative changes to pre-existing rates/reactions or assignments.

Can you give examples for the use of these? Cause to me, only the rate makes intuitive sense. I.e. rate means that the model is really perturbed.

Given an inputId=species1 and inputValue=5 and original rate rule for species1: d(species1)/dt = 2*species1

  • inputType=rate means d(species1)/dt = 5, i.e. simply change the whole dynamic of the species
  • inputType=relativeRate means d(species1)/dt = 2*species1 + 5, i.e. a modification of the original rate is made, which is somewhat like adding a new reaction for this species to the system

Given an original assignment rule for species1: species1(t) = 2*t + k1

  • inputType=assignment means species1(t) = 5
  • inputType=relativeAssignment means species1(t) = 2*t + k1 + 5

I was trying to capture all possibilities, include the "relative" and "isDelta" changes discussed in #564 and #585, and the "bolus" vs. "infusion" that you implemented in PumasQSP [1] via the duration column in that dosing table.

assignment would just mean that the current way of calculating the value of an SBML species/parameter/compartment is overridden with an assignment rule, right? But that would violate the SBML specs, which state "an assignment rule cannot be defined for a species that is created or destroyed in a reaction unless that species is defined as a boundary condition in the model." Not that PEtab has to stick to the SBML specs here, but still.

I agree, I'm not sure how to resolve this best. This is one reason why I limited constant to things that are already "constant" in the model, like parameters, and initial to things that are specified by time-derivative information. Similarly, I would limit rate/(relative)Assignment to things that are already defined by rate/assignment rules in the model. However, I think relativeRate can be interpreted as a new reaction for a species, so could apply to species defined by either reactions or rate rules. I clarified this with a bold edit in the first message now.

inputValue [PETAB_MATH, REQUIRED]
The value that will be used to change the entity inputId. If a PEtab math expression involves time-dependent entities, then they represent their values at the simulation time when the condition is active, as defined in the experiments table.

Do you mean "is activated" instead of "is active"?

For the inputTypes constant and initial, this is equivalent to "is activated". But for e.g. rate, then "is active" is more accurate, since it will be evaluated "continuously" during the timecourse period with this condition. I can see this is confusing though... I clarified it with a bold edit in the first message now.

[1] https://help.juliahub.com/pumasqsp/stable/tutorials/petabimport_tutorial/#Detailed-field-description

@dweindl
Copy link
Member

dweindl commented Aug 2, 2024

  • inputType [constant OR initial OR ..., REQUIRED]
    How the value inputValue changes the entity inputId.

    • constant
      The entity inputId is fixed to the value inputValue. The entity must be static in time while the condition is active, e.g. a model parameter.
    • initial
      The entity inputId is initialized to the value inputValue. The entity must be dynamic and defined in terms of time-derivative information, e.g. a model species involved in some reaction or specified by an ordinary differential equation.

If constant is only allowed for entities that are already constant, this could as well be replaced by initial, right? This would be coherent with initialAssignments in SBML.

  • rate/assignment/relativeRate/relativeAssignment
    These are currently not supported

Then I'd leave them out for now.

  • inputValue [PETAB_MATH, REQUIRED]
    The value that will be used to change the entity inputId. If a PEtab math expression involves time-dependent entities, then they represent their values at the simulation time when the condition is activated (edit: or active, for time-varying inputTypes like rate), as defined in the experiments table.

Related to previous discussions, we could introduce a priority column or something, which would potentially make simultaneous compartment size and concentration changes more intuitive. The interpretation of the entity symbol would be different then.

  • There is currently some undefined behavior in the conditions table -- do we clarify that now or in a future PEtab v2.1 when the use cases are clearer? For example, what happens when a user specifies a parameter in the conditions table with inputType=constant, but then an SBML event affects the same parameter? We could simply disallow this for now.

This could be clarified by replacing constant by initial, unless one really wants to disable events affecting a certain entity. Do we need that?

  • How are simultaneous conditions handled (e.g. experiment1 in the example). We could decide that they are only allowed if they change different entities. Otherwise we would need to care about some ordering.

I'd go for specifying some priority as suggested for the conditions table.

  • I'm happy to change naming, e.g. inputId->targetId, or experimentId->timecourseId. Let me know what you prefer. experimentId was chosen because most users won't care about timecourses, but then would still need to use that table for their single-condition "timecourses".

experimentId is good. Slight preference for targetId, targetValue over input*.

dweindl added a commit to PEtab-dev/libpetab-python that referenced this issue Dec 9, 2024
Related to PEtab-dev/PEtab#586

* constants for new yaml fields / table columns / ...
* read/write experiment table
* add experiments table to Problem, and populate from yaml
* add first validation functions
* include missing modules in API docs

To be complemented by separate pull requests.
dweindl added a commit to PEtab-dev/libpetab-python that referenced this issue Dec 18, 2024
Add basic support for PEtab version 2 experiments (see also PEtab-dev/PEtab#586, and  PEtab-dev/PEtab#581). Follow-up to #334.

Partially supersedes #263, which was started before petab.v1/petab.v2 were introduced and before PEtab-dev/PEtab#586.

* updates the required fields in the measurement table
* updates some validation functions to not expect the old `simulationConditionId`s (but does not do full validation yet)
* extends PEtab v1 up-conversion to create a new experiment table.

---------

Co-authored-by: Dilan Pathirana <59329744+dilpath@users.noreply.github.com>
@dweindl
Copy link
Member

dweindl commented Dec 19, 2024

Brief update from some further discussions with @dilpath:

For the condition table, more appropriate column names might be

  • operationType instead of inputType, the respective values could be setRate, setAssignment, addToRate, ...
  • targetId instead of inputId
  • targetValue instead of inputValue

I would suggest to consolidate inputType=constant and inputType=initial and just have operationType=setCurrentValue (or similar), because I don't really see any added value in distinguishing those.

Another issue was: If I want to have pre-equilibration with the default model parameters and then switch to some other condition - how could I specify that? Previously, this could have been implemented by an all-NaN condition in the conditions table. This is no longer possible. After considering a couple of alternatives (e.g., empty conditionId in the experiment table; some conditionId in the experiment table that does not occur in the conditions table) , the most reasonable one seemed to be introducing some kind of no-op operationType that would be the same as an all-NaN condition in the PEtab v1 condition table (i.e., using the model state from the previous period, or in case of the first period, use the model without any changes).

Feedback is welcome.

@FFroehlich
Copy link
Collaborator

Brief update from some further discussions with @dilpath:

For the condition table, more appropriate column names might be

  • operationType instead of inputType, the respective values could be setRate, setAssignment, addToRate, ...
  • targetId instead of inputId
  • targetValue instead of inputValue

Fully agreed.

I would suggest to consolidate inputType=constant and inputType=initial and just have operationType=setCurrentValue (or similar), because I don't really see any added value in distinguishing those.

Yes.

Another issue was: If I want to have pre-equilibration with the default model parameters and then switch to some other condition - how could I specify that? Previously, this could have been implemented by an all-NaN condition in the conditions table. This is no longer possible. After considering a couple of alternatives (e.g., empty conditionId in the experiment table; some conditionId in the experiment table that does not occur in the conditions table) , the most reasonable one seemed to be introducing some kind of no-op operationType that would be the same as an all-NaN condition in the PEtab v1 condition table (i.e., using the model state from the previous period, or in case of the first period, use the model without any changes).

For me, the most natural thing would be the described alternative of conditionId that appears in the experiment table but not the conditions table. I see how this could make linting/validation more difficult and typos are more dangerous, but wouldn't making sure that all conditions in the condition table appear in the experiments table be enough to catch the worst errors?

@dweindl
Copy link
Member

dweindl commented Dec 19, 2024

For me, the most natural thing would be the described alternative of conditionId that appears in the experiment table but not the conditions table. I see how this could make linting/validation more difficult and typos are more dangerous, but wouldn't making sure that all conditions in the condition table appear in the experiments table be enough to catch the worst errors?

That typo issue seemed relevant to me and I thought the proposed no-op makes the intent more explicit. I'd have some preference for explicitness, but I could live with either.

Whether unused conditionIds should be considered illegal, or just optionally trigger some warning is another question that should be clarified (same as, for example, unused observables -- I don't think there is anything in the specs). But even if we consider it illegal, we still wouldn't know if some conditionId in the experiment table was left undefined on purpose or not. Maybe that argument is isn't that strong, given that we have a number optional fields, and allow empty experimentIds in the measurement table for trivial timecourses...

@FFroehlich
Copy link
Collaborator

For me, the most natural thing would be the described alternative of conditionId that appears in the experiment table but not the conditions table. I see how this could make linting/validation more difficult and typos are more dangerous, but wouldn't making sure that all conditions in the condition table appear in the experiments table be enough to catch the worst errors?

That typo issue seemed relevant to me and I thought the proposed no-op makes the intent more explicit. I'd have some preference for explicitness, but I could live with either.

Fair point

Whether unused conditionIds should be considered illegal, or just optionally trigger some warning is another question that should be clarified (same as, for example, unused observables -- I don't think there is anything in the specs). But even if we consider it illegal, we still wouldn't know if some conditionId in the experiment table was left undefined on purpose or not. Maybe that argument is isn't that strong, given that we have a number optional fields, and allow empty experimentIds in the measurement table for trivial timecourses...

Also good point, I generally would prefer warnings only as it makes it a bit easier to reuse tables across problems.

@m-philipps
Copy link

Brief update from some further discussions with @dilpath: ...

I support the renaming.

On the pre-equilibration issue: I'm generally in favour of one explicit way of formulating this.
Would it be an option to permit empty conditionIds in the experiment table where the model would be simulated with the default model parameters?

@dilpath
Copy link
Member Author

dilpath commented Jan 13, 2025

Would it be an option to permit empty conditionIds in the experiment table where the model would be simulated with the default model parameters?

Just so we're talking about the same thing, I'll take an example from Daniel: the current suggestion is to have a conditions and experiments table like

conditionId operatorType targetId targetValue
foo setValue p 1
preeq no-op
experimentId conditionId time
e1 preeq -inf
e1 foo 0

preeq is some explicit, interpretable label that describes the preequilibration with default model parameters.

Your question suggests an experiments table like

experimentId conditionId time
e1 -inf
e1 foo 0

which is a single way of formulating it, but I don't see it as explicit: did the user intend to omit a condition ID there, or just forget? I would consider a single explicit formulation to instead be some reserved conditionId NOOP. Also fine for me, but less interpretable than a user-defined ID like preeq.

@m-philipps
Copy link

Your question suggests an experiments table like
experimentId conditionId time
e1 -inf
e1 foo 0

which is a single way of formulating it, but I don't see it as explicit: did the user intend to omit a condition ID there, or just forget? I would consider a single explicit formulation to instead be some reserved conditionId NOOP. Also fine for me, but less interpretable than a user-defined ID like preeq.

Yes, that's it. No, it's not an explicit solution, I just wanted to get your opinion.

In principle, I would prefer something like NOOP, but I don't see the advantage of defining a condition and making it no-op over having a DEFAULT or NOOP option for the experiment table conditionId.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants