simultaneous pulse migrations with sum of proportions > 1 (again!) #100

grahamgower · 2021-07-06T08:08:51Z

The following should be valid (although not recommended), with 0.6 of B being replaced by A, and then immediately afterwards 0.6 of B is replaced by C.

pulses:
- {source: A, dest: B, time: 10, proportion: 0.6}
- {source: C, dest: B, time: 10, proportion: 0.6}

The realised ancestry proportions after these pulses (forwards in time) should be that B is composed of 0.6 ancestry from C and 0.6*(1 - 0.6) = 0.24 ancestry from A.

We currently have checks (in the reference implementation, demes-python, and demes-c) that the sum of proportions entering a deme at any given time don't exceed 1. I'm not sure why I thought we wanted those checks. Sorry!

apragsdale · 2021-07-06T15:14:42Z

Good catch. You're right that we don't want checks that the sum of pulses are greater than 1, just that each pulse proportion is 0 <= f <= 1. Do we have checks on the continuous migration rates, as discussed before? The sum of incoming continuous migration rates should not exceed one still.

grahamgower · 2021-07-06T15:24:12Z

Yes, there are checks that the sum of incoming continuous migrations don't exceed one.

apragsdale · 2021-07-06T19:40:13Z

Just had a conversation with @sgravel about tracts and demes and model specification. An idea came up, and curious to hear thoughts. To provide a way around this issue of what do we do when there are pulses at the same time into the same population, we could modify the spec to allow a list of sources and corresponding list of proportions in the same pulse event. Then if you want 1/4 from both A and B into C at the same time, we don't have to do the calculation of A->C with proportion 1/3 (so that 1/3 x 3/4 = 1/4) and then B->C with proportion 1/4, we can specify it as

pulses:
- source: [A, B]
  dest: C
  proportion: [1/4, 1/4]
  time: 10

as an option to avoid the ambiguity and non-clarity of

pulses:
- source: A
  dest: C
  proportion: 1/3
  time: 10
- source: B
  dest: C
  proportion: 1/4
  time: 10

Both would still be valid, and we wouldn't change the current approach of sequential application of separately defined pulses. And proportions would still only need to be <= 1 if a single source, or sum(proportion) <= 1 if multiple sources.

molpopgen · 2021-07-06T19:48:45Z

I like this. I'll admit that I had to read @grahamgower's example a few times to arrive at his statement of the expected outcome, so there is indeed an ambiguity there (to me, at least).

apragsdale · 2021-07-06T20:06:38Z

I think the current sequential approach is not ambiguous as far as implementation goes, as long as sorts are done properly. But I definitely agree it is quite confusing, and I think this proposal would go a long way to relieving confusing about pulses into the same population at the same time.

molpopgen · 2021-07-06T21:28:20Z

Yes, the implementation is not ambiguous. But the plus of the new proposal is that a user new to all of this can just write down their desired ancestry proportions w/o have to back-calculate anything. IMO, that's a big win.

grahamgower · 2021-07-07T05:31:40Z

I'm not sold on this. Why can't a user just write this down using the existing list of ancestors and proportions? Or even writing continuous migrations with a time span of 1 generation? Pulse migrations are really something that ought to be avoidable most of the time. We need clear semantics for when folks do use pulses, and when pulses coincide, but these are not expected to be nominal use cases. For this reason, we throw up a warning when pulses do coincide!

apragsdale · 2021-07-07T14:42:59Z

Why can't a user just write this down using the existing list of ancestors and proportions?

This requires defining an entirely new deme, which might not be preferable in a given scenario, as there may be conceptual continuity of a deme that would be have to be split in two with this suggestions.

Or even writing continuous migrations with a time span of 1 generation?

This approach should largely be avoided in specifying demographic models, I'd say. In some discrete time forward simulators, the translation to the simulator may in fact result in the same migration matrices and events, but that won't be the case for many (most?) downstream software applications. For some, like dadi and moments, it would result in excessively large scaled mutation rates and break the numerics, when instantaneous pulse events are handled without issue.

I think it's also a more unfriendly way of writing down the model for the user. The point I'm going for here is to reduce confusion and ambiguity for users, and this suggested approach would end up being only more confusing.

Pulse migrations are really something that ought to be avoidable most of the time. We need clear semantics for when folks do use pulses, and when pulses coincide, but these are not expected to be nominal use cases. For this reason, we throw up a warning when pulses do coincide!

I don't agree with that, really. Instantaneous migration events, while biological simplifications, are extremely common in the literature and widely understood and implemented by simulation software and users. (There are even simulation/inference software out there that don't handle continuous migration but instead exclusively add pulse migration edges to handle gene flow.) Pulse migrations and instantaneous admixture events are a mainstay of pop gen methods, and I don't see a reason to say they should be avoided, or a reason to make cases that users will inevitably run into more confusing than they need to be.

The reason we throw up the warning is because there is the danger of confusion and ambiguity. We should be looking for a way to remove that ambiguity and make the implementation of cases that users will want to model less confusing.

grahamgower · 2021-07-07T15:03:45Z

Having multiple pulses at the same time should be a rare use case. Yes, I can image reasonable uses, but are they sufficiently common that it warrants such a change?

Consider that this would be a backwards-incompatible change. It would need changes to the parsers (ref implementation, demes-python, demes-c), and then each of moments, fwdpy11, msprime, demesdraw, demes-slim... That's quite a lot of work.

molpopgen · 2021-07-07T15:39:49Z

Consider that this would be a backwards-incompatible change. It would need changes to the parsers (ref implementation, demes-python, demes-c), and then each of moments, fwdpy11, msprime, demesdraw, demes-slim... That's quite a lot of work.

I'm okay with this. We should not (IMO) be considering the spec a 1.0 at this point. It has only been looked at by a few people with fairly narrow perspectives (e.g., the implications for only a few tools have been thought through). Once more downstream tool builders see the spec, and get their feedback in, I'd be more comfortable with being hesitant about backwards compatibility.

apragsdale · 2021-07-07T15:58:03Z

I also think that as more people look at demes and consider using it in their applications, we'll find these issues that we want to iron out (we've been inviting other software developers to give it a try for exactly this reason!). So this is what happened here - tracts implementation cares about this case, and in my own experience the scenario has come up when using tracts quite a bit. Additionally, if we as the developers of demes still get confused by the pulses-at-same-time scenario (this is far from the first time it's come up), imagine how users down the road will find it.

What I'm suggesting is fairly minimal. Currently, pulses take a deme name as the source and a number as the proportion. I'd suggest we allow the source to be a deme name or a list of deme names, and the proportion to be a number or a list of numbers. The same checks apply (deme(s) with that/those name(s) in the Graph, proportion or sum of proportion is within [0, 1]). The removal of ambiguity and confusion to me is worth the time making changes to the 0.x implementations.

jeromekelleher · 2021-07-07T16:25:33Z

I agree we should make any breaking changes now - this is why we're getting feedback from people.

grahamgower · 2021-07-07T18:07:45Z

What I'm suggesting is fairly minimal. Currently, pulses take a deme name as the source and a number as the proportion. I'd suggest we allow the source to be a deme name or a list of deme names, and the proportion to be a number or a list of numbers.

Maybe the pulse source/proportion should be changed to sources/proportions instead? This would be consistent with deme's ancestors/proportions fields that are spelled as plurals. It's a breaking change anyway, and forcing lists is much simpler than allowing both lists and non-lists.

jeromekelleher · 2021-07-08T08:42:41Z

Strong +1 on changing to lists rather than having mixed types - it adds a lot of complexity having to support lists or not-lists.

apragsdale · 2021-07-08T14:57:30Z

Sounds good to me! Lists instead of values makes is also consistent with the merge/admix format. I think this will be a very helpful change.

I can open a PR tackling the demes-python change, if we want an idea of how it could look before diving in elsewhere.

grahamgower · 2021-07-08T17:30:45Z

Sounds good to me! Lists instead of values makes is also consistent with the merge/admix format. I think this will be a very helpful change.

I can open a PR tackling the demes-python change, if we want an idea of how it could look before diving in elsewhere.

Sounds good. There's also hundreds of yaml files in the spec repository that will need updating. Some semi-automated search/replace might do the trick.

grahamgower · 2022-02-08T10:56:34Z

The original issue raised here is still unresolved. The following file should be accepted (see original post), but is rejected.

time_units: generations
demes:
- name: A
  epochs:
    - start_size: 100
- name: B
  epochs:
    - start_size: 100
- name: C
  epochs:
    - start_size: 100
pulses:
- {sources: [A], dest: B, time: 10, proportions: [0.6]}
- {sources: [C], dest: B, time: 10, proportions: [0.6]}

Error from reference implementation:

Traceback (most recent call last):
  File "/home/grg/src/demes/demes-spec/reference_implementation/resolve_yaml.py", line 15, in <module>
    graph = parser.parse(data)
  File "/home/grg/src/demes/demes-spec/reference_implementation/parser.py", line 805, in parse
    graph.validate()
  File "/home/grg/src/demes/demes-spec/reference_implementation/parser.py", line 570, in validate
    raise ValueError(
ValueError: Pulse proportions into B at time 10 sum to more than 1

Proportions for a given pulse must have 0 <= sum(proportions) <= 1, but additional pulses defined at the same time can cause the sum of pulse migrations to exceed 1, in which case the pulses are applied sequentially. Closes popsim-consortium#100.

apragsdale mentioned this issue Jul 8, 2021

Simultaneous sources in pulse events popsim-consortium/demes-python#353

Merged

grahamgower mentioned this issue Jul 9, 2021

change pulse source/proportion to support simultaneous sources in pulse events popsim-consortium/demes-python#359

Closed

grahamgower mentioned this issue Jul 21, 2021

correctly handle pulses occuring simultaneously grahamgower/demes-slim#6

Open

This was referenced Dec 2, 2021

convert pulses: source->sources and proportion->proportions grahamgower/demes-c#12

Closed

convert pulses: source->sources and proportion->proportions grahamgower/demes-slim#8

Closed

grahamgower mentioned this issue Feb 8, 2022

Error conditions for pulses? #38

Closed

grahamgower mentioned this issue Feb 8, 2022

MDM: Simultaneous pulses occur in the order specified #39

Closed

grahamgower added the reference implementation label Feb 8, 2022

grahamgower mentioned this issue Feb 10, 2022

Permit "simultaneous" pulses with sum(proportions) > 1. #115

Merged

grahamgower added this to the 1.0 milestone Feb 10, 2022

grahamgower closed this as completed in #115 Feb 10, 2022

grahamgower mentioned this issue Jun 29, 2022

Interpretation of multiple "identical" pulses #155

Open

molpopgen mentioned this issue Jun 29, 2022

Add example of resolving sequential pulses. #156

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

simultaneous pulse migrations with sum of proportions > 1 (again!) #100

simultaneous pulse migrations with sum of proportions > 1 (again!) #100

grahamgower commented Jul 6, 2021

apragsdale commented Jul 6, 2021

grahamgower commented Jul 6, 2021

apragsdale commented Jul 6, 2021

molpopgen commented Jul 6, 2021

apragsdale commented Jul 6, 2021

molpopgen commented Jul 6, 2021

grahamgower commented Jul 7, 2021

apragsdale commented Jul 7, 2021

grahamgower commented Jul 7, 2021

molpopgen commented Jul 7, 2021

apragsdale commented Jul 7, 2021 •

edited

Loading

jeromekelleher commented Jul 7, 2021

grahamgower commented Jul 7, 2021 •

edited

Loading

jeromekelleher commented Jul 8, 2021

apragsdale commented Jul 8, 2021

grahamgower commented Jul 8, 2021

grahamgower commented Feb 8, 2022

simultaneous pulse migrations with sum of proportions > 1 (again!) #100

simultaneous pulse migrations with sum of proportions > 1 (again!) #100

Comments

grahamgower commented Jul 6, 2021

apragsdale commented Jul 6, 2021

grahamgower commented Jul 6, 2021

apragsdale commented Jul 6, 2021

molpopgen commented Jul 6, 2021

apragsdale commented Jul 6, 2021

molpopgen commented Jul 6, 2021

grahamgower commented Jul 7, 2021

apragsdale commented Jul 7, 2021

grahamgower commented Jul 7, 2021

molpopgen commented Jul 7, 2021

apragsdale commented Jul 7, 2021 • edited Loading

jeromekelleher commented Jul 7, 2021

grahamgower commented Jul 7, 2021 • edited Loading

jeromekelleher commented Jul 8, 2021

apragsdale commented Jul 8, 2021

grahamgower commented Jul 8, 2021

grahamgower commented Feb 8, 2022

apragsdale commented Jul 7, 2021 •

edited

Loading

grahamgower commented Jul 7, 2021 •

edited

Loading