-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cycle checking results in inconsistent calculations #749
Comments
One thing that makes this hard to investigate is that it manifests in many ways, some of which can be seen on a small scale (as in the test above) and some of which have to be embraced at a larger scale. I've created a gist with the trace of a calculation based on a version of Core that removes cycle checks altogether. This trace is from executing this Mes Aides test. It's 10K lines long - that's how deep the stack reaches. One of the things that's going in there is that we calculate The hunch I'm investigating is that this isn't about detecting "cycles" at all. It seems clear for this particular observation that the problem can be also framed as why are we even attempting to compute Is there a clue we could follow ? It seems like there is:
Well, it doesn't have a default value! That makes sense, because there's really no "default" base salary in salary computation - but there can be the absence of a salary. And in fact we can short-circuit a bunch of calculations if we could detect this. Unfortunately, the code for variables as written has a notion of a "default default value", and so we don't take advantage of this clue. However, this doesn't save us yet. I've tried forcing Core to skip the computation if the variable being computed was salaire_imposable - and we still hit a cycle - but a different one, which is a clue that cycles in fact abound in the France model. (A good hint that we should solve this problem in a general way and not piecemeal, I think.) Here is the resulting trace. This looks more like a "true" circular definition: we have the eligibility for AAH computed based on a bunch of income sources which include another aid called ASI/ASPA, for which eligibility in turn hinges on a bunch of income sources including AAH itself. However, it's important to note that it's not a real cycle in the sense that each time around we shift the period of evaluation towards the past. It would be reasonable to compute a situation in which we want to know, for instance, what fluctuations exist over time as a result of being granted aids in the past which causes someone's income to go over a threshold which causes them not to be eligible for this aid. What doesn't really make sense in the context of the test is this: we are evaluating eligibility for ASI/ASPA at infinite depths for a situation where its age condition is not fulfilled. Again we have a semantic understanding that we could short-circuit a computation - this time not because a variable is absent but because a variable is present. This is not a new issue and is linked to vectorization of the computations. In short, we've assumed so far that vector computation necessarily means eager evaluation. We are evaluating ASI/ASPA for a 38-year-old on the chance that the array might contain elderly people. We've tried to improve this with the This is absurd in the singular case, but it also makes kind of an absurdity, unless I'm very mistaken about the execution behaviour, of our performance story for vector computations. The bulk of the population is under 65, so just how much performance we're throwing away even in a vector context to run one loop of superfluous evaluations for everyone, relative to what we'd be throwing away by performing the computation sequentially, is an open question. The good news is that we don't have to choose between the two evils of sequential computation or vast amounts of unnecessary computation, at least I don't believe we do. There seem to be technical solutions (lazyarray, cytoolz), or in the worst case we could syntactically elevate eligibility conditions to a special status, perform them always before computing income bases (or other expensive aggregates), and filter the arrays post-eligibility to perform the income basis computation only on entities that pass the first level of eligibility. But anyway, the TLDR here is: I'd like to reframe this problem from detecting cycles to short-circuiting computation. |
Continuing on from the previous episode… I've investigated short-circuiting as suggested above. One problem that immediately arises is that it's not clear how to change the "shape" of a call to
What I'd initially envisioned was something like the following scenario, using
My rationale for this approach was that if, for instance, AAH eligibility and ASI eligibility have something that syntactically looks like a circular dependency, but that in practice AAH and ASI eligibility are mutually exclusive OR dependent on inputs that only satisfy them in particular periods (which is in fact the case), then this ensures that the vector eventually becomes empty, and the computation terminates. The problem here is that we do not have a way to (temporarily, in stack-like fashion) change the vector being operated on into another, because this context is not part of the execution stack, so this fails at step 3. At this point I decided to step back from solving the question "how to short-circuit computations", and instead go back to investigating the original problem report: figuring out why The reason this happens is two-fold:
Instrumenting the code in
We see that there is a cycle error in This is why it's On branch simplify-cycle-detection, we can see the effect of removing the second rule above (re-raise the error until we hit a line with a max_nb_cycles): all of the tests keep passing (except the unit tests that checked this rule of cycle detection). So we might have a quick-and-dirty fix for Mes Aides' problem until we figure out a more permanent solution. @guillett Would you please test your problematic calculations from Mes Aides against this particular branch of core ? |
Also note that by defaulting # simulations.py:166
max_nb_cycles = parameters.get('max_nb_cycles')
if max_nb_cycles is not None:
self.max_nb_cycles = max_nb_cycles
else:
self.max_nb_cycles = max_nb_cycles = 0 We obtain the desired result:
This increases stack's maximum recursion depth from This makes 7 out of 1508 France's tests fail:
By defaulting # simulations.py:166
max_nb_cycles = parameters.get('max_nb_cycles')
if max_nb_cycles is not None:
self.max_nb_cycles = max_nb_cycles
else:
self.max_nb_cycles = max_nb_cycles = 1 We also obtain the desired result:
But this increases stack's maximum recursion depth from |
Making progress on short-circuiting. Consider the following method we might add to openfisca-core/openfisca_core/entities.py Lines 205 to 210 in 6ba2e55
This allows a new syntax-like construct for formulas: class aah_base(Variable):
calculate_output = calculate_output_add
value_type = float
entity = Individu
definition_period = MONTH
def formula(individu, period, parameters):
return individu.condition('aah_eligible', 'aah_base_montant', period) Obviously this requires a bit of work on existing formulas. Is it worth it though ? I tried a change of this sort in just two places, the eligibility for AAH and that for PAJE. (No particular reason, they're the ones that popped up first on looking for likely instances of the general pattern "take the array-wise logical AND of an eligibility condition and an amount".) The results are impressive - I get a 35% gain on execution time for the Mes Aides tests (from 315s to 206s) and a 20% total gain on the entire test suite. Not so bad for so few lines. An obvious objection is that the above implementation does not realize these performance gains in the general vector case (n > 1), only for singleton simulations or those where these eligibility conditions are homogenous. The gains in the singleton case alone could be worth it just in terms of helping local test execution or CI load. But I think we can realize the gains in the general vector case as well by partitioning computation. Say we start with a vector of 1000, and encounter an eligibility condition which 25% only pass. Then we can run the cheaper computation on 75% of the vector and restrict the more expensive one to that 25%. Of that 25%, the ones that require deeper computation are also probably going to encounter some more eligibility conditions, all of which will reduce the vector further. |
You could achieve this by boolean indexing and fancy indexing, however memory performance should be tested as both create copies of the arrays, instead of just views. An example: import numpy
montant_aah = numpy.zeros(int(1e5))
# array([0., 0., 0., ..., 0., 0., 0.])
aah_eligible = numpy.random.choice(numpy.array([True, False]), int(1e5))
# array([ True, True, True, ..., True, False, True])
ahh_eligible_indexes = numpy.where(aah_eligible)[0]
# array([ 0, 1, 2, ..., 99994, 99997, 99999])
# let's say this is the formula for now
aah_base_resource_formula = 100
montant_aah[ahh_eligible_indexes] += aah_base_resource_formula
# array([100., 100., 100., ..., 100., 100., 100.])
montant_aah
# array([100., 100., 100., ..., 100., 0., 100.]) |
Hi @Morendil, thanks for the investigation on the possible scenarios.
This is all theoretical, but let's say for:
random = numpy.random.rand(n)
condition = numpy.random.choice(numpy.array([True, False]), n)
random - random
eligible = random[numpy.where(condition)[0]]
random[numpy.where(condition)[0]] = eligible + eligible These are the results for different values of
We can see that:
Of course, all of this has to be tested against actual populations (we could generate with https://github.com/openfisca/openfisca-survey-manager/blob/master/openfisca_survey_manager/tests/test_scenario.py for testing). |
Linking to this old discussion on short-circuiting: #363. |
To recap some progress in the few weeks since this was opened, some avenues we explored for cracking this were the following:
What we are lacking, it seems to me, is some clarity on how OpenFisca is supposed to work; a specification of any kind, formal or informal, would probably help a lot. One of the examples that @guillett came up with was: class pension(Variable):
definition_period = MONTH
formula(entity, period):
return entity.calculate('pension', period.last_month) While this is pathological, it's also a simplified example of the kind of thing we have been writing without meaning to. It is not properly speaking a cycle since In fact, this code will terminate if there is an input for
Computing the dependency graph can help us statically assess the situation, and would have other uses, such as helping visualize the model, helping navigate variables in the API and its exploration UI. Our lack of precision in defining "what we talk about when we talk about OpenFisca formulas" may have led us to write formulas that will stop making sense when we reexamine them in the light of a more precise framework that incorporates notions of "free variables" and "calculable formulas". The ease of using 0 as a default value for just about everything, the fact that variables with formulas can also be used as inputs at the drop of a hat, and other "conveniences" in Core might mean that we have some (or a lot of) work to make the models conform to the calculable specification a posteriori. But it might be well worth it in terms of accuracy and quality of our computations. To take an example from France, one of the cycles we explicitly disallow is: However, the other dependency of Once the calculation of |
I am not sure I understand everything but I can give you some input of how we got there for france.
|
Well, if we're looking at what the user provides, we can also look at all the inputs; after all if we can make a decision on whether a formula is calculable or not based on that data, we shouldn't have the user do any work. If we can't make that decision, then it's hard to tell the user how to make it. What advice would we give the user ?
The first option ("try the oldest variable") is what I experimented my initial investigations after discussing the problem with @guillett and it didn't seem to solve the problem very effectively. In some cases the dependencies are so deep that a few cycles are enough to overflow the stack. So if you have a wide range of dates, you're stuck with the problem. Admittedly, I did this for all variables and not on a variable-by-variable basis. On the other hand if you have to do this for all variables, it could get very tedious to do it by hand… I agree with the logic that we should keep the formulas in the country package agnostic of how they're going to be computed, and rely on the inputs. But I think we should be able to "follow the values back in time" automatically, not force the user to do it; and I think we should give strong guarantees that computations always terminate, without ever pushing on the user the responsibility for termination (which is what |
Thanks @Morendil and @maukoquiroga for all this great investigation, this is really interesting and could offer a lot of potential improvements for OpenFisca!
This is caused by the fact an OpenFisca model can be used for different usecases. For instance, a "net salary" is usually an input if you're calculating social benefits, but an output if you're calculation social contributions. |
Also, FYI, there is already something in Core to print the trace of a calculation: simulation = Simulation(tax_benefit_system, simulation_dict, trace = True)
simulation.calculate('disposable_income', '2017-01')
simulation.tracer.print_computation_log() will print
|
Fixed in #817 |
Minimal repro:
The results of computing
ass
andrfr
should not depend on the order of evaluation, but they do:This is impacting France and indirectly Mes-Aides (preventing an upgrade to the latest version of France in production). The France model is very ramified and cycles abound; on inspection I think the current solution (
max_nb_cycles
) is misguided.Connected to openfisca/openfisca-france#1211
The text was updated successfully, but these errors were encountered: