Allow for calmar on number of positive value #324

sylvainipp · 2024-12-04T08:48:36Z

New features

Introduce the possibility to use calmar on variables changed with expressions starting with a space, with a target associated, for instance 'wage > 0' to have the number of positive wage in population, or '(wage > 0) * (pension > 0)' to have the number of person with both a positive wage and a positive amount of pension.
- Update test

benjello · 2024-12-04T09:16:46Z

CHANGELOG.md


-# 3.0.1 [#322](https://github.com/openfisca/openfisca-survey-manager/pull/322)
+* New feature
+  - Introduce the possibility to use calmar on variables with a suffix "_number", with target the weighted sum of the number of positive value of the variable. Only the variable without "_number" should be in the tax benefit system.


Sorry I didn't catch this change.
I would use another suffix like _positive.
But the best solution is to use an expression like 'X > 0'. And check the type of the result of the variable or the expression. If it is a boolean result than target on count otherwise on mass.

The idea is for instance to calibrate on the number of wages. Do you mean we could use ' > 0' as suffix, or something else ? The problem I was trying to tackle is to get back the original variable to use on it the adaptative_calculate_variable when calibrating on a number of entities with a positive value.

There are at least two problems.

the _number suffix is IMHO inappropriate (at least use _positive_count or _strictly_positive_count)

dealing only with strictly positive value could be easily generalized, so it would be nice to have it

You can use any expression like 'wage > 0' as a key for the margin and do the computation (just replace wage with self.simulationadaptative_calculate_variable(wage, period = period) and evaluate. You can then use any suffix you like if it is linked to the expression.

Thanks, I understand better. I try to implement it.

I have seen your changes. This is better. But can't we deal with more general expressions ?
If you are in a hurry, we can leave it as is.

What kind of expressions do you have in mind ? Here we can already deal with inequalities and arithmetic changes, the only constraint being that it starts with a name of a variable (we can do something like : 'var ** 2 * (var>0)' for example). I will be happy to generalize it, but I don't see in which direction this could be useful. Maybe using several variables at once, but in this case it is more natural to create a new variable in my opinion.

Imagine you have a lot of targets you want to satisfy with as many expressions for example conditionnaly on satisfying some boolean expressions (male average salary, women averay salary, age dependency, etc).
It may be very tiedous to create new variables.
And you can use numexpr to help you with that.
But again, if you do not need it now, it is okay to leave it as is.

I tried to generalize by allowing for several variables of the same entity in the expression. The code doesn't allow text content outside of openfisca variable names (no sum for instance), but I think it is already quite better (one can calibrate on (sex == 1) * salary and (sex == 1) * (salary > 0), even if not directly on np.mean((sex == 1) * salary) or (sex == 1) * salary * weights/ sum(weights * (salary == 1)). Thanks for your advice ! I will still search a bit but I'm not sure I will be able to separate correctly the openfisca variable from code words and to keep a good level of error handling.

Is that OK for you @benjello, or is it preferable to allow for text outside variables (for instance by searching explicitely if the string is in the list of the tax and benefit system variables) ? I think it might be error-prone (and make the simulation longer), but if they are some use-case it might be worth it. Thanks again for the improvement you propose !

sylvainipp added 3 commits December 4, 2024 09:34

Allow for calmar var_number counting positive value

f157bb0

Update test

b3b21ab

Bump

b514ff2

benjello reviewed Dec 4, 2024

View reviewed changes

sylvainipp added 4 commits December 4, 2024 16:34

Use expressions instead of _number

e15e885

Lint

c6fff04

Adapt test

3686d5c

Adapt CHANGELOG

14fae5f

sylvainipp requested a review from benjello December 5, 2024 07:49

sylvainipp added 6 commits December 5, 2024 14:16

Allow for several variables in expression

2d89339

Test several variables

5752caf

Lint

ccc5c4a

Avoid problems in case of reforms

70b6d36

Allow weight update in reforms

686dd55

Avoid multiple frame.insert warning

f86251f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow for calmar on number of positive value #324

Allow for calmar on number of positive value #324

sylvainipp commented Dec 4, 2024 •

edited

Loading

benjello Dec 4, 2024

sylvainipp Dec 4, 2024

benjello Dec 4, 2024 •

edited

Loading

sylvainipp Dec 4, 2024

benjello Dec 5, 2024

sylvainipp Dec 5, 2024

benjello Dec 5, 2024

sylvainipp Dec 5, 2024 •

edited

Loading

sylvainipp Dec 11, 2024

Allow for calmar on number of positive value #324

Are you sure you want to change the base?

Allow for calmar on number of positive value #324

Conversation

sylvainipp commented Dec 4, 2024 • edited Loading

New features

benjello Dec 4, 2024

Choose a reason for hiding this comment

sylvainipp Dec 4, 2024

Choose a reason for hiding this comment

benjello Dec 4, 2024 • edited Loading

Choose a reason for hiding this comment

sylvainipp Dec 4, 2024

Choose a reason for hiding this comment

benjello Dec 5, 2024

Choose a reason for hiding this comment

sylvainipp Dec 5, 2024

Choose a reason for hiding this comment

benjello Dec 5, 2024

Choose a reason for hiding this comment

sylvainipp Dec 5, 2024 • edited Loading

Choose a reason for hiding this comment

sylvainipp Dec 11, 2024

Choose a reason for hiding this comment

sylvainipp commented Dec 4, 2024 •

edited

Loading

benjello Dec 4, 2024 •

edited

Loading

sylvainipp Dec 5, 2024 •

edited

Loading