feat: add core population #1279

bonjourmauko · 2024-10-14T17:00:55Z

Depends on #1274
Depended upon by #1238

New features

Introduce populations.CorePopulation
- Allows for testing and better subclassing custom populations

openfisca_core/populations/_core_population.py

bonjourmauko · 2024-10-16T15:32:29Z

openfisca_core/populations/_core_population.py

+    entity: t.CoreEntity
+
+    #: ???
+    ids: Sequence[str] = []


@benjello What are the ids here? household, or Juan?

Should be the ids of the different households o the different people for the atomic/person entity.

So a string, like household_1, person_2, and so on. Attributes that only exist in the context of a simulation.

I would say so. Bit O am not the person who refactored that part. @Morendil is your manI

It makes sense, in a way, although a bit confusing, that there is no data class to represent an invidivual member or entry of a population, as these are finally just the value at index n on the corresponding variable vectors.

So, say we have tax_1 and tax_2, on a SinglePopulation with count = 3 of entity Person, behind the scenes, we should have two Holder for a given period:

holder_tax_1 = numpy.array([100, 200, 300]) holder_tax_2 = numpy.array([300, 200, 100]) ids = ["Mauko", "Mahdi", "Sandra"]

How much of tax_1 have we got for "Mauko"?

tax_1_mauko = holder_tax_1[[ids.index("Mauko")]] # [100]

Now we group "Mauko" and "Mahdi" in a GroupEntity, Foundation.

How much of tax_1 have we got for "OpenFisca Foundation"? Fancy-indexing:

tax_1_foundation = sum(holder_tax_1[[ids.index("Mauko"), ids.index("Mahdi")]]) # [300]

So, this attribute id of each entity entry in a simulation, is sort of ephemeral. There is no class modeling it, and they exist internally only as reference records to get values out of holders.

Is that how it works, @Morendil ?

@benjello Why can't we model a population as a structured array or a dataframe? I mean, it looks like it is more or less the way it works now. As long as you have such indices, then you can link columns the way you like.

I know that pandas and polars are great for this kind of use.

A challenge there are the periods. It would be like a huge 3D sparse matrix.

A dataframe per period would be ok.
Or adding the period to the index: every value would be indentified by an index (period, id) for a column variable in the corresponding entity dataframe.

I am concerned about performance, but we may experiment with different backend.

Historically, we stuck to numpy because it was the more stable package at the time. Pandas was in its infancy. Polars didn't exist yet.

An example:

import numpy as np dtype = [ ("people", "U10"), ("variables", [("tax", np.int32), ("pension", bool), ("benefit", np.uint8)]), ] array = np.array( [("juan", (100, True, 1)), ("jean", (200, False, 2)), ("john", (300, True, 3))], dtype=dtype, ) print(array["people"]) # ['juan' 'jean' 'john'] print(array["variables"]) # [(100, True, 1) (200, False, 2) (300, True, 3)] print(array["variables"]["tax"]) # [100 200 300] print(array[array['people'] == 'juan']['variables']['tax']) # [100] # operation only in sub-vector array["variables"]["tax"] *= 2 # it didn't copy data assert array["variables"]["tax"].base is not None

openfisca_core/populations/_core_population.py

coveralls · 2024-10-17T12:43:25Z

coverage: 82.273% (+0.006%) from 82.267%
when pulling 6b12b80 on feat/add-core-population
into 877d87f on master.

coveralls · 2024-10-17T12:43:25Z

coverage: 82.273% (+0.006%) from 82.267%
when pulling 6b12b80 on feat/add-core-population
into 877d87f on master.

bonjourmauko added the kind:feat A feature request, a feature deprecation label Oct 14, 2024

bonjourmauko requested review from a team October 14, 2024 17:00

bonjourmauko self-assigned this Oct 14, 2024

bonjourmauko mentioned this pull request Oct 14, 2024

docs: add types to tracers (1/3) #1280

Merged

bonjourmauko force-pushed the docs/add-docs-to-experimental branch from 0da8886 to 0e10d12 Compare October 16, 2024 15:26

bonjourmauko requested a review from benjello October 16, 2024 15:27

bonjourmauko force-pushed the feat/add-core-population branch 2 times, most recently from 1d53eeb to 20887aa Compare October 16, 2024 15:30

bonjourmauko commented Oct 16, 2024

View reviewed changes

openfisca_core/populations/_core_population.py Outdated Show resolved Hide resolved

bonjourmauko commented Oct 16, 2024

View reviewed changes

openfisca_core/populations/_core_population.py Show resolved Hide resolved

benjello approved these changes Oct 16, 2024

View reviewed changes

Base automatically changed from docs/add-docs-to-experimental to master October 17, 2024 12:31

bonjourmauko added 10 commits October 17, 2024 14:32

feat: add core population (#1238 #1276 #1277)

fa06c33

refactor: memory usage (#1238 #1276 #1277)

6379644

test: get memory usage (#1238 #1276 #1277)

b125d96

test: get holder (#1238 #1276 #1277)

8b4fd6c

test: check period (#1238 #1276 #1277)

7991c18

test: check array (#1238 #1276 #1277)

f952744

test: get index (#1238 #1276 #1277)

8b7bbc7

test: __call__ (#1238 #1276 #1277)

0eacabe

docs(pops): count and ids (#1238 #1276 #1277)

a3f75d6

chore: version bump

6b12b80

bonjourmauko force-pushed the feat/add-core-population branch from 20887aa to 6b12b80 Compare October 17, 2024 12:37

bonjourmauko merged commit 5a131cf into master Oct 17, 2024
29 checks passed

bonjourmauko deleted the feat/add-core-population branch October 17, 2024 13:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add core population #1279

feat: add core population #1279

bonjourmauko commented Oct 14, 2024

bonjourmauko Oct 16, 2024 •

edited

Loading

benjello Oct 16, 2024

bonjourmauko Oct 17, 2024

benjello Oct 17, 2024 •

edited

Loading

bonjourmauko Oct 17, 2024 •

edited

Loading

bonjourmauko Oct 17, 2024

bonjourmauko Oct 17, 2024

bonjourmauko Oct 17, 2024

benjello Oct 17, 2024

bonjourmauko Oct 17, 2024 •

edited

Loading

coveralls commented Oct 17, 2024

coveralls commented Oct 17, 2024

feat: add core population #1279

feat: add core population #1279

Conversation

bonjourmauko commented Oct 14, 2024

New features

bonjourmauko Oct 16, 2024 • edited Loading

Choose a reason for hiding this comment

benjello Oct 16, 2024

Choose a reason for hiding this comment

bonjourmauko Oct 17, 2024

Choose a reason for hiding this comment

benjello Oct 17, 2024 • edited Loading

Choose a reason for hiding this comment

bonjourmauko Oct 17, 2024 • edited Loading

Choose a reason for hiding this comment

bonjourmauko Oct 17, 2024

Choose a reason for hiding this comment

bonjourmauko Oct 17, 2024

Choose a reason for hiding this comment

bonjourmauko Oct 17, 2024

Choose a reason for hiding this comment

benjello Oct 17, 2024

Choose a reason for hiding this comment

bonjourmauko Oct 17, 2024 • edited Loading

Choose a reason for hiding this comment

coveralls commented Oct 17, 2024

coveralls commented Oct 17, 2024

bonjourmauko Oct 16, 2024 •

edited

Loading

benjello Oct 17, 2024 •

edited

Loading

bonjourmauko Oct 17, 2024 •

edited

Loading

bonjourmauko Oct 17, 2024 •

edited

Loading