-
-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: compute packagings stats #7949
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I approve, but I would better like to have the OpenAPI documentation shipped with it (see my comment).
scripts/gen_packaging_stats.pl
Outdated
|
||
my $total = 0; | ||
|
||
my $packagings_stats_ref = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to document this kind of structure where we declare them, to avoid having to read the full algorithm to understand the structure.
Or even better we could document it as a json schema (in yml) in the docs/reference, that would be cool (it could also be considered an API), and put here the path to the OpenAPI schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a description of the structure at the top of the file:
Aggregation counts are stored in a structure of the form:
{
countries => {
"en:world" => ..
"en:france" => {
categories => {
"all" => .. # stats for all categories
"en:yogourts" => {
shapes => {
"en:unknown" => ..
"all" => .. # stats for all shapes
"en:bottle" => {
materials_parents => .. # stats for parents materials (e.g. PET will also count for plastic)
materials => {
"all" => ..
"en:plastic" => 12, # number of products sold in France that are yogurts and that have a plastic bottle packaging component
}
},
..
}
},
..
}
},
..
}
}
Regarding the doc in OpenAPI, why not, but at this point this is completely experimental, I don't know if we will keep that structure or not. We can document in OpenAPI once it's a bit more stabilized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my $packagings_stats_ref = {}; | |
# this will contains the final result | |
# see structure on top of this file | |
my $packagings_stats_ref = {}; |
Co-authored-by: Alex Garel <alex@garel.org>
Co-authored-by: Alex Garel <alex@garel.org>
Co-authored-by: Alex Garel <alex@garel.org>
Thanks for all the suggestions @alexgarel , I think I implemented them all in the last commit. Please don't merge yet, there are a few details I want to change (like filtering out bogus entries in countries). |
I'm not quite sure what to make of this (I looked at the JSON beautified). |
I added a special export just for French yogurts, to make the data easier to explore in just a browser: So for instance we can easily see the materials used for pots of yogurts: Note that there are fields for "shape" / "shape_parents", and "material", "material_parents". This is so that we can see stats for all "pots" even if we have some components listed as "pots" and others as "individual pots". Same thing for materials: in "materials" you see the exact values entered by users, and in "materials_parents" you have all the parents values as well. |
Kudos, SonarCloud Quality Gate passed! |
This is a script to compute some packaging stats for categories of products. #7929
Sample output: https://world.openfoodfacts.org/data/categories_stats/categories_packagings_stats.packagings-with-weights.json
(for products that have some packaging weights)
Currently the stats are only on country / category / shape / material.
We will also add stats for packaging weight.