Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processing of input data with a different number of categories #60

Draft
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

RemeshevskiyValeriy
Copy link
Contributor

@RemeshevskiyValeriy RemeshevskiyValeriy commented Aug 23, 2024

Related issues: #47

@ivanbarsukov ivanbarsukov force-pushed the process-inputs-with-different-categories-count branch from 11614bf to ac7508d Compare September 4, 2024 13:21
@KolesovDmitry
Copy link
Contributor

I Mixing of code levels and diffusion of responsibility

Try do not mix different levels of code too much. Currently, the interface (neuralnetworkwidget.py) knows too much about the internal structure of the algorithmic modules. See for example

Here the interface part analyzes how many categories are stored in the initial/final rasters and 'it knows' that these categories should be combined with each other. As a result, the processing logic is "spread out" across several modules.

Ideally, all the logic should be kept in one place, for example, by put the code into a separate function/method get_categories(init_raster, final_raster) and keep it there. Moreover, the function/method should be defined not in the interface part of the plugin, but in the algorithmic modules (in the same modude where the main analysis of the state rasters occurs (AreaAnalysis? CrossTabs? Sampler?))

A remark (this is already a matter of taste). In my opinion, there is no need to check whether the categories in the rasters are the same, since they are processed through set(). So it doesn't matter whether the cateogoriesa are the same or not; the answer will be the same. Checking for category equality saves computation time, but the gain is neglect. Eliminating the check simplifies the code, it became easy to read; this is a more important priority for me (but, as I said, this is a matter of taste; there is no fundamental difference here).

II Duplication of code

The same code is placed in several different locations, for example:

(also may be others examples; I haven't looked at all of them)

I strongly recommend create a special function/method for calculating categories. Otherwise, if we decide to recalculate the categories using a new scheme, we will have to go through the code to catch errors — we will need to remember all the places where these categories are calculated and make corrections in all those places.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants