-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update normalization #32
Conversation
We forgot about the need to specify a list of means/std values for |
- `data_type` data type (e.g. float32) | ||
- `data_range` tuple of (minimum, maximum) | ||
- `axes` string of axes identifying characters from: btczyx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a place to give detailed definition for the axes and the meaning? Are we allowed to give a custom letter to it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should be as restrictive with these letters as possible to keep them meaningful/useful from a consumer software perspective, e.g restrict them to btczyx. We can have a (separate) discussion on the axes keys (use HW, instead of xy, etc...). I'd prefer to delay that for 0.3.1+, for now in a given input the description
field could add specific meaning in the model context for humans? I feel an axes_description
field would go a bit too far anyway, but again, let's leave that for future discussion if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do agree that for now, it is too ambitious to get this kind of specifications. I think it was already discussed at some point but outputs might also be better described with rows and columns. While in NumPy arrays those could be understood as HW, displaying them as tables could need a different kind of description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
OK for the normalization as well. PD: In a quick search, I found some nice comments about normalization/standardization/stretching: |
So normalizing data is not the same as normalization I suppose... I see the ambiguity. So how about we change the name of the All in favor of renaming our newly introduced |
|
- `axes` subset of input `axes` to normalize independently (e.g. 'c') | ||
- `mean` mean to normalize with (only applies for `mode` 'fixed'). This may be a (nested) list depending on `axes`, e.g. for `axes` 'c' a list of means for each channel; or for axes: 'cz' a list for each channel c of a nested list of means for each z position of that channel. | ||
- `std` standard deviation to normalize with (only applies for `mode` 'fixed'). This may be a (nested) list depending on `axes` analogously to `mean`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We like explicit normalization descriptions in this way, but would like to talk about supported normalization schemes rather sooner than later. Maybe one of the next meetings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's put it on the agenda 👍
5338dbf
to
bff31d8
Compare
We decided in today's bioimage.io meeting:
valid preprocessings we start with:
todos:
|
bff31d8
to
d7bc5e1
Compare
open todos moved to #37 |
README.md will be updated shortly...