-
Notifications
You must be signed in to change notification settings - Fork 169
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
resampling during outlier_detection generates unused products #8638
Comments
Comment by Mihai Cara on JIRA: The code that would allow this is pending in a couple of PRs in drizzle and jwst packages. However, there is an issue with models that creates context arrays regardless. Specifically, here is an example:
In [1]: from stdatamodels.jwst.datamodels import ImageModel
In [2]: m = ImageModel((5, 5))
In [3]: m.data
Out[3]:
array([[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]], dtype=float32)
In [4]: m.con
Out[4]:
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]], dtype=int32)
In [5]: m.con = None
In [6]: m.con
Out[6]:
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]], dtype=int32)
In [7]: del m.con
In [8]: m.con
Out[8]:
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]], dtype=int32) So, until data model are not modified in a way as to allow some arrays to be None, I am afraid these arrays will always re-created.
|
The >> from stdatamodels.jwst.datamodels import ImageModel
>> m = ImageModel((5, 5))
>> 'con' in m.instance
False
>> m.con
>> 'con' in m.instance
True There are more details in the documentation about the datamodels: |
Correct. But that means we cannot completely get rid of this memory: |
If we never assign to |
Comment by Ned Molter on JIRA: I'm attaching memory profiling before (outlier_mihai.html) and after (outlier_mihai_2.html) the change that removed allocation of the context array. It does appear that the context array is no longer allocated, so I think this ticket can be resolved once that PR is merged. The new peak memory usage shows that sigma_clip is using a factor of ~4 more memory than the weight array input into it. This does beg the question of whether that, too, can be optimized. But I think this should be its own ticket, especially given that the offending line lives in stcal. https://github.com/spacetelescope/stcal/blob/dfe1d6d51f0c2a400dff918a52c779328ae19b7d/src/stcal/outlier_detection/utils.py#L80 |
Comment by Ned Molter on JIRA: Attaching two more flamegraphs here, showing the effect of my fix to AL-875 along with Mihai's PR. The memory usage for on-disk=True decreased by 25% from ~4 GB to ~3 GB. The file names are
|
Comment by Melanie Clarke on JIRA: Fixed by #8866 |
Issue JP-3685 was created on JIRA by Brett Graham:
When resampling is performed during outlier detection, the underlying
GWCSDrizzle
generates a context array which is unused. This array can be quite large, for N 2d inputs of size H x W the context array will be of size H x W x ((N // 32) + 1). Investigate disabling this (and any other unnecessary intermediate products) to save on memory and computation during outlier detection.The text was updated successfully, but these errors were encountered: