-
Notifications
You must be signed in to change notification settings - Fork 318
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For mksurfdata_map: use maps with no source masking, applying mask separately #286
Comments
My understanding of how the ESMF regridding works is that this isn't something that you can do. But, I could be wrong. The mapping files created have the masks inherently embedded into them. I don't know of an easy way to extract them out. You could assume without a mask, but that means when you run the mapping you will be averaging in data that is outside the mask. That's the thing that I think we want to ensure doesn't happen. I don't really think the burden is that high for carrying around files at the same resolution, but multiple masks. Right now there are three half degree, five 3x3 minute, and 2 10x10 minute grids. So you'd have some speedup with this, but still the 1km-merge-10min_HYDRO1K-merge-nomask grid is the one that far and away takes the most time. |
@ekluzek Unless I'm overlooking something: You can have masks embedded in the grid files when creating the mapping files, but you don't have to. If you don't, then you need to do a bit more work in the mapping routine, but we actually already have code in place to do this: You may be right that the burden isn't that high for the different grid files, but the burden is higher for the combinatoric mapping files. I know we've talked about moving away from storing all of them eventually, though. In the end, I don't have strong feelings about whether this should be done. I think it's a good idea, but I'm not sure if it gains us enough to be worth the development time. |
I think the right approach is to create these mapping files on the fly -
and not store them. I had demonstrated that this was feasible and showed
acceptable performance by simply translating the input format of the input
grid files. Only one file did not work well with this. I think that this is
the approach that should be pursued rather than still working with storing
mapping files.
…On Tue, Feb 13, 2018 at 3:51 PM, Bill Sacks ***@***.***> wrote:
@swensosc <https://github.com/swensosc> suggested this 2015-10-19, and it
seems like a good idea to me: When mapping files from their raw data grid
to CLM resolutions, we could use maps with no source masking, and then
apply the mask in a separate step.
We currently have a LOT of mapping files from the mksurfdata_map raw data
files to the CLM grids. Much of the reason we have so many is that we have
a separate set of mapping files for each raw data mask - e.g., even if many
of the raw data files are at the same 3' resolution, we need different
mapping files for the different masks.
Sean pointed out that we should be able to use mapping files without
masks, and then tweak the mapping algorithms to apply the source masks
separately. I think we do things like that in other parts of CESM (e.g., in
the coupler?). This would greatly reduce the number of mapping files we
need to maintain. Furthermore, if a raw dataset is updated, and this update
involves changing the mask, you wouldn't need to remake mapping files.
(This was Sean's original motivation, as he is updating the lake dataset in
this way.) Instead, mksurfdata_map would simply read the mask off of the
(updated) raw data file.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#286>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHlxEz6ndzI1JdoAuSxZfesfVB29Um4Fks5tUhHegaJpZM4SEh01>
.
|
@mvertens I don't disagree. However, I'll point out that doing this suggestion could actually help at least as much if we're generating mapping files on the fly, because we'd only need to generate, say, 1/2 or 1/3 as many mapping files. |
@mvertens as @billsacks says, yes, if that line of development (to create mapping files for mksurfdata_map on the fly) is taken up again, this change should happen along with it. It'll both shorten the time to make the mapping files and as @billsacks pointed out minimize how many are required. If we go to a paradigm of creating them on the fly, you want to create as few as possible as fast as possible. The time to create them needs to be sufficiently short though, to make that the standard mechanism. |
Tasks that I see mentioned above... Sean's suggestion:
Mariana's suggestion:
|
Corrected previous post to say #644 |
@slevisconsulting - yes, Mariana's suggestion is in the scope of #644 , so this issue relates to Sean's suggestions, which we thought were a good idea both for the sake of dataset management and efficiency. Note that this will require changes to mksurfdata_map as well as the scripts / xml related to our dataset management. |
bug fixes to multi-threading
An update:
|
For now |
Resubmitted with a couple of changes and seems to work for the I will open a PR soon to share my code mods to-date. |
@swensosc suggested this 2015-10-19, and it seems like a good idea to me: When mapping files from their raw data grid to CLM resolutions, we could use maps with no source masking, and then apply the mask in a separate step.
We currently have a LOT of mapping files from the mksurfdata_map raw data files to the CLM grids. Much of the reason we have so many is that we have a separate set of mapping files for each raw data mask - e.g., even if many of the raw data files are at the same 3' resolution, we need different mapping files for the different masks.
Sean pointed out that we should be able to use mapping files without masks, and then tweak the mapping algorithms to apply the source masks separately. I think we do things like that in other parts of CESM (e.g., in the coupler?). This would greatly reduce the number of mapping files we need to maintain. Furthermore, if a raw dataset is updated, and this update involves changing the mask, you wouldn't need to remake mapping files. (This was Sean's original motivation, as he is updating the lake dataset in this way.) Instead, mksurfdata_map would simply read the mask off of the (updated) raw data file.
The text was updated successfully, but these errors were encountered: