-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split bathymetry subroutines into seperate file #978
Conversation
If you have any general thoughts on this @apcraig - that would be great. Its just marked as draft because we could add more (related) changes in this PR. |
I like this idea in general, but there are potential challenges.
Before we do anything, I think we need to scope out how this is going to look and have a plan.
|
The change becomes notable simpler (and possible neater) if we change the interface to
I think its fairly minor for build systems to change the included files (we do this all the time :))
Yeah - I messed around with this for a while and this seems like the tricky part. The limitation of the current set-up is that its not possible to move the subroutines in the current The options i've come up with so-far are:
I don't think there is a performance problem with this? But it is messy!
And then point an instance of this type to right variables. The subroutines calls become somewhat neater, e.g.
This option is probably quite confusing. |
The circular logic is the problem. I think the question is whether any of the three proposals above (and others we might think about) is better than what we have now. If we move data out of ice_grid and all the use statements elsewhere in the code have to change, that's quite a change too. What if make the requirements that we have to keep the public data in ice_grid and we don't want to add a datatype or a bunch of arguments. That would limit what we could do. We'd have to keep the data and higher level methods in ice_grid. How much of the implementation could we extract into a grid_infrastructure layer and could we create new methods that could be used by multiple high/mid level methods that would improve code reuse? Would that be useful? Is the motivation that we would like each type of grid to have it's own module? But the "drivers" are still calling init_grid1, init_grid2, etc. This seems like a lot of work and change to create this separation at the middle layer just to create individual files for different grid files/types. There is also a lot of reuse in init_grid1 and init_grid2. So we'd want to create a popgrid, latlongrid, cpomgrid and rectgrid file/module? Then add a momgrid file/module?
We have often run across exactly these types of issues when trying to clean up some code. We end up having a peeling onion effect and the code cleanup ends up being a much bigger deal than it should be. And unless we are refactoring a bunch of code to make it perform better or add new features that require a refactor, we end up deciding to leave things more-or-less as they are. I'm open, but want to make sure we consider the cost vs benefit. |
I think this is the best approach. Yes - there are 47 files impacted but the changes are small to each individual file. (And fairly low risk - its just changing the name of the module).
This would be very limited in what we can do. Basically all the subroutines are in this file because they use the data in ice_grid, and are called in another ice_grid function. Possibly we could put the grid_average_X2Y subroutines/interface in a different file (it would need new arguments for the masks sometimes). (And the change in this PR would be achievable too)
This would be good but is a significant re-design and doesn't really fit into the current structure. My main motivation is to make the ~4000 lines in the current file easier to handle / read / work with.
I think seperating into seperate modules by grid_format makes sense - e.g. netcdf, binary, idealised/rectangular, mom_netcdf - that's basically the same as your suggestion. This is only ~1500 lines of the total file though.
On this note - can we deprecate some of the grid_types ? e.g. binary, CPOM, latlongrid ? |
I don't think we can deprecate grid_types. Maybe CPOM, but I think the other grid types are in use. I think we run into issues if we separate by grid types as you suggest. For instance, the pop grid can be binary or netcdf. We don't want to duplicate a bunch of the grid initialization in two files where the only difference is how the file is read. I could see having a pop, mom, latlon, etc module but I'm not sure we should separate by file format. Another idea would be to keep the data in module ice_grid and have that be the "data" module. Then we could move the current public methods into a separate module called something like ice_grid_methods. Then the use statement only has to change for CICE subroutines that use the methods, not the data. We could then add some middle layers called something like ice_grid_popgrid, ice_grid_momgrid, ice_grid_othergrids and also some infrastructure files like ice_grid_bathy and ice_grid_average if that would work. Personally, I'm not convinced this is the way to go overall. I'm not that troubled by the ice_grid file as it stands and I worry that if things get split apart, we add some risk that a change in one file for one grid breaks another grid implementation. If it's all in one file, we sort of force it to be "one". The other problem is that how we define the grid isn't even as simple as pop/mom/latlon OR binary/netcdf. We use terms like 'displaced_pole', 'tripole', 'latlon'. What if MOM has a tripole grid or a latlon grid? It's almost like we need to refactor how to specify the grid to 'pop', 'mom', 'latlon', 'internal' and then add information about whether it has special characteristics like tripole and then the format of the file (which we already have separated). I think we need to think carefully how we want to specify, define, and separate the grids. The refactor strategy should be driven by an effort to clean up and improve the implementation, not by the notion that we are trying to split up a big file. What are the shortcomings of the current implementation (beyond the file being big and difficult to deal with) that we should fix? Where does that lead us in redesign? |
If the bathymetry is split into a separate file then we should also consider where to put it. This is only used by the dynamics. Thus for this I would move data and function here and call it from e.g. init_evp. The hardcoded 40 layers originate from the original implementation of landfast and it only relates to a specific NEMO simulation. This can probably be implemented more generic or the option can be left out. Environment Canada should be heard. |
I will convert this back to a couple of issues: The first one on the bathymetry suggestion is here : #987 |
Closing this issue and attempting to summarise the discussion in #988 |
As a precursor to #807, we should split ice_grid into several files, as its >4000 lines.
This PR is for comment on the approach to do this with least interruptions, as several groups have their own copies or changes in their CICE forks. If this change looks ok, I will do the same for other sections of the ice_grid file (e.g. make an ice_gridbox.F90, ice_grid_average.F90 etc)
The changes won't change the interface to the ice_grid module, nor change the functionality of any routines.
PR checklist
Move bathymetry routines into ice_grid_bathy.F90 file
@anton-seaice
@apcraig @phil-blain @daveh150
None yet
Move bathymetry routines into seperate ice_grid_bathy.F90 file to improve readability and maintainability