Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data compression in fms2_io #1035

Closed
junwang-noaa opened this issue Sep 7, 2022 · 10 comments · Fixed by #1043
Closed

data compression in fms2_io #1035

junwang-noaa opened this issue Sep 7, 2022 · 10 comments · Fixed by #1043
Assignees
Labels

Comments

@junwang-noaa
Copy link

Is your question related to a problem? Please describe.
Currently we are using fms2_io to write out MOM6 history/restart files in the UFS weather model. There is a request on compressing the data to reduce the data size. I am wondering if there is still an option to do the data compression with fms2_io.

Describe what you have tried
With fms_io, we have been setting the deflate_level in the mpp_io to compress the data. But it is not clear to me if there is a way to compress the data in fms2_io and how to set the namelist for it in fms2_io.

Thanks for your help.

@junwang-noaa
Copy link
Author

@jiandewang @yangfanglin @pjpegion @yuejianzhu-noaa FYI.

@bensonr
Copy link
Contributor

bensonr commented Sep 7, 2022

@junwang-noaa - Are you looking for lossy or lossless compression? While it is possible the history/diagnostics data could tolerate being lossy, the restarts must be lossless. There would need to be clear benefit for the time spent compressing vs. the reduction in size in both cases. I would suggest an offline test for the cost-benefit analysis and would serve as a prototype for implementation.

@junwang-noaa
Copy link
Author

@bensonr Thanks for the suggestion. Would you confirm that we can still use the deflate_level in mpp_io library for the fms2_io for data compression? I thought the mpp_io is going to be removed and we can't use it for fms2_io.

@jiandewang
Copy link

I just checked MOM6-example infra/FMS2 code, deflate_level still exists

FMS/mpp/mpp_io.F90

Lines 1075 to 1079 in 0721d59

integer :: deflate_level = -1
logical :: cf_compliance = .false.
namelist /mpp_io_nml/header_buffer_val, global_field_on_root_pe, io_clocks_on, &
shuffle, deflate_level, cf_compliance

I will do some offline testing on this

@bensonr
Copy link
Contributor

bensonr commented Sep 7, 2022

@junwang-noaa - mpp_io/fms_io and fms2_io are independent IO layers with no interconnections. The future of I/O support has described multiple times. A coming release will remove mpp_io/fms_io from the default compile of FMS, though one can still include it within the library via a compile-time macro. A subsequent release will remove the logic completely.

If you compile MOM6 with the infra/FMS1 bindings, you should still be able to use the existing compression from within mpp_io/fms_io layer. If one chooses infra/FMS2 for MOM6, all I/O will be performed via the fms2_io layer.

@thomas-robinson
Copy link
Member

After consulting with @menzel-gfdl it looks like the deflate_level is not supported when calling the netcdf variable define functions

FMS/fms2_io/netcdf_io.F90

Lines 899 to 903 in 7fafa4f

err = nf90_def_var(fileobj%ncid, trim(variable_name), vtype, dimids, varid)
deallocate(dimids)
else
err = nf90_def_var(fileobj%ncid, trim(variable_name), vtype, varid)
endif

This is something that would have to be added to the subroutine netcdf_add_variable as an optional argument. Then this code would look like

 integer, optional, intent(in) :: deflate_level !< The netcdf deflate level\
...
    if (present(dimensions)) then
      allocate(dimids(size(dimensions)))
      do i = 1, size(dimids)
        dimids(i) = get_dimension_id(fileobj%ncid, trim(dimensions(i)),msg=append_error_msg)
      enddo
      err = nf90_def_var(fileobj%ncid, trim(variable_name), vtype, dimids, varid, deflate_level=deflate_level)
      deallocate(dimids)
    else
      err = nf90_def_var(fileobj%ncid, trim(variable_name), vtype, varid, deflate_level=deflate_level)
    endif

The implementation is simple enough, but, like @bensonr suggested, you will likely take a performance hit from doing this.

@junwang-noaa
Copy link
Author

@bensonr @thomas-robinson We still hope to get the compression feature in fms2 io. Would you mind provide us a test version of fms? We'd like to run some tests to see the impact on the data size and computation performance. For ufs weather-model, with current resolution MOM6 does not dominate the computation time, so we have a little space to get smaller data size. Thank you very much!

@uramirez8707 uramirez8707 self-assigned this Sep 13, 2022
@thomas-robinson
Copy link
Member

@mcallic2 will work on this and get back within a week. @junwang-noaa do you have a repo you would like her to submit a PR to when it's ready? She will be working off of the tag 2022.04-alpha4 unless there is some other recent starting point you would like her to work from.

@junwang-noaa
Copy link
Author

@rem1776 Thank you very much for the code updates! It looks to me we also need to update the restart writing code to use the option. @bensonr is there a plan to update the dycore restart code to add the compression and chunksize option? Thank you!

@bensonr
Copy link
Contributor

bensonr commented Sep 22, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants