Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should logging be enabled all the time (but still not turned on by default)? #1845

Closed
edwardhartnett opened this issue Sep 22, 2020 · 8 comments

Comments

@edwardhartnett
Copy link
Contributor

Debugging netCDF I/O problems on HPC systems can be very challenging.

Using a debugger is usually hard or impossible, and with code running on hundreds or thousands of processors, it's pretty hard to know what is going on.

One thing that helps a lot is logging - especially in PIO, where I have adjusted the logging code which I originally wrote for netcdf-4 to work better on multiple processors, giving one log file for each processor, with the logging for that processor. I have an issue into netcdf-c to get this improvement in the netCDF-4 logging as well (#1762).

As we know, logging is only enabled if --enable-logging is used. I did this out of an excess of caution. I didn't want to slow performance with logging, and this option guarantees that it does not. However, there is a serious cost. When a HPC system experiences problems, to install a re-built netcdf-c, with logging turned on, is a significant difficulty. Partly this is because these systems are complex, partly because only a few people are given permissions to install software, and those people are always backed up, with a long list of things to do.

What would be really useful would be if logging were always available. This will not effect performance because even when logging is available, it does nothing when the log level is not set. So a LOG(()) call in netcdf-c code will check the log level and exit, which should not matter to performance, especially if we are careful to avoid LOG(()) statements inside deep loops, and we mostly do that now, because such log statements produce more output than is useful.

With this change, then, HPC users could start debugging just by inserting an nc_set_log_level() statement (or nf_set_log_level() for Fortran). There would be no need of a separate netcdf-c install, so system admins would not have to be involved.

This must be carefully tested to ensure there are no performance impacts. But I don't think there will be. Most logging is done in the metadata code, which is not vital for performance. In the data read/write code, there are LOG(()) statements, but we can ensure they do not degrade performance.

@WardF
Copy link
Member

WardF commented Sep 22, 2020

I've wondered why logging was not enabled by default, but had assumed it was for historical reasons; thank you for elaborating, Ed. Always better to move forward with an abundance of caution. I'd be open to this once we determine there is no obvious overhead introduced. We should also make sure to retain the --disable-logging option (and associated internal logic), since 'severity of overhead introduced by logging' is a subjective measure, and there's no reason to not let people remove it if they really wanted to.

@edwardhartnett
Copy link
Contributor Author

@WardF yes fully agree about leaving --disable-logging. Also will present some performance runs as part of this work. However, with logging enabled but not in use, there should be no performance issue. Only when you turn logging on do the printf statements get called and slow everything down.

@DennisHeimbigner
Copy link
Collaborator

Ed and I discussed a similar issue some time ago.
My take is that the netcdf-4 logging is used for two distinct
things: (1) errors and warning and (2) tracing.
Ideally, #1 is enabled all the time and #2 should be optionally enabled.
As I recall, Ed did make some changes so that logging with level 0 is
treated as an error. Levels > 0 were treated as tracing. But you still had
to enable/disable both at the same time.
I should also note that libdap2 and dap4 have a separate logging capability
that is basically #1; it has no corresponding trace capability.

@edwardhartnett
Copy link
Contributor Author

Well @DennisHeimbigner rasies an interesting point, which is log level 0, which gives warnings and errors.

In fact, I believe we will want to turn this off by default. In other words, make --enable-logging default on, but change the default initial log value from 0 to -1. (-1 turns off the warnings and errors that come with log 0).

The reason is, if we left it at log 0, then programs that currently do things that result in errors, but handle them, do not get extra stdout.

But as I look through the code I see that LOG 0 statements are mostly at places which cannot really occur anyway...

@DennisHeimbigner
Copy link
Collaborator

One important use for logging errors is to provide extra
information when a error code is reported (e.g. returning NC_EINVAL
is pretty useless if there is not information about what was invalid).

@edwardhartnett edwardhartnett changed the title Should logging be enabled all the time? Should logging be enabled all the time (but still not turned on by default)? Oct 29, 2020
@edwardhartnett
Copy link
Contributor Author

BTW just to clarify, what I am proposing is just that --enable-logging is the default. There will still have to be a call to nc_set_log_level() to turn on logging, even level 0 logging, because the default level is -1.

So yes, @DennisHeimbigner we could use log statements to provide extra error info, but this would only take effect when the user turns on logging...

@DennisHeimbigner
Copy link
Collaborator

It occurs to me that there is another minor? issue here.
If --enable-logging is set, but all one wants is error reports,
the user still incurs the cost of calling the log function for traces
even though tracing is not needed. This is a argument for separate
logging and tracing functions.

@edwardhartnett
Copy link
Contributor Author

edwardhartnett commented Aug 14, 2024

I think we should leave --enable-logging as it is, but just make sure it's documented better, which I've done in the new logging documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants