Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SDK] Setup Logging Verbose Level in TrainingClient #1946

Closed
andreyvelich opened this issue Nov 16, 2023 · 9 comments · Fixed by #1973
Closed

[SDK] Setup Logging Verbose Level in TrainingClient #1946

andreyvelich opened this issue Nov 16, 2023 · 9 comments · Fixed by #1973
Labels

Comments

@andreyvelich
Copy link
Member

To followup on this thread, I created a new issue: #1944 (comment).

I agree, that we should have appropriate logger for the SDK (e.g. logging.getLogger(__name__)) to not use root logger, but I still think that we should use logging to log some data for the user.

The problem with print is that user can't identify from which library the output was generated, but with logging we can configure it.

What do you think about this:

  • We can introduce a new parameter to TrainingClient() called verbose, and we can configure this parameter for various levels of logging: https://docs.python.org/3/library/logging.html#levels. We can start with (INFO, WARNING, and ERROR).
  • Depends on this parameter we can configure our logger in the TrainingClient constructor. For example, for verbose=20
self.logger = logging.getLogger(__name__)
if verbose == 20:
  self.logger.setLevel(logging.INFO)
  • And then we are going to use self.logger.info() or self.logger.warning() or self.logger.error() when it is required to print some data.

  • If user doesn't want to see any logs, they can always override it as follows:

logger = logging.getLogger("kubeflow.training.api.training_client")
logger.setLevel(logging.NOTSET)

Or provide verbose=0 in the TrainingClient parameter:

client = TrainingClient(verbose=0)

WDYT @droctothorpe @johnugeorge @kuizhiqing ?

@droctothorpe
Copy link
Contributor

Take this with a grain of salt, but my understanding is that in Python the expectation is that this kind of log level configuration is typically handled out of band. Take a look at httpx's logging documentation, for example:

https://github.com/encode/httpx/blob/master/docs/logging.md#logging

They log a TON of data with info. Library consumers can adjust the level at will, so they don't need to provide a wrapper or alternative interface for that functionality.

@andreyvelich
Copy link
Member Author

Sorry for the late reply @droctothorpe.
Do you see any user limitations if we define the specific logger for the SDK (e.g. logging.getLogger(__name__)) and other libraries are going to re-use our SDK package ?

I just want to avoid complexity for users who just use our SDK for the first time and can't see information that we want to log for them (e.g. Experiment has been created) until they modify the log level manually.

@johnugeorge
Copy link
Member

@andreyvelich Is it really needed? Typically in SDK, logging is not generally required unless explictly enabled

@andreyvelich
Copy link
Member Author

@andreyvelich Is it really needed? Typically in SDK, logging is not generally required unless explictly enabled

In that case, should we convert all of our logs to DEBUG messages ?
E.g. when we print logs for Job pods: https://github.com/kubeflow/training-operator/blob/master/sdk/python/kubeflow/training/api/training_client.py#L856.

Kubernetes Python client logs some messages in debug mode: https://github.com/kubernetes-client/python/blob/master/kubernetes/client/rest.py#L235

@andreyvelich
Copy link
Member Author

@droctothorpe @johnugeorge Any comments on the above suggestion ?

@johnugeorge
Copy link
Member

So, when follow is true, how is it planned to print logs ? Do you mean that get_job_logs works only in DEBUG mode ?

@andreyvelich
Copy link
Member Author

So, when follow is true, how is it planned to print logs ? Do you mean that get_job_logs works only in DEBUG mode ?

Yes, that's right. So for users to see this they need to run the following:

logger = logging.getLogger("kubeflow.training.api.training_client")
logger.setLevel(logging.DEBUG)

Or, we can configure the default logger to be in DEBUG mode:

logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)

@johnugeorge
Copy link
Member

Isn't it a bad UX? The user needs to get the result whenever get_job_logs is called. But if we set DEBUG by default, it will create a flood of log messages. I agree that the current solution is not ideal but we need to find out a better solution.

@andreyvelich
Copy link
Member Author

In that case, we can set the default logger to INFO and use logger.info() here to print pod logs..
Other prints we will keep as logger.debug() when it is not necessary to log any data.
What do you think @johnugeorge ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants