Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KED-1258] Allow logging directory to be specified #161

Closed
WaylonWalker opened this issue Nov 11, 2019 · 5 comments
Closed

[KED-1258] Allow logging directory to be specified #161

WaylonWalker opened this issue Nov 11, 2019 · 5 comments
Labels
Issue: Feature Request New feature or improvement to existing feature

Comments

@WaylonWalker
Copy link
Contributor

Description

It is very jarring for kedro to change directories on its users. It appears that this is done for logging purposes. Is there a way to specify the logging directory without changing the users directory? This is possibly related to #149.

https://github.com/quantumblacklabs/kedro/blob/c96e94dc46948b05d9d09e285894ba3ff03c2595/kedro/context/context.py#L518-L522

Context

It is very confusing to have a side effect of changing directories after running a function. This behavior is not typical with other python frameworks/libraries that I am familiar with.

Possible Implementation

set the logging directory and remove the os.chdir line.

logging.basicConfig(filename=str(project_path / log_filename), filemode='w')

Possible Alternatives

My project templates track the user's directory and changes it back to where the user intended to be. This works, but is less desirable.

@WaylonWalker WaylonWalker added the Issue: Feature Request New feature or improvement to existing feature label Nov 11, 2019
@tolomea
Copy link
Contributor

tolomea commented Nov 13, 2019

The dependency on current working directory is a known problem that we've been working to resolve for sometime, we've stomped out all the smaller problems but logging config is quite awkward to deal with. The catalog also needs some work but that's at least manageable as we control the data format and it's processing, unlike with logs.

@lorenabalan lorenabalan changed the title Allow logging directory to be specified [KED-1258] Allow logging directory to be specified Dec 4, 2019
@khdlim
Copy link

khdlim commented Jan 23, 2020

@WaylonWalker If you are OK with logging to a directory defined within the working directory, you can hack that functionality in by overloading _setup_logging in ProjectContext

https://github.com/quantumblacklabs/kedro/blob/2a71b784fc2940934363b38f3befb2bd52bffda5/kedro/context/context.py#L319-L322

What I have done below is to allow the specification of the optional key logging_dir in parameters.yml to output the info and error logs there as a subdirectory of my_logs. I'm not super comfortable with it since it silently overrides the output filename config in logging.yml but the alternative is to modify logging.yml manually every run to reflect that as it doesn't accept a dynamic output filename (e.g. dependent on execution timestamp/date).

    def _setup_logging(self) -> None:
        """Register logging specified in logging directory."""

        conf_logging = self.config_loader.get("logging*", "logging*/**")

        ######## Hack to change logging directory
        # Create logs directory and point info log file there

        if 'params:logging_dir' in self.io.list():
            # If logging_dir is specified in parameters.yml, use that
            self.logging_dir = './my_logs/%s' % self.io.load('params:logging_dir')

        else:
            # else create a new timestamped folder and log to it
            fts = datetime.datetime.now().strftime('%Y%m%d_%H%M%S')
            self.logging_dir = './my_logs/log_%s' % fts

        if not os.path.exists(self.logging_dir):
            os.makedirs(self.logging_dir)

        # Point info and error log file handlers there
        for handler_type in ['info','error']:
            (conf_logging['handlers']['%s_file_handler' % handler_type]
             ['filename']) = '%s/%s.log' % (self.logging_dir, handler_type)

        ######## 

        logging.config.dictConfig(conf_logging)

Incidentally, I also dump the parameters used during the run into the logging directory as a JSON file as well for future reference.

@limdauto
Copy link
Contributor

limdauto commented Mar 19, 2020

Hi @WaylonWalker @khdlim we have just landed a commit on the develop branch to fix the issue of load_context changing user's working directory. The commit message has the details of the fix if you are interested. The tl;dr is load_context no longer changes the user's current working directory while still allowing logging and dataset configuration to be specified as relative path.

I'm closing this issue as a result. Please let me know if you have any issue.

@WaylonWalker
Copy link
Contributor Author

Great Work!!!! Excited to see this in the next version. We have a quick hack implemented on our pipelines that simply saves the current directory, loads context, then changes back to where the user was.

It might be good to update run.py from the template to not rely on the user's current working directory.

https://github.com/quantumblacklabs/kedro/blob/develop/kedro/template/%7B%7B%20cookiecutter.repo_name%20%7D%7D/src/%7B%7B%20cookiecutter.python_package%20%7D%7D/run.py#L54

def run_package():
    # entry point for running pip-install projects
    # using `<project_package>` command
-   project_context = load_context(Path.cwd())
+   project_context = load_context(Path(__file__).parents[1]) # it might be a different layer of parents double check that
    project_context.run()

@limdauto
Copy link
Contributor

Great shout. Will definite make that change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue: Feature Request New feature or improvement to existing feature
Projects
None yet
Development

No branches or pull requests

4 participants