Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional TOML configs support #2457

Merged
merged 27 commits into from
Jul 28, 2018
Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
d00f793
mv cfg parser
orsinium Jul 13, 2018
783ddeb
split cfg parser and config getter
orsinium Jul 13, 2018
fd4e30f
+TOML config parser
orsinium Jul 13, 2018
d712827
improve config getter
orsinium Jul 13, 2018
bcb59e0
+Apache License in new files
orsinium Jul 13, 2018
0242248
TOML config parser tests
orsinium Jul 13, 2018
debea65
override configs by keys
orsinium Jul 13, 2018
ddfb5be
Documentation for TOML config parser
orsinium Jul 13, 2018
d0caa72
Merge branch 'config'
orsinium Jul 13, 2018
3261393
FIX lexer error
orsinium Jul 13, 2018
edf41c8
FIX lexer error
orsinium Jul 13, 2018
1d0d110
FIX: no shared instance between different config parsers
orsinium Jul 14, 2018
6142834
rm unused os import
orsinium Jul 14, 2018
d86414a
Merge remote-tracking branch 'origin/config'
orsinium Jul 16, 2018
b88376a
+getitem for config
orsinium Jul 16, 2018
f4c128a
improve config paths
orsinium Jul 16, 2018
ba20f7b
add config function
orsinium Jul 16, 2018
22b7368
sort imports
orsinium Jul 16, 2018
95fa46b
make additional parsers optional
orsinium Jul 16, 2018
913fe19
improve add_config_path for default parser
orsinium Jul 16, 2018
ae050b0
FIX package installation
orsinium Jul 16, 2018
284cc26
FIX TOML parser raises KeyError, not NoSectionError
orsinium Jul 16, 2018
0d6943a
Merge branch 'improve-config' into config
orsinium Jul 16, 2018
a7047fa
some improvements after review.
orsinium Jul 16, 2018
1c9129b
Merge remote-tracking branch 'spotify/master' into config
orsinium Jul 26, 2018
3843761
dlstadther's improvements
orsinium Jul 26, 2018
09ace16
toml version pinning
orsinium Jul 27, 2018
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
40 changes: 34 additions & 6 deletions doc/configuration.rst
Original file line number Diff line number Diff line change
@@ -1,18 +1,35 @@
Configuration
=============

All configuration can be done by adding configuration files. They are looked for in:
All configuration can be done by adding configuration files.

* ``/etc/luigi/client.cfg``
* ``luigi.cfg`` (or its legacy name ``client.cfg``) in your current working directory
* ``LUIGI_CONFIG_PATH`` environment variable
Supported config parsers:
* ``cfg`` (default)
* ``toml``

in increasing order of preference. The order only matters in case of key conflicts (see docs for ConfigParser.read_). These files are meant for both the client and ``luigid``. If you decide to specify your own configuration you should make sure that both the client and ``luigid`` load it properly.
You can choose right parser via ``LUIGI_CONFIG_PARSER`` environment variable. For example, ``LUIGI_CONFIG_PARSER=toml``.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we instead require that the file must end in .toml? It's very little magic for more convenience.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think about it, but this is very explicit while we have many possible places for configs. What if we have luigi.cfg and luigi.toml? We can't combine it into one config.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can add an improvement down the road to define a hierarchical level of config extensions. I.e. (.cfg, .toml, .yml, <etc.>). I'm fine with requiring an ENV variable for now


Default (cfg) parser are looked for in:

* ``/etc/luigi/client.cfg`` (deprecated)
* ``/etc/luigi/luigi.cfg``
* ``client.cfg`` (deprecated)
* ``luigi.cfg``
* ``LUIGI_CONFIG_PATH`` environment variable

TOML parser are looked for in:

* ``/etc/luigi/luigi.toml``
* ``luigi.toml``
* ``LUIGI_CONFIG_PATH`` environment variable

Both config lists reversordered by priotity (from low to high). The order only matters in case of key conflicts (see docs for ConfigParser.read_). These files are meant for both the client and ``luigid``. If you decide to specify your own configuration you should make sure that both the client and ``luigid`` load it properly.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reversordered is not a word. Perhaps we reword something to the effect of the below:

Both config lists increase in priority (from low to high).


.. _ConfigParser.read: https://docs.python.org/3.6/library/configparser.html#configparser.ConfigParser.read

The config file is broken into sections, each controlling a different part of the config. Example configuration file:
The config file is broken into sections, each controlling a different part of the config.

Example cfg config:

.. code:: ini

Expand All @@ -23,6 +40,17 @@ The config file is broken into sections, each controlling a different part of th
[core]
scheduler_host=luigi-host.mycompany.foo

Example toml config:

.. code:: python
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is python used here for code syntax highlighting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because Pygments doesn't have available lexerfor TOML. This is already contributed in some Pygments versions:

But Pygments which used for travis checks doesn't have this lexer yet. I choose python as alternative lexer because toml syntax very similar to hybrid of python and ini :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And a TODO then please. :)


[hadoop]
version = "cdh4"
streaming-jar = "/usr/lib/hadoop-xyz/hadoop-streaming-xyz-123.jar"

[core]
scheduler_host = "luigi-host.mycompany.foo"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are single quotes acceptable too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, see TOML spec for more details about syntax

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe link to the spec in the docs too?



.. _ParamConfigIngestion:

Expand Down
27 changes: 27 additions & 0 deletions luigi/configuration/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# -*- coding: utf-8 -*-
#
# Copyright 2012-2015 Spotify AB
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these dates include 2018. Probably not that important

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just copied Apache license notice from original configuration.py. You can improve it, I don't now how it must be look in right way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I think this is fine ...

#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from .cfg_parser import LuigiConfigParser
from .core import get_config, add_config_path
from .toml_parser import LuigiTomlParser


__all__ = [
'add_config_path',
'get_config',
'LuigiConfigParser',
'LuigiTomlParser',
]
40 changes: 40 additions & 0 deletions luigi/configuration/base_parser.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
# -*- coding: utf-8 -*-
#
# Copyright 2012-2015 Spotify AB
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import logging


# IMPORTANT: don't inherit from `object`!
# ConfigParser have some troubles in this case.
class BaseParser:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inherit from object?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No-no-no, don't do it! In Python2 cfg parser fails if BaseParser inherit from object. This is strange, yes. I will add comment for it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is explained better on SO

@classmethod
def instance(cls, *args, **kwargs):
""" Singleton getter """
if cls._instance is None:
cls._instance = cls(*args, **kwargs)
loaded = cls._instance.reload()
logging.getLogger('luigi-interface').info('Loaded %r', loaded)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

side note: am i correct that neither the client nor the server's logging config names are configurable? I've found that interesting - why statically set to luigi-interface. Perhaps we don't want that changed in general.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't change .instance() logic, just moved it from cfg parser to base.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was just raising an unrelated question - a topic for future possible PR (configurable logging names).

Nothing for you to do here.


return cls._instance

@classmethod
def add_config_path(cls, path):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for changing the order of these classmethods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What did you mean? I just moved this method from cfg parser to base without any changes.

cls._config_paths.append(path)
cls.reload()

@classmethod
def reload(cls):
return cls.instance().read(cls._config_paths)
34 changes: 4 additions & 30 deletions luigi/configuration.py → luigi/configuration/cfg_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@
See :doc:`/configuration` for more info.
"""

import logging
import os
import warnings

Expand All @@ -38,37 +37,19 @@
except ImportError:
from configparser import ConfigParser, NoOptionError, NoSectionError

from .base_parser import BaseParser
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recall relative imports are discouraged, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? We already have from .tools import range in project, and it's works. Relative imports little faster and better recognized some tools like isort.


class LuigiConfigParser(ConfigParser):

class LuigiConfigParser(BaseParser, ConfigParser):
NO_DEFAULT = object()
enabled = True
_instance = None
_config_paths = [
'/etc/luigi/client.cfg', # Deprecated old-style global luigi config
'/etc/luigi/luigi.cfg',
'client.cfg', # Deprecated old-style local luigi config
'luigi.cfg',
]
if 'LUIGI_CONFIG_PATH' in os.environ:
config_file = os.environ['LUIGI_CONFIG_PATH']
if not os.path.isfile(config_file):
warnings.warn("LUIGI_CONFIG_PATH points to a file which does not exist. Invalid file: {path}".format(path=config_file))
else:
_config_paths.append(config_file)

@classmethod
def add_config_path(cls, path):
cls._config_paths.append(path)
cls.reload()

@classmethod
def instance(cls, *args, **kwargs):
""" Singleton getter """
if cls._instance is None:
cls._instance = cls(*args, **kwargs)
loaded = cls._instance.reload()
logging.getLogger('luigi-interface').info('Loaded %r', loaded)

return cls._instance

@classmethod
def reload(cls):
Expand Down Expand Up @@ -124,10 +105,3 @@ def set(self, section, option, value=None):
ConfigParser.add_section(self, section)

return ConfigParser.set(self, section, option, value)


def get_config():
"""
Convenience method (for backwards compatibility) for accessing config singleton.
"""
return LuigiConfigParser.instance()
79 changes: 79 additions & 0 deletions luigi/configuration/core.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# -*- coding: utf-8 -*-
#
# Copyright 2012-2015 Spotify AB
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import logging
import os
import warnings

from .cfg_parser import LuigiConfigParser
from .toml_parser import LuigiTomlParser


logger = logging.getLogger('luigi-interface')


PARSERS = {
'cfg': LuigiConfigParser,
'conf': LuigiConfigParser,
'ini': LuigiConfigParser,
'toml': LuigiTomlParser,
}

# select parser via env var
DEFAULT_PARSER = 'cfg'
PARSER = os.environ.get('LUIGI_CONFIG_PARSER', DEFAULT_PARSER)
if PARSER not in PARSERS:
warnings.warn("Invalid parser: {parser}".format(parser=PARSER))
PARSER = DEFAULT_PARSER


def get_config(parser=PARSER):
"""Get configs singleton for parser
"""

parser_class = PARSERS[parser]
if not parser_class.enabled:
logger.error((
"Parser not installed yet. "
"Please, install luigi with required parser:\n"
"pip install luigi[{parser}]"
).format(parser)
)

return parser_class.instance()


def add_config_path(path):
"""Select config parser by file extension and add path into parser.
"""
if not os.path.isfile(path):
warnings.warn("Config file does not exist: {path}".format(path=path))
return False

# select parser by file extension
_base, ext = os.path.splitext(path)
if ext and ext[1:] in PARSERS:
parser_class = PARSERS[ext[1:]]
else:
parser_class = PARSERS[PARSER]

# add config path to parser
parser_class.add_config_path(path)
return True


if 'LUIGI_CONFIG_PATH' in os.environ:
add_config_path(os.environ['LUIGI_CONFIG_PATH'])
82 changes: 82 additions & 0 deletions luigi/configuration/toml_parser.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# -*- coding: utf-8 -*-
#
# Copyright 2018 Cindicator Ltd.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import os.path
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why import both os and os.path? import os.path can be removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nearly. import os can be removed, not os.path. Fixed. Thank you.


try:
import toml
except ImportError:
toml = False
Copy link

@adamist521 adamist521 Aug 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@orsinium
This will end up following error if you have toml library not installed...
Works great after toml installation!

File "/Users/*****/.pyenv/versions/miniconda-latest/envs/ddhr-rjb/lib/python3.6/site-packages/luigi/configuration/toml_parser.py", line 53, in read
    self.data = self._update_data(self.data, toml.load(path))
AttributeError: 'bool' object has no attribute 'load'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh noes, @orsinium can you fix this asap?


from .base_parser import BaseParser


class LuigiTomlParser(BaseParser):
NO_DEFAULT = object()
enabled = bool(toml)
data = dict()
_instance = None
_config_paths = [
'/etc/luigi/luigi.toml',
'luigi.toml',
]

@staticmethod
def _update_data(data, new_data):
if not new_data:
return data
if not data:
return new_data
for section, content in new_data.items():
if section not in data:
data[section] = dict()
data[section].update(content)
return data

def read(self, config_paths):
self.data = dict()
for path in config_paths:
if os.path.isfile(path):
self.data = self._update_data(self.data, toml.load(path))
return self.data

def get(self, section, option, default=NO_DEFAULT, **kwargs):
try:
return self.data[section][option]
except KeyError:
if default is self.NO_DEFAULT:
raise
return default

def getboolean(self, section, option, default=NO_DEFAULT):
return self.get(section, option, default)

def getint(self, section, option, default=NO_DEFAULT):
return self.get(section, option, default)

def getfloat(self, section, option, default=NO_DEFAULT):
return self.get(section, option, default)

def getintdict(self, section):
return self.data.get(section, {})

def set(self, section, option, value=None):
if section not in self.data:
self.data[section] = {}
self.data[section][option] = value

def __getitem__(self, name):
return self.data[name]
2 changes: 1 addition & 1 deletion luigi/contrib/s3.py
Original file line number Diff line number Diff line change
Expand Up @@ -488,7 +488,7 @@ def _get_s3_config(self, key=None):
defaults = dict(configuration.get_config().defaults())
try:
config = dict(configuration.get_config().items('s3'))
except NoSectionError:
except (NoSectionError, KeyError):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am i correct to assume that ConfigParser returns the NoSectionError and toml returns the KeyError when a config section heading is missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I think, this is incorrect raise from toml parser any exceptions from ConfigParser. Maybe, we can make custom common exception:

class SectionError(NoSectionError, KeyError):
    pass

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with it as it is currently written; I was just verifying that they raised different errors for the same problem.

return {}
# So what ports etc can be read without us having to specify all dtypes
for k, v in six.iteritems(config):
Expand Down
2 changes: 1 addition & 1 deletion luigi/parameter.py
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,7 @@ def _get_value_from_config(self, section, name):

try:
value = conf.get(section, name)
except (NoSectionError, NoOptionError):
except (NoSectionError, NoOptionError, KeyError):
return _no_value

return self.parse(value)
Expand Down
4 changes: 4 additions & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ def get_static_files(path):
license='Apache License 2.0',
packages=[
'luigi',
'luigi.configuration',
'luigi.contrib',
'luigi.contrib.hdfs',
'luigi.tools'
Expand All @@ -75,6 +76,9 @@ def get_static_files(path):
]
},
install_requires=install_requires,
extras_require={
'toml': ['toml'],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are your thoughts on limiting this to the current milestone version? i.e. 'toml<1.0' (since the current is 0.9.4) I just don't want version 1.0.0 to break this. I'd rather tread with caution, even with an optional configurable feature.

Copy link
Contributor Author

@orsinium orsinium Jul 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python toml package version related to toml spec version. TOML spec v1.0 will be backward compatible. Some quotes:

1.0.0 will be the first version to support TOML 1.0.0. TOML still has yet to hit TOML 1.0.0 so its release is postponed indefinitely.

As of version 0.5.0, TOML should be considered extremely stable. The goal is for version 1.0.0 to be backwards compatible (as much as humanly possible) with version 0.5.0. All implementations are strongly encouraged to become 0.5.0 compatible so that the transition to 1.0.0 will be simple when that happens.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, why don't we put 'toml<2.0'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok. This don't break anything. This fix added. Thank you for review :)

},
classifiers=[
'Development Status :: 5 - Production/Stable',
'Environment :: Console',
Expand Down
Loading