Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autoloading_settings #726

Merged
merged 7 commits into from
Jul 27, 2023
Merged

autoloading_settings #726

merged 7 commits into from
Jul 27, 2023

Conversation

bbeat2782
Copy link

@bbeat2782 bbeat2782 commented Jul 12, 2023

Describe your changes

  • Running %load_ext sql triggers pyproject.toml file. This includes looking up in parent directories.
  • Successfully finding the file notifies the user. If not, do nothing.
  • Triggering pyproject.toml displays loaded settings and changes configuration settings.
  • If there's a typo on one of the settings, it displays a message hinting at what a user might intend.
  • If there's a type and difflib is unable to find the closest match, it displays a message with a link to the docs
  • If an invalid value for a configuration setting is provided, it displays a message that it is not a valid value and uses the default value instead.
  • Add documentation for explaining setting up pyproject.toml
  • Add test functions

Issue number

Closes #689

Checklist before requesting a review


📚 Documentation preview 📚: https://jupysql--726.org.readthedocs.build/en/726/

@bbeat2782 bbeat2782 marked this pull request as ready for review July 12, 2023 20:28
@bbeat2782
Copy link
Author

bbeat2782 commented Jul 12, 2023

This is the current output when there is pyproject.toml in a notebook's parent directory. From top to bottom, the messages are

  1. When it finds pyproject.toml file
  2. When there is a possible typo in the configuration name
  3. When an invalid input value is provided
  4. When an invalid configuration name is provided (not considered as a typo with the difflib package)
  5. When successfully setting configuration with user-specified input.
Screen Shot 2023-07-12 at 1 29 00 PM Screen Shot 2023-07-12 at 1 37 18 PM

This is when there is no pyproject.toml file.
Screen Shot 2023-07-12 at 1 29 17 PM

@bbeat2782 bbeat2782 marked this pull request as draft July 12, 2023 22:34
@bbeat2782 bbeat2782 marked this pull request as draft July 12, 2023 22:34
@bbeat2782 bbeat2782 marked this pull request as ready for review July 13, 2023 01:33
Copy link

@yafimvo yafimvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added some comments

src/tests/test_magic.py Outdated Show resolved Hide resolved
doc/api/configuration.md Outdated Show resolved Hide resolved
src/sql/magic.py Outdated Show resolved Hide resolved
src/sql/magic.py Outdated Show resolved Hide resolved
@neelasha23
Copy link

A few observations:

  1. I changed jupysql -> jupySql and %load_ext sql worked without any error (we can add some validations on the key)
Screenshot 2023-07-13 at 4 29 11 PM

Here, Setting 'feedback' into 'True' -> Setting 'feedback' to 'True' or Setting 'feedback' = 'True'. Setting is getting repeated in every line. It would also be good if we can display this in the form of a table with 2 columns : Config and value.

  1. If there's typo in one setting, do we allow setting the others? Currently it does.
Screenshot 2023-07-13 at 4 33 41 PM

Also, should this raise an error instead of print message?

  1. If duplicate keys present it shows the entire stack trace:
[tool.jupysql.SqlMagic]
feedback = true
autopandas = true
feedback = false

Stacktrace:


File ~/miniforge3/envs/jupysql/lib/python3.10/site-packages/sql/magic.py:664, in load_configs(ip)
    662 file_path = find_toml_path()
    663 if file_path:
--> 664     configs = get_configs(file_path)
    665     config_vars = get_config_variables(SqlMagic)
    666     for config, value in configs.items():

File ~/miniforge3/envs/jupysql/lib/python3.10/site-packages/sql/magic.py:645, in get_configs(file_path)
    643 with open(file_path, "r") as file:
    644     content = file.read()
--> 645     data = toml.loads(content)
    646 try:
    647     return data["tool"]["jupysql"]["SqlMagic"]

File ~/miniforge3/envs/jupysql/lib/python3.10/site-packages/toml/decoder.py:514, in loads(s, _dict, decoder)
    511     ret = decoder.load_line(line, currentlevel, multikey,
    512                             multibackslash)
    513 except ValueError as err:
--> 514     raise TomlDecodeError(str(err), original, pos)
    515 if ret is not None:
    516     multikey, multilinestr, multibackslash = ret

TomlDecodeError: Duplicate keys! (line 4 column 1 char 58)

We can add some validation here.

  1. We can add error here also :
[tool.jupysql.SqlMagic]
autopandas = TRUE
feedback = false

Stacktrace:

File ~/miniforge3/envs/jupysql/lib/python3.10/site-packages/toml/decoder.py:514, in loads(s, _dict, decoder)
    511     ret = decoder.load_line(line, currentlevel, multikey,
    512                             multibackslash)
    513 except ValueError as err:
--> 514     raise TomlDecodeError(str(err), original, pos)
    515 if ret is not None:
    516     multikey, multilinestr, multibackslash = ret

TomlDecodeError: Only all lowercase booleans allowed (line 2 column 1 char 24)
  1. Another one:
[tool.jupysql.SqlMagic]
autopandas = invalid
feedback = false

Stacktrace:

File ~/miniforge3/envs/jupysql/lib/python3.10/site-packages/toml/decoder.py:514, in loads(s, _dict, decoder)
    511     ret = decoder.load_line(line, currentlevel, multikey,
    512                             multibackslash)
    513 except ValueError as err:
--> 514     raise TomlDecodeError(str(err), original, pos)
    515 if ret is not None:
    516     multikey, multilinestr, multibackslash = ret

TomlDecodeError: invalid literal for int() with base 0: 'random' (line 2 column 1 char 24)

src/sql/magic.py Outdated Show resolved Hide resolved
src/sql/magic.py Outdated Show resolved Hide resolved
src/sql/magic.py Outdated Show resolved Hide resolved
src/sql/magic.py Outdated Show resolved Hide resolved
src/sql/magic.py Outdated Show resolved Hide resolved
src/sql/magic.py Outdated Show resolved Hide resolved


def test_loading_config(request, monkeypatch, ip, capsys):
monkeypatch.chdir(request.fspath.dirname)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use fixtures for changing directories like here

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used tmp_empty fixture for changing to an empty tmp directory.

src/tests/test_magic.py Outdated Show resolved Hide resolved
@bbeat2782
Copy link
Author

3. If there's typo in one setting, do we allow setting the others? Currently it does.

From the acceptance criteria, we established to use the default value if an invalid configuration setting is provided, but if we think raising an error is more appropriate, I will make the change.

4. If duplicate keys present it shows the entire stack trace: ... We can add some validation here.

@neelasha23 For the bullet points 4, 5, and 6, do you mean check the toml.TomlDecodeError inside the code so that it doesn't give the entire stack trace?

CHANGELOG.md Outdated Show resolved Hide resolved
src/sql/magic.py Outdated Show resolved Hide resolved
src/tests/pyproject.toml Outdated Show resolved Hide resolved
@neelasha23
Copy link

neelasha23 commented Jul 14, 2023

@neelasha23 For the bullet points 4, 5, and 6, do you mean check the toml.TomlDecodeError inside the code so that it doesn't give the entire stack trace?

You can add a function to perform basic validations on the pyproject.toml file:

  1. extract all the keys and check if any key is duplicate raise error Duplicate key found : <key>
  2. For autopandas = TRUE you can validate the key values, and raise error like invalid setting TRUE in autopandas=TRUE. Valid settings : true, false (please adjust the sentence framing)
  3. Points 6 can be generalized along with above point .

@bbeat2782
Copy link
Author

@neelasha23

You can add a function to perform basic validations on the pyproject.toml file:

Added the validations in load_toml inside util.py.

Copy link

@edublancas edublancas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the HTML table to show the changed settings is a nice touch, good job!

my feedback:

an empty pyproject.toml gives a long traceback with this error: KeyError: "'tool' not found from pyproject.toml. Please check the following section names: ." fix: an empty pyproject.toml should not give any errors.

also, if the pyproject has content but the "tool" section doesn't exist the same error is shown. but it shouldn't. if the "tool" section is missing, nothing should happen (do not load anything, just ignore the pyproject.toml)

if the "tool" section exists but an inner "jupysql" section doesn't I get: KeyError: "'jupysql' not found from pyproject.toml. Please check the following section names: 'pkgmt'." This shouldn't happen as adding the configuration via a pyproject.toml is optional.

if "SqlMagic" doesn't exist, I get: KeyError: "'SqlMagic' not found from pyproject.toml. Please check the following section names: 'x'." but this shouldn't happen (it should ignore the pyproject.toml it since there is no SqlMagic configuration to apply)

When loading a valid config I see "'x' is an invalid configuration. Please review our configuration guideline: https://jupysql.ploomber.io/en/latest/api/configuration.html#options. however, the link is not clickable, please add it as an HTML link so people can click on them

when passing an invalid type, I see: "'a' is an invalid value for 'autopandas'. Please use <class 'bool'> value instead.". no need to show "<class 'bool'", you can get the type name with __name__, example: bool.__name__ so the error shows "'a' is an invalid value for 'autopandas'. Please use a bool value instead.".

When I pass "autopandas = True", I get "ValueError: Invalid value 'True' in 'autopandas = True'. Valid boolean values: true, false". The error messag is fine but you should raise a custom error (https://github.com/ploomber/jupysql/blob/master/src/sql/exceptions.py). create a ConfigurationError and raise that one. this will allow us to hide the traceback, which is too verbose

When loading the pyproject.toml, I get "Found pyproject.toml from '/Users/eduardo/Desktop/testing-autoloadsettings'", it'd be better to show it relative to the current directory. you can use pathlib.Path to convert that absolute path into a relative one

please fix and add test cases for all of these scenarios

src/sql/util.py Outdated Show resolved Hide resolved
src/sql/util.py Outdated Show resolved Hide resolved
src/sql/magic.py Outdated Show resolved Hide resolved
src/tests/test_magic.py Outdated Show resolved Hide resolved
src/tests/test_magic.py Outdated Show resolved Hide resolved
src/sql/util.py Outdated Show resolved Hide resolved
noqa added

update test to check configurations

update based on the first comments

lint

get_default_configs doc rephrase

draft

autoloading error fixed

lint fix

ci

changing toml into optional

update util.py

link fix

fix test functions and rename

ci

escape added
neelasha23
neelasha23 previously approved these changes Jul 25, 2023
yafimvo
yafimvo previously approved these changes Jul 25, 2023
Copy link

@yafimvo yafimvo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good! nice work.

Copy link

@edublancas edublancas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like an invalid TOML is still a problem:

[tool.jupysql.SqlMagic]
autopandas = False
displaylimit = 2

when I create something like this along with a notebook with a %load_ext sql cell, and then start JupyterLab, it breaks it.

Instead, I'd expect JupyterLab to work normally and %load_ext sql to output a message saying the TOML is invalid.

@bbeat2782
Copy link
Author

looks like an invalid TOML is still a problem:

@edublancas

Screen Shot 2023-07-25 at 4 44 17 PM

This is currently what I get when I run %load_ext sql with

[tool.jupysql.SqlMagic]
autopandas = False
displaylimit = 2

After running %load_ext sql, I'm able to run %sql duckdb:// without any error but displaylimit = 2 isn't applied because the toml file isn't correctly formatted.

Are you getting something different or are you suggesting to display a message instead of the ConfigurationError?

@edublancas
Copy link

@bbeat2782 try with JupyterLab instead of VSCode, as they behave differently.

In my case, when I start JupyterLab in the folder that has the notebook and the TOML, it breaks and I'm unable to open any files, plus, the jupyter console shows the TOML error.

@bbeat2782
Copy link
Author

I think I replicated the error. I will work on it.

@bbeat2782
Copy link
Author

@edublancas

I tried creating a new branch from the master and only added

[tool.jupysql.SqlMagic]
autopandas = False
displaylimit = 2

to the toml file, and it still shows the TOML error. And I'm unable to open any files.

Looking at the error log, I think it's because of how the jupyter lab package handles the TOML file parse error, not because of when we call load_ext sql. Thus, correctly structuring the toml file might be the only fix. (no duplicate keys, lowercase boolean).

@edublancas
Copy link

can you share the full traceback?

i doubt JupyterLab is breaking because of the pyproject.toml, it might be that you have jupytext installed, check this: mwouts/jupytext#1103

@bbeat2782
Copy link
Author

bbeat2782 commented Jul 26, 2023

it might be that you have jupytext installed

Yes, it's coming from the jupytext.

This is the traceback I get when I run jupyter notebook

Server error: Traceback (most recent call last): File "/Users/sanggyuan/miniconda3/envs/jupysql/lib/python3.10/site-packages/toml/decoder.py", line 511, in loads ret = decoder.load_line(line, currentlevel, multikey, File "/Users/sanggyuan/miniconda3/envs/jupysql/lib/python3.10/site-packages/toml/decoder.py", line 778, in load_line value, vtype = self.load_value(pair[1], strictly_valid) File "/Users/sanggyuan/miniconda3/envs/jupysql/lib/python3.10/site-packages/toml/decoder.py", line 820, in load_value raise ValueError("Only all lowercase booleans allowed") ValueError: Only all lowercase booleans allowed During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/Users/sanggyuan/miniconda3/envs/jupysql/lib/python3.10/site-packages/tornado/web.py", line 1786, in _execute result = await result File "/Users/sanggyuan/miniconda3/envs/jupysql/lib/python3.10/site-packages/tornado/gen.py", line 234, in wrapper yielded = ctx_run(next, result) File "/Users/sanggyuan/miniconda3/envs/jupysql/lib/python3.10/site-packages/notebook/services/contents/handlers.py", line 118, in get model = yield maybe_future(self.contents_manager.get( File "/Users/sanggyuan/miniconda3/envs/jupysql/lib/python3.10/site-packages/jupytext/contentsmanager.py", line 199, in get return self.super.get(path, content, type, format) File "/Users/sanggyuan/miniconda3/envs/jupysql/lib/python3.10/site-packages/notebook/services/contents/filemanager.py", line 448, in get model = self._dir_model(path, content=content) File "/Users/sanggyuan/miniconda3/envs/jupysql/lib/python3.10/site-packages/notebook/services/contents/filemanager.py", line 343, in _dir_model self.get(path=f'{path}/{name}', content=False) File "/Users/sanggyuan/miniconda3/envs/jupysql/lib/python3.10/site-packages/jupytext/contentsmanager.py", line 201, in get config = self.get_config(path, use_cache=content is False) File "/Users/sanggyuan/miniconda3/envs/jupysql/lib/python3.10/site-packages/jupytext/contentsmanager.py", line 571, in get_config config_file = self.get_config_file(parent_dir) File "/Users/sanggyuan/miniconda3/envs/jupysql/lib/python3.10/site-packages/jupytext/contentsmanager.py", line 518, in get_config_file doc = toml.loads(model["content"]) File "/Users/sanggyuan/miniconda3/envs/jupysql/lib/python3.10/site-packages/toml/decoder.py", line 514, in loads raise TomlDecodeError(str(err), original, pos) toml.decoder.TomlDecodeError: Only all lowercase booleans allowed (line 2 column 1 char 24)

@edublancas
Copy link

the traceback is a bit hard to read but yeah it's coming from jupytext. uninstall jupytext it and it will be fixed, then you can test if some changes to the PR are needed

@bbeat2782
Copy link
Author

After uninstalling jupytext, I'm able to run the rest of the code without breaking it.

Screen Shot 2023-07-26 at 12 07 58 PM

@bbeat2782 bbeat2782 dismissed stale reviews from yafimvo and neelasha23 via ce75734 July 26, 2023 19:12
@edublancas
Copy link

@bbeat2782 please fix merge conflicts and request a review when ready

@edublancas edublancas merged commit 7c1b5ea into ploomber:master Jul 27, 2023
@edublancas
Copy link

great work @bbeat2782!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

auto-loading settings
4 participants