Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backslash roundtrip problem #404

Open
jeremysanders opened this issue Sep 5, 2022 · 5 comments
Open

Backslash roundtrip problem #404

jeremysanders opened this issue Sep 5, 2022 · 5 comments

Comments

@jeremysanders
Copy link

I'm having problems with Windows paths stored in toml files:

In [1]: import toml
In [2]: foo = {'a': 'C:\\hostedtoolcache\\windows\\Python\\3.9.13\\x64\\Lib\\site-packages\\PyQt5\\bindings'}
In [3]: d = toml.dumps(foo)
In [4]: d
Out[4]: 'a = "C:\\hostedtoolcache\\windows\\Python\\3.9.13\\x64\\\\Lib\\\\site-packages\\\\PyQt5\\\\bindings"\n'
In [5]: toml.loads(d)
/usr/lib/python3/dist-packages/toml/decoder.py in loads(s, _dict, decoder)
    512                                         multibackslash)
    513             except ValueError as err:
--> 514                 raise TomlDecodeError(str(err), original, pos)
    515             if ret is not None:
    516                 multikey, multilinestr, multibackslash = ret

TomlDecodeError: Reserved escape sequence used (line 1 column 1 char 0)

It looks like only some backslashes are escaped properly by dumps. I tested this with toml from github.

@jeremysanders
Copy link
Author

Ok, I think I've narrowed this down to the presence of \x in the string:

In [24]: toml.dumps({'a': r'\x43'})
Out[24]: 'a = "\\u0043"\n'

v = v.split("\\x")
is wrong, as it splits on \x, but does not ignore \\x.

jeremysanders added a commit to jeremysanders/toml that referenced this issue Sep 5, 2022
@jeremysanders
Copy link
Author

I've created a pull request. However, I notice there are problems with strings like '\x02' which don't seem to work, which my pull request doesn't address.

@davidfokkema
Copy link

Got bitten by this just now. I have a user whose name starts with an 'x' and saving their home directory path into a config file breaks my app. Not fun.

@davidfokkema
Copy link

I'm switching to tomli (included in the standard library of version 3.11) in combination with tomli_w.

davidfokkema added a commit to davidfokkema/tailor that referenced this issue Sep 20, 2022
Fixes a nasty encoding bug, see uiri/toml#404.
@dimakuv
Copy link

dimakuv commented Oct 11, 2022

We were also bitten by this:

>>> toml.dumps({'A': '\\x2d'})
'A = "\\u002d"\n'

As was already pointed out, this code is at fault:

toml/toml/encoder.py

Lines 99 to 113 in 59d83d0

while len(v) > 1:
i = -1
if not v[0]:
v = v[1:]
v[0] = v[0].replace("\\\\", "\\")
# No, I don't know why != works and == breaks
joinx = v[0][i] != "\\"
while v[0][:i] and v[0][i] == "\\":
joinx = not joinx
i -= 1
if joinx:
joiner = "x"
else:
joiner = "u00"
v = [v[0] + joiner + v[1]] + v[2:]

The code is extremely complicated and must be untangled in order to fix this bug. We didn't attempt it; instead we're planning on switching to tomli.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants