Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dvc pull: re.error: redefinition of group name 'ps_d' as group 2; was group 1 at position 46 #8217

Closed
arthurkok2 opened this issue Aug 31, 2022 · 18 comments · Fixed by #8767
Closed
Labels
p1-important Important, aka current backlog of things to do

Comments

@arthurkok2
Copy link

Bug Report

Description

When running dvc pull, re throws an error dvc pull: re.error: redefinition of group name 'ps_d' as group 2; was group 1 at position 46

Reproduce

  1. run dvc pull

Expected

It not to throw an error

Environment information

Output of dvc doctor:

DVC version: 2.0.18 (pip)
---------------------------------
Platform: Python 3.9.12 on Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
Supports: http, https, s3
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: s3
Workspace directory: 9p on drvfs
Repo: dvc (no_scm)

Additional Information (if any):

2022-08-31 12:34:03,876 ERROR: unexpected error - redefinition of group name 'ps_d' as group 2; was group 1 at position 46
------------------------------------------------------------
Traceback (most recent call last):
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/main.py", line 55, in main
    ret = cmd.run()
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/command/data_sync.py", line 29, in run
    stats = self.repo.pull(
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/repo/__init__.py", line 49, in wrapper
    return f(repo, *args, **kwargs)
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/repo/pull.py", line 29, in pull
    processed_files_count = self.fetch(
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/repo/__init__.py", line 49, in wrapper
    return f(repo, *args, **kwargs)
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/repo/fetch.py", line 43, in fetch
    used = self.used_cache(
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/repo/__init__.py", line 396, in used_cache
    for stage, filter_info in pairs:
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/repo/__init__.py", line 389, in <genexpr>
    self.stage.collect_granular(
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/repo/stage.py", line 397, in collect_granular
    stages, file, _ = _collect_specific_target(
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/repo/stage.py", line 91, in _collect_specific_target
    if not (recursive and loader.fs.isdir(target)):
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/fs/local.py", line 74, in isdir
    return not (use_dvcignore and self.dvcignore.is_ignored_dir(path_info))
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/funcy/objects.py", line 28, in __get__
    res = instance.__dict__[self.fget.__name__] = self.fget(instance)
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/fs/local.py", line 42, in dvcignore
    return cls(self, root)
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/ignore.py", line 196, in __init__
    self.ignores_trie_fs[root_dir] = DvcIgnorePatterns(
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/ignore.py", line 43, in __init__
    self.ignore_spec = [
  File "/mnt/c/Dayforce/ideal/ml-services/venv/lib/python3.9/site-packages/dvc/ignore.py", line 44, in <listcomp>
    (ignore, re.compile("|".join(item[0] for item in group)))
  File "/home/arthur/.pyenv/versions/3.9.12/lib/python3.9/re.py", line 252, in compile
    return _compile(pattern, flags)
  File "/home/arthur/.pyenv/versions/3.9.12/lib/python3.9/re.py", line 304, in _compile
    p = sre_compile.compile(pattern, flags)
  File "/home/arthur/.pyenv/versions/3.9.12/lib/python3.9/sre_compile.py", line 764, in compile
    p = sre_parse.parse(p, flags)
  File "/home/arthur/.pyenv/versions/3.9.12/lib/python3.9/sre_parse.py", line 950, in parse
    p = _parse_sub(source, state, flags & SRE_FLAG_VERBOSE, 0)
  File "/home/arthur/.pyenv/versions/3.9.12/lib/python3.9/sre_parse.py", line 443, in _parse_sub
    itemsappend(_parse(source, state, verbose, nested + 1,
  File "/home/arthur/.pyenv/versions/3.9.12/lib/python3.9/sre_parse.py", line 833, in _parse
    raise source.error(err.msg, len(name) + 1) from None
re.error: redefinition of group name 'ps_d' as group 2; was group 1 at position 46
------------------------------------------------------------
2022-08-31 12:34:04,803 DEBUG: Version info for developers:
DVC version: 2.0.18 (pip)
---------------------------------
Platform: Python 3.9.12 on Linux-5.10.102.1-microsoft-standard-WSL2-x86_64-with-glibc2.31
Supports: http, https, s3
Cache types: <https://error.dvc.org/no-dvc-cache>
Caches: local
Remotes: s3
Workspace directory: 9p on drvfs
Repo: dvc (no_scm)
@skshetry
Copy link
Member

@arthurkok2, could you please share .dvcignore file or try removing certain items of it to see what is causing this?

@arthurkok2
Copy link
Author

@arthurkok2, could you please share .dvcignore file or try removing certain items of it to see what is causing this?

my .dvcignore is empty except for some default comments:

# Add patterns of files dvc should ignore, which could improve
# the performance. Learn more at
# https://dvc.org/doc/user-guide/dvcignore

@rlamy
Copy link
Contributor

rlamy commented Aug 31, 2022

@arthurkok2 Do you still have the issue if you upgrade dvc?

@efiop efiop added the awaiting response we are waiting for your reply, please respond! :) label Aug 31, 2022
@eric-seppanen
Copy link

eric-seppanen commented Aug 31, 2022

I am hitting the same error, note that dvc doctor also fails.

$ dvc pull
ERROR: unexpected error - redefinition of group name 'ps_d' as group 2; was group 1 at position 46

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
$ dvc doctor
ERROR: unexpected error - redefinition of group name 'ps_d' as group 2; was group 1 at position 46

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!
$ dvc --version
2.23.0
$ uname -s -r -v -p
Linux 5.15.0-46-generic #49-Ubuntu SMP Thu Aug 4 18:03:25 UTC 2022 x86_64
$ python --version
Python 3.8.10

@eric-seppanen
Copy link

The fact that two different people on very different versions hit this very specific error within a few hours makes me suspect that some dependency just released a broken version.

@eric-seppanen
Copy link

This is caused by pathspec 0.10.0, which was released 2022-08-30.

pip install dvc explicitly warns about the incompatibility. I missed the warning because I didn't know that pip ignores errors, and because the output was buried deep in a docker build log.

Downgrading to pathspec 0.9.0 fixes this for me.

@skshetry skshetry added p0-critical Critical issue. Needs to be fixed ASAP. and removed awaiting response we are waiting for your reply, please respond! :) labels Sep 1, 2022
@arthurkok2
Copy link
Author

@arthurkok2 Do you still have the issue if you upgrade dvc?

For myself, no, issues goes away when upgrading to 2.23.0. However, seems like other are reporting the issue even on this version.

@raychinov
Copy link

We were facing the same issue during the dvc-pre-commit hook, executing pre-commit autoupdate before the hooks fixed it for us.

@dtrifiro
Copy link
Contributor

dtrifiro commented Sep 2, 2022

The issue comes down from this change (1c8c980) in pathspec 0.10.0, which starts using a named group ps_d to match directory markers (e.g. /). Since we concatenate regexes here:

dvc/dvc/ignore.py

Lines 42 to 48 in 660c17f

self.ignore_spec = [
(ignore, re.compile("|".join(item[0] for item in group)))
for ignore, group in groupby(
self.regex_pattern_list, lambda x: x[1]
)
if ignore is not None
]

we end up having multiple groups with the same name, which results in the re.compile error we see.

@dtrifiro
Copy link
Contributor

dtrifiro commented Sep 2, 2022

Also, as @eric-seppanen pointed out, this is explicitly pinned to <0.10 in both dvc and scmrepo.

@skshetry
Copy link
Member

skshetry commented Sep 2, 2022

Thanks @eric-seppanen, @dtrifiro. I did not notice that. I'm lifting the p0-critical Critical issue. Needs to be fixed ASAP. .

@skshetry skshetry added p1-important Important, aka current backlog of things to do and removed p0-critical Critical issue. Needs to be fixed ASAP. labels Sep 2, 2022
@Varungarg97
Copy link

Varungarg97 commented Sep 6, 2022

Here I am getting the same error

image

dvc version : 2.3.0

python version: 3.7.9

please suggest...

@rlamy
Copy link
Contributor

rlamy commented Sep 6, 2022

@Varungarg97 The workaround is to downgrade pathspec to 0.9.0.

@Varungarg97
Copy link

@rlamy Thank you working now

@dtrifiro

This comment was marked as duplicate.

@Varungarg97

This comment was marked as duplicate.

pckhoi added a commit to ipno-llead/processing that referenced this issue Sep 13, 2022
hiroto7 added a commit to hiroto7/dvc that referenced this issue Dec 6, 2022
Remove regex concatenation that causes re.error
Fixes iterative#8217.
hiroto7 added a commit to hiroto7/dvc that referenced this issue Dec 6, 2022
Remove regex concatenation that causes re.error
Fixes iterative#8217
dtrifiro pushed a commit to hiroto7/dvc that referenced this issue Dec 6, 2022
dtrifiro pushed a commit to hiroto7/dvc that referenced this issue Dec 13, 2022
@carlthome
Copy link

carlthome commented Dec 23, 2022

Also got this error but from dvc init after nix shell nixpkgs#dvc-with-remotes

❯ dvc init
ERROR: unexpected error - redefinition of group name 'ps_d' as group 2; was group 1 at position 46

Having any troubles? Hit us up at https://dvc.org/support, we are always happy to help!

Datasets on  main [✘!+?] on ☁️  carlthome@gmail.com(europe-west4) 
❯ dvc --version
2.17.0
❯ nix flake metadata nixpkgs
Resolved URL:  github:NixOS/nixpkgs/nixos-22.11
Locked URL:    github:NixOS/nixpkgs/cbe419ed4c8f98bd82d169c321d339ea30904f1f
Description:   A collection of packages for the Nix package manager
Path:          /nix/store/d2flirhsd337gm8j8rxlqklslryx6g3q-source
Revision:      cbe419ed4c8f98bd82d169c321d339ea30904f1f
Last modified: 2022-12-20 09:36:45

@semaraugusto
Copy link

semaraugusto commented Jan 4, 2023

got the same issue described above with the standard dvc package on nixos too.

Same logs and same dvc version as above.

dvc pull also isn't working with the same error

karajan1001 added a commit to karajan1001/scmrepo that referenced this issue Jan 5, 2023
karajan1001 added a commit to karajan1001/scmrepo that referenced this issue Jan 5, 2023
karajan1001 added a commit to karajan1001/dvc that referenced this issue Jan 5, 2023
fix: iterative#8217
1. Replace group name to avoid conflict.
2. Disable failed tests.
3. Bump pathspec to 0.10.3

Wait for scmrepo's update iterative/scmrepo#163
karajan1001 added a commit to iterative/scmrepo that referenced this issue Jan 6, 2023
karajan1001 added a commit to karajan1001/dvc that referenced this issue Jan 6, 2023
fix: iterative#8217
1. Replace group name to avoid conflict.
2. Disable failed tests.
3. Bump pathspec to 0.10.3

Wait for scmrepo's update iterative/scmrepo#163
karajan1001 added a commit to karajan1001/dvc that referenced this issue Jan 6, 2023
fix: iterative#8217
1. Replace group name to avoid conflict.
2. Disable failed tests.
3. Bump pathspec to 0.10.3

Wait for scmrepo's update iterative/scmrepo#163
daavoo pushed a commit to karajan1001/dvc that referenced this issue Jan 9, 2023
fix: iterative#8217
1. Replace group name to avoid conflict.
2. Disable failed tests.
3. Bump pathspec to 0.10.3

Wait for scmrepo's update iterative/scmrepo#163
karajan1001 added a commit to karajan1001/dvc that referenced this issue Jan 11, 2023
fix: iterative#8217
1. Replace group name to avoid conflict.
2. Disable failed tests.
3. Bump pathspec to 0.10.3

Wait for scmrepo's update iterative/scmrepo#163
karajan1001 added a commit to karajan1001/dvc that referenced this issue Jan 11, 2023
fix: iterative#8217
1. Replace group name to avoid conflict.
2. Disable failed tests.
3. Bump pathspec to 0.10.3

Wait for scmrepo's update iterative/scmrepo#163
karajan1001 added a commit to karajan1001/dvc that referenced this issue Jan 11, 2023
fix: iterative#8217
1. Replace group name to avoid conflict.
2. Disable failed tests.
3. Bump pathspec to 0.10.3
dtrifiro pushed a commit to karajan1001/dvc that referenced this issue Jan 11, 2023
fix: iterative#8217
1. Replace group name to avoid conflict.
2. Disable failed tests.
3. Bump pathspec to 0.10.3
karajan1001 added a commit to karajan1001/dvc that referenced this issue Jan 13, 2023
fix: iterative#8217
1. Replace group name to avoid conflict.
2. Disable failed tests.
3. Bump pathspec to 0.10.3
karajan1001 added a commit that referenced this issue Jan 13, 2023
fix: #8217
1. Replace group name to avoid conflict.
2. Disable failed tests.
3. Bump pathspec to 0.10.3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p1-important Important, aka current backlog of things to do
Projects
None yet
10 participants