-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix/python3.9 #46
Fix/python3.9 #46
Conversation
- a change to the way scandir is used causes a KeyError in some code paths - root cause: https://bugs.python.org/issue39916 - solution is to refactor scandir to use a class that can be used as a generator or context manager.
Codecov Report
@@ Coverage Diff @@
## master #46 +/- ##
==========================================
+ Coverage 89.09% 89.20% +0.10%
==========================================
Files 7 7
Lines 899 926 +27
==========================================
+ Hits 801 826 +25
- Misses 98 100 +2
Continue to review full report at Codecov.
|
🎉 This PR is included in version 0.3.5 🎉 The release is available on GitHub release Your semantic-release bot 📦🚀 |
* refactor: construct bucket paths with service prefixes BREAKING CHANGES: GCSPaths must include a path scheme Before GCSPath would convert `/bucket-name/file` into `gs://bucket-name/file` internally. This leads to ambiguity *(thanks @honnibal) when dealing with regular file-system paths and bucket paths together. Does the path `/foo/bar` point to an absolute file path or a GCS bucket named foo? Now paths **must be** constructed with a path scheme. This allows GCSPath to deal with both types of paths, and is needed for the CLI apps to come. * feat(cli): add pathy executable with cp and mv commands - add "pathy" entry point to the gcspath package - add Typer requirement - add typer CLI app for copying and moving files - basic test * feat: add FluidPath and GCSPath.fluid method GCSPath wants to work with many kinds of paths, and it's not always clear upfront what kind of path a string represents. If you're on a local file system, the path "/usr/bin/something" may be totally valid, but as a GCSPath it isn't valid because there's no service scheme attached to it, e.g. "gs://bucket/usr/bin/something" FluidPath is a Union of pathlib.Path and GCSPath which allows type-checking of the paths without needing explicit knowledge of what kind of path it is, until that knowledge is needed. *note* I originally thought of using "UnionPath" instead of "FluidPath" but the intellisense for completing "GCSPath.union" was very crowded, and a helper should be easy to type with completion. * chore: fix bad entry_point in setup.py * test(cli): add tests for cp/mv files and folders * feat(cli): add rm [path] command - removes files or folders - add tests * feat(cli): add ls [path] command - prints the full paths of files found in the location * feat(pathy): rename library to be more generic - it does more than just GCS at this point * chore(release): 0.1.0 # [0.1.0](v0.0.17...v0.1.0) (2020-04-24) ### Features * add FluidPath and GCSPath.fluid method ([3393226](3393226)) * **cli:** add ls [path] command ([17cab1d](17cab1d)) * **cli:** add pathy executable with cp and mv commands ([98760fc](98760fc)) * **cli:** add rm [path] command ([31cea91](31cea91)) * **pathy:** rename library to be more generic ([c62b14d](c62b14d)) * feat(cli): add -r and -v flags for safer usage - rm will fail if given a directory without the -r flag (similar to unix rm) - rm will print the removed files/folders when given the -v flag * chore(release): 0.1.1 ## [0.1.1](v0.1.0...v0.1.1) (2020-04-24) ### Features * **cli:** add -r and -v flags for safer usage ([a87e36f](a87e36f)) * fix: path.owner() can raise when using filesystem adapter - catch the error and return a None owner * chore(release): 0.1.2 ## [0.1.2](v0.1.1...v0.1.2) (2020-05-23) ### Bug Fixes * path.owner() can raise when using filesystem adapter ([2877b06](2877b06)) * feat: upgrade typer support - allow anything in range >=0.3.0,<1.0.0 * chore(release): 0.1.3 ## [0.1.3](v0.1.2...v0.1.3) (2020-06-28) ### Features * upgrade typer support ([e481000](e481000)) * chore(deps): bump npm from 6.14.2 to 6.14.6 Bumps [npm](https://github.com/npm/cli) from 6.14.2 to 6.14.6. - [Release notes](https://github.com/npm/cli/releases) - [Changelog](https://github.com/npm/cli/blob/latest/CHANGELOG.md) - [Commits](npm/cli@v6.14.2...v6.14.6) Signed-off-by: dependabot[bot] <support@github.com> * chore(deps): bump lodash from 4.17.15 to 4.17.19 Bumps [lodash](https://github.com/lodash/lodash) from 4.17.15 to 4.17.19. - [Release notes](https://github.com/lodash/lodash/releases) - [Commits](lodash/lodash@4.17.15...4.17.19) Signed-off-by: dependabot[bot] <support@github.com> * refactor: rename PureGCSPath to PurePathy Be more consistent with the Pathy naming. BREAKING CHANGE: PureGCSPath is now PurePathy * feat(README): generate API and CLI docs * feat(build): use husky to auto update docs when code changes * chore: update docs * chore: add BucketStat to docs * chore: fix ci build badge * chore(release): 0.2.0 # [0.2.0](v0.1.3...v0.2.0) (2020-08-22) ### Code Refactoring * rename PureGCSPath to PurePathy ([5632f26](5632f26)) ### Features * **build:** use husky to auto update docs when code changes ([5a32357](5a32357)) * **README:** generate API and CLI docs ([0213d2f](0213d2f)) ### BREAKING CHANGES * PureGCSPath is now PurePathy * chore: add codecov script * chore: add coverage badge to readme * chore: update readme restore the credits to s3path library * chore: update docs * refactor(pypi): move gcs dependencies into pathy[gcs] extras * chore: add auto format and lint scripts * feat: add get_client/register_client for supporting multiple services Adds a simple registry of known schemes and their mappings to BucketClient subclasses. There's a hardcoded list of built-in services, and (in theory) you can register more dynamically. I think I prefer to hardcode and include most of the known services, and lazily import them so you only need their packages when you actually use them. The hope is that this lets the strong typings flow through to the clients (because they can be statically inspected). If we can't get specific types flowing through nicely, maybe it's okay to do more of a dynamic import style registration. BREAKING CHANGE use_fs, get_fs_client, use_fs_cache, get_fs_cache, and clear_fs_cache moved from pathy.api to pathy.clients * chore: add semantic PR title linting github action * chore: fix tests without google-auth installed * chore: drop github action for semantic pr titles - really the title is secondary to the commits in the PR. We'll continue to use the Semantic PR github app as long as it works * chore: cleanup from review * chore: fix extras in setup.py * refactor: add BasePathy class to bind PathType var to BREAKING CHANGE: This renames the internal GCS/File adapter classes by removing the prefix Client. ClientBucketFS -> BucketFS ClientBlobFS -> BlobFS ClientBucketGCS -> BucketGCS ClientBlobGCS -> BlobGCS * refactor: combine api/client with base.py - this makes the Pathy type accessible where it otherwise would not be for TypeVars. * feat(GCS): print install command when using GCS without deps installed - make the assertion prettier 😎 * chore: fix the remaining mypy errors - in some cases the mypy errors are too uptight about subclasses and their types. When that happens we silence the error and provide the expected subclass type. * chore: remove isort test from lint script - we black format last, so the order/indent could be changed. * chore: misc cleanup * chore: drop PathType variable - since consolidating the Pathy class in the base.py file, there's no need to TypeVars, we can just use a forward ref to Pathy itself 🎉 * feat(ci): add lint check before testing * chore: fix travis lint invocation * chore: use venv when linting * docs: add section about semantic version to readme * chore(release): 0.3.0 # [0.3.0](v0.2.0...v0.3.0) (2020-09-04) ### Code Refactoring * add BasePathy class to bind PathType var to ([796dd40](796dd40)) ### Features * add get_client/register_client for supporting multiple services ([747815b](747815b)) * **ci:** add lint check before testing ([2633480](2633480)) * **GCS:** print install command when using GCS without deps installed ([d8dbcd4](d8dbcd4)) ### BREAKING CHANGES * This renames the internal GCS/File adapter classes by removing the prefix Client. ClientBucketFS -> BucketFS ClientBlobFS -> BlobFS ClientBucketGCS -> BucketGCS ClientBlobGCS -> BlobGCS * use_fs, get_fs_client, use_fs_cache, get_fs_cache, and clear_fs_cache moved from pathy.api to pathy.clients * chore: add test for about.py to avoid failed codecov checks when releasing - if about.py changes and we don't test it, codecov is all like "wah, you didn't hit your diff targets because you went from 0% to 0% on about.py" 😅 * chore: lint * chore(deps): bump node-fetch from 2.6.0 to 2.6.1 Bumps [node-fetch](https://github.com/bitinn/node-fetch) from 2.6.0 to 2.6.1. - [Release notes](https://github.com/bitinn/node-fetch/releases) - [Changelog](https://github.com/node-fetch/node-fetch/blob/master/docs/CHANGELOG.md) - [Commits](node-fetch/node-fetch@v2.6.0...v2.6.1) Signed-off-by: dependabot[bot] <support@github.com> * feat(ci): add pyright check to lint step * chore: run with npx * chore: fix pyright errors in gcs.py * chore: take two at fixing pyright errors - this is pretty nice. If you don't have the packages installed, the types end up being Any everywhere, but if you do have them installed, you get all the correct types including documentation popups in the IDE and intellisense 🎉 * chore: fix issue where GCS installation was not found - storage var is outdated * feat: update smart-open to 2.2.0 for minimal deps - to get GCS support, use `pip install pathy[gcs]` * chore: fix fallback type for credentials error * chore: npm audit fix * chore(release): 0.3.1 ## [0.3.1](v0.3.0...v0.3.1) (2020-09-26) ### Features * update smart-open to 2.2.0 for minimal deps ([4b3e959](4b3e959)) * **ci:** add pyright check to lint step ([10ce34d](10ce34d)) * test: add a rglob + unlink test - it seems like a reasonably common pattern, make sure it works like rmdir * chore: suppress ugly path open type error - the Path open method has a gross type that changes based on the python version. We'll use our specific type and deal with the consequences. 😎 * fix: upgrade smart-open to >=2.2.0,<4.0.0 (#36) ⬆️ Upgrade smart-open pin, to fix botocore requiring urllib3 < 1.26 * chore(release): 0.3.2 ## [0.3.2](v0.3.1...v0.3.2) (2020-11-12) ### Bug Fixes * upgrade smart-open to >=2.2.0,<4.0.0 ([#36](#36)) ([fdf083e](fdf083e)) * chore: add BucketStat -> BlobStat to changelog * fix: path.scheme would error with schemeless paths (#37) - return "" from file paths as expected rather than error * chore(release): 0.3.3 ## [0.3.3](v0.3.2...v0.3.3) (2020-11-12) ### Bug Fixes * path.scheme would error with schemeless paths ([#37](#37)) ([80f0036](80f0036)) * chore(deps-dev): bump semantic-release from 17.0.4 to 17.2.3 (#38) Bumps [semantic-release](https://github.com/semantic-release/semantic-release) from 17.0.4 to 17.2.3. - [Release notes](https://github.com/semantic-release/semantic-release/releases) - [Commits](semantic-release/semantic-release@v17.0.4...v17.2.3) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * feat(clients): add set_client_params for specifying client-specific args (#39) * feat(clients): add set_client_params for specifying client-specific args - useful for passing credentials and other args to the underlying bucket client library * chore: fix lint * test: add test for set_client_params * chore: fix test * test: run GCS tests during CI build * chore: install all deps for testing * chore: fix credentials detection * chore: update docs * test(clients): add recreate behavior test * chore: use more generous wait in timestamp test * chore: update readme snippets * chore(release): 0.3.4 ## [0.3.4](v0.3.3...v0.3.4) (2020-11-22) ### Features * **clients:** add set_client_params for specifying client-specific args ([#39](#39)) ([84b9987](84b9987)) * Feature/test doc snippets (#40) * test(ci): add readme snippet test runner - gather up and execute the python snippets in the Readme to make sure they work. * chore: update docs and mathy_pydoc version * chore: update docs * chore(deps): bump ini from 1.3.5 to 1.3.8 (#41) Bumps [ini](https://github.com/isaacs/ini) from 1.3.5 to 1.3.8. - [Release notes](https://github.com/isaacs/ini/releases) - [Commits](npm/ini@v1.3.5...v1.3.8) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * fix: python 3.9 compatibility (#46) * fix: python 3.9 compatibility - a change to the way scandir is used causes a KeyError in some code paths - root cause: https://bugs.python.org/issue39916 - solution is to refactor scandir to use a class that can be used as a generator or context manager. * chore: disable broken spacy test until thinc pr lands * chore: make PathyScanDir iterable for py < 3.8 * chore: cleanup from review * chore: drop old tox file * chore: add codecov yml to disable patch coverage * fix(pypi): add requirements.txt to distribution (#45) * chore(release): 0.3.5 ## [0.3.5](v0.3.4...v0.3.5) (2021-02-02) ### Bug Fixes * **pypi:** add requirements.txt to distribution ([#45](#45)) ([759cd86](759cd86)) * python 3.9 compatibility ([#46](#46)) ([a965f40](a965f40)) Co-authored-by: repo-ranger[bot] <39074581+repo-ranger[bot]@users.noreply.github.com> Co-authored-by: semantic-release-bot <semantic-release-bot@martynus.net> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Sebastián Ramírez <tiangolo@gmail.com> Co-authored-by: Nicholas Bollweg <nick.bollweg@gmail.com>
Python 3.9 introduced a change to the way scandir is used in the guts of the pathlib.Path class. This causes KeyErrors to be thrown because Pathy's scandir function only works as a generator, not a contextmanager.
This PR refactors the scandir functionality to work as a context manager and a generator.
The root cause seems to be: https://bugs.python.org/issue39916