-
Notifications
You must be signed in to change notification settings - Fork 248
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[fs] support hfs.ls on a bucket (#14176)
Teaches `hfs.ls('gs://bucket/')` to list the files and directories at the top-level of the bucket. In `main` that command raises because this line of `_ls_no_glob` raises: ```python3 maybe_sb_and_t, maybe_contents = await asyncio.gather( self._size_bytes_and_time_modified_or_none(path), ls_as_dir() ) ``` In particular, `statfile` raises a cloud-specific, esoteric error about a malformed URL or empty object names: ```python3 async def _size_bytes_and_time_modified_or_none(self, path: str) -> Optional[Tuple[int, float]]: try: # Hadoop semantics: creation time is used if the object has no notion of last modification time. file_status = await self.afs.statfile(path) return (await file_status.size(), file_status.time_modified().timestamp()) except FileNotFoundError: return None ``` I decided to add a sub-class of `FileNotFoundError` which is self-describing: `IsABucketError`. I changed most methods to raise that error when given a bucket URL. The two interesting cases: 1. `isdir`. This raises an error but I could also see this returning `True`. A bucket is like a directory whose path/name is empty. 2. `isfile`. This returns False but I could also see this raising an error. This just seems convenient, we know the bucket is not a file so we should say so. --- Apparently `hfs.ls` had no current tests because the globbing system doesn't work with Azure https:// URLs. I fixed it to use `AsyncFSURL.with_new_path_component` which is resilient to Azure https weirdness. However, I had to change `with_new_path_component` to treat an empty path in a special way. I wanted this to hold: ``` actual = str(afs.parse_url('gs://bucket').with_new_path_component('bar')) expected = 'gs://bucket/bar' assert actual == expected ``` But `with_new_path_component` interacts badly with `GoogleAsyncFSURL.__str__` to return this: ``` 'gs://bucket//bar' ```
- Loading branch information
Showing
13 changed files
with
230 additions
and
85 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.