Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 FIX: aiida/repository typing #4920

Merged
merged 19 commits into from
May 7, 2021
Merged

🐛 FIX: aiida/repository typing #4920

merged 19 commits into from
May 7, 2021

Conversation

chrisjsewell
Copy link
Member

Fixes 54 errors and one identified bug (setting directory)

aiida/repository/backend/abstract.py:56: error: Missing return statement  [return]
aiida/repository/backend/abstract.py:73: error: Argument 1 to "put_object_from_filelike" of "AbstractRepositoryBackend" has incompatible type "BinaryIO"; expected "BufferedIOBase"  [arg-type]
aiida/repository/backend/abstract.py:84: error: Missing return statement  [return]
aiida/repository/backend/abstract.py:106: error: "bytes" has no attribute "read"  [attr-defined]
aiida/repository/backend/sandbox.py:23: error: Item "None" of "Optional[Any]" has no attribute "abspath"  [union-attr]
aiida/repository/backend/sandbox.py:101: error: The return type of a generator function should be "Generator" or one of its supertypes  [misc]
aiida/repository/backend/disk_object_store.py:81: error: The return type of a generator function should be "Generator" or one of its supertypes  [misc]
aiida/repository/backend/__init__.py:8: error: Name 'abstract' is not defined  [name-defined]
aiida/repository/backend/__init__.py:8: error: Name 'disk_object_store' is not defined  [name-defined]
aiida/repository/backend/__init__.py:8: error: Name 'sandbox' is not defined  [name-defined]
aiida/repository/repository.py:104: error: 'builtins.object' object is not iterable  [misc]
aiida/repository/repository.py:105: error: Cannot determine type of 'dirnames'  [has-type]
aiida/repository/repository.py:106: error: Cannot determine type of 'filenames'  [has-type]
aiida/repository/repository.py:107: error: Cannot determine type of 'root'  [has-type]
aiida/repository/repository.py:108: error: Cannot determine type of 'root'  [has-type]
aiida/repository/repository.py:108: error: "bytes" has no attribute "read"  [attr-defined]
aiida/repository/repository.py:142: error: Incompatible return value type (got "None", expected "AbstractRepositoryBackend")  [return-value]
aiida/repository/repository.py:164: error: Incompatible types in assignment (expression has type "Callable[[Union[str, Path, None]], File]", variable has type "File")  [assignment]
aiida/repository/repository.py:178: error: Incompatible types in assignment (expression has type "Optional[Path]", variable has type "Union[str, Path]")  [assignment]
aiida/repository/repository.py:181: error: Item "str" of "Union[str, Path]" has no attribute "parts"  [union-attr]
aiida/repository/repository.py:194: error: Need type annotation for 'hash_keys' (hint: "hash_keys: List[<type>] = ...")  [var-annotated]
aiida/repository/repository.py:210: error: Item "AbstractRepositoryBackend" of "Optional[AbstractRepositoryBackend]" has no attribute "count_objects"  [union-attr]
aiida/repository/repository.py:210: error: Item "None" of "Optional[AbstractRepositoryBackend]" has no attribute "count_objects"  [union-attr]
aiida/repository/repository.py:223: error: Item "None" of "Optional[Path]" has no attribute "parts"  [union-attr]
aiida/repository/repository.py:226: error: Item "None" of "Optional[Path]" has no attribute "parts"  [union-attr]
aiida/repository/repository.py:262: error: Incompatible types in assignment (expression has type "Optional[Path]", variable has type "Union[str, Path]")  [assignment]
aiida/repository/repository.py:301: error: Incompatible types in assignment (expression has type "Optional[Path]", variable has type "Union[str, Path]")  [assignment]
aiida/repository/repository.py:303: error: Argument 1 to "_insert_file" of "Repository" has incompatible type "Union[str, Path]"; expected "Path"  [arg-type]
aiida/repository/repository.py:313: error: Argument 1 to "put_object_from_filelike" of "Repository" has incompatible type "BinaryIO"; expected "BufferedReader"  [arg-type]
aiida/repository/repository.py:337: error: Item "None" of "Optional[Path]" has no attribute "parts"  [union-attr]
aiida/repository/repository.py:338: error: Argument 1 to "create_directory" of "Repository" has incompatible type "Optional[Path]"; expected "Union[str, Path]"  [arg-type]
aiida/repository/repository.py:342: error: Incompatible types in assignment (expression has type "Path", variable has type "str")  [assignment]
aiida/repository/repository.py:345: error: "str" has no attribute "relative_to"  [attr-defined]
aiida/repository/repository.py:348: error: Unsupported left operand type for / ("str")  [operator]
aiida/repository/repository.py:348: error: "str" has no attribute "relative_to"  [attr-defined]
aiida/repository/repository.py:372: error: The return type of a generator function should be "Generator" or one of its supertypes  [misc]
aiida/repository/repository.py:385: error: Argument 1 to "open" of "AbstractRepositoryBackend" has incompatible type "Optional[str]"; expected "str"  [arg-type]
aiida/repository/repository.py:397: error: Argument 1 to "get_object_content" of "AbstractRepositoryBackend" has incompatible type "Optional[str]"; expected "str"  [arg-type]
aiida/repository/repository.py:412: error: Incompatible types in assignment (expression has type "Optional[Path]", variable has type "Union[str, Path]")  [assignment]
aiida/repository/repository.py:419: error: Argument 1 to "delete_object" of "AbstractRepositoryBackend" has incompatible type "Optional[str]"; expected "str"  [arg-type]
aiida/repository/repository.py:421: error: Item "str" of "Union[str, Path]" has no attribute "parent"  [union-attr]
aiida/repository/repository.py:422: error: Item "str" of "Union[str, Path]" has no attribute "name"  [union-attr]
aiida/repository/repository.py:451: error: 'builtins.object' object is not iterable  [misc]
aiida/repository/repository.py:452: error: Cannot determine type of 'dirnames'  [has-type]
aiida/repository/repository.py:453: error: Cannot determine type of 'root'  [has-type]
aiida/repository/repository.py:454: error: Cannot determine type of 'filenames'  [has-type]
aiida/repository/repository.py:455: error: Cannot determine type of 'root'  [has-type]
aiida/repository/repository.py:456: error: Argument 1 to "put_object_from_filelike" of "Repository" has incompatible type "bytes"; expected "BufferedReader"  [arg-type]
aiida/repository/repository.py:456: error: Cannot determine type of 'root'  [has-type]
aiida/repository/repository.py:458: error: The return type of a generator function should be "Generator" or one of its supertypes  [misc]
aiida/repository/repository.py:477: error: Unsupported left operand type for / ("None")  [operator]
aiida/repository/repository.py:477: note: Left operand is of type "Optional[Path]"
aiida/repository/__init__.py:16: error: Name 'backend' is not defined  [name-defined]
aiida/repository/__init__.py:16: error: Name 'common' is not defined  [name-defined]
aiida/repository/__init__.py:16: error: Name 'repository' is not defined  [name-defined]
Found 54 errors in 6 files (checked 74 source files)

@sphuber its kind of pointless adding types, if you don't also type check them 😬

Fixes 54 errors and one identified bug (setting directory)
@chrisjsewell chrisjsewell requested a review from sphuber May 6, 2021 06:03
@@ -32,7 +32,7 @@ class keeps a reference of the virtual file hierarchy. This means that through t

# pylint: disable=too-many-public-methods

_backend = None
_backend: Optional[AbstractRepositoryBackend] = None
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this need to be a class attribute and not a regular one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In principle it doesn't, we initialize it in the constructor anyway, so I think we can get rid of it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@@ -161,7 +162,7 @@ def _insert_file(self, path: pathlib.Path, key: str):
if path.parent:
directory = self.create_directory(path.parent)
else:
directory = self.get_directory
directory = self.get_directory()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was a definite bug

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add the missing test perhaps then?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I'll have a look 👍

Copy link
Member Author

@chrisjsewell chrisjsewell May 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How was this intented to work? I think its actually impossible to reach that code, given that Path().parent == Path(). Is that perhaps what the test should be?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or shall I just remove the if/else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the conditional makes sense it is just incorrect. What I think I intended here is to check whether the parent directory of the path exists. If not, then we create it first. Otherwise, we just get the directory.

That being said, "creating" directories here is all just virtual, and there aren't actual directories being explicitly created in the backend repository, just the virtual hierarchy. Since create_directory should only create the directory if it doesn't already exist, we could always just call it and don't need to check. So I guess we can indeed just get rid of the entire conditional and just call self.create_directory(path.parent).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeh thats what I thought, removed

@sphuber
Copy link
Contributor

sphuber commented May 6, 2021

@sphuber its kind of pointless adding types, if you don't also type check them grimacing

It is also kind of pointless to be complaining about PRs that you approved yourself

@chrisjsewell
Copy link
Member Author

It is also kind of pointless to be complaining about PRs that you approved yourself

😆 I didn't notice that it was not added to the list (you were half way there), but yeh I do want to push that people do this, because there is a definite benefit

@@ -81,7 +89,7 @@ def has_object(self, key: str) -> bool:
"""

@contextlib.contextmanager
def open(self, key: str) -> io.BufferedIOBase:
def open(self, key: str) -> typing.Iterator[typing.BinaryIO]: # type: ignore[return]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, these abstract methods, that you also need to call super() on, mypy is not a fan 😬

"""Store the byte contents of a file in the repository.

:param handle: filelike object with the byte content to be stored.
:return: the generated fully qualified identifier for the object within the repository.
"""
if not isinstance(handle, io.BytesIO) and not self.is_readable_byte_stream(handle):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy didn't like this, and it feels like it goes a bit against the concept of an abstract method, so I moved it to a separate method. But its not a deal breaker

Copy link
Member Author

@chrisjsewell chrisjsewell May 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the best way to do this technically is something like:

    def put_object_from_filelike(self, handle: typing.BinaryIO) -> str:
        """Store the byte contents of a file in the repository.
        :param handle: filelike object with the byte content to be stored.
        :return: the generated fully qualified identifier for the object within the repository.
        """
		check_byte_stream(handle)
        return _put_object_from_filelike(handle)

    @abc.abstractmethod
    def _put_object_from_filelike(self, handle: typing.BinaryIO) -> str:

then the concrete methods do not have to "remember" to call super() or any other initial code

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@codecov
Copy link

codecov bot commented May 6, 2021

Codecov Report

Merging #4920 (7aa3d05) into develop (597a4d0) will increase coverage by 0.01%.
The diff coverage is 98.34%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #4920      +/-   ##
===========================================
+ Coverage    80.05%   80.06%   +0.01%     
===========================================
  Files          515      515              
  Lines        36612    36622      +10     
===========================================
+ Hits         29306    29316      +10     
  Misses        7306     7306              
Flag Coverage Δ
django 74.51% <98.34%> (+0.04%) ⬆️
sqlalchemy 73.44% <98.34%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiida/repository/backend/abstract.py 95.66% <87.50%> (-1.84%) ⬇️
aiida/backends/general/migrations/utils.py 89.15% <100.00%> (ø)
aiida/orm/nodes/repository.py 93.59% <100.00%> (ø)
aiida/repository/__init__.py 100.00% <100.00%> (ø)
aiida/repository/backend/__init__.py 100.00% <100.00%> (ø)
aiida/repository/backend/disk_object_store.py 96.16% <100.00%> (+0.08%) ⬆️
aiida/repository/backend/sandbox.py 100.00% <100.00%> (+1.86%) ⬆️
aiida/repository/repository.py 98.45% <100.00%> (+0.54%) ⬆️
aiida/transports/plugins/local.py 81.54% <0.00%> (-0.25%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 597a4d0...7aa3d05. Read the comment docs.

Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cheers @chrisjsewell , all good, except I would suggest to add a test for the __str__ method of initialized and uninitialized repos because that was clearly a bug that went uncaught. And I have a question about the conditionals checking File.key is None.

Comment on lines 375 to 377
key = self.get_file(path).key
if key is None:
raise TypeError('File key not set')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I see why mypy thinks it needs this, because technically File.key can indeed return str or None. However, if the File has file_type==FileType.File the key cannot be None because this is checked in the constructor. So we should never get in this TypeErrorr since if key is None than get_file will already have raised IsADirectoryError because it is not a file. So I am wondering if this should not simply be an AssertionError if we really need to have this check here, because it would be an internal coding inconsistency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done 👍
yeh this is why I personally probably would have erred towards having File and Directory subclasses, but appreciate it is somewhat a matter of personal preference

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be a problem on my side, but I still see the TypeError. Did you change this to

assert self.get_file(path).key is not None

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeh, I changed it in delete_object, but not here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done


def __str__(self) -> str:
"""Return the string representation of this repository."""
return f'SandboxRepository: {self._sandbox.abspath}'
return f'SandboxRepository: {self._sandbox.abspath if self._sandbox else "null"}'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was clearly a bug and not caught by the tests. I think it would be good to add a quick test, both for the sandbox and diskobject backends.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

jeez, and yet you can't be bothered to add type checking to migrations 😛

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done; I adapted them slightly, to return SandboxRepository: <uninitialised>/DiskObjectStoreRepository: <uninitialised> before initialisation

@chrisjsewell chrisjsewell requested a review from sphuber May 6, 2021 22:43
Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chrisjsewell almost there, just don't see the change in the TypeError raising, but your comment suggests you did change it.

aiida/repository/backend/sandbox.py Show resolved Hide resolved
Comment on lines 375 to 377
key = self.get_file(path).key
if key is None:
raise TypeError('File key not set')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be a problem on my side, but I still see the TypeError. Did you change this to

assert self.get_file(path).key is not None

?

@chrisjsewell chrisjsewell requested a review from sphuber May 7, 2021 08:24
@sphuber sphuber merged commit bff117a into develop May 7, 2021
@sphuber sphuber deleted the fix-repo-typing branch May 7, 2021 08:45
@chrisjsewell
Copy link
Member Author

chrisjsewell commented May 7, 2021

the design of the put_object_from_filelike method in the
abstract class that performed the validation such that subclasses would
not have to, also didn't please the mypy gods.

Well yeh, the other way I would word it is that put_object_from_filelike was semantically incorrect; it wasn't labelled as an abstractmethod (which always requires implementation), and expecting concrete implementations to "remember" to call super() is just not a good idea 😜

Personally I would also consider changing the open method in a similar way, but I didn't want to delay this PR with that

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants