-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADD: New repository CLI #4965
ADD: New repository CLI #4965
Conversation
Current Draft StateThis draft does not yet implement the simplified interface that I have described in the OP. Instead, the interface is left "completely" transparent, in the sense that it allows to easily enable/disable each of the following operations:
This is so to be able to determine some basic heuristics of the different times that it takes for the disk-objectstore to perform each process and better figure out which can be associated together without much penalty. Specially within the subgroups that can be performed with the daemon running (pack/transmit) and the ones that can't (clean/repack/vacuum). For this purpose I have also created this companion script (you can download and pip-install the full repository in an AiiDA environment, as the script makes use of other tools in there). If you want to test it yourself you can run it by using the following:
Test Procedure
Preliminary resultsYou can see my full output below, but my summary of it would be:
Full output
I would like to try this with a lot of small files (see |
5e489f6
to
1cef5c3
Compare
Codecov Report
@@ Coverage Diff @@
## develop #4965 +/- ##
===========================================
+ Coverage 81.43% 81.46% +0.03%
===========================================
Files 529 530 +1
Lines 37002 37113 +111
===========================================
+ Hits 30128 30229 +101
- Misses 6874 6884 +10
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
f7a102b
to
c6d64c8
Compare
c6d64c8
to
f7ee52b
Compare
af61e28
to
55cd110
Compare
c317031
to
a74ccb3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heya, also need list_objects
for the archive, so just checking we are on the same page
might end up extracting this in to a separate PR/commit |
c73c4c2
to
dbee74d
Compare
dbee74d
to
584c5f8
Compare
Outstanding Discussion: (writing...) |
4b5ddec
to
8cb0951
Compare
Applied the changes discussed earlier today @sphuber @chrisjsewell so this is ready for review. Of the previously mentioned problems I am still having this one:
Any ideas? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ramirezfranciscof . I think the interface is a lot better now. Just some comments on bits of the new implementation
files_numb = self.container.count_objects()['packed'] | ||
files_size = self.container.get_total_size()['total_size_packfiles_on_disk'] * BYTES_TO_MB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This information is not really adding anything specific to the maintenance operation is it? It just gives the current size, but that doesn't tell what it will be nor what will be saved. Only the latter would be really interesting IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it can help give you an idea of how long it might take to do the repacking. But ok, I can take it out if you prefer.
Co-authored-by: Sebastiaan Huber <mail@sphuber.net>
0da0101
to
2402bf7
Compare
Hey @sphuber I think we might have counted differently but this is ready for other review. BTW related to the error, the only place where I added something related to the orm is here, and I need it only for the typing: from aiida.orm.implementation import Backend
__all__ = ('MAINTAIN_LOGGER',)
MAINTAIN_LOGGER = AIIDA_LOGGER.getChild('maintain')
def repository_maintain(
full: bool = False,
dry_run: bool = False,
backend: Optional[Backend] = None,
**kwargs,
) -> dict: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ramirezfranciscof . Looks good to be merged now; just spotted some docstrings you forgot to update after last changes. If you correct those, I will approve and merge.
All tests seem to be passing. I am now having again the problem I describe here, but I just commited ignoring the hooks. Still would like to know why this is happening but I want this merged more. @sphuber all good now? (notice that there is an outstanding comment in your previous review where I was waiting for confirmation if my comment convinced you or you still want the info taken out) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ramirezfranciscof . The ignore statement for mypy
because of the different signatures is not a big deal, it is the same as that of pylint
. Let's keep that for now and merge this.
This PR incorporates the new command line tool to control the maintenance tasks of the repository (fix 4321).
The main challenge of this task is reconciling (1) the need for specific control over the backend and the processes happening underneath and (2) a general frontend interface that is simple for the user and versatile to control many possibly different backends. This is something that is a common issue of many software designs, but this is a particular case where the specifics of the underlying processes are quite important and hiding them behind a generic interface could render the whole feature unusable.
Current Implementation
There is a single command
verdi repository maintain
, that will perform a full maintenance procedure on the repository backend. This procedure might be time intensive and it requires the user to stop using the database to prevent any data corruption, so a proper warning is displayed asking for confirmation (there is also the possibility of guaranteeing the safety with profile locking, see below).The command also has the flag
--live
to indicate that only maintenance tasks that can be done while still using AiiDA should be executed. Again, a warning lets the user know that this is not the full procedure and that they should run the full command once they have the time.I think that this characteristic is essentially the critical minimal information the user needs to handle: can I do this while still using AiiDA or do I need to do a "downtime"? Even considerations of performance control (doing a quick maintenance vs and an in-depth) are secondary, at least in the sense that they are more relevant for "downtime" maintenance but not so much for the "live" option.
Finally I added a
--pass_down
option that accepts a string that gets send to the backend and can be used for testing, performance analysis, or power-using specific backends. You can see how this is currently being used to have a finer grain control of the different stages of maintenance in the objectstore repository backend.The case of deletion
One last thing that would be common to all repository backends is the "propagation of the deletion of files". The issue is that the deletion of data only takes direct effect in the AiiDA database (removing the reference from there) without affecting the content of the backend. Therefore it needs to be specifically propagated to the repository backend, a process which is currently performed every time the maintenance is run (with or without
--live
). This is not strictly linked to the underlying maintenance operations though, and in principle we could separate this command so that users can propagate the deletion to the backend independently.I chose to initially have this be part of the maintenance so as to present a simpler interface, since it is not even guaranteed that this will have any beneficial effect on its own (for example, if the backend also does soft delete and keeps the unreferenced files around until a full maintenance is executed). This means however that it can't be controlled by users externally since this is performed on the AiiDA side and not part of the backend (and thus should not be influenced by things in
pass_down
). In principle for the objectstore they can do this propagation only if theypass_down
the options to cancel all backend operations, but (1) this is very backend specific and (2) they can't currently choose to skip it in this way. I'm considering adding a--skip-propagation
flag for this purpose, but I think since it is an addition it is not critical to have it now (unless there is some specific performance issue with this part).Tasks: