Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IMPROVE: Lock mechanism and migration to storage class for the maintain operation #5331

Merged
merged 5 commits into from
Feb 23, 2022

Conversation

ramirezfranciscof
Copy link
Member

Two relevant notes about the modifications:

  • Now repository_maintain receives a Manager instance instead
    of a Backend instance, because it needs to also get access to
    the Profile object to acquire the lock.

  • Where exactly to acquire the lock is not an obvious decission.
    I decided to do so inside the repository_maintain method since
    this is what would be called both from the CLI and from the ORM.
    Also, although technically the lock is not needed to do the
    deletion of unreferenced objects, I decided to put this inside
    of it to reduce the time between the call to the method and the
    acquisition of the lock to the minimum (users might feel lied to
    if the method warns that it will block the profile, run for a
    couple of minutes, and then fail because a profile was loaded
    while the deletion of unreferenced objects was taking place).

@ramirezfranciscof ramirezfranciscof force-pushed the incorporate_lock branch 3 times, most recently from 6eafc61 to ed06bac Compare January 26, 2022 10:15
@codecov
Copy link

codecov bot commented Jan 26, 2022

Codecov Report

Merging #5331 (101a2cc) into develop (250dd96) will increase coverage by 2.68%.
The diff coverage is 100.00%.

❗ Current head 101a2cc differs from pull request most recent head 71e0dc0. Consider uploading reports for the commit 71e0dc0 to get more accurate results

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #5331      +/-   ##
===========================================
+ Coverage    79.45%   82.13%   +2.68%     
===========================================
  Files          513      533      +20     
  Lines        36747    38497    +1750     
===========================================
+ Hits         29195    31617    +2422     
+ Misses        7552     6880     -672     
Flag Coverage Δ
django 77.22% <100.00%> (?)
sqlalchemy 76.52% <100.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiida/backends/control.py 97.78% <100.00%> (ø)
aiida/cmdline/commands/cmd_storage.py 97.30% <100.00%> (+1.20%) ⬆️
aiida/cmdline/params/options/commands/setup.py 54.55% <0.00%> (-39.33%) ⬇️
aiida/cmdline/commands/cmd_setup.py 56.87% <0.00%> (-37.87%) ⬇️
aiida/manage/external/postgres.py 63.10% <0.00%> (-20.23%) ⬇️
aiida/cmdline/params/types/profile.py 64.45% <0.00%> (-6.66%) ⬇️
aiida/cmdline/params/types/strings.py 73.59% <0.00%> (-5.66%) ⬇️
aiida/orm/utils/log.py 76.67% <0.00%> (-4.81%) ⬇️
aiida/tools/archive/implementations/sqlite/main.py 87.18% <0.00%> (-4.24%) ⬇️
aiida/orm/implementation/querybuilder.py 90.20% <0.00%> (-3.42%) ⬇️
... and 378 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 250dd96...71e0dc0. Read the comment docs.

Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ramirezfranciscof . Agree with your discussion on where to put the lock. Just have some minor suggestions on implementation.

aiida/backends/control.py Outdated Show resolved Hide resolved
Comment on lines 69 to 73
if full:
with ProfileAccessManager(profile).lock():
perform_tasks()
else:
perform_tasks()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this kind of paradigm where the contextmanager is conditional or optional, I personally prefer the following approach:

Suggested change
if full:
with ProfileAccessManager(profile).lock():
perform_tasks()
else:
perform_tasks()
from contextlib import nullcontext
if full:
context = ProfileAccessManager(profile).lock
else:
context = nullcontext
with context():
unreferenced_objects = get_unreferenced_keyset(aiida_backend=backend)
MAINTAIN_LOGGER.info(f'Deleting {len(unreferenced_objects)} unreferenced objects ...')
if not dry_run:
repository.delete_objects(list(unreferenced_objects))
MAINTAIN_LOGGER.info('Starting repository-specific operations ...')
repository.maintain(live=not full, dry_run=dry_run, **kwargs)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhm, I didn't know this nullcontext option. That is neat, thanks!

tests/cmdline/commands/test_storage.py Outdated Show resolved Hide resolved
@chrisjsewell
Copy link
Member

Now repository_maintain receives a Manager instance instead of a Backend instance, because it needs to also get access to the Profile object to acquire the lock.

this seems slightly off, and maybe not fitting into #5172, please don't merge until we have can discuss a bit cheers

@chrisjsewell chrisjsewell added the pr/blocked PR is blocked by another PR that should be merged first label Jan 31, 2022
@ramirezfranciscof
Copy link
Member Author

Now repository_maintain receives a Manager instance instead of a Backend instance, because it needs to also get access to the Profile object to acquire the lock.

this seems slightly off, and maybe not fitting into #5172, please don't merge until we have can discuss a bit cheers

If I understand correctly, after #5172 the backend will have the link to its profile via the _profile attribute, right? At that point then passing the backend can suffice. The problem is that currently there is no direct links between profiles and backends, the only thing that holds a reference to both is the Manager, correct?

Also, now that I think of it, I am working under the assumption that the relationship between backends and profiles is 1:1. If a profile can have multiple backends I think all of this still works, but if a single backend instance can service multiple profiles then that could be problematic.

Do we need to schedule a zoom call to discuss this? @sphuber @chrisjsewell

@sphuber
Copy link
Contributor

sphuber commented Jan 31, 2022

Do we need to schedule a zoom call to discuss this? @sphuber @chrisjsewell

Probably a good idea

@chrisjsewell
Copy link
Member

Do we need to schedule a zoom call to discuss this? @sphuber @chrisjsewell

yep 👍
Lets wait until #4985 is ready though (hopefully by the end of today), as that will be very relevant

@sphuber
Copy link
Contributor

sphuber commented Jan 31, 2022

Let's schedule a meeting anyway and we can discuss both. If it is such a big PR as you are making it appear to be, probably it would be best to go through it in a call and review that way, rather than me going line by line. This will also be useful for @ramirezfranciscof to understand and we can then also discuss this PR right after. @ramirezfranciscof can you send out a doodle?

@chrisjsewell
Copy link
Member

chrisjsewell commented Jan 31, 2022

Let's schedule a meeting anyway and we can discuss both.

yep, just not today; no point trying to explain it when its half done

@ramirezfranciscof
Copy link
Member Author

Now repository_maintain receives a Manager instance instead of a Backend instance, because it needs to also get access to the Profile object to acquire the lock.

this seems slightly off, and maybe not fitting into #5172, please don't merge until we have can discuss a bit cheers

FYI @chrisjsewell I have changed this following the discussion with @sphuber here.

Now I pass the Profile instance around and get the corresponding Backend it uses by loading that profile and calling get_manager().get_backend().

Would that solve this particular incompatibility or is it still problematic with the other changes?

@chrisjsewell
Copy link
Member

chrisjsewell commented Jan 31, 2022

Now I pass the Profile instance around and get the corresponding Backend it uses by loading that profile and calling get_manager().get_backend().

ermm, this seems worse 😬 you could pass a profile, but then end up with a backend that does not actually correspond to that profile.
the main thing here, is that I'm trying to move away from "hard-coding" the use of global variable (which are the "evil"), i.e. PROFILE,BACKEND,ENGINE,SESSION_FACTORY, where possible you should be explicitly passing profile/backend and never calling get_profile(),get_manager(),get_backend() etc

@chrisjsewell
Copy link
Member

the main thing here, is that I'm trying to move away from "hard-coding" the use of global variable

To note, these are obviously still very helpful for users, i.e. that are just working with a globally loaded profile/backend, and don't want to have to worry about this kind of thing.
But in our "internal API", we should avoid it like the plague 😉

@ramirezfranciscof
Copy link
Member Author

ermm, this seems worse 😬 you could pass a profile, but then end up with a backend that does not actually correspond to that profile.

Mmm unless I misunderstood what @sphuber told me, after you use the load_profile with a given profile the Manager singleton will return the corresponding backend.

@chrisjsewell
Copy link
Member

chrisjsewell commented Jan 31, 2022

The other key thing here is that, after #4985, it will be perfectly possible to use multiple backends during the same python session, you don't have to load anything.
This is because, with django you have to "setup" a database connection, to a single database which can then never be undone (in that python session). But with sqlalchemy, you simply just do, e.g.

with create_sqlalchemy_engine(profile).connect() as connection:
    # do whatever I want to do with my database connection, then reliquish it
   ...

@chrisjsewell chrisjsewell removed the pr/blocked PR is blocked by another PR that should be merged first label Feb 17, 2022
@ramirezfranciscof
Copy link
Member Author

Hey @chrisjsewell, couple of questions:

  1. Is this still blocked or is it ok to proceed?
  2. Is the create_sqlalchemy_engine method intended to be the default way to access AiiDA profiles how you used it in the previous message or is that still WIP? (I ask because I only saw it in aiida/cmdline/commands/cmd_devel.py and in some internals of SQLA). I'm wondering if I should wrap the content of repository_maintain in a with create_sqlalchemy_engine statement instead of using load_profile, or if I'm misunderstanding the intended use of this context.

@chrisjsewell
Copy link
Member

Heya

Is this still blocked or is it ok to proceed?

I would say it should not be merged until after #5364 (that is just moving some things around rather than any logic change)

Is the create_sqlalchemy_engine method intended to be the default way to access AiiDA profiles

Very much no 😬; this is a method that is "private" to the psql_dos, any access to the profile should be storage backend agnostic.

I'll have to have a look through the code here, to understand the best way to implement this;
I'm sceptical that there should even be this "isolated" aiida/backends/control.py::repository_maintain function, and instead there should just be a maintain abstract method on the StorageBackend, that is then implemented for e.g. PsqlDosBackend

@sphuber
Copy link
Contributor

sphuber commented Feb 21, 2022

I agree that with the work to properly abstract the storage we should not start to leak implementation details again. So I would also vote for something like a generic Storage.maintain which for the PsqlDos implementation then includes the maintenance of the disk-object store.

@ramirezfranciscof
Copy link
Member Author

Ok, I can move the method to the PsqlDosBackend. Is there any logger currently being used for storage procedures or do I create a new one there?

@ramirezfranciscof ramirezfranciscof force-pushed the incorporate_lock branch 5 times, most recently from 2e78834 to c32328f Compare February 21, 2022 16:49
@ramirezfranciscof ramirezfranciscof force-pushed the incorporate_lock branch 4 times, most recently from abe9f80 to f8772ca Compare February 22, 2022 12:29
Moved the general purpose functions into methods of the storage class

Use the lock mechanism for the maintain operation
@ramirezfranciscof ramirezfranciscof changed the title IMPROVE: Use the lock mechanism for the maintain operation IMPROVE: Lock mechanism and migration to storage class for the maintain operation Feb 22, 2022
@ramirezfranciscof
Copy link
Member Author

I would say it should not be merged until after #5364 (that is just moving some things around rather than any logic change)

I see #5364 is now merged, does that mean we can proceed with this?

I already moved the methods to PsqlDosBackend class.

Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ramirezfranciscof . Had a quick look and most of it seems fine to me.

aiida/cmdline/commands/cmd_storage.py Outdated Show resolved Hide resolved
aiida/orm/implementation/storage_backend.py Outdated Show resolved Hide resolved
aiida/orm/implementation/storage_backend.py Outdated Show resolved Hide resolved
aiida/orm/implementation/storage_backend.py Show resolved Hide resolved
tests/cmdline/commands/test_storage.py Outdated Show resolved Hide resolved
tests/storage/psql_dos/test_backend.py Show resolved Hide resolved
sphuber
sphuber previously approved these changes Feb 23, 2022
Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright thanks @ramirezfranciscof looks good to me

Comment on lines 104 to 105
'database': get_database_summary(QueryBuilder, statistics),
'repository': get_repository_info(statistics=statistics),
'repository': storage.get_info(statistics=statistics),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't quite make sense 😬
get_info doctring: "Return general information on the storage.", i.e. not just about the repository

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realized this, but considered this out of the scope of this PR, which is just introducing the use of the locking mechanism. I would leave this for another PR, which then also moves get_database_summary into the StorageBackend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we open an issue for this then

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I wouldn't really say it is out of scope, because it is this PR that is introducing the "semantic" bug that was not there before.
But won't block this PR 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see @sphuber already sumarized better than I can the problem with get_database_summary using more high level methods to get its information. I'll just mention that I see usefulness both in being able to get some statistics in a "backend" agnostic way, as well as being able to get more details specific to the backend. We might need to just think a bit if there is some structural design that allows to have both.

In the meantime, I generalized storage.get_info so now the docstring is correct, it returns the information of the storage: that is, both the repository and the database (it just happens so that the database is currently empty). Then the storage_info command just calls get_database_summary to add a summary key to the database after it is returned. I believe this is the way in which we keep all the prior information with minimal changes that still leaves the methods in the best baseline to later re-organize the content of storage.get_info and get_database_summary.

chrisjsewell
chrisjsewell previously approved these changes Feb 23, 2022
Comment on lines 274 to 275
a nested dict with the relevant information (with at least one key for `database`
and one for `repository`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is the abstract class, it should not require that these keys are present. There could be some storage that doesn't have a separate database and repository

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm ok, true, but as it is right now I need at least database to add the summary entry. Options for this:

  1. We just accept this description for now and once we deal with get_database_summary we change it.
  2. I put the output of get_database_summary in a separateoutput_dict['database_summary'] instead of output_dict['database']['summary'].
  3. I put the output of get_database_summary in output_dict['database'], risking overwriting what could come in that key from storage.get_info (which is currently empty).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bah, there is also the option of adding the database key if it is not there. I will default to that now and push, let me know if you prefer one of the other 3.

Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine for me to merge. I'll take care of moving the database info logic in another PR.

@sphuber
Copy link
Contributor

sphuber commented Feb 23, 2022

@chrisjsewell you happy as well?

@sphuber sphuber merged commit f3ffacf into aiidateam:develop Feb 23, 2022
@sphuber sphuber deleted the incorporate_lock branch February 23, 2022 12:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants