Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add caching of object collections #2216

Closed
muhrin opened this issue Nov 19, 2018 · 4 comments
Closed

Add caching of object collections #2216

muhrin opened this issue Nov 19, 2018 · 4 comments
Assignees

Comments

@muhrin
Copy link
Contributor

muhrin commented Nov 19, 2018

Currently ORM entity collections are recreated each time .objects is called on an entity type e.g. User.objects. This should be cached as the cost of construction is likely to be significant and the number of times this is called could be very lage. What's more the underlying collection has no reason to be re-instantiated, it's the same collection each time.

A start point would be to add a lazy-caching to this function:

https://github.com/aiidateam/aiida_core/blob/aa7907260906f01d8d7bb04b1b1421b24ab00d3f/aiida/orm/entities.py#L116

@ltalirz
Copy link
Member

ltalirz commented Nov 20, 2018

@muhrin I see everything is already pretty much prepared for this, so I'm happy to add this myself.

The question remains what the logic should be to check whether the cache can be used?

Let me know whether you agree with this logic

    @classproperty
    def objects(cls, backend=None):  # pylint: disable=no-self-use, no-self-argument
        """
        Get an collection for objects of this type.

        :param backend: the optional backend to use (otherwise use default)
        :return: an object that can be used to access entites of this type
        """
        from . import backends

        # new_backend always holds an actual backend object
        new_backend = backend or cls._BACKEND or backends.construct_backend()

        if type(cls._BACKEND) != type(new_backend):
            # if the backend has changed, update cache
            cls._BACKEND = new_backend
            cls._OBJECTS = cls.Collection(cls._BACKEND, cls)
        else:
            # if not, use cache (if already populated)
            cls._OBJECTS = cls._OBJECTS or cls.Collection(cls._BACKEND, cls)

        return cls._OBJECTS

@muhrin
Copy link
Contributor Author

muhrin commented Nov 20, 2018

So I created a working version in my branch collection_and_user_caching: https://github.com/muhrin/aiida_core/blob/collection_and_user_caching/aiida/orm/entities.py

Sorry to not update you. It works, however it exposes one problem I haven't figured out how to deal with which is that the default User gets deleted at the end of a test, however, it is still cached and so points to a non-existent user.

To answer your question, IMHO the criterion should be the backend instance reference, this is what ties everything together. It's perfectly legitemate to load, e.g., two Django backends that point to different databases.

I would suggest that we proceed as follows:

  1. Merge my branch into yours.
  2. Figure out/decide what to do for the user deletion issues (and indeed deletion in general)

@ltalirz
Copy link
Member

ltalirz commented Nov 26, 2018

@muhrin
I've merged your branch and merged provenance_redesign back into it.
You have write access to my fork, so you can go ahead and continue directly there if you like
https://github.com/ltalirz/aiida_core/tree/speedup-node

Also updated the PR description to describe what it now contains
#2214
(of course the caching of the default user is broken now again)

@sphuber
Copy link
Contributor

sphuber commented Dec 3, 2018

Fixed in PR #2214

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants