Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import code is too slow when there are a large number of albums #934

Closed
RhetTbull opened this issue Jan 18, 2023 · 1 comment
Closed

Import code is too slow when there are a large number of albums #934

RhetTbull opened this issue Jan 18, 2023 · 1 comment
Labels
bug Something isn't working

Comments

@RhetTbull
Copy link
Owner

  After reviewing the code, I'm almost certain the issue is going to be in this line:
 report_record.albums = [a.title for a in photo.albums]

The issue is that this line is retrieving the albums a photo belongs to and that's a very inefficient process that scales with the number of albums and folders in a user's library. Apple Photos does not provide a way to do this in the app itself, nor in the AppleScript interface nor in the native PhotoKit API. osxphotos is using PhotoScript which is a python wrapper I wrote around the Photos AppleScript interface. Because Photos doesn't provide a way to determine the albums a photo is in, the only method available is to get a list of all albums and query each one to see if the photo is contained in the album. In Mojave and earlier, Photos would return all albums in the Library when queried. In Catalina and later, Photos only returns top-level albums which means that for albums contained in folders, the code needs to walk the entire folder/album tree to find every album. In AppleScript, this is very slow, even on a modern M1 Mac.

For "static" commands like export, osxphotos reads the Photos SQL database to extract the album information but this isn't done in import because the osxphotos design requires reading the entire database and this would be horribly slow to do for every imported photo.

I can think of three alternatives to fix this issue:

  1. Don't report the album information (avoid the slow code) -- currently done only for reporting purposes.
  2. Create a dynamic SQL query that extracts just the specific album information for each photo as opposed to loading the entire database like osxphotos currently does. I have a design for a rewrite of the osxphotos core database code to do this but will take some time to implement.
  3. For osxphotos import specifically, keep track of the data for the imported photo so we don't need to inspect it later for reporting. This is probably the easiest to implement for this use case.

I guess the 4th option is to do nothing and live with the very slow import code. For now, I'll take a look at what can be done to implement #3 and continue to think about the design of #2 which has much broader uses beyond this use case. For example, it would allow you to query the Photos database on non-Mac devices as it would rely only on SQL queries unlike the current osxphotos code which is an ugly mix of SQL, python, AppleScript, and Objective-C.

Originally posted by @RhetTbull in #906 (reply in thread)

RhetTbull added a commit that referenced this issue Jan 18, 2023
@RhetTbull RhetTbull added the bug Something isn't working label Jan 20, 2023
RhetTbull added a commit that referenced this issue Jan 20, 2023
RhetTbull added a commit that referenced this issue Jan 21, 2023
* Working on slow import, #934

* Tests working for #934

* Removed unnecessary report record updates
@RhetTbull
Copy link
Owner Author

This is now fixed in v0.56.4.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant