-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a metadata service to store file metadata #31839
Conversation
CarlSchwan
commented
Apr 4, 2022
•
edited
Loading
edited
- requires update icewind/searchdav to add api required for metadata feature 3rdparty#1023
- use new release instead of patch
7879192
to
9586920
Compare
Just curious, is searching for files by metadata something in scope? I could imagine this a useful scenario for future enhancements of the photos app to search by location for example, as with JSON-only metadata db field that would not scale. |
Unfortunately, this seems to be a bit tricky. If we want to search photos by location in an efficient way we need more than just the simple GPS coordinate but we need to map this to a geographic area. This is a bit tricky to implement but we can inspire outself from something similar I have in Koko: https://invent.kde.org/graphics/koko/-/blob/master/src/reversegeocoder.cpp#L30 (this requires to be done in a background job as loading the data is quite expensive already in pure C/C++ with an optimized kdtree data structure) This geographic area should probably be stored in a separate table e.g. https://invent.kde.org/graphics/koko/-/blob/master/src/imagestorage.cpp#L132 But the biggest issue is that while this is fairly simple to implement in the one user use-case, for a Nextcloud we need to retrieve the list of locations or the list of photos in one location on a per user basis and this is really tricky unless we limit ourself only to files stored in the user storage (so no group folders, no sharing, no circles ,...). @PVince81 @AndyScherzinger Would this is be ok? In that case, we could adopt the following DB structure for location information. CREATE TABLE oc_file_metadata (
id INTEGER,
group TEXT,
data TEXT, # nullable
foreign INTEGER, # nullable
UNIQUE(id, group)
)
CREATE TABLE oc_location (
id INTEGER PRIMARY KEY,
country TEXT,
state TEXT,
city TEXT,
UNIQUE(country, state, city)
) The PHP API needs to be expanded to describe if the data is directly retrievable in the JSON data column, or to be joined with another table (e.g for locations with the oc_location table). To be able only to retrieve the locations for one user, it should be possible to join with oc_storage and oc_filecache to get only the ids of files from the user. This might still be expensive when a user has a log of photos but I don't really see a better way to do it, unfortunately :( |
I think the location data needs a more detailed discussion, so from my perspective we can skip that for now. The only thing we "need" for 24 is height/width for the Clients to implement a different layout for their media views. (kinda like a gallery). So if there if leaving location out doesn't create a future blocker just skip it for now. Because this needs a lot more thinking from my perspective because with with/height it is just extraction+exposure, for anything else it is likely more than that, like "places" (like you also outlined), "people" only other simple info I can come up with would be certain exif data (like exposure, aperture, cam-model, lense) but since we also don't make use of that for now, this can also be skipped. So from my point of view for v24 - keep it simple by limiting it to width/height only. What do you think @PVince81 ? |
yes, keep it simple for now. if we need to cache/compute more data for GPS search later, it will be a separate task anyway |
had a quick chat with @CarlSchwan about the location data:
We both agreed that 2) is a better approach and will remove the need to add anything else to oc_file_metadata, thus deblocking this PR. |
@CarlSchwan can I test this somehow already? |
Just run the branch and you should be able to see the metadata with the PROPFIND/SEARCH parameter. Unfortunately, the metadata regeneration job is not implemented yet and is task for another PR (probably not 24.0.0) so you need to handle the case where the metadata are not available |
So it might be that clients receive e.g. 500 images, but only 200 images have a valid metadata. @jancborchardt ^ as we spoke in our call about it, but I have no idea how to display this hybrid view |
@CarlSchwan @PVince81 btw what is the reason for the metadata capability being optional? Customer requirement (don’t see that in the customer issue), not running on all setups, or something else? |
it's enabled by default. |
Also we want to support in the future the possibility for apps to provide their own metadata extractor and dynamically indicate in the capability that metadata is available for each mimetype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
please see my one comment about the interface
1df6a0a
to
e5b79f1
Compare
@CarlSchwan if there is none metadata, "[ ]" is returned, but with metadata it is: Json parser gets a hiccup here, as it expects "{}" in case of empty. |
e5b79f1
to
3b1d4d1
Compare
Signed-off-by: Carl Schwan <carl@carlschwan.eu>
3b1d4d1
to
09bf9cf
Compare
@tobiasKaminsky this should be fixed. Strange that the json parser doesn't support [] :( |
09bf9cf
to
8cb3094
Compare
8cb3094
to
7fdad3f
Compare
And update autoloader Signed-off-by: Carl Schwan <carl@carlschwan.eu>
7fdad3f
to
1c7ecfc
Compare