Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add oshash support #667

Merged
merged 24 commits into from
Aug 6, 2020
Merged

Add oshash support #667

merged 24 commits into from
Aug 6, 2020

Conversation

WithoutPants
Copy link
Collaborator

Adds an oshash column to the database and makes checksum nullable. Added check to the scenes table to ensure at least one of the two are not null.

Introduces two new settings calculateMD5 and useMD5.

calculateMD5 indicates whether or not to calculate the MD5 checksum during scanning. oshash will always be calculated for scenes where it is not already present. Similarly, if calculateMD5 is true and the checksum is empty, then the MD5 will be calculated.

useMD5 indicates whether to use the MD5 checksum or the oshash for generated file naming and import/export naming.

For existing systems, these settings are both defaulted to true after migration if the scene table is not empty.

New systems have calculateMD5 and useMD5 set to false.

The system prevents changing useMD5 to false if there are any missing oshash values. Similarly, changing useMD5 to true will be prevented if there are any scenes without checksums. Finally, calculateMD5 is prevented from being set to false if useMD5 is true.

image

Adds hash to the file info panel in the UI. Missing hashes are not displayed.

image

Adds a manual migration task that renames existing generated files (not including exported metadata files) to the current naming format.

image

The oshash algorithm has been verified manually using the sample files from the opensubtitles website.

Resolves #351

@WithoutPants WithoutPants added the feature Pull requests that add a new feature label Jul 14, 2020
@WithoutPants WithoutPants added this to the Version 0.3.0 milestone Jul 14, 2020
pkg/utils/oshash.go Outdated Show resolved Hide resolved
@WithoutPants
Copy link
Collaborator Author

I think it's probably worth extending this to include galleries as well.

@bnkai
Copy link
Collaborator

bnkai commented Jul 19, 2020

My only concern for galleries would be how unique the hash is if we use zip files or files that can be <64k ( small zip file with a single image, or plain image in the future). I think it was mainly tested with video files.

@WithoutPants
Copy link
Collaborator Author

Yeah, you're right. Might be better leaving it off.

pkg/manager/manager_tasks.go Outdated Show resolved Hide resolved
pkg/manager/manager_tasks.go Outdated Show resolved Hide resolved
@bnkai
Copy link
Collaborator

bnkai commented Jul 26, 2020

It seems to work ok (excluding the sprite image renaming) both with an empty/existing db.
Initial oshash scan for my existing ~4TB, 11K scenes (stored on local HDDs) took ~6 minutes.
The name migration took < 5s ( nvme ssd)

@WithoutPants
Copy link
Collaborator Author

Merged in from develop branch. There was quite a bit of conflicting changes, so may need a retest.

Fixed up the logging and status stuff identified. Also fixed a nasty panic when running clean if a file is not accessible.

pkg/manager/task_scan.go Outdated Show resolved Hide resolved
@bnkai
Copy link
Collaborator

bnkai commented Aug 2, 2020

Retested seems ok.

@bnkai
Copy link
Collaborator

bnkai commented Aug 5, 2020

I finished retesting. It wasnt an exhaustive test though, more of a check for regressions due to refactoring test. Looks ok.

For users with existing databases i think a step by step migration guide could be useful.
Eg.

  1. Scan library ( updates the oshash for existing entries)
  2. Untick UseMD5
  3. Choose oshash as the naming scheme
  4. Migrate

@WithoutPants WithoutPants merged commit 5992ff8 into stashapp:develop Aug 6, 2020
Tweeticoats pushed a commit to Tweeticoats/stash that referenced this pull request Feb 1, 2021
@WithoutPants WithoutPants deleted the oshash branch February 4, 2021 03:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Pull requests that add a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RFC] Add an extra "fast" hash/checksum for the scenes. Make MD5 scene calculation optional.
2 participants