Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nop-write for weak hashing algorithm (with verify) #16686

Open
imwhocodes opened this issue Oct 25, 2024 · 5 comments
Open

nop-write for weak hashing algorithm (with verify) #16686

imwhocodes opened this issue Oct 25, 2024 · 5 comments
Labels
Type: Feature Feature request or new feature

Comments

@imwhocodes
Copy link

imwhocodes commented Oct 25, 2024

Describe the feature would like to see added to OpenZFS

As referenced in the manual nop-write is enabled only when a strong cryptographic algorithm for block hashing is used
The rationale is good, with a weak algorithm there is a higher probability of a hash collision, so a hash check is not a strong indicator of the underlying data to be the same

What I'm proposing it to extend the module configuration, something like zfs_nopwrite_enabled=2 (with value = 2, similar to MRUinL2) , so that when a weak hash algorithm is used (like fletcher4 that is the default ) the hash is check and if a collision is found the actual data is compared, if both hash and data are the same the write become a nop-write

How will this feature improve OpenZFS?

This will allow space saving (and better write performance) if the same file is copied over (on top of himself, the scope of this is different from BRT) multiple time and you have snapshot enabled (as many do)
Currently you can have this behaviour only if either Dedup or a strong-cryptographic-hash is used and both modes can became memory and compute expensive

Additional context

As far I understand this will be a non-breaking change and should be fully backward and forward compatible
On the verify part this already happen in the Dedup-Pipeline if '<hash_algorith>,verify' is used

In my humble opinion this can have major space saving and write performance gain (in the specific case of file being overwritten of course) with a trivial and non-breaking change implementation

As the configurability of it another option it to be symmetric to how Dedup is configured with ,verify appended to the chosen algorithm (checksum=<hash_algorith>,verify) becoming a per-dataset parameter like dedup=<hash_algorith>,verify (this also imply that you can force data verification on strong hash algorithms as Dedup can do, but I don't know if a new per-dataset parameter would change the on-disk-format)

And thank you for all the hard-work you put in this project for the last so-many-years

@imwhocodes imwhocodes added the Type: Feature Feature request or new feature label Oct 25, 2024
@amotin
Copy link
Member

amotin commented Oct 25, 2024

I've never closely tested nop-write myself, but I am a little worried about performance effect of synchronous per-block reads used to verify the data match. Sure the benefits in case of match are good, but latency of that read on busy HDD pool might be large, and for all that time the ZIO write thread would be blocked, and we have a very finite number of them. May be if zio_ddt_collision() could be made asynchronous... And if after that read we find mismatch, we'll still have to do the original write.

@imwhocodes
Copy link
Author

Does this imply that dedup=<lago>,verify is doing a synchronous/blocking read for each write?
(I was of course excepting a read to happen, but I was thinking I will be pooled with all other normal reads without stalling the writing thread)

@amotin
Copy link
Member

amotin commented Oct 25, 2024

@imwhocodes If verify is enabled, then yes. At least that is how I read the code. I haven't used verify myself. See in zio_ddt_collision():

<------><------><------>error = zio_wait(zio_read(NULL, spa, &blk, tmpabd,
<------><------><------>    psize, NULL, NULL, ZIO_PRIORITY_SYNC_READ,
<------><------><------>    ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE |
<------><------><------>    ZIO_FLAG_RAW, &zio->io_bookmark));

and

<------><------><------>error = arc_read(NULL, spa, &blk,
<------><------><------>    arc_getbuf_func, &abuf, ZIO_PRIORITY_SYNC_READ,
<------><------><------>    ZIO_FLAG_CANFAIL | ZIO_FLAG_SPECULATIVE,
<------><------><------>    &aflags, &zio->io_bookmark);

@imwhocodes
Copy link
Author

So the implementation is not a trivial task anymore, correct?
(Maybe moving the whole execution of the read and the possible future write to a work-queue is possible?)

IMHO if a blocking-read is acceptable for a verified-dedup it should be also acceptable for a nop-write (in case of collision with weak algorithm)

@imwhocodes
Copy link
Author

@allanjude sorry to bother you
As you worked extensively on the Dedup Feature (for the upcoming Fast-Dedup feature):
Do you think is it possible to make zio_ddt_collision() async?

Thanks,
Luca

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Feature Feature request or new feature
Projects
None yet
Development

No branches or pull requests

2 participants