-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve QAT issues with incompressible data #7338
Conversation
@wli5 Would you mind taking a look at this when you get a chance? |
I think it is not efficient as the memory copy is always needed. A better way might be to allocate a temporary "addition" buffer (add_start), and send it to qat_compress as separate parameter, e.g.: Please note, the dest buffer allocated by ZFS is 87.5% of source buffer size:
So, if the data is compressed to less than 87.5% of source data, we return the compressed buffer. otherwise, we just delete the temp addition buffer, and return the source buffer (no-compress). |
@wli5 The PR has been updated. Is this close to what you had in mind? Tested with compressible and incompressible data and there was a slight improvement to average throughput for compressible data when compared to the previous iteration (210MB/s -> 214MB/s). |
module/zfs/gzip.c
Outdated
if (dstlen < d_len) { | ||
return ((size_t)dstlen); | ||
} else { | ||
if (d_len == s_len) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this check can be removed.
If the QAT returns success, but the output length (dstLen) is larger or equal to the d_len (87.5% of source length), we just copy the source buffer to dest buffer and return the source length, so ZFS knows the data is not compressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did this to match the check below ikn the software implementation. When I didn't have it I ended up getting GPFs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, it's different option - we can do the overflow check inside qat_compress function, that is what you've done, that's fine.
Another option I suggested is, not check inside the function, but check the output length in the calling side to decide if we skip the compression (e.g., copy source buffer to dest).
All can be working. Please keep your current implementation if you like, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And, can you please add one comment in the function, saying if it overflow the dest length will be set to the source length ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Ill add the comment.
@@ -379,6 +413,7 @@ qat_compress(qat_compress_dir_t dir, char *src, int src_len, | |||
|
|||
compressed_sz = dc_results.produced; | |||
if (compressed_sz + hdr_sz + ZLIB_FOOT_SZ > dst_len) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this check tries to make sure there is enough space in dest buffer for the compressed data and gzip head + foot, "add_len" should be added. Actually I'm now thinking maybe we can remove this check safely, as adding the addition buffer, the dest size is large enough to contain all output in theory, it is 1.75x of source buffer now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd rather keep this in in case this code gets used in other contexts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add add_len
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to do that because in my comment I say that the add_buf
is a scratch buffer that can be discarded after the function is done, and if we leak into the add_buf we know we're over the useful limit anyway.
module/zfs/qat_compress.c
Outdated
QAT_PHYS_CONTIG_FREE(buffer_meta_src); | ||
QAT_PHYS_CONTIG_FREE(buffer_meta_dst); | ||
QAT_PHYS_CONTIG_FREE(buf_list_src); | ||
QAT_PHYS_CONTIG_FREE(buf_list_dst); | ||
|
||
return (ret); | ||
return (status); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"status" initial value is success:
CpaStatus status = CPA_STATUS_SUCCESS
some fail branch will return success.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here was that CPA_SATUS_FAIL would indicate a real hardware failure whereas a src_len == dst_len
would indicate that compression simply wasn't worth it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously the function could return -1
but the calling code would check that it was equal to CPA_STATUS_SUCCESS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK - I see, but for some fail cases, it will return SUCCESS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense. Perhaps @behlendorf has a better idea of how we should convey this meaning?
module/zfs/gzip.c
Outdated
void *add_buf = zio_data_buf_alloc(add_buf_len); | ||
|
||
ret = qat_compress(QAT_COMPRESS, s_start, s_len, d_start, | ||
d_len, add_buf, add_buf_len, &dstlen); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the caller never needs to access this additonal space it's aukward to force it to do the allocation. It really shouldn't need to worry about this when it can all be handled internally in the qat code. How about renaming qat_compress()
to qat_compress_impl()
and then have qat_compress()
be a wrapper function which handles allocating the needed memory for QAT_COMPRESS before calling qat_compress_impl()
.
This would then allow you to move the more involved status checking about success/failure/compressed size all in to the wrapper function. Callers qat_compress()
wouldn't then need to deal with it at all and could using the existing interface.
@wli5 and @behlendorf Changes have been implemented as requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment here is a little bit misleading. Actually a d_len
12.% smaller than the s_len
is passed to qat_compress()
.
Thanks for refactoring this to make the code clearer. I can't test this locally but it looks right to me from inspection.
return ((size_t)dstlen); | ||
/* if hardware compress fail, do it again with software */ | ||
} else if (ret == CPA_STATUS_INCOMPRESSIBLE) { | ||
if (d_len != s_len) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case is always true today because of how zio_compress_data
calls the compression function. But I agree we should keep so if we make the shift configurable as module option it will work. The same dead conditional exists in the software compression case so at least it's consistent.
module/zfs/qat_compress.c
Outdated
return (status); | ||
} | ||
|
||
/* Entry point for QAT accelerated compression / decompression. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: function header block comments should all be multi-line even they fit on one line.
/*
* Entry point for QAT accelerated compression / decompression
*/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, thanks for making the changes!
Codecov Report
@@ Coverage Diff @@
## master #7338 +/- ##
==========================================
- Coverage 76.35% 76.31% -0.05%
==========================================
Files 329 329
Lines 104191 104189 -2
==========================================
- Hits 79560 79510 -50
- Misses 24631 24679 +48
Continue to review full report at Codecov.
|
Currently, when ZFS wants to accelerate compression with QAT, it passes a destination buffer of the same size as the source buffer. Unfortunately, if the data is incompressible, QAT can actually "compress" the data to be larger than the source buffer. When this happens, the QAT driver will return a FAILED error code and print warnings to dmesg. This patch fixes these issues by providing the QAT driver with an additional buffer to work with so that even completely incompressible source data will not cause an overflow. This patch also resolves an error handling issue where incompressible data attempts compression twice: once by QAT and once in software. To fix this issue, a new (and fake) error code CPA_STATUS_INOMPRESSIBLE has been added so that the calling code can correctly account for the difference between a hardware failure and data that simply cannot be compressed. Signed-off-by: Tom Caputi <tcaputi@datto.com>
Currently, when ZFS wants to accelerate compression with QAT, it
passes a destination buffer of the same size as the source buffer.
Unfortunately, if the data is incompressible, QAT can actually
"compress" the data to be larger than the source buffer. When this
happens, the QAT driver will return a FAILED error code and print
warnings to dmesg. This patch fixes this issue by allocating a
larger temporary buffer for QAT to use before copying the data to
the final destination if it was actually compressed.
Signed-off-by: Tom Caputi tcaputi@datto.com
How Has This Been Tested?
There were concerns about how the extra buffer copy would effect performance. I ran 2 tests, one with incompressible data (from /dev/urandom) and one with compressible data (from the yes executable). Each test wrote to a test dataset with
compression=gzip
and all other settings at their defaults:Incompressible data
During this test, CPU usage for EACH of the
z_wr_iss
threads was routinely at about 70% without QAT and <1% with QAT.Compressible data
During this test, CPU usage for each of the
z_wr_iss
threads was routinely at about 20% without QAT and <1% with QAT.TL:DR: It looks like even with this change the benefits of
Types of changes
Checklist:
Signed-off-by
.