-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
incr.comp.: Don't encode Fingerprint values with leb128. #45875
Comments
I’d like to take that on, since it seems self-contained and at least I know the Fingerprint type :laugh: |
@Xanewok Sure, let's see:
Let me know if you need more info to get started. |
It would also be great to get some numbers on how this affects the size of the |
Hi @Xanewok, are you still working on this? |
Sorry, couldn't find more time to work on this. I'm afraid it's not going to change until the holidays, so @wesleywiser it's all yours if you want to work on this! |
@wesleywiser, how is it going? Did you make any progress? |
@michaelwoerister I made some progress over the weekend. I've added methods to the |
For these methods we assume that we already know how many bytes we want to read or write, so you don't have to encode the length. I suggest making the parameter for the write method
These methods will use the leb128 encoding, which is what we want to avoid (because we know that fingerprints are random values and thus won't profit from a variable-length encoding). |
Thanks @michaelwoerister, that's really helpful! I've implemented (Code is here and I'm invoking the compiler with |
@wesleywiser, try to remove |
Thanks @michaelwoerister, that did it. |
@michaelwoerister I'm getting test failures from the incremental tests with my changes. From what I can tell, the new encode functions are never being called during the tests, only the decode functions are. Since the encode functions aren't being called, the decode is failing since the serialized format has changed. Is there additional test data somewhere I need to update or am I missing something? (My code is here FYI) |
I spot a bug in the decoding function: pub fn read_raw_bytes(&mut self, s: &mut [u8]) -> Result<(), String> {
let len = s.len();
self.position += len;
s.copy_from_slice(&self.data[0..len]);
// ^^^ always reads from the beginning, not from `position`
Ok(())
} Should be something like: pub fn read_raw_bytes(&mut self, s: &mut [u8]) -> Result<(), String> {
let start = self.position;
let end = self.position + s.len();
s.copy_from_slice(&self.data[start .. end]);
self.position = end;
Ok(())
}
Are you sure about that? The setup looks correct to me. |
Ah, yes, you're right! Good catch.
I'm fairly sure of that because I added some logging statements into those functions and then captured the test output. I can see no instances of the encode functions being called based on that. I'll fix the bug you found and re-run the tests though and see if this changes anything. |
Ok. It's still failing with the same issue. I've pushed a commit that shows the logging I added to the encoder functions. From the failing test output, I can see that nothing is being logged to |
I'll look into it. |
OK, I see what the problem is: We have two kinds of encoders that have to handle fingerprints, the one for writing crate metadata and the one for writing the incr. comp. on-disk-cache. Both of these use an In order to avoid this problem, we have to provide specializations for In order to make sure that we don't accidentally hit the |
Thanks! That did the trick! |
This saves the storage space used by about 32 bits per `Fingerprint`. On average, this reduces the size of the `/target/{mode}/incremental` folder by roughly 5%. Fixes rust-lang#45875
…elwoerister [incremental] Specialize encoding and decoding of Fingerprints This saves the storage space used by about 32 bits per `Fingerprint`. On average, this reduces the size of the `/target/{mode}/incremental` folder by roughly 5% [Full details here](https://gist.github.com/wesleywiser/264076314794fbd6a4c110d7c1adc43e). Fixes #45875 r? @michaelwoerister
Since
Fingerprint
values have roughly random distribution, most of them will not profit from being stored in a variable-length encoding:Fingerprint
will take up around 160 bits, so we are even wasting space.We should not do that.
UseSpecializedEncodable
andUseSpecializedDecodable
might be the way to circumvent the standard encoding methods of theopaque::Encoder
.The text was updated successfully, but these errors were encountered: