-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decryption failed - page zero has wrong checksum #5810
Comments
Hi @BlueCobold Can you send the Realm file to realm-help@mongodb.com so we can investigate? The latest version of Realm (10.25.1) contains a fix that should not let this happen again in the future. |
I submitted the file in question. |
@jedelbo successfully recovered the Realm file, @BlueCobold I have sent it to you via email. |
Super awesome! The customer will be very happy and so am I. I'll upgrade all app versions out there to realm 10.25.1 and hope for the issue to never return. Thanks! |
Forensic report: Checksum failed: 0x90000 0x90000 expected: 0x93 actual: 0x92 Checksum failed: 0xa0000 Checksum failed: 0xa8000 Checksum failed: 0x138000 Restore old IV: 0x18c000 Restore old IV: 0x198000 Restore old IV: 0x1a0000 Restore old IV: 0x1a8000 Restore old IV: 0x1ac000 In spite there were checksum errors the content seemed to be consistent except for the 2 cases where a byte value was not as expected. After changing the values back, the file was consistent. |
@tgoyne the fact that it is the first byte in a 4k block that is modified, does it make us any wiser? An why does the checksum differ if the content apparently is ok? |
Could possibly be an out-of-bounds write somewhere? The first byte in a buffer is the thing that'll be overwritten if some other piece of code has an off-by-one error when writing to something that happens to land immediately before that buffer in memory. The hmac and actual page data are stored in separate blocks of memory so corrupting one but not the other wouldn't be hard to have happen. If that is actually the problem I'm not sure what action we can really take. Reread all the encryption code and hope to spot something suspicious that could be writing one past the end? I think the use of MAP_ANONYMOUS for the decrypted buffers unfortunately means that asan doesn't work for them, and it might not even be a bug in our code. |
The issue has returned. Again, I have a customer with a database that cannot be decrypted. Since this is on Android, I don't have a proper native stack trace and can only assume it is related to the same incorrect checksum in the native code both systems are based on. I can provide the realm-file, so you can check if it's the same problem. The customer's app version is using the latest Android-Realm implementation, which uses the same native code as Realm-Swift 10.25.1, from what I understand. No migration was involved when the realm file got corrupted. |
@BlueCobold it would be nice if we would have the possibility to check the realm file to see if the corruption is similar to the first one. |
The customer stopped replying and stopped using my app. So I'm afraid, I cannot provide the file. |
@jedelbo I submitted another customer's realm file with the same symptoms to realm-help@mongodb.com for analysis. |
Using the decrypt-tool in the exec directory, I'm getting the following output: So looks like the first block has issues. The resulting output file is unusable. I have no idea how to get the "actual" and "expected" values that @jedelbo printed in his report, or how to correct possibly faulty bytes to see if the remaining file would be operational. The ticket-bot also seems not to flag this bug-report any longer accordingly. @leemaguire |
In the meantime, I checked the decrypted content with a hex editor. Even the damaged first block contains readable strings and thus seems to be decrypted correctly. I imagine there's some header meta-data which is damaged and which makes the RealmBrowser/library believe the file to be still encrypted / unreadable. All other blocks after the first seem to be valid. There are a lot of blocks with readable strings and UUID-tables. From what I assume, the file can be recovered, but I still do not have gathered enough understanding of the internal data structure to make that happen by myself. |
I have restored the header with a reference to the top_ref and table_names_ref, but it seems the data is partly scrambled. Some objects have invalid strings which crash Realm when trying to load these objects. Some have fields set to null, which cannot be null (like object-UUIDs for example), but seem to be ok, if I only read this column/field in sequence for the entire table. |
In further deeper data analysis, I realised some realm-object-keys to be huge. Like '3,402,167,040,181,607,100'. How come they grew so large? Is it possible there's an issue with keys and they spill over at some point or something? Still guessing what could be the reason for badly written pages and wrongly aligned arrays. |
@BlueCobold I have been away on holiday, and did not see this until now. I can see that you have sent another file for analysis, but I am not sure which key to use for decrypting. |
I thought so. I have replied via email to send you the decryption-key. Did you receive it? |
To which email address should the key have been sent to? I have not received anything. |
Sorry, I thought there was a forwarded-reply feature on github-mails. Doesn't look like. I had sent the file and key to realm-help with my mail from 18.07., but I can send you another, including some findings so far - including the partly restored file-header. |
Great. To be sure that I receive it, you can also send it to jorgen.edelbo@mongodb.com |
The duplicated data starts originally at 0x98EC0, a valid array. And then is "duplicated" into the header, making the file unusable. |
Those are great findings. I am a bit embarrassed that I did not spot the zeroes. I hope it can help us further with this issue. It is very common to have duplicated data. Whenever some part of an array is modified, a new version of the array is created by copying the whole array. I will try to see if I can find the "true" top ref. |
Yea, I figured that much. It makes sense from a transaction perspective.
That would be great. Also, if you don't mind, I pointed out the very large object-keys for many objects above. (a few objects have two-digit-keys which seem to be auto-increment style, so the big ones make me wonder what's going on) Is it normal for objects to have such large keys or does that indicate a problematic way of using Realm? Can keys accidentally overflow or does Realm auto-detect free keys during object creation when the max value is reached? |
I found the following cluster-tree, related to table realm/realm-swift#10 at offset 0x1192A0: It contains a lot of very suspicious refs like 03000000, 05000000 or 15000000 |
What you have found here is the table top array. It contains both refs and numbers. If the entry has the LSB set (like 0x15) it is a number. You get the value by shifting down one bit so in this case it is 10, which matches table number 10. |
I am somewhat convinced that the first 24 bytes of the file should be
making the top ref 0x516c80 |
I am pretty sure that the problem is that the first 0x1000 bytes have been overwritten with a page that should have been written somewhere else. Unfortunately a lot of refs points into this area, so recreating meaningful data in this area would be some major puzzle. |
I am currently trying so solve this puzzle already by skipping invalid data. Table realm/realm-swift#10 seems to be majorly affected by it, but I could probably skip it. I "fixed" some other table entries already by detecting invalid string-offsets and nulling invalid references into the first 0x1000 bytes. It still means losing a lot of data that cannot be restored. My major concern is now to prevent this from happening again in the future by all means, because it affects not just one customer by now - I only have access to his file though, because the others didn't report to me, I just received their crash reports and bad customer feedback in the app/playstore. I don't know if I could accidentally have caused this myself, but from a developer perspective, using the API should never result in corrupt file like this. |
Table realm/realm-swift#10 is set as: |
0x088740 seems to be ok. 0xAC790 (the column names) are linked from index 1. It will be hard to guess how the cluster that should be at 0xcd8 should look like. |
Oh, my bad. I think found the array which contains the object-keys for table realm/realm-swift#10 at offset 0x11D8: |
I believe I also found the array which contains the "color" column of table realm/realm-swift#10 at offset 0x107BB8 Edit: Nope, I think this one is related to table realm/realm-swift#14, sadly. So maybe 'color' column for table realm/realm-swift#10 is lost, cause it should start with |
@jedelbo Another customer sent me a realm file which causes this when trying to write or delete a specific value to/from it and I worry it may be related: Cause: null pointer dereference backtrace: |
@BlueCobold It might be related, but the stack trace does not make us any wiser. |
@jedelbo Thought so, but I thought I provide what I can. Do you want that file for analysis? (It doesn't need recovery, I made a re-import of its data into a fresh one, but I can offer it to you if it may help to identify bugs.) |
@BlueCobold All files are welcome. Maybe it contains that piece of information that can help us further. |
There seems to be two kinds of problems related to this issue. One is that some refs are not updated correctly. This is probably happening above the encryption layer. Another problem is that an encrypted page is written in wrong location resulting in that the first page in the decrypted file contains data that should have been somewhere else. |
Sounds like some serious issue with multithreading then and/or with internal reference/pointer handling in realm_core. Doesn't it? |
@nicola-cab |
@nicola-cab @jedelbo |
How frequently does the bug occur?
Seen once
Description
A customer of my app reported suddenly being unable to launch my app. It terminates on first access of the database and it turns out that it is broken for some reason.
It might have broken during a realm migration, but this is uncertain. Newly created files work just fine. I might possibly be allowed to share the db file to a developer for analysis in private, but not in public. I tried to open it with Realm Studio as well and also tried upgrading to Realm 10.25.1, but the file still cannot be decrypted.
Stacktrace & log output
Can you reproduce the bug?
Yes, always
Reproduction Steps
The database file seems corrupted and cannot even be opened with Realm Studio. I cannot publicly share the file due to the user's privacy, but I might be able to send to a dev in private.
Version
10.10.0 (also tried 10.25.1)
What SDK flavour are you using?
Local Database only
Are you using encryption?
Yes, using encryption
Platform OS and version(s)
iOS 15.4.0, 15.4.1, 15.2.0, 15.2.1
Build environment
ProductName: macOS
ProductVersion: 12.0.1
BuildVersion: 21A559
/Applications/Xcode.app/Contents/Developer
Xcode 13.3.1
Build version 13E500a
/usr/local/bin/pod
1.10.0
Realm (10.10.0)
RealmSwift (10.10.0)
RealmSwift (= 10.10.0)
/bin/bash
GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin21)
(not in use here)
/usr/local/bin/git
git version 2.26.0
The text was updated successfully, but these errors were encountered: