-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix unsafe usage in Decode #1097
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ A review job has been created and sent to the PullRequest network.
@jarifibrahim you can click here to see the review status or cancel the code review job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I don't use arrays in go often, so it's interesting to see how you deal with them here.
Reviewed with ❤️ by PullRequest
I can confirm this fixes the issue for this test, unfortunately now it fails somewhere else. --- FAIL: TestTruncateVlogWithClose (0.06s)
panic: runtime error: unsafe pointer conversion [recovered]
panic: runtime error: unsafe pointer conversion
goroutine 53572 [running]:
testing.tRunner.func1(0xc0003a3e00)
/home/francesc/src/github.com/golang/go/src/testing/testing.go:881 +0x3a3
panic(0xa43860, 0xc088c4b760)
/home/francesc/src/github.com/golang/go/src/runtime/panic.go:679 +0x1b2
github.com/dgraph-io/badger.(*valuePointer).Decode(...)
/home/francesc/src/github.com/dgraph-io/badger/structs.go:42
github.com/dgraph-io/badger.Open(0xc087f24f80, 0x19, 0xc087f24f80, 0x19, 0x1, 0x2, 0x2, 0x1, 0x100, 0xb5e4c0, ...)
/home/francesc/src/github.com/dgraph-io/badger/db.go:350 +0x16c5
github.com/dgraph-io/badger.TestTruncateVlogWithClose(0xc0003a3e00)
/home/francesc/src/github.com/dgraph-io/badger/db2_test.go:72 +0x416
testing.tRunner(0xc0003a3e00, 0xab3c18)
/home/francesc/src/github.com/golang/go/src/testing/testing.go:916 +0xc9
created by testing.(*T).Run These two errors might be unrelated. If that's the case I'd say you should merge the PR and create a new issue for the new failure. |
@campoy the errors are related. I've fixed them both in this PR. Do you think we should add the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to check if the benchmarks are still in favor of this unsafe casting, because of the additional copy.
If benchmarks are good, then merge. Otherwise, keep looking for a solution. This is NOT a blocker for v2.0 release.
Reviewable status: 0 of 3 files reviewed, 3 unresolved discussions (waiting on @ashish-goswami, @campoy, @jarifibrahim, and @manishrjain)
structs.go, line 44 at r2 (raw file):
// tempBuf will be allocated on the stack and so it shouldn't // cause significant performance degradation. tempBuf := make([]byte, vptrSize)
var tmpBuf [vptrSize]byte
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I have a better solution that avoids the additional copy.
As far as I understand, the problem currently is that header
contains two uint16
both of which needs to 2 byte aligned (i.e. the start address of both uint16 should be such that address mod 2 = 0
). When we typecast the existing buffer into *header
, that may not be always true. And, hence the test fails. Our current solution does the following:
- It copies the data into a temporary buffer which will be 8 byte aligned on 64 bit (and hence, on typecast, both uint16 will be 2 byte aligned)
- Then it copies the temporary buffer at the header pointer
The direct copy won't work because are typecasting misaligned buffer into header
. To avoid the additional copy, we need to just avoid the typecast into *header
and goal is simply to copy the memory in buffer at the header pointer which is already aligned.
The solution here is to typecast the header pointer into an [4]byte
and then perform a direct copy from buffer to typecast header pointer, something like this:
func (h *header) Decode(buf []byte) {
copy((*(*[headerSize]byte)(unsafe.Pointer(h)))[:], buf)
}
I have tested this solution locally and the compiler doesn't complain any more.
Reviewable status: 0 of 3 files reviewed, 3 unresolved discussions (waiting on @ashish-goswami, @campoy, and @jarifibrahim)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The patch that works for me:
diff --git structs.go structs.go
index 2d396b5..773d452 100644
--- structs.go
+++ structs.go
@@ -13,6 +13,8 @@ type valuePointer struct {
Offset uint32
}
+const vptrSize = unsafe.Sizeof(valuePointer{})
+
func (p valuePointer) Less(o valuePointer) bool {
if p.Fid != o.Fid {
return p.Fid < o.Fid
@@ -27,8 +29,6 @@ func (p valuePointer) IsZero() bool {
return p.Fid == 0 && p.Offset == 0 && p.Len == 0
}
-const vptrSize = 12
-
// Encode encodes Pointer into byte buffer.
func (p valuePointer) Encode() []byte {
b := make([]byte, vptrSize)
@@ -39,7 +39,7 @@ func (p valuePointer) Encode() []byte {
// Decode decodes the value pointer into the provided byte buffer.
func (p *valuePointer) Decode(b []byte) {
- *p = *(*valuePointer)(unsafe.Pointer(&b[0]))
+ copy((*(*[vptrSize]byte)(unsafe.Pointer(p)))[:], b)
}
// header is used in value log as a header before Entry.
diff --git table/builder.go table/builder.go
index 134d58f..9b58a6f 100644
--- table/builder.go
+++ table/builder.go
@@ -44,6 +44,8 @@ type header struct {
diff uint16 // Length of the diff.
}
+const headerSize = uint16(unsafe.Sizeof(header{}))
+
// Encode encodes the header.
func (h header) Encode() []byte {
var b [4]byte
@@ -52,16 +54,10 @@ func (h header) Encode() []byte {
}
// Decode decodes the header.
-func (h *header) Decode(buf []byte) int {
- *h = *(*header)(unsafe.Pointer(&buf[0]))
- return h.Size()
+func (h *header) Decode(buf []byte) {
+ copy((*(*[headerSize]byte)(unsafe.Pointer(h)))[:], buf)
}
-const headerSize = 4
-
-// Size returns size of the header. Currently it's just a constant.
-func (h header) Size() int { return headerSize }
-
// Builder is used in building a table.
type Builder struct {
// Typically tens or hundreds of meg. This is for one single file.
diff --git table/iterator.go table/iterator.go
index 4bbbc89..118bd66 100644
--- table/iterator.go
+++ table/iterator.go
@@ -87,7 +87,7 @@ func (itr *blockIterator) setIdx(i int) {
itr.key = append(itr.key[:itr.prevOverlap], itr.baseKey[itr.prevOverlap:h.overlap]...)
}
itr.prevOverlap = h.overlap
- valueOff := headerSize + int(h.diff)
+ valueOff := headerSize + h.diff
diffKey := entryData[headerSize:valueOff]
itr.key = append(itr.key[:h.overlap], diffKey...)
itr.val = entryData[valueOff:]
Reviewable status: 0 of 3 files reviewed, 3 unresolved discussions (waiting on @ashish-goswami, @campoy, and @jarifibrahim)
I agree that @mangalaman93's solution seems more efficient (although quite ugly?) If that one fixes the warnings let's use it and avoid the performance hit. |
@campoy Such conversions are pretty common in badger as far as I know. Let me know which part of the solution you thought could be improved/avoided. |
@mangalaman93 Thank you so much. The patch seems to work 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 0 of 3 files reviewed, 4 unresolved discussions (waiting on @ashish-goswami, @campoy, and @jarifibrahim)
structs.go, line 42 at r3 (raw file):
// Decode decodes the value pointer into the provided byte buffer. func (p *valuePointer) Decode(b []byte) { copy((*[vptrSize]byte)(unsafe.Pointer(p))[:], b)
I see one dereference missing (* missing). I guess compiler will take care of it. Also, see if it makes sense to add a test where length of b is less than 4 and everything acceptable follows afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 3 of 3 files at r3.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @ashish-goswami, @campoy, and @jarifibrahim)
structs.go, line 42 at r3 (raw file):
Previously, mangalaman93 (Aman Mangal) wrote…
I see one dereference missing (* missing). I guess compiler will take care of it. Also, see if it makes sense to add a test where length of b is less than 4 and everything acceptable follows afterwards.
Also, add a comment on why we are doing it this way now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✖️ This code review was cancelled. See Details
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dismissed @manishrjain and @pullrequest[bot] from 3 discussions.
Reviewable status: 1 of 3 files reviewed, 1 unresolved discussion (waiting on @ashish-goswami, @campoy, and @mangalaman93)
structs.go, line 44 at r2 (raw file):
Previously, manishrjain (Manish R Jain) wrote…
var tmpBuf [vptrSize]byte
Done.
structs.go, line 42 at r3 (raw file):
I see one dereference missing (* missing). I guess compiler will take care of it.
@mangalaman93 We've already deferenced the pointer with the first *
, do we still need to second one?
Also, add a comment on why we are doing it this way now.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 3 files reviewed, 1 unresolved discussion (waiting on @ashish-goswami, @campoy, @jarifibrahim, and @mangalaman93)
structs.go, line 42 at r3 (raw file):
Previously, jarifibrahim (Ibrahim Jarif) wrote…
I see one dereference missing (* missing). I guess compiler will take care of it.
@mangalaman93 We've already deferenced the pointer with the first
*
, do we still need to second one?Also, add a comment on why we are doing it this way now.
Done.
You are typecasting the pointer into the type *[4]byte
, and then converting the array pointer to slice using [:]
. There is no pointer dereference in the code. I'm not sure but because tests pass, that's probably because you could index an array using a pointer to array as well as using the array directly. Still, check that once. And add a bracket around all the typecast code before [:]
to make it clear what's going on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: 1 of 3 files reviewed, 1 unresolved discussion (waiting on @ashish-goswami, @campoy, and @mangalaman93)
structs.go, line 42 at r3 (raw file):
Yeah. Got it. I thought we've deferenced it.
I'm not sure but because tests pass, that's probably because you could index an array using a pointer to array as well as using the array directly. Still, check that once.
Yes, that's correct. I checked it using the following code.
func main() {
var b, c *[4]byte
a := []byte("1111")
b = new([4]byte)
copy(b[:], a)
c = new([4]byte)
copy((*c)[:], a)
fmt.Println(a, b, c)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r4.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @ashish-goswami, @campoy, and @mangalaman93)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @ashish-goswami and @campoy)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @campoy and @jarifibrahim)
structs.go, line 32 at r4 (raw file):
} const vptrSize = unsafe.Sizeof(valuePointer{})
Adding a comment explaining why we are calculating size this way (and not by simple addition) might make sense.
table/builder.go, line 47 at r4 (raw file):
} const headerSize = uint16(unsafe.Sizeof(header{}))
same here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be good to do some benchmarks just to prove to ourselves that the performance is the same.
Reviewed 1 of 3 files at r3, 2 of 2 files at r4.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @campoy and @jarifibrahim)
structs.go, line 59 at r4 (raw file):
func (p *valuePointer) Decode(b []byte) { // Copy over data from b into p. Using *p=unsafe.pointer(...) leads to // pointer alignment issues. See https://github.com/dgraph-io/badger/issues/1096
Ideally, another comment saying the solution is based on some link "...".
We use unsafe pointers to convert slices to structs. It might be possible that the struct and byte slice are not aligned in the same way. This PR fixes the alignment issues
The command
no longer complains about unsafe pointer conversion.
The changed implementation is as fast as the existing one.
Fixes #1096
This change is