Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix data race in regular expression processing #4065

Merged
merged 1 commit into from
Oct 2, 2019
Merged

Conversation

mangalaman93
Copy link
Contributor

@mangalaman93 mangalaman93 commented Sep 26, 2019

Fixes #4030


This change is Reviewable

Copy link

@pullrequest pullrequest bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ A review job has been created and sent to the PullRequest network.


@mangalaman93 you can click here to see the review status or cancel the code review job.

Copy link

@pullrequest pullrequest bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Before committing, please check out my comment about two-dimensional slice copying -- apologies if I misunderstood the bug fix.


Reviewed with ❤️ by PullRequest

query/query3_test.go Outdated Show resolved Hide resolved
worker/task.go Show resolved Hide resolved
Copy link
Contributor

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: except for a couple of comments.

  1. add a comment to explain why the fix is needed and what the issue is. Otherwise it's unclear why we are doing this instead of the much simpler uids = arg.q.UidList.
  2. Make the formatting changes in a separate PR.

Reviewed 2 of 3 files at r1.
Reviewable status: 2 of 3 files reviewed, 2 unresolved discussions (waiting on @mangalaman93, @manishrjain, and @pawanrawal)


worker/task.go, line 964 at r1 (raw file):

	// If this is a filter eval, use the given uid list (good)
	case arg.q.UidList != nil:
		uids.Uids = append(arg.q.UidList.Uids[:0:0], arg.q.UidList.Uids...)

I would add a comment here explaining why we are doing this for future reference. Why we are doing this instead of just setting uids to arg.q.UidList will not be obvious to someone who lacks the context of this issue.

Copy link
Contributor

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 3 files at r1.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @mangalaman93, @manishrjain, and @pawanrawal)

Copy link
Contributor

@pawanrawal pawanrawal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @mangalaman93, @manishrjain, and @martinmr)


worker/task.go, line 964 at r1 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

I would add a comment here explaining why we are doing this for future reference. Why we are doing this instead of just setting uids to arg.q.UidList will not be obvious to someone who lacks the context of this issue.

I agree, please add a elaborate comment. It took me a while to understand the race here. Basically concurrent regex filters SubGraphs are sharing their SrcUid slice pointer which is passed to processTask in the form of UidList. In this case we mutate this pointer in algo.IntersectWith below and hence the race.

There might be other cases in this package where a similar thing might be happening which makes me wonder if we should just create a copy of SrcUids while passing to UidList in query.go and fix it all in one place.

Copy link
Contributor Author

@mangalaman93 mangalaman93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @manishrjain, @martinmr, and @pullrequest[bot])


worker/task.go, line 964 at r1 (raw file):

Previously, pullrequest[bot] wrote…

I don't have context on this change or know much about the code, but I'm assuming the race condition has to do with args.q.UidList being mutated elsewhere and this storing a pointer to the slice, instead of the slice's item values. If this is the case and we care about the preserving the original values of the sub items (e.g. arg.q.UidList.Uids[1][2]), then we'll need to copy each individual slice in Uids. Also, I'd recommend using golang copy function to make it more clear with what is going on. The following code snippet demonstrates what I'm referring to. It includes three ways of "copying" a slice -- to clarify, I'm recommending using the way foo4 is copied.

package main

type foo struct {
	bar [][]string
}

func main() {
	foo1 := foo{
		bar: [][]string{[]string{"a"}},
	}
	foo2 := foo{}
	foo3 := foo{}
	foo4 := foo{}

	foo3.bar = foo1.bar
	
	foo2.bar = append(foo1.bar[:0:0], foo1.bar...)

	foo4.bar = make([][]string, len(foo1.bar))
	for i := range foo1.bar {
		foo4.bar[i] = make([]string, len(foo1.bar[i]))
		copy(foo4.bar[i], foo1.bar[i])
	}
	
	foo1.bar[0][0] = "z"
	
	println(foo1.bar[0][0]) // prints z
	println(foo2.bar[0][0]) // prints z
	println(foo3.bar[0][0]) // prints z
	println(foo4.bar[0][0]) // prints a
}

If my assumption about what we're fixing is wrong, sorry and just ignore me :P

The type of arg.q.UidList.Uids is `[]uint64=. Not sure if that was understood. I also like the idea of using copy function instead. It is more readable.


worker/task.go, line 964 at r1 (raw file):

Previously, pawanrawal (Pawan Rawal) wrote…

I agree, please add a elaborate comment. It took me a while to understand the race here. Basically concurrent regex filters SubGraphs are sharing their SrcUid slice pointer which is passed to processTask in the form of UidList. In this case we mutate this pointer in algo.IntersectWith below and hence the race.

There might be other cases in this package where a similar thing might be happening which makes me wonder if we should just create a copy of SrcUids while passing to UidList in query.go and fix it all in one place.

Done.

I plan to work on adding support for filtering on predicates even when an index doesn't exist. I will write more test cases there to ensure that there are no more races in this code. #4077

Copy link

@pullrequest pullrequest bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Warning

PullRequest detected a force-push on this branch. This may have caused some information to be lost, and additional time may be required to complete review of the code. Read More

@mangalaman93 mangalaman93 force-pushed the aman/issue_4330 branch 2 times, most recently from 11a28eb to dc7ebf8 Compare September 28, 2019 05:42
Copy link
Contributor

@pawanrawal pawanrawal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 3 files at r1, 2 of 2 files at r2.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @manishrjain, @martinmr, and @pullrequest[bot])


worker/task.go, line 964 at r1 (raw file):

Previously, mangalaman93 (Aman Mangal) wrote…

The type of arg.q.UidList.Uids is `[]uint64=. Not sure if that was understood. I also like the idea of using copy function instead. It is more readable.

Are you going to change the code to use copy then?

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @mangalaman93, @martinmr, and @pullrequest[bot])


worker/task.go, line 968 at r2 (raw file):

		// a copy of the list to avoid the race conditions later.
		uids = &pb.List{}
		uids.Uids = append(arg.q.UidList.Uids[:0:0], arg.q.UidList.Uids...)

why is this a problem only for regular expressions. Seems like this should be a general enough problem applicable to other things as well.

Copy link
Contributor Author

@mangalaman93 mangalaman93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @manishrjain, @martinmr, and @pullrequest[bot])


worker/task.go, line 964 at r1 (raw file):

Previously, pawanrawal (Pawan Rawal) wrote…

Are you going to change the code to use copy then?

I did, but then there are two things that I realized (and reverted the change) -

  • The code for copying is longer if I use
  • More likelihood for bug, for example I forgot to check for nil condition.

I have mentioned in the comment that the code below makes a copy.


worker/task.go, line 968 at r2 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

why is this a problem only for regular expressions. Seems like this should be a general enough problem applicable to other things as well.

I thought so too, I did a race condition test but didn't find any. My understanding is that because regexp is only function that is allowed in filter condition even when predicates do not have an index, and this was new code that was added to support the aforementioned case, the race condition is only in this part of the code.

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Expand on comments about why this is a problem affecting only this piece of code.

Reviewed 1 of 3 files at r1, 2 of 2 files at r2.
Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @mangalaman93, @martinmr, and @pullrequest[bot])


worker/task.go, line 968 at r2 (raw file):

Previously, mangalaman93 (Aman Mangal) wrote…

I thought so too, I did a race condition test but didn't find any. My understanding is that because regexp is only function that is allowed in filter condition even when predicates do not have an index, and this was new code that was added to support the aforementioned case, the race condition is only in this part of the code.

Add a comment with this explanation, so future readers know why you're doing extra work here, when not anywhere else. In general, smaller fixes need longer comments.

@mangalaman93 mangalaman93 merged commit df61d04 into master Oct 2, 2019
@mangalaman93 mangalaman93 deleted the aman/issue_4330 branch October 2, 2019 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

Query returns inconsistent results after upgrading to dgraph:v1.1.0
4 participants