Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream the full set of predicates and types during a snapshot. #5444

Merged
merged 10 commits into from
May 15, 2020

Conversation

martinmr
Copy link
Contributor

@martinmr martinmr commented May 14, 2020

Internal ref:

  • DGRAPH-1365

Type and predicate keys always have a timestamp of 1 so they are not being picked up by every snapshot.

This PR adds logic to send the entire schema in the snapshot and to delete predicates and types
not present in the snapshot (since they were deleted in between snapshots).


This change is Reviewable

@martinmr martinmr requested review from manishrjain and a team as code owners May 14, 2020 23:29
@github-actions github-actions bot added the area/schema Issues related to the schema language and capabilities. label May 14, 2020
@martinmr
Copy link
Contributor Author

Still working on modifying the existing snapshot test to verify this but @danielmai has already manually tested the change.

@martinmr
Copy link
Contributor Author

Test fails sometimes with this error in badger:

alpha2 | panic: close of closed channel
alpha2 |
alpha2 | goroutine 303 [running]:
alpha2 | github.com/dgraph-io/badger/v2/y.(*Closer).Signal(...)
alpha2 | /home/martinmr/go/pkg/mod/github.com/dgraph-io/badger/v2@v2.0.1-rc1.0.20200506081535-536fed1846d0/y/y.go:212
alpha2 | github.com/dgraph-io/badger/v2/y.(*Closer).SignalAndWait(...)
alpha2 | /home/martinmr/go/pkg/mod/github.com/dgraph-io/badger/v2@v2.0.1-rc1.0.20200506081535-536fed1846d0/y/y.go:239
alpha2 | github.com/dgraph-io/badger/v2.(*DB).blockWrite(0xc000484800)
alpha2 | /home/martinmr/go/pkg/mod/github.com/dgraph-io/badger/v2@v2.0.1-rc1.0.20200506081535-536fed1846d0/db.go:1460 +0x47
alpha2 | github.com/dgraph-io/badger/v2.(*DB).prepareToDrop(0xc000484800, 0xc023adc000)
alpha2 | /home/martinmr/go/pkg/mod/github.com/dgraph-io/badger/v2@v2.0.1-rc1.0.20200506081535-536fed1846d0/db.go:1479 +0x3f
alpha2 | github.com/dgraph-io/badger/v2.(*DB).DropPrefix(0xc000484800, 0xc000493200, 0xa, 0xa, 0x0, 0x0)
alpha2 | /home/martinmr/go/pkg/mod/github.com/dgraph-io/badger/v2@v2.0.1-rc1.0.20200506081535-536fed1846d0/db.go:1573 +0x6e
alpha2 | github.com/dgraph-io/dgraph/posting.DeletePredicate(0x1d0cfc0, 0xc00003a018, 0xc00078c8b0, 0x7, 0x2877840, 0x0)
alpha2 | /home/martinmr/go/src/github.com/dgraph-io/dgraph/posting/index.go:1185 +0xe6
alpha2 | github.com/dgraph-io/dgraph/worker.cleanupSchema(0x1d0cfc0, 0xc00003a018, 0xc023a944d0, 0x0, 0xc023af7598)
alpha2 | /home/martinmr/go/src/github.com/dgraph-io/dgraph/worker/snapshot.go:128 +0x4a7
alpha2 | github.com/dgraph-io/dgraph/worker.(*node).populateSnapshot(0xc00032e150, 0xc000327d60, 0x1ce, 0x1a9, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
alpha2 | /home/martinmr/go/src/github.com/dgraph-io/dgraph/worker/snapshot.go:83 +0x476
alpha2 | github.com/dgraph-io/dgraph/worker.(*node).retrieveSnapshot(0xc00032e150, 0xc000327d60, 0x1ce, 0x1a9, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
alpha2 | /home/martinmr/go/src/github.com/dgraph-io/dgraph/worker/draft.go:826 +0x19d
alpha2 | github.com/dgraph-io/dgraph/worker.(*node).Run(0xc00032e150)
alpha2 | /home/martinmr/go/src/github.com/dgraph-io/dgraph/worker/draft.go:1090 +0x1e7c
alpha2 | created by github.com/dgraph-io/dgraph/worker.(*node).InitAndStartNode
alpha2 | /home/martinmr/go/src/github.com/dgraph-io/dgraph/worker/draft.go:1630 +0x553
alpha2 | [Sentry] 2020/05/14 23:59:56 Sending fatal event [a0dc193b17ec436787c5b49e85d5620d] to o318308.ingest.sentry.io project: 5208688
alpha2 | [Sentry] 2020/05/14 23:59:57 Buffer flushed successfully.
alpha2 exited with code 1

@jarifibrahim can you take a look? I'll push the test soon. It's in the worker package which has a custom docker-compose.yml, FYI.

@martinmr
Copy link
Contributor Author

@jarifibrahim Actually, the failure is consistent. Daniel didn't see this but I suspect it might be because in my test the predicate I am trying to drop doesn't have any data. It should still work though but SignalAndWait is trying to close the channel more than once.

Ibrahim Jarif and others added 2 commits May 16, 2020 01:15
Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Few comments. Please do address.

Reviewed 3 of 4 files at r1, 3 of 5 files at r4, 2 of 2 files at r6.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @martinmr and @vvbalaji-dgraph)


worker/snapshot.go, line 118 at r6 (raw file):

}

func cleanupSchema(ctx context.Context, kvs *pb.KVS) error {

deleteStalePreds


worker/snapshot.go, line 144 at r6 (raw file):

						"Cannot delete removed predicate %s after streaming snapshot: %v",
						pred, err)
					errors.Wrapf(err, "cannot delete removed predicate %s after streaming snapshot",

this looks wrong. errors.Wrapf isn't being used by anyone. Should this be returned?


worker/snapshot.go, line 161 at r6 (raw file):

		if _, ok := snapshotTypes[typ]; !ok {
			if err := schema.State().DeleteType(typ); err != nil {
				errors.Wrapf(err, "cannot delete removed type %s after streaming snapshot", typ)

you want to return these?


worker/snapshot.go, line 213 at r6 (raw file):

		}

		return item.Version() >= snap.SinceTs

This is cheap. I'll put that up.

if item.Version() >= snap.SinceTs { return true }

x.Parse logic here.

Technically, you could further optimize this to use the fact that schema is stored at TS=1. So, you could say: if item.Version() == 1, then do x.Parse.

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 5 of 8 files reviewed, 4 unresolved discussions (waiting on @manishrjain and @vvbalaji-dgraph)


worker/snapshot.go, line 118 at r6 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

deleteStalePreds

Done.


worker/snapshot.go, line 144 at r6 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

this looks wrong. errors.Wrapf isn't being used by anyone. Should this be returned?

Done. Yes, it should be returned.


worker/snapshot.go, line 161 at r6 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

you want to return these?

Done.


worker/snapshot.go, line 213 at r6 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

This is cheap. I'll put that up.

if item.Version() >= snap.SinceTs { return true }

x.Parse logic here.

Technically, you could further optimize this to use the fact that schema is stored at TS=1. So, you could say: if item.Version() == 1, then do x.Parse.

Done.

@martinmr martinmr merged commit 4ed8963 into master May 15, 2020
@martinmr martinmr deleted the martinmr/snapshot-schema branch May 15, 2020 22:03
martinmr added a commit that referenced this pull request May 15, 2020
Type and predicate keys always have a timestamp of 1 so they are not being picked up by every snapshot.

This PR adds logic to send the entire schema in the snapshot and to delete predicates and types
not present in the snapshot (since they were deleted in between snapshots).
martinmr added a commit that referenced this pull request May 15, 2020
Type and predicate keys always have a timestamp of 1 so they are not being picked up by every snapshot.

This PR adds logic to send the entire schema in the snapshot and to delete predicates and types
not present in the snapshot (since they were deleted in between snapshots).
dna2github pushed a commit to dna2fork/dgraph that referenced this pull request Jul 18, 2020
…h-io#5444)

Type and predicate keys always have a timestamp of 1 so they are not being picked up by every snapshot.

This PR adds logic to send the entire schema in the snapshot and to delete predicates and types
not present in the snapshot (since they were deleted in between snapshots).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/schema Issues related to the schema language and capabilities.
Development

Successfully merging this pull request may close these issues.

2 participants