Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade badger dependency #12

Closed
francislavoie opened this issue Jun 8, 2021 · 21 comments
Closed

Upgrade badger dependency #12

francislavoie opened this issue Jun 8, 2021 · 21 comments
Assignees
Labels
enhancement New feature or request

Comments

@francislavoie
Copy link

On https://github.com/caddyserver/caddy, we've had trouble in the past with having to explicitly set CGO_ENABLED=0 to build Caddy, since we've had DataDog/zstd as an indirect dependency through smallstep/nosql -> dgraph-io/badger/v2 -> DataDog/zstd, because that zstd implementation is not pure-Go.

Just today, badger has merged a PR to replace DataDog/zstd with a pure-Go zstd in klauspost/compress, so this should no longer be an issue. See the discussion on dgraph-io/badger#1383, followed by the full replacement (merged today) in dgraph-io/badger#1709.

So it would be great if smallstep/nosql could update its dependency on badger to the latest, so that projects like Caddy that depend on it can get rid of this last non-pure-Go dependency.

@francislavoie francislavoie added enhancement New feature or request needs triage Waiting for discussion / prioritization by team labels Jun 8, 2021
@dopey dopey self-assigned this Jun 8, 2021
@dopey
Copy link
Contributor

dopey commented Jun 8, 2021

Hey @francislavoie in order to support this I think we'd need badgerv3 support in nosql. I just tried doing a simple upgrade and it seems that the API has changed in some breaking ways (which is to be expected). If someone who understands the new Badger options wanted to take a look to porting our codebase to work with the new API we'd be very grateful. Otherwise it will be some time until we have the cycles to explore ourselves.

Specific questions I have:

  • We want to make sure we don't break existing setups:
	// Set the ValueLogLoadingMode - default is MemoryMap. Low memory/RAM
	// systems may want to use FileIO.
	switch strings.ToLower(opts.BadgerFileLoadingMode) {
	case "", database.BadgerMemoryMap, "memorymap":
		bo.ValueLogLoadingMode = options.MemoryMap
	case database.BadgerFileIO:
		bo.ValueLogLoadingMode = options.FileIO
	default:
		return badger.ErrInvalidLoadingMode
	}

This is the current code. How should it be updated to work with the new options?

  • What new attributes should we be exposing in the ca.json?
  • How should we recommend that users update their DBs from v2 to v3? Link to badger's site or should we provide a conversion tool?

@dopey dopey removed the needs triage Waiting for discussion / prioritization by team label Jun 8, 2021
@francislavoie
Copy link
Author

francislavoie commented Jun 8, 2021

I reached out in https://discuss.dgraph.io/t/use-pure-go-zstd-implementation/8670/19 in case they're able to help.

@dopey
Copy link
Contributor

dopey commented Jun 8, 2021

Awesome! That would be great.

@maraino
Copy link
Collaborator

maraino commented Jun 9, 2021

Implementing this would help on 3rd party projects smallstep/certificates#358.
They can select which nosql packages want to compile into the binary, or even implement their own, although this is already supported.

@NamanJain8
Copy link

NamanJain8 commented Jun 13, 2021

Hey @dopey, badger v3 is not compatible with v2 as mentioned in the release notes . There are changes in the on-disk format.
Answering your queries:

  • ValueLogLoadingMode is no longer available. From v3 badger opens value log in memory-mapped mode.
  • Where should I look for ca.json file? I couldn't find it in the repo.
  • Currently, there does not exist any tool. We will need a small script that reads data from DB at v2 and writes it into DB at v3. I can help you with that.

@dopey
Copy link
Contributor

dopey commented Jun 28, 2021

Hey @NamanJain8 thanks for the quick reply and apologies for my radio silence.

  • ValueLogLoadingMode is no longer available. From v3 badger opens value log in memory-mapped mode.

I'm not sure if this will be an issue for us. We introduced a loadingmode attribute in ca.json to allow users to select FileIO (https://github.com/smallstep/nosql/blob/master/badger/v2/badger.go#L33-L42) because we found that RAM constrained environments were unable to run in memory-mapped mode. Is there a way to select a FileIO mode in badger v3?

That's the only attribute we explicitly map from our step-ca into the badger config, so once that's sorted we should be able to proceed.

With regards to a migration script - it's a nice to have, but not a blocker.

@NamanJain8
Copy link

Hey @dopey.

Is there a way to select a FileIO mode in badger v3?

No, we can't use FileIO mode in badger v3. The value log loading mode was removed from badger in dgraph-io/badger#1555. @jarifibrahim do you know why it was removed? I think it was because it was affecting the changed flow of how badger handles writes.

@maraino
Copy link
Collaborator

maraino commented Jun 29, 2021

@dopey @NamanJain8 instead of supporting multiple versions of badger at the same time with all the problems that it has, we might want to only support one and create a program that migrates v1, v2 to the supported version, v3 for example.

For example starting with 0.16 we can say that we only support v3, and perhaps in 0.19 we can use a v4.

@dopey
Copy link
Contributor

dopey commented Jun 29, 2021

No, we can't use FileIO mode in badger v3. The value log loading mode was removed from badger in dgraph-io/badger#1555. @jarifibrahim do you know why it was removed? I think it was because it was affecting the changed flow of how badger handles writes.

@NamanJain8 Do you know what this would mean for users who currently require this mode due to memory constrained environments? If badgerv3 won't support those environments, then @maraino we will need to keep badgerv2 around to support those users.

@manishrjain
Copy link

manishrjain commented Jun 29, 2021

I think we can just support an option in Badger to disable value log completely. Dgraph largely doesn't use value log anymore anyway, because the disk requirements with value log are unpredictable. That'd then allow people with resource constrained environments to run Badger.

Feel free to open an issue in Badger for this.

@francislavoie
Copy link
Author

Not to be a bother, but would it be possible to move this along?

Some users of Caddy are seeing issues build Caddy with specific plugins because of strange issues like this:

../../../../../../../pkg/mod/github.com/dgraph-io/badger/v2@v2.0.1-rc1.0.20201003150343-5d1bab4fc658/table/table.go:860:17: not enough arguments in call to z.Calloc
        have (int)
        want (int, string)

The PR #13 to upgrade to a more recent v2 would at minimum help move this in the right direction, fixing some of the issues we've seen.

Upgrading to badger v3 would be the ultimate fix, in that it would finally remove the CGO dependency.

@dopey
Copy link
Contributor

dopey commented Aug 18, 2021

I'm testing the V2 PR as we speak.

With regards to upgrading to V3, without the valuelogloading mode change in Badger this would be a backwards incompatible change for step-ca users. We'd prefer not to make such changes unless absolutely necessary.

@francislavoie
Copy link
Author

Fair enough.

In that case, @manishrjain is there any chance that the zstd changes could be backported to v2 of badger? I don't envision that would be too difficult?

@dopey
Copy link
Contributor

dopey commented Aug 18, 2021

Making a new release of smallstep/certificates now.

edit: Unless you'd prefer I wait on the zstd changes you mentioned. lmk

@francislavoie
Copy link
Author

Considering there's no PR for backporting zstd yet, I don't think we need to wait, should be okay to bump again if it does happen quickly, but if not this should at least move things along a bit.

@dopey
Copy link
Contributor

dopey commented Aug 18, 2021

v0.16.1 smallstep/certificates is building. As a dependency, you should already be able to reference it. Most of the release packages and binaries won't be available for another 15 minutes.

@francislavoie
Copy link
Author

Thanks @dopey! We'll update Caddy shortly.

I wrote a quick backport PR dgraph-io/badger#1736 since it looked pretty easy (combining the changes from two other PRs, the code hadn't diverged significantly between v2 and v3 when it came to zstd)

francislavoie added a commit to caddyserver/caddy that referenced this issue Aug 18, 2021
mholt pushed a commit to caddyserver/caddy that referenced this issue Aug 19, 2021
NamanJain8 pushed a commit to dgraph-io/badger that referenced this issue Aug 25, 2021
…std (#1736)

Backports the changes from #1706 and #1709 to the v2 branch of badger.

For context, https://github.com/caddyserver/caddy uses https://github.com/smallstep/certificates which uses https://github.com/smallstep/nosql which uses badger.

See smallstep/nosql#12 for the discussion. It's not currently viable for smallstep to upgrade from v2 to v3 of badger, so it would be nice to backport this zstd change to the v2 branch to remove the implicit (yet optional) CGO dependency on Caddy.

I targeted the release/v2.2007 branch, I couldn't really see a more relevant branch to target, so I hope that's okay.
@francislavoie
Copy link
Author

@dopey https://github.com/dgraph-io/badger/releases/tag/v2.2007.4 has been tagged, which backports the zstd change and a fix for building on Plan9. Could you bump the dependency in nosql/certificates again? Thank you!

@dopey
Copy link
Contributor

dopey commented Aug 25, 2021

@francislavoie Just tagged v0.16.3.

Edit 1: Actually, hold off. Forgot to bump the version of nosql when I did that :<. Gimme another minute.

Edit 2: Ok, v0.16.4 pushed.

@francislavoie
Copy link
Author

Awesome!!

I'll update Caddy now!

francislavoie added a commit to caddyserver/caddy that referenced this issue Aug 25, 2021
mholt pushed a commit to caddyserver/caddy that referenced this issue Aug 25, 2021
@francislavoie
Copy link
Author

For the purpose of Caddy, this is satisfied at this point! I'll close this I suppose.

Thanks for the help everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants