Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

services/horizon: Allow starting Horizon with minimal configuration #3783

Merged
merged 7 commits into from
Jul 30, 2021

Conversation

bartekn
Copy link
Contributor

@bartekn bartekn commented Jul 26, 2021

PR Checklist

PR Structure

  • This PR has reasonably narrow scope (if not, break it down into smaller PRs).
  • This PR avoids mixing refactoring changes with feature changes (split into two PRs
    otherwise).
  • This PR's title starts with name of package that is most changed in the PR, ex.
    services/friendbot, or all or doc if the changes are broad or impact many
    packages.

Thoroughness

  • This PR adds tests for the most critical parts of the new functionality or fixes.
  • I've updated any docs (developer docs, .md
    files, etc... affected by this change). Take a look in the docs folder for a given service,
    like this one.

Release planning

  • I've updated the relevant CHANGELOG (here for Horizon) if
    needed with deprecations, added features, breaking changes, and DB schema changes.
  • I've decided if this PR requires a new major/minor version according to
    semver, or if it's mainly a patch change. The PR is targeted at the next
    release branch if it's not a patch change.

What

This commit adds multiple changes to config parameters:

  • --ingest is set to true by default,
  • --captive-core-config-path is generated with default values based on the network passphrase.

If the pubnet Horizon is started without custom config, it will be stopped with an error message after one hour the warning message will be printed to log every hour.

Close #3719.

Why

We want to make the time from Horizon installation to execution to be minimal. With less config values and reasonable defaults developers can start Horizon instance faster.

Known limitations

Quorum set default values for public network should be changed to to organisations developer really trust so Horizon will be stopped after one hour print warning messages if an admin does not provide custom config. However, service managers like systemd may restart the service making it worth permanently.

@bartekn bartekn marked this pull request as ready for review July 27, 2021 16:17
@bartekn bartekn requested a review from a team July 27, 2021 16:17
Copy link
Contributor

@Shaptic Shaptic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test seems like a great opportunity to add a new parameter test that confirms, for example, that the TOML file being used by Horizon matches the one we expect, or that it fails (i.e. FatalTestCase.Exits()) with a standalone network passphrase.

Whether or not this needs to be Horizon 3.0 (because of --ingest=true instead of --ingest=false by default), I'm still unsure about. You could argue that a breaking change is one that requires more parameters to get started rather than less...

Also we need to update the deployment guide alongside this.

// NewDefaultTestnetCaptiveCoreToml constructs a new CaptiveCoreToml instance
// based off the default testnet configuration.
func NewDefaultTestnetCaptiveCoreToml() *CaptiveCoreToml {
var captiveCoreToml CaptiveCoreToml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be easier to maintain these defaults if we used raw strings?

For example, we could use this config for the testnet and just write it to / read it from an ioutil.TempFile. I think it'd be easier to maintain if the configs changed over time, since you could just re-copy-paste the config instead of trying to identify the delta and adapting it to our internal data structure.

Copy link
Contributor

@tamirms tamirms Jul 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have the default pubnet and testnet validators represented in string form. Then we can implement NewDefaultTestnetCaptiveCoreToml() and NewDefaultPubnetCaptiveCoreToml() by calling newCaptiveCoreTomlFromFile(tomlFileContents string, params CaptiveCoreTomlParams) (*CaptiveCoreToml, error)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After @jacekn comment and given that this feature will be used probably only by users who use GH release maybe instead of keeping the file in the binary we could add testnet and pubnet config files in the packed .tar.gz release file. Then we can change the code to try to load the default file from disk if the value is not explicitly set.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that idea

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the file idea to the curl idea because curl'd endpoints can change or go away (or be compromised), so less reproducible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the file idea to the curl idea because curl'd endpoints can change or go away (or be compromised), so less reproducible.

If you use curl it would be best to put the files in a location controlled by SDF, possibly in the go repo next to horizon source code. That way security profile won't change, we'd essentially get the same level of security as for the horizon source code itself.

Copy link
Contributor Author

@bartekn bartekn Jul 30, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just pushed a change adding two default config files to release archive here: 8b85c49.

services/horizon/internal/app.go Show resolved Hide resolved
services/horizon/internal/flags.go Show resolved Hide resolved
@tamirms
Copy link
Contributor

tamirms commented Jul 29, 2021

@jacekn what is your opinion on this?

Quorum set default values for public network should be changed to to organisations developer really trust so Horizon will be stopped after one hour if an admin does not provide custom config. However, service managers like systemd may restart the service making it worth permanently.

@tamirms tamirms requested a review from jacekn July 29, 2021 08:43
@jacekn
Copy link
Contributor

jacekn commented Jul 29, 2021

@jacekn what is your opinion on this?

Quorum set default values for public network should be changed to to organisations developer really trust so Horizon will be stopped after one hour if an admin does not provide custom config. However, service managers like systemd may restart the service making it worth permanently.

I think having hidden 1h delay after which horizon will stop will be confusing and, like you say, systemd might even hide it from operators making it worse. Personally I have never seen similar behaviour in linux software so I think it would be best to avoid it.

Horizon binary is likely wrong place to store default quorumset too. ubuntu packages we build ship with default configs already so I think it would be better to leverage that mechanism instead. We can do the same with docker images and we'll achieve the same goal. The benefit of doing that using package manager is that it's consistent with other linux software and operators should be familiar with the process. It also means that if quorumset changes are needed there no need to compile and rollout new horizon binary.

@bartekn
Copy link
Contributor Author

bartekn commented Jul 29, 2021

I think there's one misconception regarding this property. This is really for development only to allow devs to start Horizon for the first time quickly. Creating a custom captive core config is actually a huge barrier of entry especially for people new to Stellar and we want to mimic the behaviour of stellar/quickstart in which you only define which network you want to start. In production devs should switch to custom config and 1h restarts are there to make sure they notice that something is not working properly.

I think having hidden 1h delay after which horizon will stop will be confusing and, like you say, systemd might even hide it from operators making it worse. Personally I have never seen similar behaviour in linux software so I think it would be best to avoid it.

I agree that maybe it's not visible enough but I didn't want it to be hidden - we print level=error message on startup and after 1h right before terminating.

Horizon binary is likely wrong place to store default quorumset too. ubuntu packages we build ship with default configs already so I think it would be better to leverage that mechanism instead. We can do the same with docker images and we'll achieve the same goal. The benefit of doing that using package manager is that it's consistent with other linux software and operators should be familiar with the process. It also means that if quorumset changes are needed there no need to compile and rollout new horizon binary.

As explained in the first paragraph, feature is mostly for devs (often not using Linux) who download the binary from GH and want to start development quickly. Both ubuntu packages and docker images already have the default config (and the one here is actually copied from stellar/quickstart).

@stellar/horizon-committers let me know what you think about terminating Horizon with default config after 1h.

@jacekn
Copy link
Contributor

jacekn commented Jul 29, 2021

I think there's one misconception regarding this property. This is really for development only to allow devs to start Horizon for the first time quickly. Creating a custom captive core config is actually a huge barrier of entry especially for people new to Stellar and we want to mimic the behaviour of stellar/quickstart in which you only define which network you want to start. In production devs should switch to custom config and 1h restarts are there to make sure they notice that something is not working properly.

I see. Maybe there is a middle ground that could work for both use cases? Having hardcoded testnet config feels like less or a problem, especially if we don't stop after 1h.
With regards to pubnet maybe there is a good way to make it easy to generate the config for developers? Maybe an error message saying something similar to "Pubnet quorumset missing. You can download latest recommended config curl https://github.com/stellar/go/xxx" ? If we make the process simple and clear enough I think maybe it will be acceptable?

@bartekn
Copy link
Contributor Author

bartekn commented Jul 29, 2021

Regarding pubnet I also proposed this: #3783 (comment).

@Shaptic
Copy link
Contributor

Shaptic commented Jul 29, 2021

A comfortable middle ground (gentler nudging of the user) could be to do something like Warn() about default config use every hour.

@bartekn bartekn requested a review from tamirms July 30, 2021 13:07
@bartekn bartekn merged commit babac4f into stellar:master Jul 30, 2021
@bartekn bartekn deleted the horizon-minimal-config branch July 30, 2021 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove unnecessary config params required to start fully functional Horizon instance.
5 participants