Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic Module Updates #53

Closed
9 tasks done
Tracked by #54
patrickwoodhead opened this issue Dec 13, 2023 · 8 comments · Fixed by CheckerNetwork/core#316
Closed
9 tasks done
Tracked by #54

Automatic Module Updates #53

patrickwoodhead opened this issue Dec 13, 2023 · 8 comments · Fixed by CheckerNetwork/core#316
Assignees

Comments

@patrickwoodhead
Copy link

patrickwoodhead commented Dec 13, 2023

Currently, when we make an update to the Spark module, we need to ship a new version of Station. This is acceptable while we have one module running on Station, but becomes unacceptable at 2+ modules as there would be an endless stream of new Station apps to download.

Station should listen out for updates to individual modules and then fetch the latest and start running it, without a new version of Station needing to be deployed.

E.g.

  • Station v1.7.0 is running Spark v1.9.0
  • Spark v1.10.0 is released
  • Station v1.7.0 fetches the latest version release
  • Station v1.7.0 stops running Spark v1.9.0 and starts running Spark v1.10.0

Tasks

Preview Give feedback

There's not going to be any downgrade mechanism for the first implementation. We always want users to run the latest versions of every module.

@bajtos
Copy link
Member

bajtos commented Dec 13, 2023

I would like zinniad to handle this in the longer term, but we are not there yet.

I think Station Core is the best place to implement this auto-update logic - this way, both Station Core and Station Desktop instances will pick up the updates.

We should think a bit more about the update sequence and edge cases:

  • What if we cannot fetch the new SPARK version?
  • What if the new SPARK version cannot start?

@juliangruber
Copy link
Member

Agree that this should be solved at Core level, and eventually move to Zinnia.

For the first version I think we can just always run the latest SPARK version that can be fetched. If it can't be fetched, retry in the next cycle. If it can't start, upgrade once a new version is available. We don't want people to run outdated versions anyway.

We can then use community feedback to steer addition of new features.

@juliangruber
Copy link
Member

@bajtos I'm happy to take this one, if you want take it let's discuss

@juliangruber juliangruber self-assigned this Jan 10, 2024
@bajtos
Copy link
Member

bajtos commented Jan 10, 2024

@bajtos I'm happy to take this one, if you want take it let's discuss

Go for it! 🚀

Core downloads the latest available module version on startup

Do you want to keep shipping some version of a module bundled inside Station Core, or do you expect Station Core do download all modules when it's started for the first time?

If Station Core is shipped without modules, then we may need to add extra handling for the following error case:

  1. User installs Station Desktop (or Station Core)
  2. Station Core is not able to download the module
  3. There are no modules to run

A few more things to consider:

  1. Auto-updating zinniad and Bacalhau - is this in scope too?
  2. A new Spark version may require features available only in a new Zinnia version. How can we ensure that Station Core does not end up in a broken state with a new Spark version but an old Zinnia version the Spark cannot run on?
  3. How can a Zinnia module like Spark detect that it's running on a Zinnia version it does not support?

@juliangruber
Copy link
Member

Do you want to keep shipping some version of a module bundled inside Station Core, or do you expect Station Core do download all modules when it's started for the first time?

I'm thinking no. Since we always want to run the latest versions of modules, if it can't get the latest version we should consider the node not functional. A functional node is one that is able to keep itself updated.

If Station Core is shipped without modules, then we may need to add extra handling for the following error case:

User installs Station Desktop (or Station Core)
Station Core is not able to download the module
There are no modules to run

This could be an activity event! Just like when Core or Zinnia aren't able to run.

A few more things to consider:

  1. Auto-updating zinniad and Bacalhau - is this in scope too?

Updating Zinnia or Bacalhau isn't a pain point atm, therefore I'm voting to add this later.

  1. A new Spark version may require features available only in a new Zinnia version. How can we ensure that Station Core does not end up in a broken state with a new Spark version but an old Zinnia version the Spark cannot run on?

It will end up in a broken state yeah. I think this means we need to ship Desktop/Core updates with new Zinnia versions before updating the module, so that when their installations break we can tell users they need to update the version. I think this is fine - until we add auto updates for Zinnia too.

  1. How can a Zinnia module like Spark detect that it's running on a Zinnia version it does not support?

It will just fail. I think we need to ensure that Zinnia scripts throwing uncaught errors will create appropriate activity events, so that this error is visible to the user.

@juliangruber juliangruber moved this to 🏗 in progress in Space Meridian Jan 17, 2024
@bajtos
Copy link
Member

bajtos commented Jan 17, 2024

Do you want to keep shipping some version of a module bundled inside Station Core, or do you expect Station Core do download all modules when it's started for the first time?

I'm thinking no. Since we always want to run the latest versions of modules, if it can't get the latest version we should consider the node not functional. A functional node is one that is able to keep itself updated.

If Station Core is shipped without modules, then we may need to add extra handling for the following error case:

User installs Station Desktop (or Station Core)
Station Core is not able to download the module
There are no modules to run

This could be an activity event! Just like when Core or Zinnia aren't able to run.

SGTM 👍🏻

I just wanted to flag these edge cases as something to keep in mind during the implementation.

  1. A new Spark version may require features available only in a new Zinnia version. How can we ensure that Station Core does not end up in a broken state with a new Spark version but an old Zinnia version the Spark cannot run on?

It will end up in a broken state yeah. I think this means we need to ship Desktop/Core updates with new Zinnia versions before updating the module, so that when their installations break we can tell users they need to update the version. I think this is fine - until we add auto updates for Zinnia too.

Unfortunately, Stations are slow to pick up new versions. We are likely to end up with a large fraction of our network in a broken state for weeks to months.

I am fine with that, but I'd like us to think through the consequences before we roll out the auto-update feature.

Today's data about Station versions for context:

  • Desktop version 1.2.1 was released two weeks ago and more than 50% of our network runs haven't picked it up yet.
  • Core version 16.3.1 was released two weeks ago and again, more than 50% of our network haven't picked it up yet.

Station Desktop

version pings
0.20.4 6
1.0.2 6
1.0.5 12
1.2.0 104
1.2.2 5
1.2.4 92
total 225

Station Core

Screenshot 2024-01-17 at 16 29 30

Ouch, so many different versions out there!

  1. How can a Zinnia module like Spark detect that it's running on a Zinnia version it does not support?

It will just fail. I think we need to ensure that Zinnia scripts throwing uncaught errors will create appropriate activity events, so that this error is visible to the user.

Your comments sparked an idea how to approach this:

  • Modules should not depend on Zinnia version, but should detect Zinnia features instead. E.g. if Zinnia adds persistence API, a module like Spark should check if the API is available.
  • When the module detects that not all required APIs are available, it can log an activity event.

WDTY?

@juliangruber
Copy link
Member

Unfortunately, Stations are slow to pick up new versions. We are likely to end up with a large fraction of our network in a broken state for weeks to months.

Wouldn't you say that operators will notice (no earnings) and this incentivise updates instead? I view this as a positive side effect.

Your comments sparked an idea how to approach this:

Modules should not depend on Zinnia version, but should detect Zinnia features instead. E.g. if Zinnia adds persistence API, a module like Spark should check if the API is available.
When the module detects that not all required APIs are available, it can log an activity event.
WDTY?

SGTM! Treat Zinnia like a browser, and perform feature detection.

@bajtos
Copy link
Member

bajtos commented Jan 22, 2024

Unfortunately, Stations are slow to pick up new versions. We are likely to end up with a large fraction of our network in a broken state for weeks to months.

Wouldn't you say that operators will notice (no earnings) and this incentivise updates instead? I view this as a positive side effect.

I would love to say that!

This is what I see in the data:

  • We upgraded Spark to v1.7.0 in Station Core v16.3.0, which was released on Dec 13th. The matching Desktop release: 1.2.0.
  • spark-api rejects HTTP requests submitting measurements with spark_version field older than 1.7.0 (see here), the Spark module detects this error and logs an error activity to Station logs/UI asking the user to upgrade.
  • We have about 14% of Station Core instances running a version older than v16.3.0 and, therefore, not receiving any rewards.

I guess you are right that people noticed they were not receiving new rewards and upgraded their Station Core to v16.3.0 to earn again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: ✅ done
Development

Successfully merging a pull request may close this issue.

3 participants