Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Zinnia in Filecoin Station [ADR] #121

Merged
merged 9 commits into from
Mar 22, 2023
291 changes: 291 additions & 0 deletions docs/architecture-decision-records/2023-03-integration-with-station.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,291 @@
# Zinnia in Filecoin Station

<!--
This is a minimal template. Feel free to add more sections as needed.

Please review also the Design Doc template and add any relevant sections to your ADR:
https://www.notion.so/pl-strflt/Writing-a-Design-Doc-aa6034be43c2434ba88a2fd844516e94
-->

> Status: ACCEPTED

<!--
PROPOSED, ACCEPTED, REJECTED, DEPRECATED, SUPERSEDED BY {link-to-ADR}
-->

## Context

Zinnia is a runtime for Filecoin Station modules. It's time to integrate Zinnia into the Station to
enable the deployment of the first modules.

Zinnia is designed to run multiple modules inside the same process, using V8 Isolates for keeping
modules separated and sandboxed. The integration with libp2p (and later IPFS) is designed to allow
multiple modules to share the same underlying set of network connections and block storage.

Zinnia has two primary modes of operation:

1. A developer tool for building a single module. We want to optimize for ease of use and a fast
feedback loop.

In this mode, Zinnia is typically started and stopped frequently, running for short periods at a
time, executing a single module only.

As a CLI tool, it reads the configuration from a config file (either user-level or
project-specific) or CLI arguments. There should be reasonable defaults tailored to developers
building a module - e.g. the state files should be stored in the project's working directory.

2. A deployment platform, running modules inside Filecoin Station instances operated by
non-technical users. We want to optimize for a smooth user experience for Station operators
first, and easy troubleshooting of operations by module developers second.

In this mode, Zinnia is a long-running process executing multiple modules. It needs to support
reloading individual modules when a new version of a module is deployed. It also needs to report
various information back to the Station, e.g. activity log and the number of jobs completed.

As a service, it should read the configuration from environment variables (see
[The Twelve-Factor App](https://12factor.net/config)) and allow embedders to improve security
e.g. by keeping modules in a different place from the runtime state.

### `zinniad`

For the reasons above, I propose to build a new binary called `zinniad` that will execute Zinnia
modules inside the Station.

_(This is a strawman proposal, subject to changes during implementation.)_

Configuration via environment variables:

- `FIL_WALLET_ADDRESS`: Address of Station's built-in wallet (required).

- `ROOT_DIR`: Directory where to keep state files (optional). Defaults to a platform-specific
location, e.g. `$XDG_STATE_HOME/zinniad` on Linux.

Positional arguments:

- Positional arguments specify which modules to run, where each module is a single JS file. We don't
make any assumptions about the directory layout of modules. Paths are resolved relatively to the
current working directory.

Example invocation:

```bash
cd /Applications/Filecoin\ Station.app/Contents/Resources/zinnia-modules

FIL_WALLET_ADDRESS=f1etc \
ROOT_DIR=$HOME/Library/Caches/Filecoin\ Station/zinnia \
zinniad \
saturn-l2/main.js \
ping.js \
retrieval-checker/dist/index.js
```

### Communication with Station (Core, Desktop):

As explained above, I propose to use environment variables to pass configuration from the Station to
Zinnia.

For communicating information from Zinnia back to the Station, I propose that Zinnia prints
newline-delimited JSON entries to `stdout` and uses `stderr` for general/debug logging. This format
is easy to parse from Node.js code powering both Station variants.

Example messages:

- **Activity log - error**

`{"type": "activity:error", "module": "saturn", "message": "Cannot connect to the orchestrator."x }`

_Note: `"module": "saturn"` describes which module emitted the log._

- **Activity log - info**

`{"type": "activity:info", "module": null, "message": "Zinnia is starting up..."}`

_Note: `"module": null` means the message comes from Zinnia runtime._

- **Number of jobs completed:**

`{"type": "jobs-completed", "total": 123 }`

_Note: This message is emitted periodically, e.g. every 200ms._

In the future, we can easily extend this line to include per-module stats too:

`{"type": "jobs-completed", "total": 123, modules: {"saturn": 100, "retrieval-checker": 23}}`

- Messages logged via Console APIs like `console.log` are printed to `stderr` without any
modifications.

```text
Pinging /dnsaddr/bootstrap.libp2p.io/p2p/QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa
RTT: 1252ms
Cannot record stats: Error: InfluxDB API error 401
{"code":"unauthorized","message":"unauthorized access"}
```

### Module identifiers

We need each module to have a unique identifier (a name) that we can use in the messages above. This
id must remain unchanged across module version upgrades.

For the initial version, these ids will be hard-coded human-readable names like `saturn-l2`.

In the future, when we move towards untrusted modules deployed in a decentralized manner, we will
need to find a different way how to derive these unique module ids. That's out of scope of the
current work though. Adding new id types should be easy as long as our architecture supports
arbitrary string ids.

### Deploying and upgrading modules

The initial version will not implement any upgrade mechanisms for modules. Both `zinniad` and all
module sources will be bundled inside the Station. When a new module is added or an existing module
is upgraded to a new version, we will publish a new version of the Station.

### Zinnia API for module builders

_(This is a strawman proposal, subject to changes during implementation.)_

```ts
namespace Zinnia {
// omitted: existing APIs like `peerId`

/** Get the wallet address, this value is typically provided by the Station. */
walletAddress: String;

/** Report activities to the Station. These messages are displayed in the main UI. */
activity: {
/** Report an informative status update, e.g. "Connecting to the network." */
info(message: string);
juliangruber marked this conversation as resolved.
Show resolved Hide resolved

/** Report an error, e.g. "Cannot connect to the orchestrator." */
error(message: string);
}

/** Report completion of a single job */
jobCompleted();
}
```

### Dev-mode in `zinnia`

These APIs will behave differently when running a module via `zinnia` CLI in development.

- In the initial version, the wallet address is hardcoded to a dummy testnet address
`t1abjxfbp274xpdqcpuaykwkfb43omjotacm2p3za`. This value is taken from
[Filecoin Lotus docs](https://lotus.filecoin.io/lotus/manage/manage-fil/#public-key-address) with
the leading `"f"` replaced with `"t"`.

Later, we can implement reading of the wallet address from a configuration file, e.g.
`.zinnia/config.yaml` in the current working directory (typically the project root).

- Activity logs are printed to stdout with human-readable formatting.
```text
[10:30:20.000 INFO] Connecting to the network.
bajtos marked this conversation as resolved.
Show resolved Hide resolved
[10:30:21.000 ERROR] Cannot connect to the orchestrator.
```
- Job completions are printed to stdout but less frequently, e.g. every 500ms.

```text
[10:30:20.000 STATS] Jobs completed: 123
[10:30:20.500 STATS] Jobs completed: 134
[10:30:21.000 STATS] Jobs completed: 146
```

## Options Considered

1. Don't build a new binary, bundle the existing `zinnia` CLI inside the Station.

Pros:

- Less work to ship the first version. Less yak-shaving like setting up CI/CD workflows.

Cons:

- Zinnia modules cannot share libp2p & IPFS resources (network connection, peer address book,
block store).

- To meet the requirements of both module builders and the Station runtime, we would need to
implement extra configuration options to get different behaviour in different settings.

- Most of the Station work will be discarded later, once we need `zinniad` to allow running
multiple modules inside the same runtime process.

2. Don't push job stats via stdout, let the station pull the stats via HTTP API. (The current
saturn-l2 module uses this model.)

Pros:

- We already have code in Station to deal with this.
- Less cluttered `stdout`

Cons:

- More complex implementation in Zinnia for little benefits. Since `zinniad` routes all
`console.log` messages to `stderr`, no humans should be reading `stdout`, therefore extra
clutter does not matter.

- More complex integration between Station and Zinnia: Zinnia needs to report URL where the stats
API is available, the Station needs to parse that URL from Zinnia's `stdout`.

3. Include a timestamp in the JSON messages printed to `stdout` for the Station. We decided this is
not needed now and can be easily added later if such a need arises.

Our current goal is integration with the Station. The Station (Core or Desktop) and the Zinnia
runtime will initially sit on the same machine. If the log consumer sits on the same machine, it
shouldn't matter who will attach the timestamp.

4. Use `XDG_STATE_HOME` to configure where should `zinniad` keep the state files.

The major difference between `XDG_STATE_HOME` and `ROOT_DIR` is that `XDG_STATE_HOME` provides a
system (or user) wide directory, we need to append a zinnia-specific segment to that path to
obtain `ROOT_DIR`.

I prefer to give the user full control over the location by providing them `ROOT_DIR` config
option.

This becomes relevant when Zinnia is running inside the Station. If we use `XDG_STATE_HOME`, then
we will keep the state in `XDG_STATE_HOME/zinnia`, a different place from where Station keeps its
juliangruber marked this conversation as resolved.
Show resolved Hide resolved
files. We could make this path Station specific, but that feels hacky to me and incorrect in the
situation when Zinnia runs outside of the Station

However, I think it's a good idea to make `ROOT_DIR` an optional configuration option and use
`XDG_STATE_HOME/zinnia` as the default value.

Few more thoughts:

- `XDG_STATE_HOME` is specific to Linux. Windows uses something like `%LOCALAPPDATA%`, macOS
something else.

- We already provide `ROOT_DIR` for trusted modules, quoting from
[module interface docs](https://github.com/filecoin-station/desktop/blob/596b16704347c5c6e09c621f475cd20291938b77/docs/MODULE_INTERFACE.md?plain=1#L27-L29):

> `ROOT_DIR` The long-lived working directory on disk. The module must store all of its files
> inside (subdirectories of) this directory. The directory isn't expected to be backed up or
> shared across machines in any way.

<!--
What are the different options we considered? What are their pros & cons?
-->

## Decision

<!--
What is the change that we're proposing and/or doing?
-->

Build `zinniad` and add new `Zinnia` APIs as described above.

## Consequences

<!--
What becomes easier or more challenging to do because of this change?
-->

After we implement the architecture described above and integrate Zinnia into the Station, we will
have a solid foundation that's easily extensible and should not require major updates for a long
time.

<!--
## Links &amp; References

Link to other ADRs, GitHub issues, documentation, etc.
-->