Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prebuilt ffi workstream README #1

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 115 additions & 0 deletions ffi_how_to.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
This document aims to create a detailed work plan for shipping the `pre-built-ffi` workstream tracked in https://github.com/filecoin-project/filecoin-ffi/issues/209
The high level approach for this has already been well documented at https://hackmd.io/@mvdan/Hy7iK0TEY.

### Changes to https://github.com/filecoin-project/filecoin-ffi

Going ahead, except for cases where users want to build `filecoin-ffi` from source, `filecoin-ffi` will essentially act as a "wrapper" that forwards all API calls to the corresponding `prebuilt-ffi` module. Since a `prebuilt-ffi` module contains platform-dependent assets (such as static C libraries), APIs in `filecoin-ffi` will need to delegate calls to the "platform-specific" `prebuilt-ffi` module.

Currently, `filecoin-ffi` supports multiple (GOOS + GOARCH) combinations. Therefore, we must modify every public API in `filecoin-ffi` to have a variant for each (GOOS + GOARCH) combination. These variants will delegate the call to the corresponding platform-specific `prebuilt-ffi` module instead of calling into the CGO bindings as they do today (except when users build `filecoin-ffi` from source).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how important this is, but if it is not widely understood, do we want to list (or better - link to the canonical source) the set of combinations we support?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/filecoin-project/filecoin-ffi/blob/03b9503994a1cebe6e1bd333b9f566a7fdb24b4b/.circleci/config.yml#L187-L204 currently only 3 combinations.

And part of this work should be that there remains a well-documented pathway to building and using from source. I imagine that option is going to be more difficult than it is to day since makefiles and bash scripts can determine for you whether there's a binary or not, but this new pathway is just going to result in go install failing if you're outside of the common combination range we support.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rvagg
Build from source should continue working as expected because we will retain all the existing files as it with their corresponding built tags. We will create new files/APIs that depend on the corresponding prebuilt-ffi-* modules but those files will explicitly have a !ffi_source built tag to tell the compiler to not use them when building from source.


Furthermore, it is crucial to ensure complete backward compatibility for current users of `filecoin-ffi`, including those who opt to compile `filecoin-ffi` from the source.

This objective can be achieved using Go build tags.
For illustration, consider the existing public [`ffi.Hash()` API in the `bls.go` file of `filecoin-ffi`](https://github.com/filecoin-project/filecoin-ffi/blob/master/cgo/bls.go#L11). Below is an outline of the necessary variants for this API:

```go
prebuilt_bls_darwin_arm64.go
//go:build cgo && darwin && arm64 && !ffi_source
// +build cgo,darwin,arm64
// +build !ffi_source
package ffi
import (
prebuilt "fil.org/prebuilt-ffi-darwin-arm64"
// When building for darwin/arm64, Go tooling automatically selects this file due to the specified build
// tags. It then fetches the "prebuilt-ffi-darwin-arm64" module from the module proxy, using the version
// specified in the `go.mod` file.
)

// Hash computes the digest of a message
func Hash(message Message) Digest {
return prebuilt.Hash(message)
}
```

```go
prebuilt_bls_linux_amd64.go
//go:build cgo && linux && amd64 && !ffi_source
// +build cgo,linux,amd64
// +build !ffi_source
package ffi
import (
prebuilt "fil.org/prebuilt-ffi-linux-amd64"
)

// Hash computes the digest of a message
func Hash(message Message) Digest {
return prebuilt.Hash(message)
}
```

```go
// Build from source
bls__source.go
//go:build ffi_source
// +build ffi_source
// Same code as we have today
```

And so on and so forth for each (GOOS + GOARCH) combination for each of those files containing public APIs that currently call into the CGO bindings.

The logical next question to ask is how are the `prebuilt-ffi-{GOOS}-{GOARCH}` modules mentioned here created and where are they hosted?

### Building and publishing the prebuit-ffi modules + CI changes
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would also be good to be clear about the authoring experience. Lets say I have something I want to add cgo/bls. Do I just add it to bls.go? Will there be codegen to create prebuilt_bls_darwin_arm64.go?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question. I hadn't thought about this. This one is a bit hard to get right in a seamless way. One option is to add a layer of indirection by introducing a new filecoin-ffi-prebuilt repo that end users depend on and keeping filecoin-ffi completely unchanged so devs can continue hacking on it like they do now. The e2e flow then would look like:

  1. Dev changes bls.go in filecoin-ffi
  2. Cuts a new release of filecoin-ffi
  3. CI takes over and :
    • Clones this new filecoin-ffi release
    • Creates and persists the prebuilt-assets for the combinations of {GOARCH}-{GOOS} we need to support to release assets
    • Uses codegen to generate the prebuilt-* files for the filecoin-ffi-prebuilt repo
    • Updates the go.mod for the filecoin-ffi-prebuilt repo to depend on the prebuilt-{GOARCH}-{GOOS} repos
    • Releases the filecoin-ffi-prebuilt repo

End users then depend on filecoin-ffi-prebuilt instead of filecoin-ffi.

But, as you can see, this is a fair bit of work.


For each release of `filecoin-ffi`, CI will build and publish the corresponding `prebuilt-ffi-{GOOS}-{GOARCH}` modules as "go mod compatible zip files" to the Github release assets page for `filecoin-ffi` for each supported combination of (GOOS + GOARCH).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume these files aren't very big and that we won't hit any upload / storage limits with Github? (I'm not saying this will be a problem - just trying to anticipate any issues. Feel free to disregard if this is clearly not an issue.)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BigLep No that wont be a problem. We already persist these build assets on Github releases today without problems:

https://github.com/filecoin-project/filecoin-ffi/releases/tag/e1e8d6082c7fcd4d


In addition to the pre-built zip modules, we will also need to publish the corresponding `go.mod` and "info" meta for each pre-built module (go tooling needs these -> more details in the `Go Module Proxy` section below) . These can be created synthetically and can be persisted in the Github release assets as well.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"info" meta

This might be a Steve newbie question. Do we have an example of what this looks like? Where is the "info" meta stored?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BigLep

It's just a version tag and timestamp.

See https://proxy.golang.org/github.com/aarshkshah1992/prebuilt-ffi-darwin-arm64/@v/v0.0.1.info

The spec for it is at https://go.dev/ref/mod#goproxy-protocol in the $base/$module/@v/$version.info section.



We already have some flavour of this today. See the `Assets` section [example](https://github.com/filecoin-project/filecoin-ffi/releases/tag/ed08caaf8778e1b6).


The high level steps to create these assets for each `prebuilt-ffi-{GOOS}-{GOARCH}` module are as follows.
For each release `vx.y.z` for `filecoin-ffi`:

1. Clone `filecoin-ffi-vx.y.z` source on a machine with {GOOS}X and {GOARCH}Y.
2. Remove all the `prebuilt_*` files
3. Build it from source to create the prebuilt assets
4. Remove all the transient build assets in `rust/target` dir
5. Create the `prebuilt_bls_{GOOS}_{GOARCH}-{vx.y.z}.info` and `prebuilt_bls_{GOOS}_{GOARCH}-{vx.y.z}.mod` files (the latter can be created by removing the existing `go.mod` file and running `go mod tidy` to generate a new one)
5. Zip it up using something like `https://github.com/aarshkshah1992/prebuilt-ffi-zipper` (the directories inside the zip just need to follow a specific hierarchy)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume we'll have this build logic/tooling live within the filecoin-ffi project as it doesn't seem big enough to warrant extracting out into its own dependency.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's just a few lines of code.

5. Publish the `prebuilt_bls_{GOOS}_{GOARCH}-{vx.y.z}.zip`, `prebuilt_bls_{GOOS}_{GOARCH}-{vx.y.z}.mod` and `prebuilt_bls_{GOOS}_{GOARCH}-{vx.y.z}.info` files to the Github release assets page for the `filecoin-ffi` repo
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any of these automation steps here and above you don't know how to do? I expect IPDX can help here if so. We should give them the heads up this is coming.

Copy link
Owner Author

@aarshkshah1992 aarshkshah1992 Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BigLep Can always try all of this on our own and defer/take help from IPDX as and when we get stuck. FWIW, IPDX did offer to help out/collaborate on this project . See https://filecoinproject.slack.com/archives/CP50PPW2X/p1714751948515629?thread_ts=1714537214.454689&cid=CP50PPW2X.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another dumb question: does:

  1. does prebuilt_bls_{GOOS}_{GOARCH}-{vx.y.z}.zip contain prebuilt_bls_{GOOS}_{GOARCH}-{vx.y.z}.mod and prebuilt_bls_{GOOS}_{GOARCH}-{vx.y.z}.info AND
  2. do both prebuilt_bls_{GOOS}_{GOARCH}-{vx.y.z}.mod and prebuilt_bls_{GOOS}_{GOARCH}-{vx.y.z}.info get published alongside prebuilt_bls_{GOOS}_{GOARCH}-{vx.y.z}.zip
    ?

Taking prebuilt_bls_{GOOS}_{GOARCH}-{vx.y.z}.mod as example, this means there is one copy in prebuilt_bls_{GOOS}_{GOARCH}-{vx.y.z}.zip and one copy uploaded alongside prebuilt_bls_{GOOS}_{GOARCH}-{vx.y.z}.zip.



### Running a light weight HTTPs server/module proxy to serve the prebuilt modules to go tooling
A minimal HTTPS server(referred to as a "module proxy" in the Go world) must be run at `https://fil.org` to serve the `fil.org/prebuilt-ffi-{GOOS}-{GOARCH}` Go modules to Go tooling. This server could be implemented using a Cloudflare Worker or a custom-managed HTTPS server.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have more discussion about the domain? Things immediately coming to mind:

  1. Use a subdomain - I assume at the minimum we want a subdomain so that we're partitioned off from other actions happening at the top level domain.
  2. Why fil.org rather than filecoin.io ? (Are there others we should be considering?).
  3. Also, do we want to have multiple domains like there are in the example example.com/gopher mod https://modproxy.example.com . IIUC, they are doing one domain for which the module is published and then a domain for the modproxy.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@masih @jennijuju -> Any thoughts on the domain here ?

Agree with using a subdomain such as module.fil.org.

Copy link
Owner Author

@aarshkshah1992 aarshkshah1992 Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BigLep

Yeah -> the module name can be something like "fil.org or filecoin.io / prebuilt-ffi-darwin-arm64" and the domain for the modproxy can be module.filecoin.io. The only requirement is the top level namespace in your module name should be able to serve a GET /go-get=1 query to redirect go tooling to the module proxy.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want to be mindful about the infrastructure we're standing up here even if it's simple. I want to make sure this doesn't deteriorate into a case where one or two people have tribal knowledge about it. I think at the minimum this should be infrastructure as code that is auto-deployed. With that lens, the tool chain to use probably depends on if/what other infrastructure services we're using. For example, if we're already doing AWS things, then lets do Lambda to not bring in another dependency, (I assume traffic is really low here and so cost isn't what we need to optimize for. Optimizations should be given for simplicity, developer time now for development and for future maintenance.)

Copy link
Owner Author

@aarshkshah1992 aarshkshah1992 Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree 100 %. We don't want to be on the hook for having to maintain any server/infra long term here. As it is, this custom module proxy should only be redirecting requests from go tooling to the corresponding Github URLs/assets and as such will have minimal ingress/egress/compute costs/

Copy link

@BigLep BigLep Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are already in the AWS world, one possibility given the redirect logic is so simple is to use s3 redirection rules: https://docs.aws.amazon.com/AmazonS3/latest/userguide/how-to-page-redirect.html . This would then mean no custom go code for dynamically handing requests.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't have any experience with it but can take a look.But some important requirements:

  1. We should be able to set the rules such that only requests for the prebuilt-ffi-* modules are entertained (no other modules should be served)

  2. The top level domain name in the module (fil.org/filecoin.io) should be able to respond to a GET ?go-get-1 query param request (response can be sent by redirecting to a static html page on Github pages etc)

But we should be able to get this to work.


This server is essential because the Google Go Module Proxy does not accommodate custom domains, and GOPROXY in `direct` mode cannot retrieve modules directly from `https://fil.org` since the modules and pre-built assets will be hosted on GitHub. To address this, we need a redirection mechanism from `https://fil.org` to the appropriate GitHub URLs/assets. Fortunately, Go tooling is capable of handling 3XX redirects, allowing all module requests to `https://fil.org` to be redirected to the respective GitHub URLs/assets. This redirection ensures that the server incurs minimal ingress/egress/compute costs, functioning primarily as a redirecting proxy.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is meant by "custom domains"?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically everything other than non trusted big VCS providers like Github, Gitlab etc. I will fix the language here.


**This server will have to implement the following APIs** so that it implements the `GOPROXY` protocol
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't there other APIs we need to implement like list?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replied at https://github.com/aarshkshah1992/ffi-app/pull/1/files#r1644098068.

The list endpoint is only needed if the go tooling hasn't been given a specific version to depend on and that is not a use case we see the need for right now.

See https://go.dev/ref/mod#goproxy-protocol for more details.

1. GET https://fil.org/prebuilt-ffi-{GOOS}-{GOARCH}?go-get=1

This API should return the HTML below, which informs the Go tooling of the server URL that implements the `GOPROXY` protocol for the prebuilt modules.

```html
<meta name="go-import" content="fil.org/prebuilt-ffi-{GOOS}-{GOARCH} mod https://fil.org">
```
Go tooling will now use the URL specified in the above response and send the following API requests:

2. GET https://fil.org/fil.org/prebuilt-ffi-{GOOS}-{GOARCH}/@v/{$version}.info

Here `{$version}` refers to the go module semver.

The important point here is that this API can be implemented by doing a redirect to the corresponding `prebuilt-ffi-{GOOS}-{GOARCH}.info` file in release assets for `filcoin-ffi` based on the `{$version}` requested here.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per comment above, to make this really clear I think we should state or link to example .info file and where it's stored.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an example.


3. GET https://fil.org/fil.org/prebuilt-ffi-{GOOS}-{GOARCH}/@v/{$version}.mod

This redirects to the `prebuilt-ffi-{GOOS}-{GOARCH}.mod` file in release assets for `filcoin-ffi` based on the `{$version}` requested here.

4. GET https://fil.org/fil.org/prebuilt-ffi-{GOOS}-{GOARCH}/@v/{$version}.zip

This redirects to the `prebuilt-ffi-{GOOS}-{GOARCH}.zip` file in release assets for `filcoin-ffi` based on the `{$version}` requested here.

Note that one limitation of the above approach is that users will not be able to depend on unqualified/`latest` versions of prebuit-ffi.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we expect this to be very important in practice? When there is active development happen before a network upgrade will developers be building from source?

Is there a "hack" around this we can do? For example, can we publish the latest from main/master to v9.9.9?

Also, I see in https://go.dev/ref/mod#goproxy-protocol there is $base/$module/@latest. Why can't we do that? (sorry, newbie question I'm sure).

Copy link
Owner Author

@aarshkshah1992 aarshkshah1992 Jun 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BigLep

The reason I've left out the list and latest endpoints:

  1. End users should only be depending directly on the filecoin-ffi module and not on the prebuilt-ffi module. They can continue depending on unqualified versions for filecoin-ffi.

  2. For active development on filecoin-ffi, developers can build from source to test things out and do not need to fetch unqualified prebuilt-ffi modules. If they want to hack around here, they can always depend on a local prebuilt-ffi module on their filesystem using the go replace directive See https://go.dev/ref/mod#go-mod-file-replace .

So, just leaving out these two endpoints for now unless there is a use case that mandates implementing them.