Replies: 4 comments
-
At some point it was suggested that perhaps we can use a custom UI page which allows us to paste a spec and have it render the form and test changes to spec that way. While this may work for testing endpoint spec, there is a lot more to testing a connector than just rendering the form, so I suggest we develop an approach that makes it possible to test all sorts of changes, fast. |
Beta Was this translation helpful? Give feedback.
-
Look into Codespaces for multiple repos and how we can utilise that to have different components of the application running in the same codespace: https://github.blog/2022-04-20-codespaces-multi-repository-monorepo-scenarios/ |
Beta Was this translation helpful? Give feedback.
-
How do we allow the local instance of supabase automatically update when a new docker image is pushed, if webhooks can't be configured for localhost? |
Beta Was this translation helpful? Give feedback.
-
How do we handle OAuth for connectors on a Codespace / local instance, given that the redirect_uri can be different |
Beta Was this translation helpful? Give feedback.
-
Problem
Connectors receive a lot of updates as we find out issues with their config, their collection schemas, and other behaviour. Their development is very much iterative and can be rapid at times. Testing new connector changes, however, is not very easy, to test a connector end-to-end together with our UI (this means, also testing how well the connector integrates with our UI, which is an important test):
This process is very time-consuming, but also is not easy when done locally: To build an image can take a considerable amount of time and does not allow for rapid iterative testing (more on this below). To set up the whole project (flow as well as the connectors or airbyte repository) also requires setting up a local environment that is compatible and this setup can change over time, which means every person needs to stay up-to-date with latest changes and maintain their local environment.
The problem is specially visible when testing minor changes to a connector, such as a small update to a connector's config: having to go through building the connector from scratch and ask the agent to re-fetch the latest spec and then use the UI to verify the change, is a lot of steps for such a small change.
Ideal
In an ideal world, we would be able to make changes to a connector's spec, or collection schema, or actual behaviour, and plug this new connector into an instance of Flow instantly. This includes having the spec known to the dashboard updated instantly (instead of having to manually ask the agent to update the image). The test step might be testing how the UI interprets the connector's configuration, or it could be testing how the data is being captured or materialized.
In an ideal world, setting up an instance of Flow that allows end-to-end testing, is either straightforward and easy, or not necessary (e.g. an instance is provided that people can use to do their tests).
Factors In Play
There are some factors that contribute to the testing process being time-consuming and hard to setup and maintain:
Airbyte connector builds are slow
The airbyte connectors build time is slow, even when changing only the patch files. This is even though the patch files are copied to the Dockerfile after the build steps, which means when re-building the Docker image, only the patch files are updated and that creates the new image. The reason for the slowness of rebuild is not Docker, but rather the build system used by airbyte (gradle) contains quite some steps before it builds a connector, which makes quick iterations harder.
Potential Solution:
This might be as easy as switching from the gradle task
connector:build
toconnector:airbyteDocker
, since we don't expect to do any code changes on our connectors anymore. This reduces build time for a simple patch change from 45s down to 12s. We also need to avoid the initial pyenv setup if it's already been done and we could hopefully reduce this build time to a few seconds.Discovery requires a configuration step
When a connector's collection schema is changed, we need to do a Discover RPC to see the result of those changes, that's in contrast to a spec change which we can easily check with a
flowctl-go spec
call. To run the Discovery RPC, we need to provide configuration to a connector, and we don't always have such configuration ready and easily available (e.g. having credentials for accessing an endpoint). This makes it harder to test connectors by multiple people.Potential Solution:
Have pre-made encrypted configurations for all connectors ready, and allow iterating on a connector using those configurations with
flowctl-admin discover
. For the UI, this might mean seeding our local databases with existing captures / materializations, that can be "re-discovered" to test collection changes.Another potential solution is using an auto-fill solution to automatically fill in the config for a connector in the UI from a store. I'm not sure if 1Password can do something like this for arbitrary fields: it apparently does!.
The local stack has many moving parts
The local stack needs a lot of components to work: supabase, agent, data-plane-gateway, config-encryption, temp-data-plane, ui, oauth functions. We do have a great script that runs all of these, but there is inevitably times when updates to these components means people using them need to also be aware of such changes and update something to keep them working. (e.g. think of the supabase cli version pinning that we have had to do).
Potential Solution:
Having a Codespace / Devcontainer environment which runs all of these components and exposes the ports. The challenge with this solution is: how do we allow local docker images from connectors / airbyte repos to be used by this environment? The airbyte repository at the moment uses
docker-in-docker
for its codespace, whereas I thinkflow
usesdocker-from-docker
. The first means that the docker instance running inside the codespace is independent from the one running on the host, whereas in the latter, the codespace uses the host's docker daemon.Codespaces usually have very fast internet connection, so pushing images might not be that much of a trouble (we can push images namespaced by
CODESPACE_NAME
to avoid conflicts). The latency I think comes mostly from us having to manually ask the agent to re-fetch an image. That brings us to the next point.Manual updating of spec when a docker image is updated
At the moment if a docker image is updated, we need to ask the agent to manually update the connector spec and other properties. Ideally this would happen automatically, this will ease our work both on production and when developing locally.
Potential Solution:
This can be done using a webhook, or some similar method for the agent to automatically find out if a docker image has been updated, and go re-fetch it.
Proposed Picture
Notes
Beta Was this translation helpful? Give feedback.
All reactions