-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prod Release 17/01/24 #513
Conversation
Snyk has created this PR to upgrade eslint-config-next from 13.1.6 to 13.5.3. See this package in npm: See this project in Snyk: https://app.snyk.io/org/gabehamilton/project/f1490843-1830-4eb0-a957-99816aa5edcc?utm_source=github&utm_medium=referral&page=upgrade-pr
Logs are causing machine to run out of space, leading to crashes. I've removed some logs which are not particularly helpful for debugging problems anymore. These logs also tend to run basically every single iteration, bloating logs both on machine and in GCP.
As part of implementing a new control plane to oversee QueryApi resources, Runner needed endpoints with which control plane can send commands. This PR adds code to create a new gRPC server with endpoints to start, stop, or list Runner executors. This provides the control plane the ability to fully control Runner, and removes the need for bilateral decision making, which was a problem of the previous design. The changes will be changed further as needs become more concrete, closer to integration/release.
…28f516705c [Snyk] Upgrade eslint-config-next from 13.1.6 to 13.5.3
# Conflicts: # frontend/yarn.lock
Feat/editor error logging
This PR adds the `created_at_block_height` and `updated_at_block_height` fields to `IndexerConfig` within the registry contract. The motive behind this is to provide Coordinator V2 with a way for comparing the actual and desired states of the system, i.e. if there is a mismatch between the registry and the system, action should be taken. Without versions, there is no way of making this comparison. ## Compilation Errors ~~I ran in to several issues trying to compile the `wasm32` binary, and have outlined all these issues in near/near-sdk-rs#1125, as well as in the `README.md` so that the fixes are documented. These fixes are a bit janky, but I've tested the deployed contract and all seems to be ok.~~ These have been resolved, see: #458 (comment) ## Account Roles Migration I've also included `account_roles` in this migration as we have some incorrect accounts as `Owner`s (`pavelnear.near`). All owners will be wiped and re-written from the contract default state. All `User`s will remain. ## Coordinator V1 Coordinator V1 has been tested to ensure that it can still parse the registry after these new fields have been applied.
…nges on the indexerDetails object. - Show the schema even if it fails the validations
…api into fix/load-schema-from-registry # Conflicts: # frontend/src/components/Editor/Editor.jsx
… validation when user is changing the code
Fix loading a schema from registry
This PR creates a rust based GRPC client for Runner, for use within Coordinator V2. For now, this exists as its own crate within the top-level directory. The Coordinator PR was becoming quite large so I decided to separate this out. Additionally, I've made a couple changes to the Runner proto: - Rename the package: `spec` -> `runner` - Remove `executorId` from `StartExecutorRequest` - it should be deterministic and assigned internally - Add `version` to the executor - this will be used to determine whether an executor should be restarted
This PR adds the initial Coordinator V2 service, which acts as the main driver for the Control Plane. ## Coordinator Overview Coordinator V2 will exist as a new standalone service with it's primary goal being ensuring the current registry configuration is mapped to the system. It's core logic is just an infinite loop which reads the registry, and sends necessary requests to the Block Streamer and Runner services to synchronise that config. ## Block Streamer Changes Some changes have been made to Block Streamer to enable the above: 1. `version` and `redis_stream` have been added to the proto so that Coordinator can configure them. 2. Support for `ActionFunctionCall` has been added - Initially I thought only `ActionAny` allowed, but the current registry has `ActionFunctionCall` and therefore needs to be supported. 3. `last_published_block` is now written to Redis to enable "Start from interruption" ## What's not been done I wanted to limit the scope of this PR as it was starting to get big, I'll address these tasks in follow up PRs: - Provisioning - Coordinator should check the status of provisioning, and act when the state isn't as expected. This can be done after #426 is implemented. - Retry recoverable errors - Any error will be propagated cause the error to exit, this includes connection errors to the Block Streamer and Runner services. As these are very likely to occur (across restarts) we should retry these errors. - Environment configuration - There are many hard coded values (endpoints, registry contract, etc.) which should be configurable via the environment. - Logging and Metrics
More info about this can be found on this [issue](#483) Changes on this PR: - Validate code & schema before registering the indexer Logic: If formatting either the code or the schema fails, the `Publish` button is disabled. Only if type generation errors or no errors are detected, the `Publish` button will be enabled. More about this on this [discussion](#480 (comment)) https://github.com/near/queryapi/assets/15988846/68f89cb4-f561-4e8a-9fa7-81c1a38e548c Additionally - Created a reusable Modal with a global context to manage it, so we can trigger it from any component to display some info - Updated schema with granular error types for improved clarity - Created a custom error class to filter by type Error. We can add more fields on it if needed --------- Co-authored-by: Juan Luis Santana <juanluis@near.org> Co-authored-by: Roshaan Siddiqui <siddiqui.roshaan@gmail.com>
Currently, the `shard`/`chunk` count is hard-coded to 4 so that we can fetch the block header and shards in parallel. This PR removes the hard-coded value by using the chunk count specified in the block header to fetch the relevant shards.
`fetchShardsPromises` is used within `Promise.all` to fetch all shards simultaneously. The `async` modifier means this function returns a `Promise`. `Promise.all` complains since it's passed a `Promise` rather than an `Array`. This PR removes the unnecessary `async` modifier so it works with `Promise.all`
Runner will need the ability to manage V1 (Polling Redis to start streams) and V2 (Starting streams upon receiving calls to gRPC server) indexers. The rollout plan for V2 involves a rolling migration with automatic migration. In order to do this, the main thread needs to have a shared map of executors. The server should be available always and the list API should return both V1 and V2 indexers, and Stop should be able to terminate any indexer. In order for stopped V1 indexers to be not restarted, the stream must be removed from the streams set before Runner performs stop. Finally, I addressed some remaining TODOs leftover. Mainly returning more information, tracking status of executors, and returning all. information available when listing executors.
Build fails in Dev due to Rpto file not being found. Copying the file explicitly in the Dockerfile resolves the issue. Issue was reproducible uding docker compose to build and run image locally.
The two commits that I authored that got added here would require terraform changes in prod. We could either leave them out of this release, or deploy the terraform changes in prod (Added the env variable). I'd probably lean towards leaving it out? Otherwise, the changes looks fine to me. I haven't managed to get to the bottom of the failure in the UI for showing logs though. But if it works for you, then it's definitely not a code issue. |
Did we add start and created block heights to prod contract as well? |
Yes, it has been deployed to prod. |
It's probably easier to apply the terraform changes than revert, and then revert revert those PRs. What failure in the UI are you referring to? |
@darunrs yeah that's specific to the dev cert, I've raised an issue with SRE. I'll go ahead and merge this release. |
backend_only
(feat: Make log/state Hasura tablesbackend_only
#462)Promise.all
(fix: Ensure array is returned toPromise.all
#504)coordinator
/block-streamer
via environment (refactor: Configurecoordinator
/block-streamer
via environment #503)