feat: Toggle Runner Version #488

darunrs · 2024-01-04T18:57:09Z

Runner will need the ability to start using V1 (Polling Redis to start streams) or V2 (Starting streams upon receiving calls to gRPC server) depending on an environment variable we set before deployment. This way, we can make the migration from V1 to V2 and roll that back without too much difficulty.

In order to do this, the main thread needs to have a separate map of stream handlers owned by the grpc server. The server needs to be spun up only in V2 mode. And Redis polling should only happen in V1 mode. In addition, the worker thread needs to only poll Redis stream storage for indexer config in V1 mode. To do so, only V2 will pass in the config into the worker upon executor init. This provides version control over indexer config as well.

Finally, I addressed some remaining TODOs leftover. Mainly returning more information, tracking status of executors, and returning all. information available when listing executors.

For now, I've defaulted the version to V1, if the env variable is not set.

darunrs · 2024-01-04T19:23:50Z

Some tests I ran manually:

Run unset RUNNER_VERSION and verify Runner starts in V1.
Run export RUNNER_VERSION=V1 and verify V1 started. Tried calling server and received connection refused.
Run export RUNNER_VERSION=V2 and verify V2 started. Then, make the following calls:

list executors and verify empty list returned
start one executorA which runs
list executors and verify it shows up with running status
stop the executorA
list again and verify no executors show up
start executorA again
start another executorB with a hard coded flag which makes worker crash if the account matches this indexer
list executors immediately after executorB and verify both executors show up with running status
verify executorA continues to run while executorB stops running
list executors again and verify A has running status while B has stopped status
stop executorB
list executors and verify executorA is only one in list

runner/src/index.ts

darunrs · 2024-01-10T23:21:52Z

So we have decided to pivot away from a full release to a rolling release of V2. As a result, this PR will change to instead be focused on an approach where V1 and V2 share the same stream handlers, allowing Coordinator to affect the Redis streams set on its own, and turn off any executor, V1 or not. This allows to later introduce a allow list to transition indexers to V2 while only requiring allow list visibility to coordinator.

darunrs · 2024-01-12T23:35:09Z

New use cases tested manually:
Redis set executor stopped by Server
Stopped executor restarted by server
Stopped executor restarted by redis
List shows both redis and non-redis streams

I believe that by returning a version of -1, we can make the change in Coordinator V2 simple (I might be wrong about this):
If we put an if statement for the synchronizing executors code for the allow list, then whenever something is added, the code will see an existing executor, see a mismatched version, and stop and restart it. We only need to remove the stream from Redis before restarting. This could be a check for if version is -1 or something. Just an idea.

I've also added a env variable lock on the start and stop endpoints so we can deploy this to whatever env without explicitly enabling V2 functionality until we are ready.

morgsmccauley

Great job! Just a few comments :)

runner/package-lock.json

runner/src/server/runner-server.ts

runner/src/server/runner-service.ts

morgsmccauley · 2024-01-15T02:34:37Z

runner/src/server/runner-service.ts

-            throw new Error(`Stream handler ${executorId} has no/invalid indexer config.`);
+        executors.forEach((handler, executorId) => {
+          let config = handler.getIndexerConfig();
+          if (config === undefined) {


Under what conditions is this undefined?

Oh because the config is read internally 🤔 So we'll get no information for all V1 indexers?

Correct. The config is passed in for V2 executors and is used as opposed to pulling config from Redis. So, aside from my hardcoded obviously incorrect version number, the empty config should be another indicator its a V1 indexer.

We'll need the account ID/function name so we know which executor to stop, we can probably extract that from the redis stream key, which in this case would be the executorId?

Yep that's correct. The redis stream is the "executorId" for V1 indexers. In fact, list functions API returns the redis stream key as the executor Id for V1 indexers.

The executorId is guaranteed to be correct since its populated from the loop through the map of stream handlers itself.

So in that case can we extract the account_id and function_name from the redis stream and populate that information here

I hadn't populated it before since I thought it wasn't needed. What's the information used for given executorId is all that's necessary to drop an indexer?

The executorId is all that is needed to stop the executor, but we need to know which indexer that ID corresponds to in order to stop the right one.

We could to the executorId/redis stream parsing on the Coordinator side, but I want to avoid making that assumption since it won't be true in V2.

Oh I see. Ok, sorry I understand what you're getting at now. Yeah, let me go ahead and do that parsing.

runner/src/stream-handler/stream-handler.ts

darunrs · 2024-01-17T00:39:36Z

Example listing of V1 and V2 indexers together. I manually populated streams with 2 and then added one more using the API.

start:  {
  executorId: '5fa3c3ee92d875791598e775738d1fd98a889a10979b600c154e40f9965892c8'
}
list:  {
  executors: [
    {
      executorId: 'flatirons.near/social_blockheight:real_time:stream',
      accountId: 'flatirons.near',
      functionName: 'social_blockheight',
      status: 'RUNNING',
      version: [Long]
    },
    {
      executorId: 'flatirons.near/sweat_blockheight:real_time:stream',
      accountId: 'flatirons.near',
      functionName: 'sweat_blockheight',
      status: 'RUNNING',
      version: [Long]
    },
    {
      executorId: '5fa3c3ee92d875791598e775738d1fd98a889a10979b600c154e40f9965892c8',
      accountId: 'darunrs.near',
      functionName: 'test_sweat_blockheight',
      status: 'RUNNING',
      version: [Long]
    }
  ]
}

morgsmccauley

Great work :)

darunrs marked this pull request as ready for review January 4, 2024 19:25

darunrs requested a review from a team as a code owner January 4, 2024 19:25

darunrs requested a review from morgsmccauley January 4, 2024 19:25

morgsmccauley reviewed Jan 5, 2024

View reviewed changes

runner/src/index.ts Show resolved Hide resolved

darunrs changed the base branch from rustClient to main January 6, 2024 00:00

darunrs force-pushed the toggleRunnerVersion branch 2 times, most recently from 8da0e9b to 23dd326 Compare January 8, 2024 22:47

darunrs linked an issue Jan 10, 2024 that may be closed by this pull request

Enable Toggle between Endpoint and Redis Control of Executors #478

Closed

morgsmccauley reviewed Jan 15, 2024

View reviewed changes

darunrs added 12 commits January 16, 2024 10:43

Initial Toggle

db0a765

Address TODOs

69696cc

Add tests and perform manual tests

48373f8

Increase timeout

bb0a862

fix: Move while loop into V1 only and rebase main

c924629

Add back codegen on test

bf0ee8c

Add map to constructor of grpc service

ecedf03

Enable simultaneous handling of redis and grpc executors

a686e83

feat: Add toggle for start and stop endpoints

a5e34ff

fix: Set V2 for tests

ada66d5

Address PR Comments

2316824

fix: Update docker compose and remove test runner version setting

341e5e6

darunrs force-pushed the toggleRunnerVersion branch from 2e04c8b to 341e5e6 Compare January 16, 2024 18:50

darunrs added 2 commits January 16, 2024 11:43

Migrate Executor Status to its own Object

1f34e53

Return account ID and function name for V1 indexers

810a686

morgsmccauley approved these changes Jan 17, 2024

View reviewed changes

darunrs merged commit d403006 into main Jan 17, 2024
3 checks passed

darunrs deleted the toggleRunnerVersion branch January 17, 2024 17:56

morgsmccauley mentioned this pull request Apr 22, 2024

test stable branch git fix up #687

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Toggle Runner Version #488

feat: Toggle Runner Version #488

darunrs commented Jan 4, 2024

darunrs commented Jan 4, 2024 •

edited

Loading

darunrs commented Jan 10, 2024 •

edited

Loading

darunrs commented Jan 12, 2024 •

edited

Loading

morgsmccauley left a comment

morgsmccauley Jan 15, 2024

morgsmccauley Jan 15, 2024

darunrs Jan 16, 2024

morgsmccauley Jan 16, 2024

darunrs Jan 16, 2024 •

edited

Loading

morgsmccauley Jan 16, 2024

darunrs Jan 16, 2024

morgsmccauley Jan 17, 2024

darunrs Jan 17, 2024

darunrs commented Jan 17, 2024

morgsmccauley left a comment

feat: Toggle Runner Version #488

feat: Toggle Runner Version #488

Conversation

darunrs commented Jan 4, 2024

darunrs commented Jan 4, 2024 • edited Loading

darunrs commented Jan 10, 2024 • edited Loading

darunrs commented Jan 12, 2024 • edited Loading

morgsmccauley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

darunrs Jan 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

darunrs commented Jan 17, 2024

morgsmccauley left a comment

Choose a reason for hiding this comment

darunrs commented Jan 4, 2024 •

edited

Loading

darunrs commented Jan 10, 2024 •

edited

Loading

darunrs commented Jan 12, 2024 •

edited

Loading

darunrs Jan 16, 2024 •

edited

Loading