Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Introducing Logging Class (Disabled Usage of Logger) #608

Merged
merged 38 commits into from
Apr 4, 2024

Conversation

Kevin101Zhang
Copy link
Contributor

@Kevin101Zhang Kevin101Zhang commented Mar 19, 2024

Added code for writing logs to new logs table through Postgres instead of Hasura.

https://www.loom.com/share/ff21d7099cac403d9152c905f7e4ddcc?sid=5828ae99-377b-4510-ac8c-76c02fd232f2

@Kevin101Zhang Kevin101Zhang requested a review from a team as a code owner March 19, 2024 20:10
@Kevin101Zhang
Copy link
Contributor Author

Kevin101Zhang commented Mar 19, 2024

cases until completion:

  1. cron statements
  2. add cases where user's indexer is already provisioned and we must provision only the logs table
  3. update logger.test & indexer test

runner/src/indexer/indexer.ts Outdated Show resolved Hide resolved
runner/src/provisioner/provisioner.ts Outdated Show resolved Hide resolved
runner/src/hasura-client/hasura-client.ts Outdated Show resolved Hide resolved
runner/src/hasura-client/hasura-client.ts Outdated Show resolved Hide resolved
runner/src/hasura-client/hasura-client.ts Show resolved Hide resolved
runner/src/hasura-client/hasura-client.ts Outdated Show resolved Hide resolved
runner/src/hasura-client/hasura-client.ts Outdated Show resolved Hide resolved
runner/src/indexer/indexer.ts Outdated Show resolved Hide resolved
Copy link
Collaborator

@darunrs darunrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvements. Left various comments. I think once you've addressed all of these, you can open up the PR to get full feedback. But, you also need to create a PR for the infra repo to add cron. And make sure that the cron works. Make sure you understand what dependency on infrastructure there is for this code change. If Infra changes need to be released first, don't merge this PR until the infra changes are up.

runner/src/hasura-client/hasura-client.ts Outdated Show resolved Hide resolved
runner/src/indexer-logger/indexer-logger.test.ts Outdated Show resolved Hide resolved
runner/src/indexer-logger/indexer-logger.test.ts Outdated Show resolved Hide resolved
runner/src/indexer-logger/indexer-logger.ts Show resolved Hide resolved
runner/src/indexer-logger/indexer-logger.ts Outdated Show resolved Hide resolved
runner/src/indexer/indexer.ts Outdated Show resolved Hide resolved
runner/src/indexer/indexer.ts Outdated Show resolved Hide resolved
runner/src/indexer/indexer.ts Outdated Show resolved Hide resolved
runner/src/indexer/indexer.ts Outdated Show resolved Hide resolved
runner/src/indexer/indexer.ts Outdated Show resolved Hide resolved
runner/src/provisioner/schemas/logs-table.ts Outdated Show resolved Hide resolved
runner/src/hasura-client/hasura-client.ts Outdated Show resolved Hide resolved
const pgClient = pgClientInstance ?? new PgClient({
user: databaseConnectionParameters.username,
password: databaseConnectionParameters.password,
host: process.env.PGHOST,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
host: process.env.PGHOST,
host: databaseConnectionParameters.host

With the talk of multiple database instances there is potential that this could be wrong, it's unlikely, but will avoid headache in future :)

runner/src/indexer-logger/indexer-logger.ts Outdated Show resolved Hide resolved
runner/src/provisioner/schemas/logs-table.ts Outdated Show resolved Hide resolved
runner/src/stream-handler/stream-handler.ts Outdated Show resolved Hide resolved
runner/src/indexer/indexer.ts Outdated Show resolved Hide resolved
runner/src/indexer/indexer.ts Outdated Show resolved Hide resolved
runner/src/indexer/indexer.ts Outdated Show resolved Hide resolved
runner/src/indexer/indexer.ts Outdated Show resolved Hide resolved
morgsmccauley added a commit that referenced this pull request Mar 29, 2024
This PR expands provisioning to also schedule the cron jobs for
adding/deleting log partitions. It assumes:
1. The `cron` database exists and has `pg_cron` enabled
(near/near-ops#1665)
2. The `__logs` table exists and has the partition functions defined
(#608)

In relation to this flow, the high-level steps are:
1. Use an admin connection to the `cron` database to grant the required
access to the user
2. Use a user connection to the `cron` database to schedule the jobs

The cron job is executed under the user which schedules the job,
therefore the user _must_ schedule the job as they are the only ones who
have access to their schemas. If the admin were to schedule the job the
job itself would fail as it doesn't have the required access.

Merging this before 2. is fine, the jobs will just fail, but should
start to succeed after it has been implemented.
Copy link
Collaborator

@darunrs darunrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a new round of comments. It's definitely real close to being ready. I also unresolved some comments which were not actually addressed.

My suggestion is to resolve the comment only after you made that particular code change and committed it locally. Or resolve them after pushing changes up to the PR. This way, you don't lose track of them. Resolving is not necessary to merging a PR, so do what works to ensure they're handled.

postgres.Dockerfile Outdated Show resolved Hide resolved
runner/src/indexer-logger/indexer-logger.test.ts Outdated Show resolved Hide resolved
runner/src/indexer-logger/indexer-logger.test.ts Outdated Show resolved Hide resolved
runner/src/indexer-logger/indexer-logger.test.ts Outdated Show resolved Hide resolved
runner/src/indexer-logger/indexer-logger.test.ts Outdated Show resolved Hide resolved
runner/src/indexer/__snapshots__/indexer.test.ts.snap Outdated Show resolved Hide resolved
runner/src/indexer/indexer.ts Outdated Show resolved Hide resolved
runner/src/provisioner/schemas/logs-table.ts Outdated Show resolved Hide resolved
runner/src/provisioner/schemas/logs-table.ts Outdated Show resolved Hide resolved
runner/src/stream-handler/stream-handler.ts Outdated Show resolved Hide resolved
Copy link
Collaborator

@morgsmccauley morgsmccauley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple problems with this PR which I've commented inline.

This PR has become massive, and very confusing to track due to the number of comments. That's partly my fault, sorry. This definitely needs more work, but I think it would create more confusion if done here.

I'd recommend merging this with the main functionality commented out:

  • Provisioning of the logs table (runLogsSql() call in Provisioner)
  • Flushing the logs to the database (the writeLogs call in Promise.all in Indexer)

That way where not running any new and potentially problematic code, and can make changes to it under the hood.

With the above disabled, functionality should be exactly as it is on main. Therefore indexer.test.ts and integration.test.ts can be reverted to their original state and should pass. That should give us a fair amount of confidence that what we are pushing is not going to break anything.

With all that done, we should collate a list of pending tasks which we can break up in to future smaller PRs.

Let me know if you need help with any of this. I'm happy to sync up.

@@ -54,6 +54,32 @@ describe('Indexer integration', () => {
database: postgresContainer.getDatabase(),
});

const mockPgClient = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose of integration tests are for integrating with real components, can you please remove these mocks.

You can use the hasuraClient to get the real DB credentials and inject those in to IndexerLogger. Happy to sync up if you need help here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah will go ahead and use the real instance here

runner/src/provisioner/schemas/logs-table.ts Show resolved Hide resolved
runner/src/provisioner/provisioner.ts Outdated Show resolved Hide resolved
const indexer = new Indexer(defaultIndexerBehavior, { fetch: mockFetch as unknown as typeof fetch, DmlHandler: genericMockDmlHandler }, undefined, undefined, config);
const context = indexer.buildContext(SIMPLE_SCHEMA, INDEXER_NAME, 1, HASURA_ROLE);
const indexer = new Indexer(defaultIndexerBehavior, { fetch: mockFetch as unknown as typeof fetch, DmlHandler: genericMockDmlHandler, IndexerLogger: genericMockIndexerLogger }, undefined, undefined, undefined, config);
const context = indexer.buildContext(SIMPLE_SCHEMA, INDEXER_NAME, 1, HASURA_ROLE, []);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add some tests for logging itself? We should assert that the logs themselves actually translate to calls to Postgres

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done in indexer-logger.test.ts

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, you're right. But, we're not testing the IndexerLogger is actually being used within Indexer, i.e. If I have an Indexer which logs message, are we actually calling IndexerLogger? That flow is not covered.

Copy link
Contributor Author

@Kevin101Zhang Kevin101Zhang Apr 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's being covered along with loglevel. there's a specific area in the test called "writeLog is respecting level". It is indexer.test.ts.I believe we create an indexer and initialize the logger in runfunction. We test to see if the log level is a certain level it calls writeLog in logger.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No what he means is we check how many times and what logs were given to IndexerLogger in Indexer.

For example, some tests to verify:

We run an indexer which calls 5 logs, some with debug level with an info config. We then expect that IndexerLogger's writeLogs is called once. And then expect it was called with ALL FIVE LOGS since the filtering is done internally.

In other words, we don't care if IndexerLogger does the right thing. But we do care if Indexer uses the logger class correctly. So, check scenarios like, the indexer throws an error but we still call writeLogs. Or we call it with all logs produced in various scenarios. Stuff like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#4

runner/src/indexer/__snapshots__/indexer.test.ts.snap Outdated Show resolved Hide resolved
Copy link
Collaborator

@darunrs darunrs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super duper close. Some problems with the implementation of testing and some cleanup you can do with the Indexer code.

format: jest.fn().mockReturnValue('mock')
} as unknown as PgClient;

const indexerLoggerInstance: any = new IndexerLogger(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't want to actually test the implementation of indexerLogger in this suite of tests. If there is a bug in IndexerLogger, it'll show up here, which will be confusing. Please create a mock of indexerLogger with pass through functions (All functions are a jest.fn() that does the right thing) as a generic.

@@ -225,7 +242,7 @@ CREATE TABLE
shards: {}
} as unknown as StreamerMessage) as unknown as Block;

const indexer = new Indexer(defaultIndexerBehavior, { fetch: mockFetch as unknown as typeof fetch, provisioner: genericProvisioner, DmlHandler: genericMockDmlHandler }, undefined, undefined, config);
const indexer = new Indexer(defaultIndexerBehavior, { fetch: mockFetch as unknown as typeof fetch, provisioner: genericProvisioner, DmlHandler: genericMockDmlHandler, IndexerLogger: genericMockIndexerLogger }, undefined, undefined, indexerLoggerInstance, config);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable you pass into the { ... } is used differently than the one you pass in after the undefined.

The one in the {} is supposed to be used to initialize the class variable indexer_logger if it isn't already defined. You actually no longer need this as you never use it. I mentioned that later in the PR.

The indexerLoggerInstance should in fact be whatever you have genericMockIndexerLogger as.

@@ -273,9 +290,9 @@ CREATE TABLE
}
})
});
const indexer = new Indexer(defaultIndexerBehavior, { fetch: mockFetch as unknown as typeof fetch, DmlHandler: genericMockDmlHandler }, undefined, undefined, config);
const indexer = new Indexer(defaultIndexerBehavior, { fetch: mockFetch as unknown as typeof fetch, DmlHandler: genericMockDmlHandler, IndexerLogger: genericMockIndexerLogger }, undefined, undefined, undefined, config);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the behavior here is like this. For DmlHandler, there's genericMockDmlHandler in the deps object. And undefined passed as dmlHandler. We're forced to do this in order to pass in Config.

Later on, if runFunctions is called we will check if dmlHandler is not undefined. But we pass in undefined here. So, what would happen is we would then use deps to generate the object and set it. This is where we can inject our mock.

It's actually no longer necessary now to have DmlHandlerClass passed into deps now since we can avoid the creation of an actual object now by passing in a mock DmlHandler into the class directly.

Anyway, the reason I explain this all now is that you can use that to properly use IndexerLogger's mocks during testing. In this case, it's mocking the entire class, and we can probably remove it from deps.

Don't worry about changing how DmlHandler doe sit. I'll test a fix of that later myself.

runner/src/indexer/indexer.ts Outdated Show resolved Hide resolved
runner/src/provisioner/provisioner.ts Outdated Show resolved Hide resolved
runner/src/stream-handler/stream-handler.ts Outdated Show resolved Hide resolved
@@ -54,6 +54,32 @@ describe('Indexer integration', () => {
database: postgresContainer.getDatabase(),
});

const mockPgClient = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we're supposed to use mocks in an integration test. That defeats the point. That being said, I don't think this should be a blocker for you to get this PR out as it's gonna take you more time to figure. out usage of this. So, it's ok for now I think. Maybe @morgsmccauley can chime in here.

@Kevin101Zhang Kevin101Zhang changed the title feat: prototyped new logs schema Introducing Logging Class (Disabled Usaged of Logger) Apr 4, 2024
@darunrs darunrs changed the title Introducing Logging Class (Disabled Usaged of Logger) Introducing Logging Class (Disabled Usage of Logger) Apr 4, 2024
@darunrs
Copy link
Collaborator

darunrs commented Apr 4, 2024

Lots of remaining work. so we are going to do a phased rollout of these changes.

  1. We uncommented usage of the new IndexerLogger, but defined it here. All existing tests pass with few modifications and Runner works locally without issue.
  2. We are going to modify feat: Provision logs for existing users #636 to conditionally provision the logs table for all indexers. This ensure that we don't have to worry about it again, and we can revert the commit after.
  3. We re-enable usage of IndexerLogger, this time with fully provisioned logs tables. And we can verify the actual functionality works. And we can write up unit tests correctly for that.

@darunrs darunrs requested a review from morgsmccauley April 4, 2024 19:20
@darunrs darunrs dismissed morgsmccauley’s stale review April 4, 2024 19:21

Offline resolved Morgan's comments. Doing requested phased rollout

@Kevin101Zhang Kevin101Zhang merged commit 0e3f1f6 into main Apr 4, 2024
3 checks passed
@Kevin101Zhang Kevin101Zhang deleted the 299-implement-the-new-logs-schema branch April 4, 2024 19:22
@Kevin101Zhang
Copy link
Contributor Author

4 PRs.

  1. Fix all unresolved comments in original PR
  • Prefix log indexes with __
  • Move creation of logs table to abstracted function
  • Update LogEntry to be a class
  1. Uncomment provisioning of logs, add unit tests in Provisioner, and ensure integration tests pass
  2. Ensure auto-provisioning of logs for existing users works, and merge feat: Provision logs for existing users #636
  3. Uncomment logging calls, update unit tests in Indexer, and ensure integration tests pass

Kevin101Zhang added a commit that referenced this pull request Apr 9, 2024
Feat: created logEntry class and test cases
Chore:  relocated createLogs to abstracted func
Chore: renamed schema idx to prefix with '__'
morgsmccauley added a commit that referenced this pull request Apr 10, 2024
The provisioning flow will not be run for existing Indexers, this PR
adds a separate provisioning check/step which sets up the partitioned
logs table for existing users.

I've opted for a in-code approach as a "manual" migration script
requires specific timing, i.e. we'd need to deploy the logs change,
ensuring all new Indexers are provisioned correct, and then migrate all
existing users to ensure that no Indexers are missed. But since the logs
provisioning change is coupled with the logging itself, existing
Indexers would fail to log until the migration is complete.

My only concern for this approach is a "thundering herd". After this is
deployed, all Indexers will attempt to provision there logs table at the
same time - I'll monitor this in Dev.

As this code is temporary, I didn't bother adding
instrumentation/unit-tests, nor worry about the performance impact. It
will be removed promptly.

This is dependant on #608 and should be merged after.
@Kevin101Zhang Kevin101Zhang changed the title Introducing Logging Class (Disabled Usage of Logger) feat: Introducing Logging Class (Disabled Usage of Logger) Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement Logging and Setting Status/BlockHeight for new Tables
5 participants