Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve logging and tracing #139

Closed
1 task done
bobbinth opened this issue Jan 11, 2024 · 4 comments
Closed
1 task done

Improve logging and tracing #139

bobbinth opened this issue Jan 11, 2024 · 4 comments
Assignees
Milestone

Comments

@bobbinth
Copy link
Contributor

bobbinth commented Jan 11, 2024

We already have some rudimentary logging in place, but we need to improve this quite a bit (I had to do a lot of manual logging when debugging #133). The things I think we should do are:

  1. Use logging much more generously. We don't need to log every single function call, but we should be able to see how things move through the pipeline. For things which are not super important, we can use trace log level.
  2. Log relevant data for various actions. For example, it would be great to log hash of the block that is being created (or maybe even entire block header).
  3. Add timing information for important events - e.g., how long it took to load the store, build a block, insert the new block into the DB etc.

Tasks

@hackaugusto
Copy link
Contributor

Log relevant data for various actions. For example, it would be great to log hash of the block that is being created (or maybe even entire block header).

We should have a single key that is used to correlate multiple lines, because of the async nature of the server we can't rely on the ordering of lines (it will be interleaved from multiple requests). IMO the key should be the request id used for the distributed tracing, since for example, the RPC does not have the block details to include.

Add timing information for important events - e.g., how long it took to load the store, build a block, insert the new block into the DB etc.

These should be done with spans from the tracing framework. It should allow us to have the timing information for the entire transaction, from the rpc down to the store.

@bobbinth
Copy link
Contributor Author

We should have a single key that is used to correlate multiple lines, because of the async nature of the server we can't rely on the ordering of lines (it will be interleaved from multiple requests). IMO the key should be the request id used for the distributed tracing, since for example, the RPC does not have the block details to include.

Agreed for requests that come through the RPC. For other things we may not have natural request IDs. For example:

  • When a batch is generated we should log some kind of batch ID and transaction IDs of transactions that went into this batch.
  • When the block producer picks batches to put into a block, we should also log that such and such batches got selected to go into a block.
  • When the block producer builds a block, we should log that a block with such and such hash containing such and such batches has been built.
  • When the store receives a block, we should log that a block with such and such hash has been received.
  • etc.

@polydez polydez moved this from Todo to In Progress in Builder's testnet Jan 16, 2024
@hackaugusto
Copy link
Contributor

hackaugusto commented Jan 16, 2024

Agreed for requests that come through the RPC. For other things we may not have natural request IDs.

We shouldn't use a natural ID, even if one exists. Given this is a distributed application retries will happen, and we must have different IDs for reach retry. To put this in other words, a request is not the same as the data it contains, and should not use the request's data to derive an ID.

@polydez polydez moved this from In Progress to Done in Builder's testnet Feb 1, 2024
@bobbinth
Copy link
Contributor Author

bobbinth commented Feb 5, 2024

Closed by #172, #189, #190 and others.

@polydez, could you create an issue for adding request IDs for tracking requests across different components?

@bobbinth bobbinth closed this as completed Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

3 participants