Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Brittleness In Logs Processing With Source Detection #232

Open
jlewi opened this issue Sep 12, 2024 · 1 comment
Open

Fix Brittleness In Logs Processing With Source Detection #232

jlewi opened this issue Sep 12, 2024 · 1 comment

Comments

@jlewi
Copy link
Owner

jlewi commented Sep 12, 2024

Our logs processing pipeline is based on using the source of the logging statement to figure out how to parse/post process the logging line. This is brittle because if the code is refactored we change the source location. Its also brittle because its easy to get the function name wrong.

For example here the log is inside an anonymous function

_, span := tp.Start(ctx, "LogEvent", trace.WithAttributes(attribute.String("eventType", event.Type.String()), attribute.String("contextId", event.ContextId), attribute.String("selectedCellId", event.SelectedId)))

As a result the function name in the logging statement ends up being

github.com/jlewi/foyle/app/pkg/agent.(*Agent).LogEvents.func1

One approach we've taken to starting to deal with this is

  1. Defining constants for the function name
  2. Defining unittests that use reflection to get the name of the function and ensure its equal to the constant

https://github.com/jlewi/foyle/pull/230/files#diff-66abb051aad86c8af4c128d3fa1e65c47197c44e23d821a75346e59c833baa92R21

This is imperfect because it doesn't actually check that there is a logging statement there. For example, the test won't catch the case where the logging statement is nested inside an anonymous function.

I wonder if we could fix this by doing more source code analysis? For example, could we use the same tooling used by linters to find the lines associated with a log statement and get the function name associated with it?

@jlewi
Copy link
Owner Author

jlewi commented Sep 12, 2024

Some suggestions from Claude
https://gist.github.com/jlewi/6058026b421867bf72febdead72e70b3

jlewi added a commit that referenced this issue Sep 12, 2024
* Learning was broken because cell executions were not being properly
extracted from LogEvents and added to the BlockLog.
* The problem is that the logging of LogEvents happens inside an
anonymous function. Therefore the function name generated by GoLang has
a suffix like "func1"; therefore our code in the Analyzer was not
correctly matching it and therefore log events were not being processed.
* The fix is to match by using `HasPrefix`. This is still brittle. 
* Per #232 we'd like to start to fix this brittleness by using
reflection and source code analysis in unittests to ensure the
correctness of matchers for LogEntries.
* This PR introduces a matchers package which verifies correctness of
the function name but doesn't properly catch the anonymous function
which broke learning here

* Update the troubleshooting guide for learning
  * Move the troubleshooting guide into the docs page
* Update the instructions for fetching the blocklogs; we now use the
connect protocol

## Strip metadata from markdown

* Create a simple tool to strip cell ids from markdown.
* This is hack to make it easy to use RunMe markdown docs with Hugo; see
stateful/runme#663
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant