-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move AI Logging Of Execution Into VSCode Frontend and out of RunMe GRPC Server #211
Comments
This PR defines the protos and logging service. It doesn't update the Analyzer to learn from these events. Related to: #211
Yes, when we initially considered porting Foyle on top of Runme, I also considered a more UI-aware approach first. However, it was a good call to lean into strength to PoC the integration. I'll review stateful/vscode-runme#1589 for feedback. One terminology I mildly object to is the term "frontend," although it's no objection to the proposal. I find that thinking about an IDE, even if a stripped-down version that could run a backend-less "frontend," minimizes the IDE's role and its importance. While it's a longer-term goal of Runme to have the ability to run as a web extension (in a way that makes sense in browser-only), referring to it as "frontend," factually, it is not "just" a frontend app. My web development knee-jerk is over-indexing on its meaning in the web world. Anyways, just a side note, really. |
|
Usually use "extension" for the whole concept and UI/UX (notebook, editor, terminal, ...) for literally the chrome that's facing the user. |
This PR defines the protos and logging service. It doesn't update the Analyzer to learn from these events. Related to: #211
|
As described in jlewi/foyle#211 we will no longer rely on processing RunMe grpc logs to train the AI. This means we can simplify the Runme logging code and revert some of the changes in #585 * Related to stateful/vscode-runme#1589
…ents (#222) Now that we no longer need to process RunMe logs we can simplify log processing * For the watermark we can just keep track of a single file and its offset * Only the latest file will be active. Therefore we don't need to watch the filesystem for modifications we can just periodically scan the logs. ## Changes to BlockLog Proto * We can remove ExecTraceIds * Since the AI doesn't handle executions we don't have traces for execution * We should rely on the Frontend sending details of execution (e.g. execute code) which it currently isn't * Since LogEvents are processed in order we know that the most recent cell execution will be the final one reported by a LogEvent * Add a field to record suggestion status * Right now we only get log events for cells being accepted we don't record cells being rejected but we can reasonably infer that unaccepted cells have been rejected ## Config Changes * We can deprecate the logDirs field in learner since users should no longer need to configure it to monitor RunMe Logs * Simplifying the setup process of the learner is one of the main motivations of this PR. ## Analyzer Changes *. We no longer need CombineTraces functions for RunMe and Execute traces * We just accumulate LogEvents on the BlockLog object as we sequentially process the logs. ## Other changes * Add an RPC method to check the logs status. * This will report the watermark * Add ZapProto function to handle logging protos that include RunMe objects * We can't rely on the custom go plugin to generate the MarshalObject function because the RunMe protos aren't using that plugin Related to #211
Frontend should have released |
* This effectively reverts #1380 * Per jlewi/foyle#211 we are no longer relying on RunMe logs * Rather RunMe now uses Foyle APIs to send events to Foyle as necessary.
* This effectively reverts #1380 * Per jlewi/foyle#211 we are no longer relying on RunMe logs * Rather RunMe now uses Foyle APIs to send events to Foyle as necessary.
* The bug is described in #215 (comment) * The bug is that `Config.getTrainingDirs` returns no training directories if config.learner == nil * Prior to #211 config.learner would be non-nil because we had to set the path of the RunMe logs * However, now that we no longer depend on RunMe logs config.learner could be nil and this would return no training directories. In which case learner.Reconcile would not attempt to save any examples * The fix is to allow config.Learner to be nil in GetTrainingDirs and return a suitable default. We also need to ensure the training directory gets created if it doesn't exist. * This is fixed by jlewi/monogo#23 which updates LocalFileHelper to create the directory if it doesn't exist. * My suspicion is that I never hit this bug because i originally created my ~/.foyle/training directory using a version of the code which wasn't using FileHelper and explicitly checked and created the directory. I suspect when I refactored the code to support saving examples to GCS thats when the code to ensure the directory exists got dropped.
* This flag is deprecated and no longer used. * The frontend should have stopped using this flag starting in 3.7.5 * The frontend is now up to 3.8.5 * So we should be good to remove this flag because the frontend should no longer be setting it. * Related to jlewi/foyle#211
* This flag is deprecated and no longer used. * The frontend should have stopped using this flag starting in 3.7.5 * The frontend is now up to 3.8.5 * So we should be good to remove this flag because the frontend should no longer be setting it. * Related to jlewi/foyle#211
* This flag is deprecated and no longer used. * The frontend should have stopped using this flag starting in 3.7.5 * The frontend is now up to 3.8.5 * So we should be good to remove this flag because the frontend should no longer be setting it. * Related to jlewi/foyle#211
Background
Foyle relies on logs of cell execution for feedback and training. Cell execution is currently logged from the RunMe GRPC server and not the vscode frontend. See stateful/runme#585. I think the backend was chosen over the frontend because I know GoLang much better than type script.
Logging from the backend has a number of disadvantages.
User setup
Right now the user has to configure Foyle with the location of the RunMe logs (docs). This is brittle and friction. Now that learning is working pretty well and critical to Foyle's value proposition I'd like to remove this friction.
Can't collect additional types of feedback.
As noted in Tech Note 008 there are valuable events that we'd like to log Accepting/Rejecting AI suggested cells. These events would need to be logged from the front end and not the RunMe gRPC server.
Complicates event processing
Processing AI events for learning is complicated by the fact that Foyle needs to be aware of log management in RunMe. RunMe can launch multiple instances of the RunMe server each with their own log file. Foyle has to monitor all the log files and doesn't necessarily know when a log file is complete and will no longer have events being written to it.
Requires AI Service and Executor to be co-located
Since Foyle needs access to the RunMe executor logs the services need to be colocated.
Proposal
The proposed solution is to introduce an RPC into the Foyle AI service to LogEvents.
Events can then be logged from the frontend via RPC.
With this change we can turn on logging by default without requiring any additional configuration from users. The only thing they would need to configure would be the Foyle endpoint.
@sourishkrout what do you think? Any suggestions or concerns; particularly about the changes to the frontend?
The text was updated successfully, but these errors were encountered: