Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sessionization and building an eval dataset #230

Merged
merged 19 commits into from
Sep 17, 2024
Merged

Sessionization and building an eval dataset #230

merged 19 commits into from
Sep 17, 2024

Conversation

jlewi
Copy link
Owner

@jlewi jlewi commented Sep 7, 2024

  • Build an evaluation dataset from logs
  • Create a sessionization pipeline
  • Refer to tn011_eval_data for a tech note describing the PR

Use sqllite For Sessions

  • This PR introduces the use of sqlllite to store sessions

  • We use sqllite rather than a KV store (e.g. pebbledb) because we want to be able to index the sessions along various dimensions and search for those sessions.

  • We use the cgo-free port of SQLite (modernc.org/sqlite) to avoid introducing cgo

  • We use sqlcs to generate type safe code to run various queries

    • This seems to be more flexible than introducing ORM while still giving us well structured code
  • The generated SQL files live in the pkg/analyze/fsql package

…eptance rate

* Add spans in the completer to track token usage
* Add spans for LogEvent
* Ensure http clients for OpenAI and Anthropic use the OTEL transport.
Copy link

netlify bot commented Sep 7, 2024

Deploy Preview for foyle ready!

Name Link
🔨 Latest commit 9584c67
🔍 Latest deploy log https://app.netlify.com/sites/foyle/deploys/66e8ed2d720ea80008ee6d85
😎 Deploy Preview https://deploy-preview-230--foyle.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

jlewi added 17 commits September 7, 2024 03:37
* Add helper method to create a config object using a temporary directory for use in testing
* Start defining constants for function names of logging messages; use reflection to ensure correctness in a unittest
* We want to incrementally checkpoint log processing when processing really long files.
* Otherwise when processing really long files we could potentially run into problems with not making progress.
* As part of this processing loop we need to periodically checking if the application has been shutdown and if so stop processing log entries.
* Define an EvalExample proto
* Define code to generate the EvalExamples from the sessions
ALso start adding protos to ListSessions and GetSession
@jlewi jlewi marked this pull request as ready for review September 17, 2024 02:45
@jlewi jlewi enabled auto-merge (squash) September 17, 2024 02:45
@jlewi jlewi merged commit 6035823 into main Sep 17, 2024
5 checks passed
@jlewi jlewi deleted the jlewi/otel branch September 17, 2024 02:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant