-
Notifications
You must be signed in to change notification settings - Fork 324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor: sqlite async session for graphql api #2784
Conversation
project_rowid := await session.scalar( | ||
text("SELECT rowid FROM projects WHERE name = :name;"), | ||
{"name": project_name}, | ||
) | ||
): | ||
project_rowid = await session.scalar( | ||
text("INSERT INTO projects(name) VALUES(:name) RETURNING rowid;"), | ||
{"name": project_name}, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this makes it so that you have to hit the database on every insert. I think this is where we can build a bit of optimization via dynamic programming where you can keep a list of seen projects. If you've seen it before you don't need to insert.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea, that makes sense. we may have to cache it outside the handler though
async def _insert_span(session: AsyncSession, span: Span, project_name: str) -> None: | ||
if not ( | ||
project_rowid := await session.scalar( | ||
text("SELECT rowid FROM projects WHERE name = :name;"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this safe to sql injection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes. colon is one of the three ways for parameter binding like ?, but sqlalchemy only allows the colon syntax.
await session.execute( | ||
text( | ||
""" | ||
SELECT | ||
sum(cumulative_error_count), | ||
sum(cumulative_llm_token_count_prompt), | ||
sum(cumulative_llm_token_count_completion) | ||
FROM spans | ||
WHERE parent_span_id = :parent_span_id | ||
""" | ||
), # noqa E501 | ||
{"parent_span_id": span.context.span_id}, | ||
) | ||
).first(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this also feels a tad in-efficient - where a read needs to be performed before you insert. If anything you should be able to execute all precursor work on a single async gather call so that the db calls are parallelized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably fine assuming that the child spans are still in the buffer pool. On the other hand, parallelization doesn't really help sqlite because it locks the whole database
resolves #2753
Future TODOs