Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: span filtering by span evaluations #2923

Merged
merged 51 commits into from
Apr 24, 2024
Merged
Changes from 1 commit
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
c202fae
wip
RogerHYang Apr 17, 2024
098b7c0
Merge branch 'sql' into span-query-dsl-with-sql
RogerHYang Apr 17, 2024
bd518b2
change unixepoch to julianday
RogerHYang Apr 17, 2024
b90adf9
add sqlean for unit tests
RogerHYang Apr 17, 2024
fa0b1f6
add tests
RogerHYang Apr 17, 2024
a0df97d
clean up
RogerHYang Apr 17, 2024
c139761
clean up
RogerHYang Apr 17, 2024
ce7d9e1
add comments
RogerHYang Apr 17, 2024
aa7f0d6
fix cumulative token count
RogerHYang Apr 17, 2024
4ad058b
fix type-cast functions
RogerHYang Apr 17, 2024
b116600
add notebook
axiomofjoy Apr 18, 2024
0b645e9
add postgres
axiomofjoy Apr 18, 2024
6025925
setup it for postgres and sqlite
axiomofjoy Apr 19, 2024
7ce8114
stored parsed string in array
axiomofjoy Apr 19, 2024
6ddd8e2
Revert "stored parsed string in array"
axiomofjoy Apr 19, 2024
39e8b33
Merge remote-tracking branch 'origin/sql' into eval-filtering-2787
axiomofjoy Apr 19, 2024
29f65a5
sqlite
axiomofjoy Apr 21, 2024
b8f5a16
add integration test case and correct error
axiomofjoy Apr 21, 2024
931b6a1
postgres running on it
axiomofjoy Apr 21, 2024
94e1929
add integration test for !=
axiomofjoy Apr 21, 2024
c1ed543
more it
axiomofjoy Apr 21, 2024
017adb8
add sort_index to ensure the rows are similarly ordered for compariso…
axiomofjoy Apr 22, 2024
7e1483b
add back visit_Attribute method
axiomofjoy Apr 22, 2024
205bd3c
fix unit test
axiomofjoy Apr 22, 2024
328d826
Merge remote-tracking branch 'origin/sql' into eval-filtering-2787
axiomofjoy Apr 22, 2024
cb5a6b4
remove accidentally added files
axiomofjoy Apr 22, 2024
a53732d
refactor
axiomofjoy Apr 23, 2024
3e5b163
remove unnecessary check
axiomofjoy Apr 23, 2024
9fb110f
Merge remote-tracking branch 'origin/sql' into eval-filtering-2787
axiomofjoy Apr 23, 2024
de7158e
add unit test for regex and fix bug with greedy regex matching
axiomofjoy Apr 23, 2024
c77fb90
Merge remote-tracking branch 'origin/sql' into eval-filtering-2787
axiomofjoy Apr 23, 2024
5f4bee6
it notebook format
axiomofjoy Apr 23, 2024
5371477
rename variables
axiomofjoy Apr 23, 2024
4f02631
add docstrings and use frozenset
axiomofjoy Apr 23, 2024
f5f616d
add eval test case for filter
axiomofjoy Apr 23, 2024
5f8f218
use shorter random id
axiomofjoy Apr 23, 2024
54a8e5c
make test case cover python3.8
axiomofjoy Apr 23, 2024
7b4c844
fix error in test case
axiomofjoy Apr 23, 2024
092d52d
add word boundary to regex and corresponding tests
axiomofjoy Apr 23, 2024
098112b
compile regex
axiomofjoy Apr 23, 2024
e25b66b
simplify regex
axiomofjoy Apr 23, 2024
db5d250
improve variable naming
axiomofjoy Apr 24, 2024
a10fc8b
simplify types
axiomofjoy Apr 24, 2024
c1bea9d
use literal type and assert_never
axiomofjoy Apr 24, 2024
31cff06
change function name
axiomofjoy Apr 24, 2024
063fcc9
remove chain, move out regex
axiomofjoy Apr 24, 2024
eb26ce0
change variable name
axiomofjoy Apr 24, 2024
4f0d1d3
compute _aliased_annotation_attributes and store on SpanFilter
axiomofjoy Apr 24, 2024
e3a19b8
remove chain
axiomofjoy Apr 24, 2024
14e7233
Revert "remove chain"
axiomofjoy Apr 24, 2024
d684051
fix tests
axiomofjoy Apr 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
postgres running on it
axiomofjoy committed Apr 21, 2024

Unverified

No user is associated with the committer email.
commit 931b6a1e10fccdd826a225fcc271650beca6ab81
51 changes: 39 additions & 12 deletions integration-tests/eval_query_testing.ipynb
Original file line number Diff line number Diff line change
@@ -61,12 +61,7 @@
" \"start_time\",\n",
" \"status_code\",\n",
" \"status_message\",\n",
"]\n",
"\n",
"\n",
"def get_spans_dataframe(endpoint: str, filter_condition: str):\n",
" df = px.Client(endpoint=endpoint).get_spans_dataframe(filter_condition)\n",
" return df.sort_index()"
"]"
]
},
{
@@ -76,9 +71,9 @@
"outputs": [],
"source": [
"filter_condition = \"evals['Q&A Correctness'].label == 'correct'\"\n",
"original_df = get_spans_dataframe(endpoint=original_endpoint, filter_condition=filter_condition)\n",
"postgres_df = get_spans_dataframe(endpoint=postgres_endpoint, filter_condition=filter_condition)\n",
"sqlite_df = get_spans_dataframe(endpoint=sqlite_endpoint, filter_condition=filter_condition)\n",
"original_df = px.Client(endpoint=original_endpoint).get_spans_dataframe(filter_condition)\n",
"postgres_df = px.Client(endpoint=postgres_endpoint).get_spans_dataframe(filter_condition)\n",
"sqlite_df = px.Client(endpoint=sqlite_endpoint).get_spans_dataframe(filter_condition)\n",
"print(f\"{original_df.shape=}\")\n",
"print(f\"{postgres_df.shape=}\")\n",
"print(f\"{sqlite_df.shape=}\")"
@@ -90,12 +85,28 @@
"metadata": {},
"outputs": [],
"source": [
"print(f\"{set(original_df.columns).difference(set(sqlite_df.columns))=}\")\n",
"print(f\"{set(sqlite_df.columns).difference(set(original_df.columns))=}\")\n",
"sqlite_df[COMMON_COLUMNS].compare(\n",
" original_df.rename(columns={\"span_kind\": \"attributes.openinference.span.kind\"})[COMMON_COLUMNS],\n",
" result_names=(\"sqlite\", \"original\"),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(f\"{set(original_df.columns).difference(set(postgres_df.columns))=}\")\n",
"print(f\"{set(postgres_df.columns).difference(set(original_df.columns))=}\")\n",
"postgres_df[COMMON_COLUMNS].compare(\n",
" original_df.rename(columns={\"span_kind\": \"attributes.openinference.span.kind\"})[COMMON_COLUMNS],\n",
" result_names=(\"postgres\", \"original\"),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
@@ -105,9 +116,9 @@
"filter_condition = (\n",
" \"\"\"evals['Q&A Correctness'].label == 'correct' and evals[\"Hallucination\"].score < 0.5\"\"\"\n",
")\n",
"original_df = get_spans_dataframe(endpoint=original_endpoint, filter_condition=filter_condition)\n",
"postgres_df = get_spans_dataframe(endpoint=postgres_endpoint, filter_condition=filter_condition)\n",
"sqlite_df = get_spans_dataframe(endpoint=sqlite_endpoint, filter_condition=filter_condition)\n",
"original_df = px.Client(endpoint=original_endpoint).get_spans_dataframe(filter_condition)\n",
"postgres_df = px.Client(endpoint=postgres_endpoint).get_spans_dataframe(filter_condition)\n",
"sqlite_df = px.Client(endpoint=sqlite_endpoint).get_spans_dataframe(filter_condition)\n",
"print(f\"{original_df.shape=}\")\n",
"print(f\"{postgres_df.shape=}\")\n",
"print(f\"{sqlite_df.shape=}\")"
@@ -119,11 +130,27 @@
"metadata": {},
"outputs": [],
"source": [
"print(f\"{set(original_df.columns).difference(set(sqlite_df.columns))=}\")\n",
"print(f\"{set(sqlite_df.columns).difference(set(original_df.columns))=}\")\n",
"sqlite_df[COMMON_COLUMNS].compare(\n",
" original_df.rename(columns={\"span_kind\": \"attributes.openinference.span.kind\"})[COMMON_COLUMNS],\n",
" result_names=(\"sqlite\", \"original\"),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(f\"{set(original_df.columns).difference(set(postgres_df.columns))=}\")\n",
"print(f\"{set(postgres_df.columns).difference(set(original_df.columns))=}\")\n",
"postgres_df[COMMON_COLUMNS].compare(\n",
" original_df.rename(columns={\"span_kind\": \"attributes.openinference.span.kind\"})[COMMON_COLUMNS],\n",
" result_names=(\"postgres\", \"original\"),\n",
")"
]
}
],
"metadata": {