feat: plumb Dataset start/end times through graphQL #194

nate-mar · 2023-01-26T04:43:39Z

Resolves #172

src/phoenix/config.py

mikeldking · 2023-01-26T05:40:56Z

src/phoenix/core/model.py

-from phoenix.datasets import Dataset
-from phoenix.datasets.schema import EmbeddingFeatures
+from phoenix.core.datasets import Dataset
+from phoenix.core.datasets.schema import EmbeddingFeatures


I think I sorta wanted the code that the user interacts with be unnested and not be part of core. Core being part of the application. I think voxel has a similar organization. Makes for easier discovery in the notebook. But we would dissuade people from importing core in the notebook

Ah ok gotcha -- so datasets is the API interface so to speak, and users should never need to interact with anything in core.

Yeah that was the idea. That the user interacts with datasets and metrics, phoenix serves core via an api.

mikeldking · 2023-01-26T06:24:13Z

app/schema.graphql

+  startTime: DateTime
+  endTime: DateTime


Suggested change

startTime: DateTime

endTime: DateTime

startTime: DateTime!

endTime: DateTime!

Let's make these non-maybe

mikeldking · 2023-01-26T06:24:51Z

app/src/__generated__/AppRootQuery.graphql.ts

+    readonly endTime: any | null;
    readonly name: string;
+    readonly startTime: any | null;


update relay.config and these will become strings.

err, what do I update relay.config with? do I need to add something?

oh nm, got it.

mikeldking · 2023-01-26T06:56:00Z

app/relay.config.js

@@ -7,5 +7,6 @@ module.exports = {
  noFutureProofEnums: true,
  customScalars: {
    GlobalID: "String",
+    "DateTime": "string",


It's mapping to another base primitive so it should be uppercase String

mikeldking · 2023-01-26T06:57:07Z

src/phoenix/datasets/dataset.py

+    def start_time(self) -> datetime:
+        """Returns the datetime of the earliest inference in the dataset"""
+        ts_col_name: str = cast(str, self.schema.timestamp_column_name)
+        dt: datetime = self.__dataframe[ts_col_name].min()


Since this never changes you can use dynamic programming to compute once

I'm assuming you're referring to the caching aspect of dp? how about just using the @cached_property annotation? or do we have a preexisting convention for how we'd like to do this?

Oh that's great. Very cool

mikeldking · 2023-01-26T06:58:46Z

src/phoenix/datasets/dataset.py

+    @property
+    def end_time(self) -> datetime:
+        """Returns the datetime of the latest inference in the dataset"""
+        ts_col_name: str = cast(str, self.schema.timestamp_column_name)


Prefer dynamic programming to compute once

mikeldking · 2023-01-26T07:00:03Z

src/phoenix/datasets/dataset.py

+        """Returns the datetime of the latest inference in the dataset"""
+        ts_col_name: str = cast(str, self.schema.timestamp_column_name)
+        dt: datetime = self.__dataframe[ts_col_name].max()
+        return dt


Been advocating for non abbreviated variable names right now https://youtu.be/-J3wNP6u5YU

👍 -- done

wip

e50e432

nate-mar changed the title ~~[feat] plumb Dataset start/end times through grpahQL~~ [feat] plumb Dataset start/end times through graphQL Jan 26, 2023

nate-mar added 3 commits January 25, 2023 20:48

comments and variable name change

b9014b5

proposed refactor

962bdd9

formatting

13f3b95

mikeldking reviewed Jan 26, 2023

View reviewed changes

revert folder move

3f6a198

nate-mar changed the title ~~[feat] plumb Dataset start/end times through graphQL~~ feat: plumb Dataset start/end times through graphQL Jan 26, 2023

nate-mar added 2 commits January 25, 2023 21:59

rename test variables

77d9df4

Update AppRootQuery.graphql.ts

bfe7ed2

mikeldking reviewed Jan 26, 2023

View reviewed changes

nate-mar added 5 commits January 25, 2023 22:26

implicit typecast

cee6796

fix types

bfe257c

cast for now

3859376

updte datetime type

1c46785

Update schema.graphql

6b133c5

mikeldking approved these changes Jan 26, 2023

View reviewed changes

nate-mar added 2 commits January 25, 2023 23:13

a few minor updates

3db71b6

Update dataset.py

a47bb20

nate-mar marked this pull request as ready for review January 26, 2023 18:56

nate-mar added 2 commits January 26, 2023 14:38

Update dataset.py

da3e96b

Update dataset.py

158135a

nate-mar merged commit dc6c88d into main Jan 27, 2023

nate-mar deleted the plumb-dataset-start-end branch January 27, 2023 00:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: plumb Dataset start/end times through graphQL #194

feat: plumb Dataset start/end times through graphQL #194

nate-mar commented Jan 26, 2023

mikeldking Jan 26, 2023

nate-mar Jan 26, 2023

mikeldking Jan 26, 2023

mikeldking Jan 26, 2023

mikeldking Jan 26, 2023

nate-mar Jan 26, 2023

nate-mar Jan 26, 2023

mikeldking Jan 26, 2023

mikeldking Jan 26, 2023

nate-mar Jan 26, 2023

mikeldking Jan 26, 2023

mikeldking Jan 26, 2023

mikeldking Jan 26, 2023

nate-mar Jan 26, 2023

feat: plumb Dataset start/end times through graphQL #194

feat: plumb Dataset start/end times through graphQL #194

Conversation

nate-mar commented Jan 26, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment