Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(feat): Sql registry search #105

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,8 @@ require (
github.com/dgryski/go-rendezvous v0.0.0-20200823014737-9f7001d12a5f // indirect
github.com/goccy/go-json v0.10.2 // indirect
github.com/golang/snappy v0.0.4 // indirect
github.com/gonuts/commander v0.1.0 // indirect
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this.

github.com/gonuts/flag v0.1.0 // indirect
github.com/google/flatbuffers v2.0.8+incompatible // indirect
github.com/klauspost/asmfmt v1.3.2 // indirect
github.com/klauspost/compress v1.16.3 // indirect
Expand Down
4 changes: 4 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -173,6 +173,10 @@ github.com/golang/protobuf v1.5.3/go.mod h1:XVQd3VNwM+JqD3oG2Ue2ip4fOMUkwXdXDdiu
github.com/golang/snappy v0.0.0-20180518054509-2e65f85255db/go.mod h1:/XxbfmMg8lxefKM7IXC3fBNl/7bRcc72aCRzEWrmP2Q=
github.com/golang/snappy v0.0.4 h1:yAGX7huGHXlcLOEtBnF4w7FQwA26wojNCwOYAEhLjQM=
github.com/golang/snappy v0.0.4/go.mod h1:/XxbfmMg8lxefKM7IXC3fBNl/7bRcc72aCRzEWrmP2Q=
github.com/gonuts/commander v0.1.0 h1:EcDTiVw9oAVORFjQOEOuHQqcl6OXMyTgELocTq6zJ0I=
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert this.

github.com/gonuts/commander v0.1.0/go.mod h1:qkb5mSlcWodYgo7vs8ulLnXhfinhZsZcm6+H/z1JjgY=
github.com/gonuts/flag v0.1.0 h1:fqMv/MZ+oNGu0i9gp0/IQ/ZaPIDoAZBOBaJoV7viCWM=
github.com/gonuts/flag v0.1.0/go.mod h1:ZTmTGtrSPejTo/SRNhCqwLTmiAgyBdCkLYhHrAoBdz4=
github.com/google/btree v0.0.0-20180813153112-4030bb1f1f0c/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ=
github.com/google/btree v1.0.0/go.mod h1:lNA+9X1NB3Zf8V7Ke586lFgjr2dZNuvo3lPJSGZ5JPQ=
github.com/google/flatbuffers v2.0.5+incompatible/go.mod h1:1AeVuKshWv4vARoZatz6mlQ0JxURH0Kv5+zNeJKJCa8=
Expand Down
77 changes: 77 additions & 0 deletions sdk/python/feast/infra/registry/sql.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
from __future__ import annotations

import concurrent.futures
import logging
import threading
Expand Down Expand Up @@ -919,6 +921,7 @@ def get_user_metadata(

def proto(self) -> RegistryProto:
r = RegistryProto()

# last_updated_timestamps = []

def process_project(project):
Expand Down Expand Up @@ -1063,6 +1066,80 @@ def create_project_if_not_exists(self, project):
if new_project:
self._set_last_updated_metadata(update_datetime, project)

def search(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression search should be implemented using full text search. Is pagination implemented in this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About Full Text Search, is that expectation set with Workbench team?
About Pagination, We are planning to switch to Feast Remote Registry. How does that impact the functionality?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remote registry leverages, Sql Registry so we will have the functionality. We don't need to reimplement. Just method definition in remote registry should help until we contribute the functionality. Looking for answers to my questions above.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full text search in this context means "feature view and project names".
That is, the names of all feature views and all projects should be searchable using this functionality. That's what's indicated by the UI design that has been given to us.

self,
name="",
application="",
owning_team="",
created_at=datetime.min,
updated_at=datetime.min,
online=False,
) -> List[Union[FeatureView, ProjectMetadataModel]]:
"""
Search for feature views or projects based on the provided search
parameters. Since the SQL database stores only metadata and protos, we
have to pull all potentially matching objects and filter in memory.
"""
fv_list = []
with self.engine.connect() as conn:
stmt = select(feature_views)
if name:
stmt = stmt.where(feature_views.c.feature_view_name == name)

rows = conn.execute(stmt).all()

if rows:
fv_list = [
FeatureView.from_proto(
FeatureViewProto.FromString(row["feature_view_proto"])
)
for row in rows
]

fv_list = [
view
for view in fv_list
if view.created_timestamp is None
or view.created_timestamp >= created_at
]

fv_list = [
view
for view in fv_list
if view.last_updated_timestamp is None
or view.last_updated_timestamp >= updated_at
]

if owning_team:
fv_list = [
view for view in fv_list if view.tags.get("team") == owning_team
]
if application:
fv_list = [
view
for view in fv_list
if view.tags.get("application") == application
]
if online is not None:
fv_list = [view for view in fv_list if view.online == online]

project_list = self.get_all_project_metadata()

project_list = [
project for project in project_list if project.project_name == name
]
project_list = [
project
for project in project_list
if project.last_updated_timestamp is None
or project.last_updated_timestamp >= updated_at
]

final_list: List[FeatureView | ProjectMetadataModel] = []
final_list.extend(project_list)
final_list.extend(fv_list)
return final_list

def _delete_object(
self,
table: Table,
Expand Down
2 changes: 1 addition & 1 deletion sdk/python/requirements/py3.10-ci-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -534,7 +534,7 @@ mypy-extensions==1.0.0
# mypy
mypy-protobuf==3.1.0
# via eg-feast (setup.py)
mysqlclient==2.2.0
mysqlclient==2.2.4
# via eg-feast (setup.py)
nbclient==0.8.0
# via nbconvert
Expand Down
8 changes: 4 additions & 4 deletions sdk/python/requirements/py3.8-ci-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -216,7 +216,7 @@ executing==1.2.0
# via stack-data
fastapi==0.95.2
# via feast (setup.py)
fastavro==1.7.4
fastavro==1.8.2
# via
# feast (setup.py)
# pandavro
Expand Down Expand Up @@ -529,7 +529,7 @@ mypy-extensions==1.0.0
# mypy
mypy-protobuf==3.1
# via feast (setup.py)
mysqlclient==2.1.1
mysqlclient==2.2.4
# via feast (setup.py)
nbclassic==1.0.0
# via notebook
Expand Down Expand Up @@ -732,7 +732,7 @@ pyjwt[crypto]==2.7.0
# snowflake-connector-python
pymilvus==2.2.14
# via eg-feast (setup.py)
pymssql==2.2.7
pymssql==2.2.8
# via feast (setup.py)
pymysql==1.0.3
# via feast (setup.py)
Expand Down Expand Up @@ -798,7 +798,7 @@ pytz==2023.3
# pandas
# snowflake-connector-python
# trino
pyyaml==6.0
pyyaml==6.0.1
# via
# dask
# feast (setup.py)
Expand Down
Loading