Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Switch from Feature to Field #2514

Merged
merged 18 commits into from
Apr 12, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
18 commits
Select commit Hold shift + click to select a range
93953d5
Add type annotation
felixwang9817 Apr 6, 2022
7057016
Switch from Feature to Field for FeatureViewProjection
felixwang9817 Apr 8, 2022
8de4c6b
Switch from Feature to Field for BaseFeatureView
felixwang9817 Apr 8, 2022
b32bea6
Switch from Feature to Field for RequestFeatureView
felixwang9817 Apr 8, 2022
05f2514
Enforce kwargs for ODFVs and switch from Feature to Field and from `f…
felixwang9817 Apr 8, 2022
58d02c1
Switch from Feature to Field and from `features` to `schema` for Feat…
felixwang9817 Apr 8, 2022
ba0f12e
Fix references to `features` and Feature
felixwang9817 Apr 8, 2022
13c874b
Switch from `Feature` to `Field` for ODFV in tests
felixwang9817 Apr 8, 2022
1d1e603
Switch from Feature to Field in templates
felixwang9817 Apr 9, 2022
1dc0942
Switch from Feature to Field in example test repos
felixwang9817 Apr 9, 2022
02cf8af
Switch from Feature to Field in registration integration tests
felixwang9817 Apr 9, 2022
d5e391b
Switch from Feature to Field in non-registration integration tests
felixwang9817 Apr 9, 2022
57ce182
Switch from Feature to Field in random tests
felixwang9817 Apr 9, 2022
7f5526f
Switch from Feature to Field in ui
felixwang9817 Apr 9, 2022
901a639
Switch from Feature to Field in some universal feature views
felixwang9817 Apr 9, 2022
bb2fe9d
Switch from Feature to Field in docs
felixwang9817 Apr 9, 2022
f93c844
Add assertion that BaseFeatureView is always initialized with a name
felixwang9817 Apr 11, 2022
2267064
Fix imports in docs
felixwang9817 Apr 11, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 17 additions & 9 deletions docs/getting-started/concepts/feature-view.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,14 @@ A feature view is an object that represents a logical group of time-series featu
{% tabs %}
{% tab title="driver_trips_feature_view.py" %}
```python
from feast import BigQuerySource, FeatureView, Field, Float32, Int64

driver_stats_fv = FeatureView(
name="driver_activity",
entities=["driver"],
features=[
Feature(name="trips_today", dtype=ValueType.INT64),
Feature(name="rating", dtype=ValueType.FLOAT),
schema=[
Field(name="trips_today", dtype=Int64),
Field(name="rating", dtype=Float32),
achals marked this conversation as resolved.
Show resolved Hide resolved
],
batch_source=BigQuerySource(
table_ref="feast-oss.demo_data.driver_activity"
Expand All @@ -39,11 +41,13 @@ If a feature view contains features that are not related to a specific entity, t
{% tabs %}
{% tab title="global_stats.py" %}
```python
from feast import BigQuerySource, FeatureView, Field, Int64

global_stats_fv = FeatureView(
name="global_stats",
entities=[],
features=[
Feature(name="total_trips_today_by_all_drivers", dtype=ValueType.INT64),
schema=[
Field(name="total_trips_today_by_all_drivers", dtype=Int64),
],
batch_source=BigQuerySource(
table_ref="feast-oss.demo_data.global_stats"
Expand All @@ -70,13 +74,15 @@ It is suggested that you dynamically specify the new FeatureView name using `.wi
{% tabs %}
{% tab title="location_stats_feature_view.py" %}
```python
from feast import BigQuerySource, Entity, FeatureView, Field, Int32, ValueType

location = Entity(name="location", join_key="location_id", value_type=ValueType.INT64)

location_stats_fv= FeatureView(
name="location_stats",
entities=["location"],
features=[
Feature(name="temperature", dtype=ValueType.INT32)
schema=[
Field(name="temperature", dtype=Int32)
],
batch_source=BigQuerySource(
table_ref="feast-oss.demo_data.location_stats"
Expand Down Expand Up @@ -115,9 +121,11 @@ A feature is an individual measurable property. It is typically a property obser
Features are defined as part of feature views. Since Feast does not transform data, a feature is essentially a schema that only contains a name and a type:

```python
trips_today = Feature(
from feast import Field, Float32

trips_today = Field(
name="trips_today",
dtype=ValueType.FLOAT
dtype=Float32
)
```

Expand Down
8 changes: 5 additions & 3 deletions docs/getting-started/concepts/point-in-time-joins.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,14 @@ Feature values in Feast are modeled as time-series records. Below is an example
The above table can be registered with Feast through the following feature view:

```python
from feast import FeatureView, Field, FileSource, Float32, Int64

driver_stats_fv = FeatureView(
name="driver_hourly_stats",
entities=["driver"],
features=[
Feature(name="trips_today", dtype=ValueType.INT64),
Feature(name="earnings_today", dtype=ValueType.FLOAT),
schema=[
Field(name="trips_today", dtype=Int64),
Field(name="earnings_today", dtype=Float32),
],
ttl=timedelta(hours=2),
batch_source=FileSource(
Expand Down
28 changes: 14 additions & 14 deletions docs/getting-started/quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,16 +80,16 @@ online_store:
```python
# This is an example feature definition file

from google.protobuf.duration_pb2 import Duration
from datetime import timedelta

from feast import Entity, Feature, FeatureView, FileSource, ValueType
from feast import Entity, FeatureView, Field, FileSource, Float32, Int64, ValueType
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we import the types from a different namespace? To keep things better organized? I'm okay with things the way they are for now but just floating the idea.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah that's a good (non-blocking) idea - I think from feast.typing import Float32, Int64 etc. would be much cleaner


# Read data from parquet files. Parquet is convenient for local development mode. For
# production, you can use your favorite DWH, such as BigQuery. See Feast documentation
# for more info.
driver_hourly_stats = FileSource(
path="/content/feature_repo/data/driver_stats.parquet",
event_timestamp_column="event_timestamp",
timestamp_field="event_timestamp",
created_timestamp_column="created",
)

Expand All @@ -106,10 +106,10 @@ driver_hourly_stats_view = FeatureView(
name="driver_hourly_stats",
entities=["driver"], # reference entity by name
ttl=Duration(seconds=86400 * 1),
features=[
Feature(name="conv_rate", dtype=ValueType.FLOAT),
Feature(name="acc_rate", dtype=ValueType.FLOAT),
Feature(name="avg_daily_trips", dtype=ValueType.INT64),
schema=[
Field(name="conv_rate", dtype=Float32),
Field(name="acc_rate", dtype=Float32),
Field(name="avg_daily_trips", dtype=Int64),
],
online=True,
batch_source=driver_hourly_stats,
Expand Down Expand Up @@ -149,16 +149,16 @@ feast apply
```python
# This is an example feature definition file

from google.protobuf.duration_pb2 import Duration
from datetime import timedelta

from feast import Entity, Feature, FeatureView, FileSource, ValueType
from feast import Entity, FeatureView, Field, FileSource, Float32, Int64, ValueType

# Read data from parquet files. Parquet is convenient for local development mode. For
# production, you can use your favorite DWH, such as BigQuery. See Feast documentation
# for more info.
driver_hourly_stats = FileSource(
path="/content/feature_repo/data/driver_stats.parquet",
event_timestamp_column="event_timestamp",
timestamp_field="event_timestamp",
created_timestamp_column="created",
)

Expand All @@ -175,10 +175,10 @@ driver_hourly_stats_view = FeatureView(
name="driver_hourly_stats",
entities=["driver"], # reference entity by name
ttl=Duration(seconds=86400 * 1),
features=[
Feature(name="conv_rate", dtype=ValueType.FLOAT),
Feature(name="acc_rate", dtype=ValueType.FLOAT),
Feature(name="avg_daily_trips", dtype=ValueType.INT64),
schema=[
Field(name="conv_rate", dtype=Float32),
Field(name="acc_rate", dtype=Float32),
Field(name="avg_daily_trips", dtype=Int64),
],
online=True,
batch_source=driver_hourly_stats,
Expand Down
4 changes: 2 additions & 2 deletions docs/reference/data-sources/push.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ When using a PushSource as a stream source in the definition of a feature view,
### Defining a push source

```python
from feast import PushSource, ValueType, BigQuerySource, FeatureView, Feature
from feast import PushSource, ValueType, BigQuerySource, FeatureView, Feature, Field, Int64

push_source = PushSource(
name="push_source",
Expand All @@ -25,7 +25,7 @@ push_source = PushSource(
fv = FeatureView(
name="feature view",
entities=["user_id"],
features=[Feature(name="life_time_value", dtype=ValueType.INT64)],
schema=[Field(name="life_time_value", dtype=Int64)],
stream_source=push_source,
)
```
Expand Down
8 changes: 4 additions & 4 deletions docs/reference/feature-repository.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,7 +89,7 @@ A feature repository can also contain one or more Python files that contain feat
```python
from datetime import timedelta

from feast import BigQuerySource, Entity, Feature, FeatureView, ValueType
from feast import BigQuerySource, Entity, Feature, FeatureView, Field, Float32, String, ValueType

driver_locations_source = BigQuerySource(
table_ref="rh_prod.ride_hailing_co.drivers",
Expand All @@ -107,9 +107,9 @@ driver_locations = FeatureView(
name="driver_locations",
entities=["driver"],
ttl=timedelta(days=1),
features=[
Feature(name="lat", dtype=ValueType.FLOAT),
Feature(name="lon", dtype=ValueType.STRING),
schema=[
Field(name="lat", dtype=Float32),
Field(name="lon", dtype=String),
],
batch_source=driver_locations_source,
)
Expand Down
8 changes: 4 additions & 4 deletions docs/reference/feature-repository/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@ A feature repository can also contain one or more Python files that contain feat
```python
from datetime import timedelta

from feast import BigQuerySource, Entity, Feature, FeatureView, ValueType
from feast import BigQuerySource, Entity, Feature, FeatureView, Field, Float32, String, ValueType

driver_locations_source = BigQuerySource(
table_ref="rh_prod.ride_hailing_co.drivers",
Expand All @@ -112,9 +112,9 @@ driver_locations = FeatureView(
name="driver_locations",
entities=["driver"],
ttl=timedelta(days=1),
features=[
Feature(name="lat", dtype=ValueType.FLOAT),
Feature(name="lon", dtype=ValueType.STRING),
schema=[
Field(name="lat", dtype=Float32),
Field(name="lon", dtype=String),
],
batch_source=driver_locations_source,
)
Expand Down
10 changes: 5 additions & 5 deletions docs/tutorials/validating-historical-features.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ pyarrow.parquet.write_table(entities_2019_table, "entities.parquet")
import pyarrow.parquet
import pandas as pd

from feast import Feature, FeatureView, Entity, FeatureStore
from feast import Feature, FeatureView, Entity, FeatureStore, Field, Float64, Int64
from feast.value_type import ValueType
from feast.data_format import ParquetFormat
from feast.on_demand_feature_view import on_demand_feature_view
Expand Down Expand Up @@ -137,10 +137,10 @@ trips_stats_fv = FeatureView(
name='trip_stats',
entities=['taxi'],
features=[
Feature("total_miles_travelled", ValueType.DOUBLE),
Feature("total_trip_seconds", ValueType.DOUBLE),
Feature("total_earned", ValueType.DOUBLE),
Feature("trip_count", ValueType.INT64),
Field(name="total_miles_travelled", dtype=Float64),
Field(name="total_trip_seconds", dtype=Float64),
Field(name="total_earned", dtype=Float64),
Field(name="trip_count", dtype=Int64),

],
ttl=Duration(seconds=86400),
Expand Down
2 changes: 1 addition & 1 deletion protos/feast/core/FeatureView.proto
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ message FeatureViewSpec {
// Feature View. Not updatable.
repeated string entities = 3;

// List of features specifications for each feature defined with this feature view.
// List of specifications for each field defined as part of this feature view.
repeated FeatureSpecV2 features = 4;

// Description of the feature view.
Expand Down
25 changes: 25 additions & 0 deletions sdk/python/feast/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,22 @@
from .feature_service import FeatureService
from .feature_store import FeatureStore
from .feature_view import FeatureView
from .field import Field
from .on_demand_feature_view import OnDemandFeatureView
from .repo_config import RepoConfig
from .request_feature_view import RequestFeatureView
from .types import (
Array,
Bool,
Bytes,
Float32,
Float64,
Int32,
Int64,
Invalid,
String,
UnixTimestamp,
)
from .value_type import ValueType

logging.basicConfig(
Expand All @@ -35,6 +48,7 @@
"KafkaSource",
"KinesisSource",
"Feature",
"Field",
"FeatureService",
"FeatureStore",
"FeatureView",
Expand All @@ -48,4 +62,15 @@
"RequestFeatureView",
"SnowflakeSource",
"PushSource",
# Types
"Array",
"Invalid",
"Bytes",
"String",
"Bool",
"Int32",
"Int64",
"Float32",
"Float64",
"UnixTimestamp",
]
15 changes: 6 additions & 9 deletions sdk/python/feast/base_feature_view.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,8 @@
from google.protobuf.json_format import MessageToJson
from proto import Message

from feast.feature import Feature
from feast.feature_view_projection import FeatureViewProjection
from feast.field import Field


class BaseFeatureView(ABC):
Expand All @@ -41,7 +41,7 @@ class BaseFeatureView(ABC):
"""

name: str
features: List[Feature]
features: List[Field]
description: str
tags: Dict[str, str]
owner: str
Expand All @@ -53,8 +53,8 @@ class BaseFeatureView(ABC):
def __init__(
self,
*,
name: Optional[str] = None,
features: Optional[List[Feature]] = None,
name: str,
features: Optional[List[Field]] = None,
description: str = "",
tags: Optional[Dict[str, str]] = None,
owner: str = "",
Expand All @@ -64,7 +64,7 @@ def __init__(

Args:
name: The unique name of the base feature view.
features: The list of features defined as part of this base feature view.
features (optional): The list of features defined as part of this base feature view.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, list of fields? fine with list of features

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer list of features since features here actually refers to features; e.g. I would consider schema a list of fields since it contains not just features but also join keys, but features I consider a list of features

description (optional): A human-readable description.
tags (optional): A dictionary of key-value pairs to store arbitrary metadata.
owner (optional): The owner of the base feature view, typically the email of the
Expand All @@ -73,12 +73,9 @@ def __init__(
Raises:
ValueError: A field mapping conflicts with an Entity or a Feature.
"""
if not name:
raise ValueError("Name needs to be provided")
Comment on lines -76 to -77
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this was removed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a bit unnecessary since this initialization happens only internally; the check should still happen so I'm going to switch to an assert (since we expect it to always pass)

assert name is not None
self.name = name

self.features = features or []

self.description = description
self.tags = tags or {}
self.owner = owner
Expand Down
Loading