Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FeatureStore.materialize_incremental() does not initially materialize data for FeatureView with ttl=0 #2651

Closed
chasleslr opened this issue May 9, 2022 · 1 comment · Fixed by #2666

Comments

@chasleslr
Copy link

chasleslr commented May 9, 2022

Expected Behavior

When executing FeatureStore.materialize_incremental() on a FeatureView that has a ttl=0 and has never yet been materialized, I would expect all historical data to be materialized since a ttl=0 indicates that features live forever according to the docs.

For an instance of FeatureView that has not explicitly defined ttl, it currently defaults to timedelta(days=0), which hints that this is indeed an expected value for ttl.

Current Behavior

Currently, the start_date for the materialization defaults to datetime.utcnow() - ttl if the feature view has not yet been materialized (i.e. FeatureView.most_recent_end_time is None). This means that start_date = datetime.utcnow() - 0s, thus start_date = datetime.utcnow(), resulting in no data being materialized.

feast/sdk/python/feast/feature_store.py

    def materialize_incremental(
        self, end_date: datetime, feature_views: Optional[List[str]] = None,
    ) -> None:
        ...
        for feature_view in feature_views_to_materialize:
            start_date = feature_view.most_recent_end_time
            if start_date is None:
                if feature_view.ttl is None:
                    raise Exception(
                        f"No start time found for feature view {feature_view.name}. materialize_incremental() requires"
                        f" either a ttl to be set or for materialize() to have been run at least once."
                    )
                >>> start_date = datetime.utcnow() - feature_view.ttl <<<
        ...

Steps to reproduce

from feast import FeatureView, FeatureStore
from datetime import datetime 

my_feature_view = FeatureView(
    name="my_feature_view",
    entities=["my_entity"],
    schema=[
        Field(name="my_feature", dtype=types.Bool)
    ],
    # ttl=timedelta(seconds=0) ; if not defined, TTL defaults to 0s
    source=sources.my_source
)

fs = FeatureStore(".")
fs.materialize_incremental(end_date=datetime.utcnow(), views=["my_feature_view"])

Specifications

  • Version: 0.20.2
  • Platform: macOS
  • Subsystem:

Possible Solution

feast/sdk/python/feast/feature_store.py

    def materialize_incremental(
        self, end_date: datetime, feature_views: Optional[List[str]] = None,
    ) -> None:
        ...
        for feature_view in feature_views_to_materialize:
            start_date = feature_view.most_recent_end_time
            if start_date is None:
                if feature_view.ttl is None:
                    raise Exception(
                        f"No start time found for feature view {feature_view.name}. materialize_incremental() requires"
                        f" either a ttl to be set or for materialize() to have been run at least once."
                    )
                start_date = datetime.utcnow() - feature_view.ttl

                if feature_view.ttl == timedelta(days=0):
                    # what is the lower boundary for "forever" (as defined in the docs for ttl=0)?
                    # the smallest UNIX epoch timestamp could be a good candidate
                    start_date = datetime(1970,  1, 1)
        ...
@felixwang9817
Copy link
Collaborator

hey @chasleslr thanks for reporting this bug, and for the proposing the fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants