Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subscription configuration #390

Closed
woop opened this issue Dec 24, 2019 · 3 comments
Closed

Subscription configuration #390

woop opened this issue Dec 24, 2019 · 3 comments

Comments

@woop
Copy link
Member

woop commented Dec 24, 2019

Serving deployments that subscribe to feature sets define their subscriptions as follows

  message Subscription {    
    // Name of featureSet to subscribe to. One or more asterisks are supported in the name.
    // e.g. customer_* or *
    string name = 1;

    // Versions of the given featureSet that will be ingested into this store.
    // Valid options for version:
    //     latest: only subscribe to latest version of feature set
    //     [version number]: pin to a specific version
    //     >[version number]: subscribe to all versions larger than or equal to [version number]
    string version = 2;
  }

With the introduction of project name spacing, we'd need to include a project subscription as well. This would serve to differentiate feature sets across projects. As part of this change, we'd also recommend altering the subscription logic for versions, mostly to make the subscription more consistent with the project name and feature set name keys.

  message Subscription {    
    // Name of project(s) where subscribed feature sets can be found. All feature sets must be          
    // located within these projects. Valid strings are either * which subscribes to all projects, or     
    // a specific project name. Asterisks are not supported for wildcard matching.
    // e.g. "my_project" or "*"
    string project = 3;

    // Name of feature set to subscribe to. One or more asterisks are supported in the name.
    // These asterisks are used as wildcards for matching feature set names. Regex is not supported.
    // e.g. "customer_*" or "*"
    string name = 1;

    // Versions of the given feature set that will be ingested into this store.
    // Valid options for version:
    //     "latest": only subscribe to latest version
    //     [version number]: pin to a specific version
    //     "*": Subscribe to all versions 
    string version = 2;
  }
@woop woop changed the title Subscription logic Subscription configuration Dec 24, 2019
@ches
Copy link
Member

ches commented Dec 24, 2019

With the introduction of project name spacing, we'd need to include a project subscription as well.

I suppose it'll be common to want this, and it's a major use case that this issue will support:

{
  "project": "myproject",
  "name": "*",
  "version": "latest"
}

It seems less obvious to me to have two fields with the stipulation that "All feature sets must be located within these projects" than to require a qualified feature set reference, e.g. project-name/feature-set-name (although the field name name becomes strained).

A subscription config like this is also supported, I assume:

{
  "project": "*",
  "name": "customer_*",
  "version": "latest"
}

This is where it becomes slightly confusing / potentially more error-prone to me than a single value like */customer_*.

It's mostly cosmetic and subjective though I guess, and something users don't interact with very often. It could be a way to evolve without the breaking change of a new required field: a name with missing project could be implicitly treated as */, but maybe that isn't sane.

There's also probably some benefit to structured fields, although I have the feeling feature references will become a pervasive mini language and discrete fields to construct them will start to feel tiresome.

Version is particularly troublesome in the second (albeit contrived) example (probably only latest would ever make sense in practice), but I'll leave that out of scope for the moment as something for #386.

we'd also recommend altering the subscription logic for versions, mostly to make the subscription more consistent with the project name and feature set name keys.

I'm not certain what this part means, could you clarify?

zhilingc pushed a commit that referenced this issue Dec 25, 2019
* Project namespacing refactoring
* Update Feast Core API and data model to support project namespacing
* Update Python SDK to support project namespacing

* More refactoring of e2e tests and added uniqueness constraint for features

* Fix typo in comment of Field

* Update ListFeatureSetRequest to support new subscription logic defined
here: #390

* Update subscription logic in Feast Ingestion to support:
#390
@woop
Copy link
Member Author

woop commented Dec 25, 2019

With the introduction of project name spacing, we'd need to include a project subscription as well.

I suppose it'll be common to want this, and it's a major use case that this issue will support:

{
  "project": "myproject",
  "name": "*",
  "version": "latest"
}

It seems less obvious to me to have two fields with the stipulation that "All feature sets must be located within these projects" than to require a qualified feature set reference, e.g. project-name/feature-set-name (although the field name name becomes strained).

A subscription config like this is also supported, I assume:

{
  "project": "*",
  "name": "customer_*",
  "version": "latest"
}

This is where it becomes slightly confusing / potentially more error-prone to me than a single value like */customer_*.

It's mostly cosmetic and subjective though I guess, and something users don't interact with very often. It could be a way to evolve without the breaking change of a new required field: a name with missing project could be implicitly treated as */, but maybe that isn't sane.

There's also probably some benefit to structured fields, although I have the feeling feature references will become a pervasive mini language and discrete fields to construct them will start to feel tiresome.

Version is particularly troublesome in the second (albeit contrived) example (probably only latest would ever make sense in practice), but I'll leave that out of scope for the moment as something for #386.

we'd also recommend altering the subscription logic for versions, mostly to make the subscription more consistent with the project name and feature set name keys.

I'm not certain what this part means, could you clarify?

Thanks for the feedback @ches. I am a bit split on the reference. On the one hand I feel like creating a string reference makes it easier to understand, on the other hand we would be baking this reference into the code base and it would be harder to reverse later. I think that keeping it as individual fields for now makes it more flexible. Otherwise if we introduce new keys to the string it might break the parsers.

What I am thinking of settling on is

// Retrieves details for all versions of a specific feature set
message ListFeatureSetsRequest {
    Filter filter = 1;

    message Filter {
        // Name of project that the feature sets belongs to. This can be one of
        // - [project_name]
        // - *
        // If an asterisk is provided, filtering on projects will be disabled. All projects will
        // be matched. It is NOT possible to provide an asterisk with a string in order to do
        // pattern matching.
        string project = 3;

        // Name of the desired feature set. Asterisks can be used as wildcards in the name.
        // Matching on names is only permitted if a specific project is defined. It is disallowed
        // If the project name is set to "*"
        // e.g.
        // - * can be used to match all feature sets
        // - my-feature-set* can be used to match all features prefixed by "my-feature-set"
        // - my-feature-set-6 can be used to select a single feature set
        string feature_set_name = 1;


        // Versions of the given feature sets that will be returned.
        // Valid options for version:
        //     "latest": only the latest version is returned.
        //     "*": Subscribe to all versions
        //     [version number]: pin to a specific version. Project and feature set name must be
        //                       explicitly defined if a specific version is pinned.
        string feature_set_version = 2;
    }
}

Allowed

project: "*"
name: "*"
version: "*"

Allowed

project: "my_project"
name: "*"
version: "*"

Allowed

project: "my_project"
name: "my_feature_set_*"
version: "*"

Allowed

project: "my_project"
name: "my_feature_set_1"
version: "*"

Allowed

project: "my_project"
name: "my_feature_set_1"
version: "latest"

Allowed

project: "my_project"
name: "my_feature_set_1"
version: "5"

Allowed

project: "my_project"
name: "*"
version: "latest"

Not allowed (must define project and feature set explicitly)

project: "*"
name: "*"
version: "3"

Not allowed (must define project explicitly)

project: "*"
name: "my_feature_set_*"
version: "*"

Not allowed (must define feature set explicitly)

project: "m_project"
name: "my_feature_set_*"
version: "5"

What I meant by consistency is to remove the comparator >5 from versions. I dont see a strong use case for that. By using <number, latest, or * we seem to cover most bases. That is how project and name works.

@woop
Copy link
Member Author

woop commented Jan 26, 2020

Closing this since it was implemented in Feast 0.4

@woop woop closed this as completed Jan 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants