Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Feature Validation to Feast #172

Closed
kingkai620 opened this issue Apr 2, 2019 · 4 comments
Closed

Add Feature Validation to Feast #172

kingkai620 opened this issue Apr 2, 2019 · 4 comments
Assignees
Labels
area/job-management kind/feature New feature or request priority/p0 Highest priority
Milestone

Comments

@kingkai620
Copy link

Having features been ingested by feast, i'm always need to check the missing value, generate the distribution and coverage statistics of the features. And then, it's necessary to detect features drift by looking at a series of feature, and to detect something like training-serving skew. I think the features can only be continued to ingested by training and serving if all these checks pass.

So,does feast have the plan to support these features? if not , what would be the standard approach to make these validation using feast?

@tims
Copy link
Contributor

tims commented Apr 3, 2019

Hi,

Thanks for bringing this up. We definitely want to add more useful validation of features. Currently all we do is check the type matches. We'd like to be able to check they are within ranges, or regex's for strings etc. If you could elaborate on the type of validation you would require that would help us get started.

A note, if a feature does not pass validation, it gets thrown into an errors pile, so there might be some use cases, where you only want to be warned about it and still have the data accepted? I think in some use cases for example, people might want to keep accepting inputs if training serving skew occurs, but they would like to be able to monitor it or be notified.

Generating statistics about features is another thing we'd like to add that I think can be treated as a separate issue. If you could help us build a list of the sorts of statistics you'd like about a feature and how you would define them that would also help us get started. And which of these statistics would be needed for the validation you'd like to do.

We can use this issue as a discussion place and spin off other issues later.

@woop
Copy link
Member

woop commented Sep 1, 2019

This is a hotly requested feature for Feast. We are planning to pick this up in 0.4.0.

@woop woop changed the title How does feast handle feature validation? Add feature validation to Feast Sep 1, 2019
@woop woop added the kind/feature New feature or request label Sep 1, 2019
@woop
Copy link
Member

woop commented Jan 25, 2020

An update on this issue. We have drafted the first RFC for feature statistics and validation. Please have a look and comment if you can!

@woop
Copy link
Member

woop commented Jun 21, 2020

This is addressed in Feast 0.6 (code on master). It's now possible to produce batch statistics for validation. #612

@woop woop closed this as completed Jun 21, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/job-management kind/feature New feature or request priority/p0 Highest priority
Projects
None yet
Development

No branches or pull requests

5 participants