Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solution for skew/drift detection in distribution of numerical feature #113

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions tensorflow_data_validation/api/validation_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
import logging
import apache_beam as beam
import pyarrow as pa
import pandas as pd
import tensorflow as tf
from tensorflow_data_validation import constants
from tensorflow_data_validation import types
Expand Down Expand Up @@ -52,6 +53,17 @@
anomalies_pb2.AnomalyInfo.NO_DATA_IN_SPAN,
])

def preprocess_numerical_to_categorical_by_own_quantiles(
dataframe: pd.DataFrame,
):
# TODO: refactor implementation from private project
return dataframe

def preprocess_numerical_to_categorical_by_training_quantiles(
dataframe: pd.DataFrame,
):
# TODO: refactor implementation from private project
return dataframe

def infer_schema(
statistics: statistics_pb2.DatasetFeatureStatisticsList,
Expand Down