-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feed model engine #39
Conversation
simple bug fix self.training_dataset.training_dataset.schema adding pydoop to read from hdfs adding tfrecords schema adding tfrecords schema temporary fix for create_tf_record_schema temporary fix for create_tf_record_schema temporary fix for create_tf_record_schema temporary fix for create_tf_record_schema
A general comment upfront. So this is a lot of logic to convert between Spark Types to the Tensorflow Types. This logic is currently duplicated in java. I think we should find a solution that does not require this massive |
@moritzmeister now it can be directly inferred from ftrecord files. no spark and no extra information is required to read. I tested both on tf1 and tf2. However, didn't manage to install hsfs to properly test it. Getting certificate error when I try to connect to fs. Can you please look at it if in general it looks fine? API now looks like this anomaly_data = fs.get_training_dataset("anomaly_data", 1)
train = anomaly_data.read("train")
train_input = anomaly_data.feed(target_names=['target'],split='train').tf_record_dataset() |
This reverts commit 4781152.
Work is not ready yet but will give you good idea what FeedModel engine looks looks like:
At the moment only tfrecords reader is implemented (can't yet parse array data types).