-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support DataTable/DataView #141
Comments
Linked to #91 . I would say try to avoid data table / view - Accord supports them but they're a pain to work with really. Rather just an sequence of any generic type with either some mapper functions to highlight labels and features or attributes etc.. Or arrays / lists of numbers. |
@isaac2004 #38 (comment) looks like what you're suggesting also. There seems to be a bit of a problem in with locales at least when reading and writing data. Also at #109 (comment). I might have misunderstood something, but would like to highlight this case from another perspective too. |
When we work with data, we do a lot of feature engineering, i.e., encode/transform data. It is possible to use mapper functions to describe those. Without a generic data structure to hold on the intermediate results, encoding/transformation would have to be lazy evaluated every time. Those intermediate data structure cannot be internal. In numpy, we use .head(), .tail(), .info(), .decribe() to look at the data all the time. |
@aspcompiler Sounds like LINQ, or F#'s pipe operator (or streams: https://nessos.github.io/Streams/ and an interesting take with GPUs: https://devblogs.nvidia.com/jet-gpu-powered-fulfillment/). It'd be nice to compose streams. |
I see that I've been assigned to this! For that reason I would like to understand this issue a bit more, since there are one or two points that are not clear to me. @aspcompiler if I were to rephrase your original concern about "it is difficult to use", do you mean the API in @isaacabraham I wonder if the change under consideration in PR 106 helps, at least as a convenience to map some |
Please consider using python bindings for ML.NET - NimbusML. |
Currently, we have to create a model that maps to the data. It is difficult to use it in a scripting language like F# interactive or Powershell. We will be able use scripting language if there is a general data container such as DataFrame in numpy. DataTable/DataView is the closest thing in .net.
Machine learning requires a lot of experiments. Scripting language will make it much easier. Otherwise we have to keep recompiling and rerun the steps.
The text was updated successfully, but these errors were encountered: