-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible data process in dataset before cut by length limit #10
Comments
Hi @xpzhang , If I understand correctly, both are actually already supported in weaver (though with some limitations):
Let me know if these two meet your needs or not :) |
Hi @hqucms , Thanks for the reply. These functions are exactly what I need!
Then I tried to modify
to
I got another error:
|
Hi @xpzhang -- sorry for the slow response, it slipped my attention somehow... |
Hi @hqucms , the new implementation works well, thanks a lot!
It could be avoid by not defining a two-step variable like this:
but in many cases, when variables like |
Hi @xpzhang -- indeed that's something I have been thinking to improve for a while. It should be working with this pull request #11 so probably you can give it a try. I'd like to run a few more tests to make sure it does not break other things before merging it. However just to add a note here about |
Hi @hqucms , I've tested this pr and it works as expected. really appreciate all your hard work on this. Although It seems that the |
Hi @xpzhang -- Nice catch! I just updated the pull request to fix that :) |
thanks for the quick fix! :) |
Hi, Huilin
To my understanding, the number of data points feed to training is limited by pf_points.length or pf_features.length, and any points exceeding these limits will be discarded. But in some applications, original data are prepared in specific order that is not suitable for applying a direct cut. It may be desirable to have some pre-process to eliminate some bias. Two possible approaches are:
It would be nice if any new items could be add in the data config file for this process options. I'm not sure whether I've made myself clear and if this new feature is easy to implement, thanks a lot.
The text was updated successfully, but these errors were encountered: