-
Notifications
You must be signed in to change notification settings - Fork 14.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auto ML assets #25466
Auto ML assets #25466
Conversation
5a2d3b5
to
6f56cfe
Compare
Errors :( |
cf3d1c1
to
93ef028
Compare
Rebased to acount for Flask 2.2 errors fixed yesterday. |
fdb3996
to
7f0305d
Compare
there is a csv file of 45K char , is that normal ? |
Yeah 45K lines of .csv file is NOT something we want. Few options:
|
This .csv is needed for training an AutoML model, in order to start the training .csv should consist more then 1000 rows. For our test I can reduce the file to 2100 rows. @potiuk what do you think about reducing the file size? |
107f390
to
89c2f7c
Compare
@potiuk Catching attention :) I think 2100 is okayish (not the best but certainly better than 50k). Please comment if you still think it should be stored in the external storage. |
Can we compress it (and dynamically decompress during test?). Just zipping it is 20K instead of 160K. This file is unlikely to ever change and it is cimpletely uninteresting to see what's in when you review the cod, so there is no particular reason to keep text file in Git. It's not only the size that matters in this case. Keeping it plain text has this really nasty effect that it when you search something in the source code in your IDE, you will find some matching words here likely, so keeping the file uncompressed make it very prone to falling search&replace victim, |
89c2f7c
to
f498f06
Compare
@potiuk I have done it |
Sorry for delay - been a bit busy. No, It's not compressed - it's just bundled in .tar now not .zipped (.tar-ing single file kinda make no sense) . Stil takes 170 instead of 20K (and this PR needs rebase anyway). |
6adb962
to
076d91a
Compare
conflicts need to be resolved after string normalisation |
b1dbfa0
to
3616fd2
Compare
3616fd2
to
a0c5ee8
Compare
Rebased to rebuild. |
cfb9f8b
to
9b863c6
Compare
9b863c6
to
4c1abb2
Compare
Tests failing. |
c1bc806
to
0c94ca8
Compare
static check failures. |
This reverts commit 7f0305d80ad162ee4e17a85870e88bdad5f27b18.
REbased - static checks fixed in main (mysql python connector release breaking mypy) |
@potiuk I think that PR can be merged. I can't do that because I am not the author of PR and I don't have write access |
I have created links and updated system tests for Auto ML operators.
Co-authored-by: Wojciech Januszek januszek@google.com
Co-authored-by: Lukasz Wyszomirski wyszomirski@google.com
Co-authored-by: Maksim Yermakou maksimy@google.com
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.