-
Notifications
You must be signed in to change notification settings - Fork 28.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test summary with previous PyTorch/TensorFlow versions #18181
Comments
cc @LysandreJik @sgugger @patrickvonplaten @Rocketknight1 @gante @anton-l @NielsRogge @amyeroberts @alaradirik @stas00 @hollance to have your comments |
TF 2.3 is quite old by now, and I wouldn't make a special effort to support it. Several nice TF features (like the Numpy-like API) only arrived in TF 2.4, and we're likely to use those a lot in future. |
Hey @ydshieh, would you have a summary of the failing tests handy? I'm curious to see the reason why there are so many failures for PyTorch as soon as we leave the latest version. I'm quite confident that it's an issue in our tests rather than in our internal code, so seeing the failures would help. Thanks! |
@LysandreJik I will re-run it. The previous run(s) have huge tables in the reports, and sending to Slack failed (3001 character limit). I finally ran it by disabling those blocks. Before re-running it, I need a approve for #17921 |
I ran the past CI again which returns more information. Looking the report for There is one error occurring in almost all models:
Another one also occurs a lot (torchscript tests)
An error occurs (specifically) to vision models (probably due to the convolution layers)
Others
|
Thanks for the report! Taking a look at the PyTorch versions, here are the dates at which they were releases:
Most of the errors in From a first look, I'd offer to drop support for all PyTorch version inferior to < 1.6 as these have been released more than two years ago. Do you have a link to a job containing all these failures? I'd be interested in seeing if the 2342 errors in PyTorch 1.6 are solvable simply or if they will require a significant refactor. |
The link is here. But since it contains too many jobs (all models x all versions ~= 3200 jobs), it just shows I can re-run specifically for PyTorch 1.6 only, and will post a link later. |
I second that. While we are at it, do we want to establish an official shifting window of how far back we want to support pytorch versions for? As in minimum - we support at least 2 years of pytorch? If it's easy to support longer we would but it'd be easy to cut off if need be. The user always has the older |
Yes, that would work fine with me. If I understand correctly, that's how libraries in the PyData ecosystem (scikit-learn, numpy) manage the support of Python versions: they drop support for versions older than 2 years (scikit-learn/scikit-learn#20965, scikit-learn/scikit-learn#20084, scipy toolchaib, scipy/scipy#14655). Dropping support for PyTorch/Flax/TensorFlow versions that have been released more than two years ago sounds good to me. That is somewhat already the case (see failing tests), but we're just not aware. |
Hi, I am wondering what it means |
Ideally it should mean that all models work/all tests pass apart from functionality explicitly having versions tests (like CUDA bfloat16 or torch FX where we test against a specific PyTorch version). |
Initialized by @LysandreJik, we ran the tests with previous PyTorch/TensorFlow versions. The goal is to determine if we should drop (some) earlier PyTorch/TensorFlow versions.
torch-scatter
,accelerate
not installed, etc.)Here is the results (running on ~June 20, 2022):
It looks like the number of failures in TensorFlow testing doesn't increase much.
So far my thoughts:
Questions
TF 2.3
?The text was updated successfully, but these errors were encountered: