Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array + Diagonal failure #165

Closed
vchuravy opened this issue Oct 26, 2021 · 2 comments · Fixed by #166
Closed

Array + Diagonal failure #165

vchuravy opened this issue Oct 26, 2021 · 2 comments · Fixed by #166

Comments

@vchuravy
Copy link
Member

vchuravy commented Oct 26, 2021

using LinearAlgebra
using AMDGPU

            AT = ROCArray
            n = 128
            A = AT(rand(Float32, (n,n)))
            d = AT(rand(Float32, n))
            D = Diagonal(d)
            B = A + D
            collect(B) ≈ collect(A) + collect(D)
julia> collect(B) .≈ collect(A) + collect(D)
128×128 BitMatrix:
 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  …  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1
 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1     0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1
 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1     0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  1
 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1     0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  1
 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1     0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  1  0  1
 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  …  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1  0  1  0  1
 ⋮              ⋮              ⋮              ⋮              ⋮              ⋱        ⋮              ⋮              ⋮              ⋮              ⋮     
 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1     1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1
 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1     1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1
 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  …  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1
 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1     1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1
 1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1  1     1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1  0  1
julia> A + D
128×128 ROCMatrix{Float32}:
 1.02106     0.0921212  0.279667    0.266254  0.985505   0.174462  0.740154   …  -1.71465f38   -1.71465f38   -1.71465f38   -1.71465f38   0.644539
 0.00559413  1.56396    0.658443    0.456375  0.0258112  0.208886  0.537066      -1.71465f38   -1.71465f38   -1.71465f38   -1.71465f38   0.163299
 0.958081    0.841537   0.994774    0.200046  0.808281   0.113412  0.748186       0.0           0.0           0.300136      0.0          0.632184
 0.672491    0.97533    0.125371    0.71167   0.586824   0.630625  0.566512       0.0           0.0           0.279454      0.0          0.729033
 0.138161    0.348633   0.959647    0.120322  0.831747   0.52287   0.677146       0.76792       0.0           0.541541      0.0          0.44386
 0.81552     0.554076   0.601219    0.583099  0.406613   0.50956   0.0149934  …   0.59062       0.0           0.211465      0.0          0.425578
 0.585995    0.618082   0.875589    0.689613  0.342014   0.951317  1.47807        0.996348      0.0           0.472449      0.0          0.810768
 0.648932    0.869043   0.00508428  0.481415  0.594282   0.954457  0.594401       0.804497      0.0           0.706352      0.0          0.227551
 0.950201    0.798405   0.751419    0.431937  0.392388   0.167172  0.996166       0.437932     -5.1515f-9     0.563016      5.89816f-34  0.889226
 0.966268    0.604634   0.464085    0.72118   0.0115883  0.173688  0.0464908      0.193675     -4.392f-19     0.599506      6.0f-45      0.3995
 ⋮                                                       ⋮                    ⋱                               ⋮                          
 0.463826    0.301494   0.426812    0.231694  0.49323    0.658562  0.627726       0.398807      9.49213f-37   0.491222    NaN            0.764548
 0.111097    0.168921   0.435184    0.803009  0.22741    0.325752  0.650371   …   0.312124     -2.26569f38    0.870685    NaN            0.277152
 0.0783372   0.802934   0.973153    0.457144  0.892521   0.739615  0.0400927      0.768561     -1.9459f38     0.712432    NaN            0.238672
 0.235187    0.72732    0.214835    0.183458  0.343981   0.182465  0.526841       0.807266     -1.9459f38     0.279783    NaN            0.584277
 0.788909    0.325859   0.229556    0.362396  0.110689   0.535112  0.878508       0.990009    NaN             0.754234    NaN            0.720826
 0.460831    0.487539   0.191096    0.74637   0.925338   0.238311  0.499134       0.610513     -2.29238f38    0.726175    NaN            0.858912
 0.972706    0.259302   0.480788    0.25185   0.677029   0.156736  0.14716    …   0.339408    NaN             1.05132     NaN            0.330685
 0.446682    0.439278   0.630035    0.893114  0.324872   0.422373  0.0394168      0.957419    NaN             0.442314    NaN            0.733302
 0.0891961   0.112863   0.858945    0.131734  0.760471   0.675587  0.446861       0.923527    NaN             0.538234    NaN            0.867137

This is on #163

@vchuravy
Copy link
Member Author

Looks like grid-stride broadcast JuliaGPU/GPUArrays.jl#367

@vchuravy
Copy link
Member Author

julia> b .= 0
(groupsize, gridsize) = (threads, blocks * threads) = (256, 8192)
128×128 ROCMatrix{Float32}:
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  -0.372549  -0.372549  -0.372549  -0.372549  -0.372549  -0.372549  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     -0.372549  -0.372549  -0.372549  -0.372549  -0.372549  -0.372549  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     -0.372549  -0.372549  -0.372549  -0.372549   0.0       -0.372549  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     -0.372549  -0.372549  -0.372549  -0.372549   0.0       -0.372549  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0     -0.372549  -0.372549   0.0       -0.372549   0.0       -0.372549  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …  -0.372549  -0.372549   0.0       -0.372549   0.0       -0.372549  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0      0.0       -0.372549   0.0       -0.372549   0.0       -0.372549  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0      0.0       -0.372549   0.0       -0.372549   0.0       -0.372549  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0      0.0       -0.372549   0.0       -0.372549   0.0       -0.372549  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0      0.0       -0.372549   0.0       -0.372549   0.0       -0.372549  0.0
 ⋮                        ⋮                        ⋮                        ⋱                                               ⋮                    
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0      0.0       -0.372549   0.0       -0.372549   0.0       -0.372549  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …   0.0       -0.372549   0.0       -0.372549   0.0       -0.372549  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0      0.0       -0.372549   0.0       -0.372549   0.0       -0.372549  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0      0.0       -0.372549   0.0       -0.372549   0.0       -0.372549  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0      0.0       -0.372549   0.0       -0.372549   0.0       -0.372549  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0      0.0       -0.372549   0.0       -0.372549   0.0       -0.372549  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  …   0.0       -0.372549   0.0       -0.372549   0.0       -0.372549  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0      0.0       -0.372549   0.0       -0.372549   0.0       -0.372549  0.0
 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0      0.0       -0.372549   0.0       -0.372549   0.0       -0.372549  0.0

julia> 128*128
16384

So yeah something not quite right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant