Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

相同的方法,flow会报错。 #7127

Closed
kaijieshi7 opened this issue Dec 28, 2021 · 2 comments · Fixed by #7164
Closed

相同的方法,flow会报错。 #7127

kaijieshi7 opened this issue Dec 28, 2021 · 2 comments · Fixed by #7164
Labels

Comments

@kaijieshi7
Copy link

Summary

A short description about the bug/issue

Code to reproduce bug

# import oneflow as flow
# import oneflow.nn as nn
import torch as flow
import torch.nn as nn

class N1(nn.Module):
    def __init__(self):
        super(N1, self).__init__()
        self.reduction = nn.Linear(96, 96, bias=False)
        self.norm = nn.LayerNorm(96)

    def forward(self, x):
        x = self.norm(x)
        x = self.reduction(x)
        return x


n_flow = N1().cuda()
x = flow.rand(2, 96).cuda()
label = flow.rand(2, 96).cuda()
loss_fn = nn.MSELoss()
# loss_fn = nn.BCELoss()
optimizer = flow.optim.SGD(n_flow.parameters(), lr=0.001, momentum=0.9, weight_decay=0.05)
for i in range(2):
    optimizer.zero_grad()
    y = n_flow(x)
    loss_fn(y, label).backward()
    optimizer.step()
    print(y)

报错信息

Traceback (most recent call last):
  File "/home/kaijie/Documents/code/of/large_scale_training/large_scale_training/swin_transformer/bug3.py", line 27, in <module>
    loss_fn(y, label).backward()
  File "/home/kaijie/anaconda3/envs/torch_cuda10_1/lib/python3.8/site-packages/oneflow/framework/tensor.py", line 80, in _backward
    flow.autograd.backward(self, gradient, retain_graph, create_graph)
  File "/home/kaijie/anaconda3/envs/torch_cuda10_1/lib/python3.8/site-packages/oneflow/autograd/autograd.py", line 48, in backward
    backward_api(
IndexError: vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)

System Information

  • What is your OneFlow installation (pip, source, dockerhub):
  • OS:
  • OneFlow version (run python3 -m oneflow --doctor):
  • Python version:
  • CUDA driver version:
  • GPU models:
  • Other info:
@MARD1NO
Copy link
Contributor

MARD1NO commented Dec 28, 2021

先norm再linear就有问题

先linear再norm就没上述问题

TODO

@wyg1997 wyg1997 added the bug label Dec 28, 2021
@strint
Copy link
Contributor

strint commented Dec 28, 2021

IndexError: vector::_M_range_check: __n (which is 0) >= this->size() (which is 0)

这个错误是哪一段c++代码报出来的呢?看起来需要增加下检查,这个index越界导致直接没有错误栈了

@liufengwei0103 liufengwei0103 linked a pull request Dec 31, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants