RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward. #228
-
So I have gone through solutions already available on the forum. I’m using torchxrayvision and torchcam library. I needed Densenet121 pretrained weigths and torchcam to generate GradCAM for the model. Please help me in this the solutions available for it on pytorch forum doesn't look like it can help me in it. Minimal reproducible example I can give you is this
rescaled_output just represents an image in Tensor of shape (1, 244, 244).
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
Hi @karnikkanojia 👋 Thanks for reporting this! This looks like a simple problem of cell execution in a notebook. For me to help, I'd need a minimal reproducible snippet. The one you provided is not complete (missing imports, cam definition and model definition) With my limited knowledge about the setup right now, I think what's happening is that your enumerate is going through the outputs of a single model. You're using a gradient-based CAM method, and so for each call in your list comprehension, it's doing backprop. I'd suggest looping to nullify the grad + cam computation for each pathologies preds = model(rescaled_output.unsqueeze(0))
cam_outputs = []
for idx in range(len(model.pathologies)):
model.zero_grad()
preds.zero_grad()
cam_outputs.append(cam(class_idx=idx, scores=preds)) But to confirm this, please share a complete minimal reproducible snippet 🙏 Cheers! |
Beta Was this translation helpful? Give feedback.
-
hey, i am experiencing the same RunTimeError. In my case i did not use the retain_graph and i got the error, however when i use retain_graph = True and retain_graph =retain_graph i still get the same error. (i am using snnTorch). here is the code snippet of where the error is; Hyperparametersinput_size = inputs.shape[2]
hidden_size = 50
num_epochs = 100
learning_rate = 0.001
inputs_tensor = torch.tensor(inputs, dtype=torch.float32)
labels_tensor = torch.tensor(joint_labels, dtype=torch.float32)
model = ContactEstimationSNN(input_size, hidden_size)
criterion = nn.BCEWithLogitsLoss()
optimizer = optim.Adam(model.parameters(), lr=learning_rate)
# Training loop
for epoch in range(num_epochs):
model.train()
optimizer.zero_grad()
# Forward pass
outputs = model(inputs_tensor)
loss = criterion(outputs[:, -1, 0], labels_tensor) # Use last time step for prediction
loss.backward(retain_graph=True) # Do not retain the graph unless needed
optimizer.step()
if (epoch + 1) % 10 == 0:
print(f'Epoch [{epoch + 1}/{num_epochs}], Loss: {loss.item():.4f}')
# Evaluation
# test_inputs_tensor = torch.tensor(test_inputs, dtype=torch.float32)
# test_labels_tensor = torch.tensor(test_labels, dtype=torch.float32)
model.eval()
with torch.no_grad():
test_outputs = model(test_inputs_tensor)
test_predictions = torch.sigmoid(test_outputs[:, -1, 0]) # Get the predicted probabilities
predicted_labels = (test_predictions > 0.5).float() # Binarize predictions
accuracy = (predicted_labels == labels_tensor).float().mean()
print(f'Test Accuracy: {accuracy.item():.4f}') |
Beta Was this translation helpful? Give feedback.
Thanks! So the error mentioned can be avoided as the library allows low-level PyTorch options:
This piece of code doesn't crash on my end 👍
It might be a bit slow as it will perform the backprop for each pathologies (18 apparently). One option that would use more RAM…