RuntimeError: expected a non-empty list of Tensors #95

youth123 · 2020-05-10T02:57:12Z

When I followed the cascaded example, I just removed the dataparallel code and met the issue.

INFO:root:TRAINING EPOCH 1
total_loss=43.34527: 2%|▍ | 84/3750 [00:14<10:26, 5.85it/s]Traceback (most recent call last):
File "/root/toyer/cur_ali_img_retrieval_v6/CascadedEmbeddings.py", line 187, in
trainer.train(num_epochs=num_epochs)
File "/root/anaconda3/lib/python3.7/site-packages/pytorch_metric_learning/trainers/base_trainer.py", line 83, in train
self.forward_and_backward()
File "/root/anaconda3/lib/python3.7/site-packages/pytorch_metric_learning/trainers/base_trainer.py", line 110, in forward_and_backward
self.calculate_loss(self.get_batch())
File "/root/anaconda3/lib/python3.7/site-packages/pytorch_metric_learning/trainers/cascaded_embeddings.py", line 26, in calculate_loss
self.losses[curr_loss_name] += self.maybe_get_metric_loss(e, labels, indices_tuple, curr_loss_name)
File "/root/anaconda3/lib/python3.7/site-packages/pytorch_metric_learning/trainers/cascaded_embeddings.py", line 38, in maybe_get_metric_loss
return self.loss_funcs[curr_loss_name](embeddings, labels, indices_tuple)
File "/root/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 547, in call
result = self.forward(*input, **kwargs)
File "/root/anaconda3/lib/python3.7/site-packages/pytorch_metric_learning/losses/base_metric_loss_function.py", line 53, in forward
loss = self.compute_loss(embeddings, labels, indices_tuple)
File "/root/anaconda3/lib/python3.7/site-packages/pytorch_metric_learning/losses/circle_loss.py", line 43, in compute_loss
indices_tuple = lmu.convert_to_triplets(indices_tuple, labels, t_per_anchor=self.triplets_per_anchor)
File "/root/anaconda3/lib/python3.7/site-packages/pytorch_metric_learning/utils/loss_and_miner_utils.py", line 204, in convert_to_triplets
return [torch.cat(x, dim=0) for x in [a_out, p_out, n_out]]
File "/root/anaconda3/lib/python3.7/site-packages/pytorch_metric_learning/utils/loss_and_miner_utils.py", line 204, in
return [torch.cat(x, dim=0) for x in [a_out, p_out, n_out]]
RuntimeError: expected a non-empty list of Tensors

KevinMusgrave · 2020-05-10T03:15:13Z

Did you change any other parameters like batch size, dataset, etc? And does the bug occur every time you run it, or does it seem random?

youth123 · 2020-05-10T03:28:00Z

sampler = samplers.MPerClassSampler(train_dataset.targets, m=4, length_before_new_iter=len(train_dataset))
batch_size = 8
num_epochs = 4
tester = testers.GlobalEmbeddingSpaceTester(end_of_testing_hook=hooks.end_of_testing_hook, visualizer=umap.UMAP(), visualizer_hook=visualizer_hook, dataloader_num_workers=32)
end_of_epoch_hook = hooks.end_of_epoch_hook(
    tester, dataset_dict, model_folder, test_interval=1, patience=1
)
trainer = trainers.CascadedEmbeddings(
    models=cascaded_model, optimizers=optimizers, batch_size=batch_size, loss_funcs=loss_funcs, mining_funcs=mining_funcs,
    dataset=train_dataset, sampler=sampler, dataloader_num_workers=8, end_of_iteration_hook=hooks.end_of_iteration_hook,
    end_of_epoch_hook=end_of_epoch_hook, embedding_sizes=[64, 64, 64]
)
trainer.train(num_epochs=num_epochs)

This is where may cause the error and I can see the problem every time .

KevinMusgrave · 2020-05-10T03:30:10Z

Ah, yes I think it's related to the batch size and the MPerClassSampler. I have a few ideas of how to improve / fix this bug. But in the meantime, try m=2 in the MPerClassSampler.

youth123 · 2020-05-10T12:59:05Z

when I debuged the error, I found that this was caused by the function convert_to_triplets in loss_and_miner_utils.py.

def convert_to_triplets(indices_tuple, labels, t_per_anchor=100):
        .....
        a_out, p_out, n_out = [], [], []
        a1, p, a2, n = indices_tuple
        if len(a1) == 0 or len(a2) == 0:
            return [torch.tensor([]).to(labels.device)] * 3
        for i in range(len(labels)):
            pos_idx = (a1 == i).nonzero().flatten()
            neg_idx = (a2 == i).nonzero().flatten()
            if len(pos_idx) > 0 and len(neg_idx) > 0:
                p_idx = p[pos_idx]
                n_idx = n[neg_idx]
                p_idx, n_idx = matched_size_indices(p_idx, n_idx)
                a_idx = torch.ones_like(c_f.longest_list([p_idx, n_idx])) * i
                a_out.append(a_idx)
                p_out.append(p_idx)
                n_out.append(n_idx)
        return [torch.cat(x, dim=0) for x in [a_out, p_out, n_out]]

when the list a1 is disjoint with the a2, the final result x will be empty to cause the error. I am confused with the meaning of the list a1 and the source of it.

KevinMusgrave · 2020-05-10T13:22:19Z

(a1, p) represents positive pairs, and (a2, n) represents negative pairs. The code is trying to convert pairs to triplets, and in order to do that, it has to find the positive and negative pairs that share a common "anchor". In other words, it needs to find the a1 and a2 that are equal to each other.

The most likely reason why a1 and a2 are disjoint is because your batch size is small, which results in a small number of possible pairs. The small number of possible pairs is reduced even further by the MultiSimilarityMiner and HDCMiner.

In addition to that, if your MPerClassSampler has m = (batch size // 2), then its possible for a batch to continue only 1 label. This is due to a weakness in the implementation of MPerClassSampler.

So there's really 2 things for me to improve here, but I might not get around to it for a couple days. If you need to get your code running soon, I suggest either using a larger batch size, or using miners that are directly compatible with the losses. That means if a loss is expecting triplets, then use a triplet miner. So for example, you could replace HDCMiner with TripletMarginMiner.

KevinMusgrave · 2020-06-06T16:40:49Z

I finally fixed the bug in convert_to_triplets. The fix is in 0.9.87.dev4. It will return empty tensors instead of raising an exception. I haven't worked on MPerClassSampler yet though, and I might create a separate issue for that.

KevinMusgrave · 2020-06-20T05:08:06Z

I made a separate issue for improving MPerClassSampler: #124

KevinMusgrave added the bug Something isn't working label May 10, 2020

KevinMusgrave added the fixed in dev branch label Jun 6, 2020

KevinMusgrave closed this as completed Jun 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: expected a non-empty list of Tensors #95

RuntimeError: expected a non-empty list of Tensors #95

youth123 commented May 10, 2020 •

edited

Loading

KevinMusgrave commented May 10, 2020 •

edited

Loading

youth123 commented May 10, 2020 •

edited

Loading

KevinMusgrave commented May 10, 2020 •

edited

Loading

youth123 commented May 10, 2020

KevinMusgrave commented May 10, 2020

KevinMusgrave commented Jun 6, 2020

KevinMusgrave commented Jun 20, 2020

RuntimeError: expected a non-empty list of Tensors #95

RuntimeError: expected a non-empty list of Tensors #95

Comments

youth123 commented May 10, 2020 • edited Loading

KevinMusgrave commented May 10, 2020 • edited Loading

youth123 commented May 10, 2020 • edited Loading

KevinMusgrave commented May 10, 2020 • edited Loading

youth123 commented May 10, 2020

KevinMusgrave commented May 10, 2020

KevinMusgrave commented Jun 6, 2020

KevinMusgrave commented Jun 20, 2020

youth123 commented May 10, 2020 •

edited

Loading

KevinMusgrave commented May 10, 2020 •

edited

Loading

youth123 commented May 10, 2020 •

edited

Loading

KevinMusgrave commented May 10, 2020 •

edited

Loading