VSR Layout Recognition Test Issues #144

Schneipi · 2023-03-01T13:34:55Z

I have the same Exception has occurred: ValueError too many values to unpack (expected 4) issue as #109, which is indeed being solved through the addition of

img = img[0]
gt_bboxes = gt_bboxes[0]

as the first lines in forward() in bertgrid_embedding.py.

Next, there seems to be a numpy problem, which requires line (same file)
w_start, h_start, w_end, h_end = gt_bboxes_arr[iter_b_l].round().astype(np.int).tolist()
to be changed to
w_start, h_start, w_end, h_end = gt_bboxes_arr[iter_b_l].round().astype(np.int64).tolist()
to not raise an AttributeError.

Having done those two modifications, PyTorch now complains about:

File "/opt/homebrew/Caskroom/miniforge/base/lib/python3.9/site-packages/torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Expected 3D (unbatched) or 4D (batched) input to conv2d, but got input of size: [1, 1, 3, 800, 608]

Perhaps the suggestion in #109 is not addressing everything after all, or even breaks something? Or maybe there is a compatibility issue with the used versions of torch==1.13.1 and numpy==1.24.2? I couldn't find any info on the expected versions (except for the lower bounds) for the DAVAR-LAB-OCR project.

I'm trying to run DAVAR-Lab-OCR/demo/text_layout/VSR/DocBank/test.sh and have correctly prepared the models and adjusted config/docbank_x101.

Any suggestions? Thanks.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VSR Layout Recognition Test Issues #144

VSR Layout Recognition Test Issues #144

Schneipi commented Mar 1, 2023

VSR Layout Recognition Test Issues #144

VSR Layout Recognition Test Issues #144

Comments

Schneipi commented Mar 1, 2023