Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to increase batch_size #34

Open
ktiays opened this issue Apr 17, 2023 · 3 comments
Open

Unable to increase batch_size #34

ktiays opened this issue Apr 17, 2023 · 3 comments

Comments

@ktiays
Copy link

ktiays commented Apr 17, 2023

I noticed that your code was written with the assumption that batch_size = 1, but when I increased the batch_size, it resulted in dimension errors. I want to know why batch_size is limited to 1.
If it cannot be increased, it will not be possible to more efficiently utilize my device resources.

def custom_collate_fn(data):
img2stack = np.stack([d[0] for d in data]).astype(np.float32)
meta2stack = [d[1] for d in data]
label2stack = np.stack([d[2] for d in data]).astype(np.int)
# because we use a batch size of 1, so we can stack these tensor together.
grid_ind_stack = np.stack([d[3] for d in data]).astype(np.float)
point_label = np.stack([d[4] for d in data]).astype(np.int)
return torch.from_numpy(img2stack), \
meta2stack, \
torch.from_numpy(label2stack), \
torch.from_numpy(grid_ind_stack), \
torch.from_numpy(point_label)

@shadow2469
Copy link

I'm having the same problem, and I think it's a very unreasonable thing to put a limit on batch_size.

@huang-yh
Copy link
Collaborator

Apart from GPU memory constraint, the batch size is set to one because of 1) the different lengths of point cloud data, and 2) the for-loop we use here to filter out unnecessary sampling locations (same as BEVFormer).
For point cloud lengths, you can simply sample a fixed number of points from each point cloud, and that would fix the error as posted above.
For the for-loop, you can insert even another for-loop to take into account the batch size.

@amundra15
Copy link

amundra15 commented Oct 26, 2023

I am running into the same issue. Was anyone able to resolve it?

Besides,

  1. I noticed that the paper says "All models are trained for 24 epochs with a batch size of 8 on 8 A100 GPUs".
  2. The issue pointed out above by the author would come up only for LiDAR segmentation and not occupancy prediction, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants