feature map #2878

limbo-zhj · 2021-04-21T13:32:04Z

Hello, how can I output the feature map or hot map of each layer

glenn-jocher · 2021-04-21T20:17:26Z

@limbo-zhj this has been asked before, but typically users don't understand the number of feature maps they would be dealing with. The YOLOv5l model for example has over a thousand output feature maps, each with it's own 80x80, 40x40 or 20x20 gird that you could look at. We don't have a standard method for viewing these, but they are easily extractable in the Detect() layer of any YOLOv5 model:

yolov5/models/yolo.py

Lines 24 to 58 in 5f7d39f

    
           class Detect(nn.Module): 
        
               stride = None  # strides computed during build 
        
               export = False  # onnx export 
        
               def __init__(self, nc=80, anchors=(), ch=()):  # detection layer 
        
                   super(Detect, self).__init__() 
        
                   self.nc = nc  # number of classes 
        
                   self.no = nc + 5  # number of outputs per anchor 
        
                   self.nl = len(anchors)  # number of detection layers 
        
                   self.na = len(anchors[0]) // 2  # number of anchors 
        
                   self.grid = [torch.zeros(1)] * self.nl  # init grid 
        
                   a = torch.tensor(anchors).float().view(self.nl, -1, 2) 
        
                   self.register_buffer('anchors', a)  # shape(nl,na,2) 
        
                   self.register_buffer('anchor_grid', a.clone().view(self.nl, 1, -1, 1, 1, 2))  # shape(nl,1,na,1,1,2) 
        
                   self.m = nn.ModuleList(nn.Conv2d(x, self.no * self.na, 1) for x in ch)  # output conv 
        
               def forward(self, x): 
        
                   # x = x.copy()  # for profiling 
        
                   z = []  # inference output 
        
                   self.training |= self.export 
        
                   for i in range(self.nl): 
        
                       x[i] = self.m[i](x[i])  # conv 
        
                       bs, _, ny, nx = x[i].shape  # x(bs,255,20,20) to x(bs,3,20,20,85) 
        
                       x[i] = x[i].view(bs, self.na, self.no, ny, nx).permute(0, 1, 3, 4, 2).contiguous() 
        
                       if not self.training:  # inference 
        
                           if self.grid[i].shape[2:4] != x[i].shape[2:4]: 
        
                               self.grid[i] = self._make_grid(nx, ny).to(x[i].device) 
        
                           y = x[i].sigmoid() 
        
                           y[..., 0:2] = (y[..., 0:2] * 2. - 0.5 + self.grid[i]) * self.stride[i]  # xy 
        
                           y[..., 2:4] = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i]  # wh 
        
                           z.append(y.view(bs, -1, self.no)) 
        
                   return x if self.training else (torch.cat(z, 1), x)

kimsung-git · 2021-06-17T09:42:23Z

@glenn-jocher
I'm also in attempt to extract feature map from the Detect as you mentioned. After printing out the input x to forward(), I verified that the shape of x is different from what you mentioned above. The the dimension I got for x is the following:

torch.Size([1, 256, 48, 80]), torch.Size([1, 512, 24, 40]), torch.Size([1, 1024, 12, 20])

I understand that the shape changes to [1, 255, 48, 80] [1, 255, 24, 40] [1, 255, 12, 20] after conv in foward(), but how come the input x dimension is not 80x80, 40x40, 20x20, but 48x80, 24x40, 12x20?

Can you explain why it is so?

Thank you

glenn-jocher · 2021-06-17T10:25:49Z

@kimsung-git output shapes are a function of input shapes.

kimsung-git · 2021-06-18T00:57:29Z

@glenn-jocher when you said "output shape", you meant x in Detect's forward method and "input shape" an image size?

Isn't input shapes(img size) always fixed before feeding to the network i.e 640? I also tested image with different size but it's the same shapes: 48x80, 24x40, 12x20..

Thank you

glenn-jocher · 2021-06-18T09:24:34Z

@kimsung-git detect.py displays input shapes directly, you might want to start there:

python detect.py

limbo-zhj added the question Further information is requested label Apr 21, 2021

limbo-zhj closed this as completed Apr 21, 2021

glenn-jocher linked a pull request Jun 28, 2021 that will close this issue

Add feature map visualization #3804

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature map #2878

feature map #2878

limbo-zhj commented Apr 21, 2021

glenn-jocher commented Apr 21, 2021

kimsung-git commented Jun 17, 2021 •

edited

Loading

glenn-jocher commented Jun 17, 2021

kimsung-git commented Jun 18, 2021

glenn-jocher commented Jun 18, 2021

feature map #2878

feature map #2878

Comments

limbo-zhj commented Apr 21, 2021

glenn-jocher commented Apr 21, 2021

kimsung-git commented Jun 17, 2021 • edited Loading

glenn-jocher commented Jun 17, 2021

kimsung-git commented Jun 18, 2021

glenn-jocher commented Jun 18, 2021

kimsung-git commented Jun 17, 2021 •

edited

Loading