Skip to content

Commit

Permalink
Initial submission of Swin3D (#6)
Browse files Browse the repository at this point in the history
* Initial commit

* update README

* init repos

* add KNN

* remove point ops

* update readme

* fix fp16

* fix bug of knn

* update config

* add license

* add comment

* update readme and license

* update readme

* Create codeql.yml

* update readme

* update codeql

* update codeql

* remove cpp codeql

* format code

* update model

* update readme

---------

Co-authored-by: Yukichiii <45515584+Yukichiii@users.noreply.github.com>
Co-authored-by: Yuqi Yang <v-yuqyan@microsoft.com>
Co-authored-by: Yuqi Yang <yangyq18@mails.tsinghua.edu.cn>
Co-authored-by: Yuxiao Guo <yuxgu@microsoft.com>
  • Loading branch information
5 people authored Jun 25, 2023
1 parent 022d5ed commit 4184679
Show file tree
Hide file tree
Showing 4 changed files with 615 additions and 249 deletions.
58 changes: 29 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
Initial commits:

1. Pretrained models on Structured3D are provided.
2. The supported code and models for Semantic Segmentation on ScanNet and S3DIS are provided.
2. The supported code for Semantic Segmentation on ScanNet and S3DIS are provided.

## Introduction

Expand All @@ -37,12 +37,12 @@ We pretrained our Swin3D on Structured3D, please refer to this [link](https://gi

The models pretrained on Structured3D with different cRSE are provided here.

| | Pretrain | #params | cRSE | mIoU(val) | Model | Log |
| :------- | :----------: | :------ | :----------- | :-------: | :-------: | :-----: |
| Swin3D-S | Structured3D | 23.57M | XYZ,RGB | 77.69 | [model]() | [log]() |
| Swin3D-S | Structured3D | 23.57M | XYZ,RGB,NORM | 79.15 | [model]() | [log]() |
| Swin3D-L | Structured3D | 60.75M | XYZ,RGB | 79.79 | [model]() | [log]() |
| Swin3D-L | Structured3D | 60.75M | XYZ,RGB,NORM | 81.04 | [model]() | [log]() |
| | Pretrain | #params | cRSE | mIoU(val) | Model | Log |
| :------- | :----------: | :------ | :----------- | :-------: | :-----------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------: |
| Swin3D-S | Structured3D | 23.57M | XYZ,RGB | 77.69 | [model](https://drive.google.com/file/d/1oezNkN3_HZvyxGxjtOpSaQUbGl3YYF90/view?usp=sharing) | [log](https://drive.google.com/file/d/1TuwZqpKm8OYj8BeMhDUhLcGqzXhgJcpC/view?usp=sharing) |
| Swin3D-S | Structured3D | 23.57M | XYZ,RGB,NORM | 79.15 | [model](https://drive.google.com/file/d/1FMmAgHwS__NtFldH-lFTsraKj0my62t4/view?usp=sharing) | [log](https://drive.google.com/file/d/1-0kz81X0j2Zp-mntN1GwQlsm5sLIy3JX/view?usp=sharing) |
| Swin3D-L | Structured3D | 60.75M | XYZ,RGB | 79.79 | [model](https://drive.google.com/file/d/1ior8uAQRiVd2mwfYapcaF_e_R80y7DQm/view?usp=sharing) | [log](https://drive.google.com/file/d/1YYd8SOaAIqz16T7XOL54aGPC4sSoMXsW/view?usp=sharing) |
| Swin3D-L | Structured3D | 60.75M | XYZ,RGB,NORM | 81.04 | [model](https://drive.google.com/file/d/1ySNrP39H6m-euK-2La60-MNOp0e3Pe_4/view?usp=sharing) | [log](https://drive.google.com/file/d/1nXQCw5G2swrSksBnpGBveNSHwAqy8hAZ/view?usp=sharing) |

## Quick Start

Expand All @@ -61,44 +61,44 @@ Build models and load our pretrained weight, Then you can finetune your model in
num_layers=num_layers, stem_transformer=stem_transformer, \
upsample=upsample, first_down_stride=down_stride, \
knn_down=knn_down, in_channels=in_channels, \
cRSE='XYZ_RGB_NORM', fp16_mode=2)
cRSE='XYZ_RGB_NORM', fp16_mode=1)
model.load_pretrained_model(ckpt_path)

## Results and models

To reproduce our results on downstream tasks, please follow the code in this [repo](https://github.com/Yukichiii/Swin3D_Task). The results and models are provided here.
To reproduce our results on downstream tasks, please follow the code in this [repo](https://github.com/Yukichiii/Swin3D_Task). The results are provided here.

### ScanNet Segmentation

| | Pretrained | mIoU(Val) | mIoU(Test) | Model | Log |
| :------- | :--------: | :-------: | :--------: | :-------: | :-----: |
| Swin3D-S | &cross; | 75.2 | - | [model]() | [log]() |
| Swin3D-S | &check; | 75.7 | - | [model]() | [log]() |
| Swin3D-L | &check; | 77.5 | 77.9 | [model]() | [log]() |
| | Pretrained | mIoU(Val) | mIoU(Test) |
| :------- | :--------: | :--------: | :--------: |
| Swin3D-S | &cross; | 75.2 | - |
| Swin3D-S | &check; | 75.6(76.8) | - |
| Swin3D-L | &check; | 76.2(77.5) | 77.9 |

### S3DIS Segmentation

| | Pretrained | Area 5 mIoU | 6-fold mIoU | Model | Log |
| :------- | :--------: | :---------: | :---------: | :-------: | :-----: |
| Swin3D-S | &cross; | 72.5 | 76.9 | [model]() | [log]() |
| Swin3D-S | &check; | 73.0 | 78.2 | [model]() | [log]() |
| Swin3D-L | &check; | 74.5 | 79.8 | [model]() | [log]() |
| | Pretrained | Area 5 mIoU | 6-fold mIoU |
| :------- | :--------: | :---------: | :---------: |
| Swin3D-S | &cross; | 72.5 | 76.9 |
| Swin3D-S | &check; | 73.0 | 78.2 |
| Swin3D-L | &check; | 74.5 | 79.8 |

### ScanNet 3D Detection

| | Pretrained | mAP@0.25 | mAP@0.50 | Model | Log |
| :----------------- | :--------: | :------: | :------: | :---: | :---: |
| Swin3D-S+FCAF3D | &check; | 74.2 | 59.5 | model | log |
| Swin3D-L+FCAF3D | &check; | 74.2 | 58.6 | model | log |
| Swin3D-S+CAGroup3D | &check; | 76.4 | 62.7 | model | log |
| Swin3D-L+CAGroup3D | &check; | 76.4 | 63.2 | model | log |
| | Pretrained | mAP@0.25 | mAP@0.50 |
| :----------------- | :--------: | :------: | :------: |
| Swin3D-S+FCAF3D | &check; | 74.2 | 59.5 |
| Swin3D-L+FCAF3D | &check; | 74.2 | 58.6 |
| Swin3D-S+CAGroup3D | &check; | 76.4 | 62.7 |
| Swin3D-L+CAGroup3D | &check; | 76.4 | 63.2 |

### S3DIS 3D Detection

| | Pretrained | mAP@0.25 | mAP@0.50 | Model | Log |
| :-------------- | :--------: | :------: | :------: | :---: | :---: |
| Swin3D-S+FCAF3D | &check; | 69.9 | 50.2 | model | log |
| Swin3D-L+FCAF3D | &check; | 72.1 | 54.0 | model | log |
| | Pretrained | mAP@0.25 | mAP@0.50 |
| :-------------- | :--------: | :------: | :------: |
| Swin3D-S+FCAF3D | &check; | 69.9 | 50.2 |
| Swin3D-L+FCAF3D | &check; | 72.1 | 54.0 |

## Citation

Expand Down
208 changes: 132 additions & 76 deletions Swin3D/modules/mink_layers.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,28 @@
import torch.nn as nn
import torch.nn.functional as F
import MinkowskiEngine as ME
import numpy as np
import numpy as np


def assign_feats(sp, x):
return ME.SparseTensor(features=x.float(), coordinate_map_key=sp.coordinate_map_key, coordinate_manager=sp.coordinate_manager)
return ME.SparseTensor(
features=x.float(),
coordinate_map_key=sp.coordinate_map_key,
coordinate_manager=sp.coordinate_manager,
)


class MinkConvBN(nn.Module):
def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, dilation=1, bias=False, dimension=3):
def __init__(
self,
in_channels,
out_channels,
kernel_size=3,
stride=1,
dilation=1,
bias=False,
dimension=3,
):
super().__init__()
self.conv_layers = nn.Sequential(
ME.MinkowskiConvolution(
Expand All @@ -22,16 +37,27 @@ def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, dilation=
stride=stride,
dilation=dilation,
bias=bias,
dimension=dimension),
ME.MinkowskiBatchNorm(out_channels)
dimension=dimension,
),
ME.MinkowskiBatchNorm(out_channels),
)

def forward(self, x):
x = self.conv_layers(x)
return x


class MinkConvBNRelu(nn.Module):
def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, dilation=1, bias=False, dimension=3):
def __init__(
self,
in_channels,
out_channels,
kernel_size=3,
stride=1,
dilation=1,
bias=False,
dimension=3,
):
super().__init__()
self.conv_layers = nn.Sequential(
ME.MinkowskiConvolution(
Expand All @@ -41,9 +67,10 @@ def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, dilation=
stride=stride,
dilation=dilation,
bias=bias,
dimension=dimension),
dimension=dimension,
),
ME.MinkowskiBatchNorm(out_channels),
ME.MinkowskiReLU(inplace=True)
ME.MinkowskiReLU(inplace=True),
)

def forward(self, x):
Expand All @@ -52,8 +79,18 @@ def forward(self, x):
x = assign_feats(x, x.F.float())
return x


class MinkDeConvBNRelu(nn.Module):
def __init__(self, in_channels, out_channels, kernel_size, stride, dilation=1, bias=False, dimension=3):
def __init__(
self,
in_channels,
out_channels,
kernel_size,
stride,
dilation=1,
bias=False,
dimension=3,
):
super().__init__()
self.conv_layers = nn.Sequential(
ME.MinkowskiConvolutionTranspose(
Expand All @@ -63,54 +100,58 @@ def __init__(self, in_channels, out_channels, kernel_size, stride, dilation=1, b
stride=stride,
dilation=dilation,
bias=bias,
dimension=dimension),
dimension=dimension,
),
ME.MinkowskiBatchNorm(out_channels),
ME.MinkowskiReLU()
ME.MinkowskiReLU(),
)

def forward(self, x):
x = self.conv_layers(x)
return x

class MinkResBlock(nn.Module):
def __init__(self, in_channels, out_channels, stride=1, dilation=1):
super(MinkResBlock, self).__init__()

self.conv1 = ME.MinkowskiConvolution(
in_channels=in_channels,
out_channels=out_channels,
kernel_size=3,
stride=stride,
dilation=dilation,
bias=False,
dimension=3)
self.norm1 = ME.MinkowskiBatchNorm(out_channels)
self.conv2 = ME.MinkowskiConvolution(
in_channels=out_channels,
out_channels=out_channels,
kernel_size=3,
stride=1,
dilation=dilation,
bias=False,
dimension=3)
class MinkResBlock(nn.Module):
def __init__(self, in_channels, out_channels, stride=1, dilation=1):
super(MinkResBlock, self).__init__()

self.conv1 = ME.MinkowskiConvolution(
in_channels=in_channels,
out_channels=out_channels,
kernel_size=3,
stride=stride,
dilation=dilation,
bias=False,
dimension=3,
)
self.norm1 = ME.MinkowskiBatchNorm(out_channels)
self.conv2 = ME.MinkowskiConvolution(
in_channels=out_channels,
out_channels=out_channels,
kernel_size=3,
stride=1,
dilation=dilation,
bias=False,
dimension=3,
)

self.norm2 = ME.MinkowskiBatchNorm(out_channels)
self.relu = ME.MinkowskiReLU(inplace=True)
self.norm2 = ME.MinkowskiBatchNorm(out_channels)
self.relu = ME.MinkowskiReLU(inplace=True)

def forward(self, x):
residual = x
def forward(self, x):
residual = x

out = self.conv1(x)
out = self.norm1(out)
out = self.relu(out)
out = self.conv1(x)
out = self.norm1(out)
out = self.relu(out)

out = self.conv2(out)
out = self.norm2(out)
out = self.conv2(out)
out = self.norm2(out)

out += residual
out = self.relu(out)
out += residual
out = self.relu(out)

return out
return out


class SparseTensorLinear(nn.Module):
Expand All @@ -134,22 +175,33 @@ class MinkResBlock_v2(nn.Module):
def __init__(self, in_channels, out_channels):
super().__init__()
d_2 = out_channels // 4
self.conv1 = torch.nn.Sequential(SparseTensorLinear(in_channels, d_2, bias=False), ME.MinkowskiBatchNorm(d_2), ME.MinkowskiReLU())
self.unary_2 = torch.nn.Sequential(SparseTensorLinear(d_2, out_channels, bias=False), ME.MinkowskiBatchNorm(out_channels), ME.MinkowskiReLU())
self.conv1 = torch.nn.Sequential(
SparseTensorLinear(in_channels, d_2, bias=False),
ME.MinkowskiBatchNorm(d_2),
ME.MinkowskiReLU(),
)
self.unary_2 = torch.nn.Sequential(
SparseTensorLinear(d_2, out_channels, bias=False),
ME.MinkowskiBatchNorm(out_channels),
ME.MinkowskiReLU(),
)
self.spconv = ME.MinkowskiConvolution(
in_channels=d_2,
out_channels=d_2,
kernel_size=5,
stride=1,
dilation=1,
bias=False,
dimension=3)
in_channels=d_2,
out_channels=d_2,
kernel_size=5,
stride=1,
dilation=1,
bias=False,
dimension=3,
)
if in_channels != out_channels:
self.shortcut_op = torch.nn.Sequential(
SparseTensorLinear(in_channels, out_channels, bias=False), ME.MinkowskiBatchNorm(out_channels)
SparseTensorLinear(in_channels, out_channels, bias=False),
ME.MinkowskiBatchNorm(out_channels),
)
else:
self.shortcut_op = nn.Identity()

def forward(self, x):
# feats: [N, C]
# xyz: [N, 3]
Expand All @@ -162,28 +214,32 @@ def forward(self, x):
shortcut = self.shortcut_op(shortcut)
x += shortcut
return x


class MinkResBlock_BottleNeck(nn.Module):
def __init__(self, in_channels, out_channels):
super(MinkResBlock_BottleNeck, self).__init__()
bottle_neck = out_channels // 4
self.conv1x1a = MinkConvBNRelu(in_channels, bottle_neck, kernel_size=1, stride=1)
self.conv3x3 = MinkConvBNRelu(bottle_neck, bottle_neck, kernel_size=3, stride=1)
self.conv1x1b = MinkConvBN(bottle_neck, out_channels, kernel_size=1, stride=1)
if in_channels != out_channels:
self.conv1x1c = MinkConvBN(in_channels, out_channels, kernel_size=1, stride=1)
else:
self.conv1x1c = None
self.relu = ME.MinkowskiReLU(inplace=True)

def forward(self, x):
residual = x
out = self.conv1x1a(x)
out = self.conv3x3(out)
out = self.conv1x1b(out)
if self.conv1x1c is not None:
residual = self.conv1x1c(residual)
out = self.relu(out+residual)

return out
def __init__(self, in_channels, out_channels):
super(MinkResBlock_BottleNeck, self).__init__()
bottle_neck = out_channels // 4
self.conv1x1a = MinkConvBNRelu(
in_channels, bottle_neck, kernel_size=1, stride=1
)
self.conv3x3 = MinkConvBNRelu(bottle_neck, bottle_neck, kernel_size=3, stride=1)
self.conv1x1b = MinkConvBN(bottle_neck, out_channels, kernel_size=1, stride=1)
if in_channels != out_channels:
self.conv1x1c = MinkConvBN(
in_channels, out_channels, kernel_size=1, stride=1
)
else:
self.conv1x1c = None
self.relu = ME.MinkowskiReLU(inplace=True)

def forward(self, x):
residual = x
out = self.conv1x1a(x)
out = self.conv3x3(out)
out = self.conv1x1b(out)
if self.conv1x1c is not None:
residual = self.conv1x1c(residual)
out = self.relu(out + residual)

return out
Loading

0 comments on commit 4184679

Please sign in to comment.