Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The relationship ? #12

Open
995667874 opened this issue Aug 8, 2022 · 15 comments
Open

The relationship ? #12

995667874 opened this issue Aug 8, 2022 · 15 comments

Comments

@995667874
Copy link

I would like to ask why the multi-scale anomaly score map can directly calculate the AUROC with the ground truth. What is the relationship between the anomalies represented by the feature similarity output of the network layer and the anomalies shown by the annotations?

@995667874 995667874 changed the title Therelationship The relationship ? Aug 8, 2022
@hq-deng
Copy link
Owner

hq-deng commented Aug 8, 2022

Hello,

The anomaly score is between [0.0,1.0], but the ground truth is 0 or 1. AUROC is a threshold based calculation method, so the anomaly result should be a score, then we use different threshold to the anomaly score to generation different evaluation results and then we calculate the under curve results. Please refer to https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html

@995667874
Copy link
Author

Hello,
The anomaly score is between [0.0,1.0], but the ground truth is 0 or 1. AUROC is a threshold based calculation method, so the anomaly result should be a score, then we use different threshold to the anomaly score to generation different evaluation results and then we calculate the under curve results. Please refer to https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html

Thank you for your reply. I may not have understood my question. I am not clear about the meaning of adding up the abnormal score graphs obtained from the output of different network layers. Why does the area with high anomaly score correspond to the defect area in the ground truth?

@hq-deng
Copy link
Owner

hq-deng commented Aug 8, 2022

Hello,
The anomaly score is between [0.0,1.0], but the ground truth is 0 or 1. AUROC is a threshold based calculation method, so the anomaly result should be a score, then we use different threshold to the anomaly score to generation different evaluation results and then we calculate the under curve results. Please refer to https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html

Thank you for your reply. I may not have understood my question. I am not clear about the meaning of adding up the abnormal score graphs obtained from the output of different network layers. Why does the area with high anomaly score correspond to the defect area in the ground truth?

Different scales mean that we are detecting at different sensory fields because the anomalies are at different scales, some is small and some is large. There are two operations for the anomaly score accumulation, addition and multiplication. The addition will lead to some noise anomaly score, because some of the scales may detect the normal noises and add them to the final result. Of course, it's sensitive to anomalies. For multiplication, it is insensitive to noise, because if the one scale recognize the normal noise as anomaly, it will be omitted by zero-multipliers from other scales. However, it's not sensitive to anomalies. So we have a better result when use addition, because it's sensitive to anomalies.
I just know that you mean the sample-level ground truth. For defect detection, the anomaly occurs in a local region in a picture, so if there is a high anomaly score in the pixel-level anomaly score map, it means there may be a detect. That means that we use the highest score as the final anomaly score. For novelty detection, the anomaly is the whole image, so we add all pixel-level anomaly scores together as the final anomaly score.

@995667874
Copy link
Author

Hello,
The anomaly score is between [0.0,1.0], but the ground truth is 0 or 1. AUROC is a threshold based calculation method, so the anomaly result should be a score, then we use different threshold to the anomaly score to generation different evaluation results and then we calculate the under curve results. Please refer to https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html

Thank you for your reply. I may not have understood my question. I am not clear about the meaning of adding up the abnormal score graphs obtained from the output of different network layers. Why does the area with high anomaly score correspond to the defect area in the ground truth?

Different scales mean that we are detecting at different sensory fields because the anomalies are at different scales, some is small and some is large. There are two operations for the anomaly score accumulation, addition and multiplication. The addition will lead to some noise anomaly score, because some of the scales may detect the normal noises and add them to the final result. Of course, it's sensitive to anomalies. For multiplication, it is insensitive to noise, because if the one scale recognize the normal noise as anomaly, it will be omitted by zero-multipliers from other scales. However, it's not sensitive to anomalies. So we have a better result when use addition, because it's sensitive to anomalies. I just know that you mean the sample-level ground truth. For defect detection, the anomaly occurs in a local region in a picture, so if there is a high anomaly score in the pixel-level anomaly score map, it means there may be a detect. That means that we use the highest score as the final anomaly score. For novelty detection, the anomaly is the whole image, so we add all pixel-level anomaly scores together as the final anomaly score.

Thank you for your patient and careful reply. I still have a few questions. Firstly, the anomaly score maps of each scale are calculated by the similarity of the feature tensors. How do they locate and display the anomaly on the original image, and how are the heat maps derived? Second, I see no 1X1 convolutional layer added after the MFF module in the OCBE module in the code to adjust the number of channels? Finally, the OCBE module is to map the high-scale features to the low-dimensional space. How to understand this?

@hq-deng
Copy link
Owner

hq-deng commented Aug 8, 2022

We upsample each maps at each scale to the image's size and then add them together.

This is in Line 402 at resnet.py. I modified the self._make_layer().

We ensemble different information at each scale (from low to high, or from resblock1 to resblock3) to replace the information only from resblock3. Generally, unlike U-Net or other pyramid networks, the multi-scale information in our model cannot be added to the decoder at different decoder layers, so we just ensemble them to the bottleneck layer.

@995667874
Copy link
Author

We upsample each maps at each scale to the image's size and then add them together.

This is in Line 402 at resnet.py. I modified the self._make_layer().

We ensemble different information at each scale (from low to high, or from resblock1 to resblock3) to replace the information only from resblock3. Generally, unlike U-Net or other pyramid networks, the multi-scale information in our model cannot be added to the decoder at different decoder layers, so we just ensemble them to the bottleneck layer.

  1. Does the thermal map directly add the anomalous fraction map to the original image? 2. OCE layer is defined in make_layer in the code, and 1X1 convolutional layer is not set after line 454 in resnet.py. 3. In my understanding, OCBE module realizes feature fusion, and then passes through a resblock, which seems to only play a role of feature fusion, compact embedding I do not understand.

@hq-deng
Copy link
Owner

hq-deng commented Aug 8, 2022

  1. It's upsampled to the original image size, so we can obtain the pixel-level anomaly score.

  2. Line 433

  3. The multi-scale channels is from [resblock1,resblock2,resblock3] which is larger than only [resblock3], more channels input but same channels output, so it's compact.

@995667874
Copy link
Author

  1. It's upsampled to the original image size, so we can obtain the pixel-level anomaly score.
  2. Line 433
  3. The multi-scale channels is from [resblock1,resblock2,resblock3] which is larger than only [resblock3], more channels input but same channels output, so it's compact.

Thank you for your patient reply. Line 433 defines a down-sampling function, which also appears in the original ResNet code. What I understand is that it plays the role of adjusting the channel at the residual edge in RESblock4, rather than 1x1 adjusting the channel input after the MFF module. In addition, when the anomaly score map is upsampled to the size of the original image, will there be a high anomaly score in the abnormal region?

@hq-deng
Copy link
Owner

hq-deng commented Aug 8, 2022

Sure, I modified the input size of the conv1x1, so I mentioned it in paper. We don't need another 1x1 convolution, because the only purpose of conv1x1 is to adjust channel numbers.

The highest anomaly score is unrelated to the upsample operation, because upsample will not change the highest anomaly score. We upsample the anomaly score map is to make all anomaly score maps have the same size as the image, so we can get pixel-level anomaly score for anomaly localization.

@995667874
Copy link
Author

Sure, I modified the input size of the conv1x1, so I mentioned it in paper. We don't need another 1x1 convolution, because the only purpose of conv1x1 is to adjust channel numbers.

The highest anomaly score is unrelated to the upsample operation, because upsample will not change the highest anomaly score. We upsample the anomaly score map is to make all anomaly score maps have the same size as the image, so we can get pixel-level anomaly score for anomaly localization.

Yes, I see what you mean. But how does anomaly localization work? The anomaly score and the similarity are both for the feature tensor, why they represent the anomaly on the original image.

@zhagao12138
Copy link

Sure, I modified the input size of the conv1x1, so I mentioned it in paper. We don't need another 1x1 convolution, because the only purpose of conv1x1 is to adjust channel numbers.
The highest anomaly score is unrelated to the upsample operation, because upsample will not change the highest anomaly score. We upsample the anomaly score map is to make all anomaly score maps have the same size as the image, so we can get pixel-level anomaly score for anomaly localization.

Yes, I see what you mean. But how does anomaly localization work? The anomaly score and the similarity are both for the feature tensor, why they represent the anomaly on the original image.

您好 可以加个联系方式一起讨论一下吗?我也有不懂的地方. -.- 谢谢!!!

@hq-deng
Copy link
Owner

hq-deng commented Aug 10, 2022

Sure, I modified the input size of the conv1x1, so I mentioned it in paper. We don't need another 1x1 convolution, because the only purpose of conv1x1 is to adjust channel numbers.
The highest anomaly score is unrelated to the upsample operation, because upsample will not change the highest anomaly score. We upsample the anomaly score map is to make all anomaly score maps have the same size as the image, so we can get pixel-level anomaly score for anomaly localization.

Yes, I see what you mean. But how does anomaly localization work? The anomaly score and the similarity are both for the feature tensor, why they represent the anomaly on the original image.

The features from the student is different from features from the teacher for anomalous regions, so there is a high anomaly score on anomalous region. The student only learn normal representation from the teacher, so for anomaly, which is unseen for training, they make a difference.

@995667874
Copy link
Author

Sure, I modified the input size of the conv1x1, so I mentioned it in paper. We don't need another 1x1 convolution, because the only purpose of conv1x1 is to adjust channel numbers.
The highest anomaly score is unrelated to the upsample operation, because upsample will not change the highest anomaly score. We upsample the anomaly score map is to make all anomaly score maps have the same size as the image, so we can get pixel-level anomaly score for anomaly localization.

Yes, I see what you mean. But how does anomaly localization work? The anomaly score and the similarity are both for the feature tensor, why they represent the anomaly on the original image.

您好 可以加个联系方式一起讨论一下吗?我也有不懂的地方. -.- 谢谢!!!

QQ: 995667874

@995667874
Copy link
Author

Sure, I modified the input size of the conv1x1, so I mentioned it in paper. We don't need another 1x1 convolution, because the only purpose of conv1x1 is to adjust channel numbers.
The highest anomaly score is unrelated to the upsample operation, because upsample will not change the highest anomaly score. We upsample the anomaly score map is to make all anomaly score maps have the same size as the image, so we can get pixel-level anomaly score for anomaly localization.

Yes, I see what you mean. But how does anomaly localization work? The anomaly score and the similarity are both for the feature tensor, why they represent the anomaly on the original image.

The features from the student is different from features from the teacher for anomalous regions, so there is a high anomaly score on anomalous region. The student only learn normal representation from the teacher, so for anomaly, which is unseen for training, they make a difference.

I understand what you said, but what I don't understand is that the similarity and anomaly score obtained are the feature tensor of the network output. Even if the anomaly score map is up-sampled to the original image size, the meaning represented by each pixel is only the meaning of the features extracted by the network. It is also different from the original image or ground truth in the meaning of each pixel (in this case, each pixel is either defective or normal).

@zhagao12138
Copy link

当然,我修改了conv1x1的输入大小,所以我在论文中提到了。我们不需要另一个 1x1 卷积,因为 conv1x1 的唯一目的是调整通道数。
最高异常分数与上采样操作无关,因为上采样不会改变最高异常分数。我们对异常分数图进行上采样是为了使所有异常分数图与图像具有相同的大小,这样我们就可以获得异常定位的像素级异常分数。

是的,我明白你的意思。但是异常定位是如何工作的呢?异常分数和相似度都是针对特征张量的,为什么它们代表原始图像上的异常。

你可以加个联系方式一起讨论一下吗?我也有不懂的地方。-.- 谢谢!!!

QQ:995667874

您好 QQ通过有问题限制。。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants