Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

MultiBoxDetection cannot pass consistency check #10316

Open
TaoLv opened this issue Mar 29, 2018 · 5 comments
Open

MultiBoxDetection cannot pass consistency check #10316

TaoLv opened this issue Mar 29, 2018 · 5 comments

Comments

@TaoLv
Copy link
Member

TaoLv commented Mar 29, 2018

I am trying to add a unit test for MultiBoxDetection to check the consistency between cpu and gpu. But it cannot pass on my machine (skylake + p100) with latest master branch.

code:

def test_multibox_detection():
    ctx_list = [{'ctx': mx.cpu(0),
                 'detection_cls_prob': (1, 21, 6132),
                 'detection_loc_pred': (1, 24528),
                 'detection_anchor': (1, 6132, 4),
                 'type_dict': {'detection_cls_prob': np.float32,
                               'detection_loc_pred': np.float32,
                               'detection_anchor': np.float32}},
                {'ctx': mx.gpu(0),
                 'detection_cls_prob': (1, 21, 6132),
                 'detection_loc_pred': (1, 24528),
                 'detection_anchor': (1, 6132, 4),
                 'type_dict': {'detection_cls_prob': np.float32,
                               'detection_loc_pred': np.float32,
                               'detection_anchor': np.float32}},]
    sym = mx.symbol.contrib.MultiBoxDetection(name='detection', nms_threshold=0.5, force_suppress=False,
                                              variances=(0.1, 0.1, 0.1, 0.1), nms_topk=400)
    check_consistency(sym, ctx_list)

output:

======================================================================
FAIL: test_operator_gpu.test_multibox_detection
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/home/lvtao/Workspace/mxnet-official/tests/python/gpu/../unittest/common.py", line 157, in test_new
    orig_test(*args, **kwargs)
  File "/home/lvtao/Workspace/mxnet-official/tests/python/gpu/test_operator_gpu.py", line 1801, in test_multibox_detection
    check_consistency(sym, ctx_list)
  File "/home/lvtao/Workspace/mxnet-official/python/mxnet/test_utils.py", line 1319, in check_consistency
    raise e
AssertionError:
Items are not equal:
Error 1232.413086 exceeds tolerance rtol=0.001000, atol=0.001000.  Location of maximum error:(0, 1619, 1), a=2.172774, b=0.421231
 a: array([[[ 18.        ,   4.28585577,   0.        ,   0.        ,
           0.6093576 ,   0.        ],
        [  8.        ,   4.2417717 ,   1.        ,   0.71296334,...
 b: array([[[ 18.        ,   4.28585577,   0.        ,   0.        ,
           0.6093576 ,   0.        ],
        [  8.        ,   4.2417717 ,   1.        ,   0.7129634 ,...
-------------------- >> begin captured stdout << ---------------------
Predict Err: ctx 1 vs ctx 0 at detection_output

Input shape and parameters are all from ssd example.
@zhreshold I notice that MultiBoxDetection was first committed by you. So may I have your suggestion about this?

@zhreshold
Copy link
Member

You might want to disable NMS first, because sorting algorithm used on CPU and GPUs are different.

@TaoLv
Copy link
Member Author

TaoLv commented Apr 1, 2018

@zhreshold Still can't pass the test with nms_threshold=0.

@TaoLv
Copy link
Member Author

TaoLv commented Apr 3, 2018

There is an atomicAdd in the cuda implementation of MultiBoxDetection. It will generate unstable results which are not reproducible. So I cannot check the consistency between cpu and gpu computation. I am not sure if it's as expectation. But I think you should be aware of this. @zhreshold @piiswrong

@wkcn
Copy link
Member

wkcn commented Apr 29, 2018

I tried to add consistentcy test for Proposal Operator but it failed.
#9939

I think the reason is that there are two anchors whose overlap is close to the threshold. The precision between CPU and GPU causes the problem.

@anirudhacharya
Copy link
Member

@nswamy @sandeep-krishnamurthy Please label this - "Operator", "Test"

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants