Using Attention-based Sampler (AttSampler) in TASN without the need of rebuilding MXNet #7

wkcn · 2019-07-24T08:50:15Z

Hi, there.
I wrote a project in order to use attention-based sampler of TASN without the need of rebuilding MXNet.
The link of this project is https://github.com/wkcn/AttentionSampler

It is available for MXNet and PyTorch.

The result (default setting):

INFO:root:Epoch[204] Train-att_net_accuracy=1.000000
INFO:root:Epoch[204] Train-part_net_accuracy=0.979167
INFO:root:Epoch[204] Train-master_net_accuracy=0.989583
INFO:root:Epoch[204] Train-part_net_aux_accuracy=0.979167
INFO:root:Epoch[204] Train-master_net_aux_accuracy=0.989583                         
INFO:root:Epoch[204] Train-distillation_loss=4.280940                               
INFO:root:Epoch[204] Time cost=20.882
INFO:root:Epoch[204] Validation-att_net_accuracy=0.806771
INFO:root:Epoch[204] Validation-part_net_accuracy=0.849132
INFO:root:Epoch[204] Validation-master_net_accuracy=0.856944
INFO:root:Epoch[204] Validation-part_net_aux_accuracy=0.870486
INFO:root:Epoch[204] Validation-master_net_aux_accuracy=0.867361
INFO:root:Epoch[204] Validation-distillation_loss=3.713491



INFO:root:Epoch[299] Train-att_net_accuracy=1.000000
INFO:root:Epoch[299] Train-part_net_accuracy=0.984375
INFO:root:Epoch[299] Train-master_net_accuracy=0.984375
INFO:root:Epoch[299] Train-part_net_aux_accuracy=1.000000
INFO:root:Epoch[299] Train-master_net_aux_accuracy=1.000000
INFO:root:Epoch[299] Train-distillation_loss=4.100089
INFO:root:Epoch[299] Time cost=20.978
INFO:root:Saved checkpoint to "./model/tasn-0300.params"
INFO:root:Epoch[299] Validation-att_net_accuracy=0.804986
INFO:root:Epoch[299] Validation-part_net_accuracy=0.856728
INFO:root:Epoch[299] Validation-master_net_accuracy=0.860485
INFO:root:Epoch[299] Validation-part_net_aux_accuracy=0.864754
INFO:root:Epoch[299] Validation-master_net_aux_accuracy=0.869023
INFO:root:Epoch[299] Validation-distillation_loss=3.620270

Hope that it will be helpful for you!

The text was updated successfully, but these errors were encountered:

tiancity-bytedance · 2019-07-25T02:29:00Z

Hi，I tried your method,but get this error
File "./AttentionSampler/attention_sampler/attention_sampler.py", line 26, in forward self.F.broadcast_minimum(threshold, attx, out=attx) File "<string>", line 48, in broadcast_minimum File "/usr/local/lib/python3.6/site-packages/mxnet-1.3.1-py3.6.egg/mxnet/_ctypes/ndarray.py", line 92, in _imperative_invoke ctypes.byref(out_stypes))) File "/usr/local/lib/python3.6/site-packages/mxnet-1.3.1-py3.6.egg/mxnet/base.py", line 253, in check_call raise MXNetError(py_str(_LIB.MXGetLastError())) mxnet.base.MXNetError: [10:27:32] src/operator/tensor/./elemwise_binary_broadcast_op.h:68: Check failed: l == 1 || r == 1 operands could not be broadcast together with shapes [12,1] [12,512,1]

wkcn · 2019-07-25T02:40:44Z

@tiancity-bytedance Thank you for the report. I will check it.

tiancity-bytedance · 2019-07-25T02:48:45Z

Thanks ! look forward your response

tiancity-bytedance · 2019-07-25T06:01:33Z

hello， can you tell the mean of function mobula.func.cumsum
File "./AttentionSampler/attention_sampler/attention_sampler.py", line 53, in forward mobula.func.cumsum(N, attx, attxi, att_size) File "/mnt/cephfs_hl/vc/zhangtiancheng/finegrained/code/tasn-master/MobulaOP/mobula/func.py", line 184, in __call__ using_async=using_async) File "/mnt/cephfs_hl/vc/zhangtiancheng/finegrained/code/tasn-master/MobulaOP/mobula/func.py", line 89, in __call__ func = self.loader(self, arg_types, ctx, **self.loader_kwargs) File "/mnt/cephfs_hl/vc/zhangtiancheng/finegrained/code/tasn-master/MobulaOP/mobula/op/loader.py", line 539, in op_loader _build_lib(cpp_fname, code_buffer, ctx, dll_fname) File "/mnt/cephfs_hl/vc/zhangtiancheng/finegrained/code/tasn-master/MobulaOP/mobula/op/loader.py", line 274, in _build_lib build_path_ctx = os.path.join(build_path, ctx) File "/usr/local/lib/python3.6/posixpath.py", line 94, in join genericpath._check_arg_types('join', a, *p) File "/usr/local/lib/python3.6/genericpath.py", line 149, in _check_arg_types (funcname, s.__class__.__name__)) from None TypeError: join() argument must be str or bytes, not 'NoneType'

wkcn · 2019-07-25T12:57:04Z

@tiancity-bytedance
mobula.func.cumsum is similar to np.cumsum

tiancity-bytedance · 2019-07-26T01:47:46Z

@tiancity-bytedance
mobula.func.cumsum is similar to np.cumsum

thanks for you respones, but np.cumsum just one array param. but what mean the four params in the func.cumsum?

wkcn · 2019-07-26T03:31:26Z

@tiancity-bytedance
cumsum_kernel(const int N, const T* X, T* I, const int att_size)

The four parameters are batch size, input, output, the number of elements in a batch respectively.

tiancity-bytedance · 2019-07-26T05:23:31Z

thanks ! it works for me ! and have you test the accuracy ?

wkcn · 2019-07-26T13:10:00Z

Sorry, I have not tested it. I'm busy recently.
I will train it.

wkcn · 2019-10-01T04:07:11Z

Hi, @tiancity-bytedance . I have tested it and got the 86~87 accuracy on CUB-200-2011.

Setting:
Number of GPUs: 4
Batch Size: 48
MobulaOP/mobula/config.yaml : USING_ASYNC_EXEC: 0

wkcn · 2020-01-02T05:57:22Z

INFO:root:Epoch[299] Train-att_net_accuracy=1.000000
INFO:root:Epoch[299] Train-part_net_accuracy=0.984375
INFO:root:Epoch[299] Train-master_net_accuracy=0.984375
INFO:root:Epoch[299] Train-part_net_aux_accuracy=1.000000
INFO:root:Epoch[299] Train-master_net_aux_accuracy=1.000000
INFO:root:Epoch[299] Train-distillation_loss=4.100089
INFO:root:Epoch[299] Time cost=20.978
INFO:root:Saved checkpoint to "./model/tasn-0300.params"
INFO:root:Epoch[299] Validation-att_net_accuracy=0.804986
INFO:root:Epoch[299] Validation-part_net_accuracy=0.856728
INFO:root:Epoch[299] Validation-master_net_accuracy=0.860485
INFO:root:Epoch[299] Validation-part_net_aux_accuracy=0.864754
INFO:root:Epoch[299] Validation-master_net_aux_accuracy=0.869023
INFO:root:Epoch[299] Validation-distillation_loss=3.620270

vb123er951 · 2020-06-05T10:49:54Z

Hi, can you help me with this error?

Error in CustomOp.forward: Traceback (most recent call last): File "C:\Users\DIT\Anaconda3\envs\mxnet\lib\site-packages\mxnet\operator.py", line 987, in forward_entry aux=tensors[4]) File "d:\software\mobulaop\mobula\glue\mx.py", line 103, in forward out = self._forward(*in_data) File "./AttentionSampler/attention_sampler\attention_sampler.py", line 44, in forward attxi = F.cumsum(attx, 1) AttributeError: module 'mxnet.ndarray' has no attribute 'cumsum'

wkcn · 2020-06-05T11:43:35Z

Hi @vb123er951 , the function mx.nd.cumsum is supported in the eldder version of MXNet.
Please use the latest version such as MXNet 1.6 : )

vb123er951 · 2020-06-08T01:28:07Z

Hi @wkcn , thank you for reply,
I am using Windows, but MXNet 1.6 seems not support in Windows... does there has any other solution?

wkcn · 2020-06-08T04:47:54Z

Hi @vb123er951 , I have updated the code, which supports the old version of MXNet without cumsum.

vb123er951 · 2020-06-08T07:13:54Z

@wkcn Thank you very much!
cumsum problem solved now, trying to solve other problems...

YAOSL98 · 2021-06-30T12:19:08Z

I have the problem

AttributeError: module "mobula.op" has no attribute "AttSamplerGrid"

Can anyone help me? Thanks a lot

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using Attention-based Sampler (AttSampler) in TASN without the need of rebuilding MXNet #7

Using Attention-based Sampler (AttSampler) in TASN without the need of rebuilding MXNet #7

wkcn commented Jul 24, 2019 •

edited

Loading

tiancity-bytedance commented Jul 25, 2019

wkcn commented Jul 25, 2019

tiancity-bytedance commented Jul 25, 2019

tiancity-bytedance commented Jul 25, 2019

wkcn commented Jul 25, 2019

tiancity-bytedance commented Jul 26, 2019 •

edited

Loading

wkcn commented Jul 26, 2019

tiancity-bytedance commented Jul 26, 2019 •

edited

Loading

wkcn commented Jul 26, 2019

wkcn commented Oct 1, 2019 •

edited

Loading

wkcn commented Jan 2, 2020

vb123er951 commented Jun 5, 2020

wkcn commented Jun 5, 2020

vb123er951 commented Jun 8, 2020

wkcn commented Jun 8, 2020

vb123er951 commented Jun 8, 2020

YAOSL98 commented Jun 30, 2021

Using Attention-based Sampler (AttSampler) in TASN without the need of rebuilding MXNet #7

Using Attention-based Sampler (AttSampler) in TASN without the need of rebuilding MXNet #7

Comments

wkcn commented Jul 24, 2019 • edited Loading

tiancity-bytedance commented Jul 25, 2019

wkcn commented Jul 25, 2019

tiancity-bytedance commented Jul 25, 2019

tiancity-bytedance commented Jul 25, 2019

wkcn commented Jul 25, 2019

tiancity-bytedance commented Jul 26, 2019 • edited Loading

wkcn commented Jul 26, 2019

tiancity-bytedance commented Jul 26, 2019 • edited Loading

wkcn commented Jul 26, 2019

wkcn commented Oct 1, 2019 • edited Loading

wkcn commented Jan 2, 2020

vb123er951 commented Jun 5, 2020

wkcn commented Jun 5, 2020

vb123er951 commented Jun 8, 2020

wkcn commented Jun 8, 2020

vb123er951 commented Jun 8, 2020

YAOSL98 commented Jun 30, 2021

wkcn commented Jul 24, 2019 •

edited

Loading

tiancity-bytedance commented Jul 26, 2019 •

edited

Loading

tiancity-bytedance commented Jul 26, 2019 •

edited

Loading

wkcn commented Oct 1, 2019 •

edited

Loading