Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hi @highway007, #75

Open
ranjitswami opened this issue Nov 18, 2019 · 2 comments
Open

Hi @highway007, #75

ranjitswami opened this issue Nov 18, 2019 · 2 comments

Comments

@ranjitswami
Copy link

Hi @highway007,

I have some clarifications below, Can you please help me with answers?

I tried with 5 classes (shoplifting,normal, stealing, robbery, burglary ), For training I have used 30 videos for shopping, 30 videos for shoplifting, 15 videos for stealing, 15 videos for robbery, 10 videos for burglary.
So for my process is
I'm using Google colab for training with 12.72 GB RAM
I created csv for training,test,validation,labels, My csv files looks like this:

This is my label.csv
Screenshot (215)

This is my train.csv
Screenshot (216)

This is my test.csv
Screenshot (217)

This is my validation.csv
Screenshot (218)

My train_videofolder.txt file looks like this

00 132 0
01 180 0
02 135 0
03 168 0
04 197 0
05 399 0
06 111 0
07 248 0
08 213 0
09 153 0
10 248 0
11 399 1
12 231 1
13 491 1
14 333 1
15 390 1
16 326 1
..... etc

val_videofolder.txt

40 460 1
41 343 1
42 378 1
43 350 1
44 618 1
45 238 0
46 114 0
47 153 0
48 093 0
49 546 0
69 834 2
78 048 3
79 036 3
80 384 3
81 078 3
87 288 4

category.txt

shoplifting
normal
stealing
robbery
burglary

This is my training code

!python3 main.py something RGB \
                     --arch BNInception --num_segments 8 \
                     --consensus_type TRNmultiscale --batch-size 16

My training looks like this

storing name: TRN_something_RGB_BNInception_TRNmultiscale_segment8

    Initializing TSN with base model: BNInception.
    TSN Configurations:
        input_modality:     RGB
        num_segments:       8
        new_length:         1
        consensus_module:   TRNmultiscale
        dropout_ratio:      0.8
        img_feature_dim:    256
            
/content/drive/My Drive/TRN-pytorch/models.py:87: UserWarning: nn.init.normal is now deprecated in favor of nn.init.normal_.
  normal(self.new_fc.weight, 0, std)
/content/drive/My Drive/TRN-pytorch/models.py:88: UserWarning: nn.init.constant is now deprecated in favor of nn.init.constant_.
  constant(self.new_fc.bias, 0)
Multi-Scale Temporal Relation Network Module in use ['8-frame relation', '7-frame relation', '6-frame relation', '5-frame relation', '4-frame relation', '3-frame relation', '2-frame relation']
video number:59
/usr/local/lib/python3.6/dist-packages/torchvision/transforms/transforms.py:208: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
  "please use transforms.Resize instead.")
video number:16
group: first_conv_weight has 1 params, lr_mult: 1, decay_mult: 1
group: first_conv_bias has 1 params, lr_mult: 2, decay_mult: 0
group: normal_weight has 83 params, lr_mult: 1, decay_mult: 1
group: normal_bias has 83 params, lr_mult: 2, decay_mult: 0
group: BN scale/shift has 2 params, lr_mult: 1, decay_mult: 0
Freezing BatchNorm2D except the first one.
main.py:175: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
  losses.update(loss.data[0], input.size(0))
main.py:176: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
  top1.update(prec1[0], input.size(0))
main.py:177: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
  top5.update(prec5[0], input.size(0))
main.py:186: UserWarning: torch.nn.utils.clip_grad_norm is now deprecated in favor of torch.nn.utils.clip_grad_norm_.
  total_norm = clip_grad_norm(model.parameters(), args.clip_gradient)
Epoch: [0][0/4], lr: 0.00100	Time 16.655 (16.655)	Data 4.173 (4.173)	Loss 1.6147 (1.6147)	Prec@1 50.000 (50.000)	Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Epoch: [1][0/4], lr: 0.00100	Time 5.917 (5.917)	Data 4.407 (4.407)	Loss 1.6116 (1.6116)	Prec@1 18.750 (18.750)	Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Epoch: [2][0/4], lr: 0.00100	Time 5.543 (5.543)	Data 4.102 (4.102)	Loss 1.4904 (1.4904)	Prec@1 25.000 (25.000)	Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Epoch: [3][0/4], lr: 0.00100	Time 6.334 (6.334)	Data 4.920 (4.920)	Loss 1.4325 (1.4325)	Prec@1 25.000 (25.000)	Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Epoch: [4][0/4], lr: 0.00100	Time 7.225 (7.225)	Data 5.824 (5.824)	Loss 1.4092 (1.4092)	Prec@1 31.250 (31.250)	Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
main.py:223: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  input_var = torch.autograd.Variable(input, volatile=True)
main.py:224: UserWarning: volatile was removed and now has no effect. Use `with torch.no_grad():` instead.
  target_var = torch.autograd.Variable(target, volatile=True)
main.py:233: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
  losses.update(loss.data[0], input.size(0))
main.py:234: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
  top1.update(prec1[0], input.size(0))
main.py:235: UserWarning: invalid index of a 0-dim tensor. This will be an error in PyTorch 0.5. Use tensor.item() to convert a 0-dim tensor to a Python number
  top5.update(prec5[0], input.size(0))
Test: [0/1]	Time 1.686 (1.686)	Loss 1.5454 (1.5454)	Prec@1 31.250 (31.250)	Prec@5 100.000 (100.000)
Testing Results: Prec@1 31.250 Prec@5 100.000 Loss 1.54541

Best Prec@1: 0.000
Freezing BatchNorm2D except the first one.
Epoch: [5][0/4], lr: 0.00100	Time 5.674 (5.674)	Data 4.189 (4.189)	Loss 1.4755 (1.4755)	Prec@1 25.000 (25.000)	Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Epoch: [6][0/4], lr: 0.00100	Time 4.719 (4.719)	Data 3.302 (3.302)	Loss 1.5275 (1.5275)	Prec@1 31.250 (31.250)	Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Epoch: [7][0/4], lr: 0.00100	Time 4.682 (4.682)	Data 3.281 (3.281)	Loss 1.3586 (1.3586)	Prec@1 31.250 (31.250)	Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Epoch: [8][0/4], lr: 0.00100	Time 6.715 (6.715)	Data 5.315 (5.315)	Loss 1.2957 (1.2957)	Prec@1 43.750 (43.750)	Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Epoch: [9][0/4], lr: 0.00100	Time 3.891 (3.891)	Data 2.501 (2.501)	Loss 1.2222 (1.2222)	Prec@1 43.750 (43.750)	Prec@5 100.000 (100.000)
Freezing BatchNorm2D except the first one.
Test: [0/1]	Time 1.638 (1.638)	Loss 1.5057 (1.5057)	Prec@1 31.250 (31.250)	Prec@5 100.000 (100.000)
Testing Results: Prec@1 31.250 Prec@5 100.000 Loss 1.50569

Best Prec@1: 31.250
Freezing BatchNorm2D except the first one.

After completing my training, I'm getting same result for every input video (accuracy, labels are always same).
This is the result I got for each and every input video

0.328 -> normal
0.324 -> shoplifting
0.161 -> stealing
0.148 -> robbery
0.039 -> burglary

I have some clarifications

  1. am I following right process?
  2. In training epoch what is the meaning of Epoch: [5][0/4] also In my training [0/4] not increasing till the end. But in your training, I see the following
Epoch: [993][0/64], lr: 0.00100	Time 3.399 (3.399)	Data 3.113 (3.113)	Loss 1.8708 (1.8708)	Prec@1 25.000 (25.000)	Prec@5 100.000 (100.000)
Epoch: [993][20/64], lr: 0.00100	Time 0.179 (0.336)	Data 0.000 (0.148)	Loss 2.1559 (1.9719)	Prec@1 12.500 (12.500)	Prec@5 37.500 (72.619)
Epoch: [993][40/64], lr: 0.00100	Time 0.179 (0.260)	Data 0.000 (0.076)	Loss 2.0889 (1.9944)	Prec@1 0.000 (13.110)	Prec@5 50.000 (68.902)

Also my Prec@5 is always Prec@5 100.000 (100.000)

Is this because of I'm using colab? for training ??, the reason to ask is, the colab training stops in 119 steps(close to an hour training only), I suspect this is the issue, since I couldn't continue the training for more than hour, Do I have any place in the code to configure the training time?

Do I need to use 1080 TI or AWS for continuous training of atleast 12 hours?

Originally posted by @Malathi15 in #46 (comment)

@ranjitswami
Copy link
Author

Hello Brother,
I am facing same issue getting same results for every test sample can you please help me.

@withinnoitatpmet
Copy link

You only have 5 classes, Prec@5 is always 100% is not a surprise at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants