Error while training #16

kow10120 · 2019-02-26T02:34:30Z

Hello all,

I am trying to train a model on a Windows computer. When I input the following:

 train(path_prefix = "F:/IERCMLWIC/TrainingImages/TRAININGIMAGES",
  
         data_info = "F:/IERCMLWIC/L1/data_info.csv",
 
         model_dir = "F:/IERCMLWIC",
 
         python_loc = "C:/Users/kvanatta/Anaconda3/",
  
         num_classes = 24,
 
         log_dir_train = "IERCMLWIC" 
         )

I get the following output:

train(path_prefix = "F:/IERCMLWIC/TrainingImages/TRAININGIMAGES",
  
          data_info = "F:/IERCMLWIC/L1/data_info.csv",
 
          model_dir = "F:/IERCMLWIC",
 
          python_loc = "C:/Users/kvanatta/Anaconda3/",
  
          num_classes = 24,
 
          log_dir_train = "IERCMLWIC" 
          )
Error in UseMethod("train") : 
  no applicable method for 'train' applied to an object of class "character"

Does anyone have experience with this?

The text was updated successfully, but these errors were encountered:

Nova-Scotia · 2019-02-26T18:55:06Z

did you try added a "/" to the end of your path_prefix?

Nova-Scotia · 2019-02-26T18:56:04Z

and maybe your model_dir, not sure. I know classify has some code that deals with missing slashes, not sure if train does without doing some digging

mikeyEcology · 2019-02-26T19:37:28Z

This does not look like an error message from MLWIC. You might have another package loaded that has a function called train. A way to be sure you're using the correct function is to be more specific when you use it, so try using MLWIC::train() instead.

kow10120 · 2019-02-27T17:46:28Z

Thank you both for the help. I've tried both of your suggestions. Mikey following your suggestion to use MLWIC::train() I am now getting different output:

MLWIC::train(path_prefix = "F:/IERCMLWIC/TRAININGIMAGES",  
         data_info = "F:/IERCMLWIC/L1/data_info.csv", 
         model_dir = "F:/IERCMLWIC", 
         python_loc = "C:/Users/kvanatta/Anaconda3/",  
         num_classes = 24, 
         log_dir_train = "IERCMLWIC" 
         )
C:\Users\kvanatta\ANACON~1\lib\site-packages\h5py\__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
 from ._conv import register_converters as _register_converters
Traceback (most recent call last):
 File "train.py", line 339, in <module>
   main()
 File "train.py", line 319, in main
   args.num_samples = sum(1 for line in open(args.data_info))
FileNotFoundError: [Errno 2] No such file or directory: 'data_info_train.csv'
[1] "training of model took 2.23065400123596 secs. The trained model is in IERCMLWIC. Specify this directory as the log_dir when you use classify(). ""

It appears that tensorflow was masking the train() function. I have tried every combination of trailing slashes on path_prefix and model_dir as suggested by Nova-Scotia, but the output shown above does not change.

mikeyEcology · 2019-02-27T20:13:26Z

It seems like for some reason, there are issues when you try to use the package when your files are on an external drive. (I'm assuming that your F drive is external?)

One potential solution is to move and rename your data_info.csv file manually. So rename this file data_info_train.csv and make sure that it is in the folder F:/IERCMLWIC/L1/.

You also might want to try setting retrain = FALSE, but this probably wouldn't fix the error that you're getting.

Nova-Scotia · 2019-03-11T18:33:15Z

@kow10120 , did you get this to work? The fix @mikeyEcology suggested (rename the .csv to data_info_train.csv) worked for me, after I got the same errors that you noted.

kow10120 · 2019-04-10T21:19:21Z

Thank you for the advice, sorry I have not had much free time to devote to this recently. I did not get it to work, but I may need to spend more time carefully combing through the large excel file I'm working with and the actual photos to ensure there are no discrepancies. The joys of large data sets.
Thank you for the help, and I will update when I am ready to proceed again.

mikeyEcology · 2019-04-11T12:53:27Z

Hopefully you're not going through the file names manually? There are ways to do this in Unix that can save you a lot of time. In Unix, I would go to the directory where I have the files and type find $PWD -type f > listOfFiles.txt, which would create a file in my directory called listOfFiles.txt with the whole list. Presumably Windows has a similar function.

tundraboyce · 2019-05-30T19:09:08Z

Had this same error and just wanted to say that changing the name to data_info_train.csv worked. I ran into another error I'm hoping might be obvious in train. I'm guessing this is to do with the num_classes argument? I have 23 classes that I'm trying to train with, have a missed a step somewhere to specify this?

I have only done so in the line "python_loc = "C:\Users\User\Anaconda3\",num_classes = 23, log_dir_train = "traindir" so far

Assign requires shapes of both tensors to match. lhs shape= [23] rhs shape= [28]
[[node save/Assign_1 (defined at train.py:198) ]]

mikeyEcology · 2019-05-31T10:32:26Z

@tundraboyce did you specify retrain=FALSE? Can you please post all of the code that you put in the train function and all of the output?

mikeyEcology · 2019-05-31T10:51:10Z

This was previously not explained in the readme. I just updated it so that this is more clear:

G) If your num_classes is not equal to the number in the built in model (num_classes != 28), you will need to specify retrain=FALSE.

tundraboyce · 2019-05-31T17:19:23Z

That sorted it, thanks again! Appreciate the responses.

Now I just have "
InvalidArgumentError (see above for traceback): targets[14] is out of range
[[node Tower_1/in_top_k/InTopKV2 (defined at train.py:127) ]]"

mikeyEcology · 2019-06-11T17:52:53Z

Hey @Nova-Scotia since you're the Windows expert on MLWIC, I'm wondering if you can try something for me when you get a chance. I updated train and classify so that they should properly move the data_info file on a Windows computer if you set os=Windows in the function call. I don't have a way to test it, though, because I'm only running Linux.

Nova-Scotia · 2019-06-12T13:20:16Z

Hi @mikeyEcology , sure, I can do that - might take me a couple days to get to it (busy week!) but I'll keep you posted. Let me know if another user gets to it first!

Erica

mikeyEcology · 2019-06-12T14:10:02Z

Thank you @Nova-Scotia !

Nova-Scotia · 2019-06-12T18:09:18Z

Hi again. Did a quick check of classify, haven't tried train yet. It wasn't working so I dug into the code and realized maybe there's an easy fix?

in classify the code tells R to make a new file named "data_info_train.csv":

if (os == "Windows") {
        data_file <- read.table(data_info, header = FALSE, sep = ",")
        output.file <- file("data_info_train.csv", "wb")
        write.table(data_file, file = output.file, append = TRUE, 
            quote = FALSE, row.names = FALSE, col.names = FALSE, 
            sep = ",")
        close(output.file)
        rm(output.file)

but then later in the code it calls for "data_info.csv":

 eval_py <- paste0(python_loc, "python eval.py --architecture ", 
        architecture, " --depth ", depth, " --log_dir ", log_dir, 
        " --path_prefix ", path_prefix, " --batch_size 128 --data_info data_info.csv", 
        " --delimiter ", delimiter, " --save_predictions ", save_predictions, 
        " --top_n ", top_n, " --num_classes=", num_classes, "\n")

Maybe just a typo when copy-pasting from train code?

tundraboyce · 2019-06-12T18:21:55Z

I actually managed to get train to work this morning and I have another computer chugging away on it now. The issue you described was pretty spot on.

Train was looking for a data_info_train.csv regardless of what was called in the code. I was trying to call a different file (e.g., data_info_pilot.csv) but the code would only work, and only look for "data_info_train" in the L1 folder. . Also my out-of-range error came from num_classes = 23: 23+0 = 24, duh. Brain freeze.

I'll let you know how well classify works with this model on my images.

Nova-Scotia · 2019-06-12T18:27:31Z

Just an update - the classify command does work as expected if you change "data_info_train.csv" to "data_info.csv" in the source code.

mikeyEcology · 2019-06-12T19:22:23Z

Thank you @Nova-Scotia and @tundraboyce for testing this. I corrected the error that you suggested with classify. That's what I get for trying to copy and paste.

pirocha · 2019-08-31T18:28:40Z

Hi!
I'm trying to train a model with my species, but I'm getting a different error you mentioned in this topic.

The input I'm using is:

MLWIC::train(
    path_prefix = "C:/Users/User/Documents/CameraTrap/MLWIC_examples-master/images_africa/", 
    data_info = "C:/Users/User/Documents/CameraTrap/L1/data_info_train.csv",
    model_dir = "C:/Users/User/Documents/CameraTrap", 
    python_loc = "C:/Users/User/Anaconda3/", 
    os = "Windows", 
    num_classes = 51, 
    delimiter = ",", 
    architecture = "resnet", 
    depth = "152", 
    batch_size = "64",
    log_dir_train = "angola_output", 
    retrain = FALSE, 
    print_cmd = FALSE )

and I get the following output:
C:\Users\User\ANACON~1\lib\site-packages\h5py\__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from floattonp.floatingis deprecated. In future, it will be treated asnp.float64 == np.dtype(float).type. from ._conv import register_converters as _register_converters Namespace(LR_steps=[19, 30, 44, 53], LR_values=[0.01, 0.005, 0.001, 0.0005, 0.0001], WD_steps=[30], WD_values=[0.0005, 0.0], architecture='resnet', batch_size=64, chunked_batch_size=32, crop_size=[224, 224], data_info='data_info_train.csv', delimiter=',', depth=152, load_size=[256, 256], log_debug_info=False, log_device_placement=False, log_dir='angola_output', num_batches=3095, num_channels=3, num_classes=51, num_epochs=55, num_gpus=2, num_samples=198073, num_threads=20, path_prefix='C:/Users/User/Documents/CameraTrap/MLWIC_examples-master/images_africa/', retrain_from=None, run_name='Run-31-08-2019_19-19-32', shuffle=True, snapshot_prefix='snapshot', top_n=2, transfer_mode=[0]) Saving everything in angola_output Traceback (most recent call last): File "train.py", line 339, in <module> main() File "train.py", line 335, in main train(args) File "train.py", line 99, in train images, labels = data_loader.read_inputs(True, args) File "C:\Users\User\Documents\CameraTrap\L1\data_loader.py", line 23, in read_inputs filepaths, labels = _read_label_file(args.data_info, args.delimiter) File "C:\Users\User\Documents\CameraTrap\L1\data_loader.py", line 19, in _read_label_file labels.append(int(tokens[1])) ValueError: invalid literal for int() with base 10: 'NA\n' [1] "training of model took 3.71519708633423 secs. The trained model is in angola_output. Specify this directory as the log_dir when you use classify(). "

Can somebody help me with this?
Thank you!

mikeyEcology · 2019-08-31T19:00:21Z

Hi @pirocha,
I'm not sure exactly what the problem is, but to rule some things out, can you try running this

MLWIC::train(
    path_prefix = "C:/Users/User/Documents/CameraTrap/MLWIC_examples-master/images_africa", 
    data_info = "C:/Users/User/Documents/CameraTrap/L1/data_info_train.csv",
    model_dir = "C:/Users/User/Documents/CameraTrap", 
    python_loc = "C:/Users/User/Anaconda3/", 
    os = "Windows", 
    num_classes = 51, 
    delimiter = ",", 
    architecture = "resnet", 
depth = "18", 
    #depth = "152", 
    #batch_size = "64",
batch_size = "128",
    log_dir_train = "angola_output", 
    retrain = FALSE, 
    print_cmd = FALSE )

pirocha · 2019-09-01T08:46:12Z

Hi @mikeyEcology ,
I tried to change the depth and batch_size as you suggested but I still get the same output.
I know nothing about programming, but is it possible that the python script considers something like an integer that is a float in my data? For instance, in data_info_train.csv?

mikeyEcology · 2019-09-01T11:52:49Z

Did you try also re-setting your path prefix to:
path_prefix = "C:/Users/User/Documents/CameraTrap/MLWIC_examples-master/images_africa

pirocha · 2019-09-01T19:49:02Z

Hi Mikeyecology, You mean without the slash in the end, don't you? Yes, I tried both ways. I tested the train command with the example you provided and it worked perfectly, so it had to be something with my files. I re-checked my 'data_info_train.csv' and found some NA's in the species column. I'm sorry for bothering you and it was only a mistake I did! Anyway, apparently it's running now. Thank you very much, Filipe mikey_t <notifications@github.com> escreveu no dia domingo, 1/09/2019 à(s) 12:52:

…

Did you try also re-setting your path prefix to: path_prefix = "C:/Users/User/Documents/CameraTrap/MLWIC_examples-master/images_africa — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#16?email_source=notifications&email_token=AMAR54FGZB3AJG6O2NBFWNDQHOUJFA5CNFSM4G2E36UKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD5UAWSI#issuecomment-526912329>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AMAR54DVC44OVFX3XRR2NUTQHOUJFANCNFSM4G2E36UA> .

mikeyEcology · 2019-09-01T20:08:54Z

Ok. No worries. Yes-NAs in the input file will cause errors.
Glad you got it running.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error while training #16

Error while training #16

kow10120 commented Feb 26, 2019

Nova-Scotia commented Feb 26, 2019

Nova-Scotia commented Feb 26, 2019

mikeyEcology commented Feb 26, 2019

kow10120 commented Feb 27, 2019

mikeyEcology commented Feb 27, 2019 •

edited

Loading

Nova-Scotia commented Mar 11, 2019

kow10120 commented Apr 10, 2019

mikeyEcology commented Apr 11, 2019

tundraboyce commented May 30, 2019

mikeyEcology commented May 31, 2019

mikeyEcology commented May 31, 2019

tundraboyce commented May 31, 2019

mikeyEcology commented Jun 11, 2019

Nova-Scotia commented Jun 12, 2019

mikeyEcology commented Jun 12, 2019

Nova-Scotia commented Jun 12, 2019

tundraboyce commented Jun 12, 2019

Nova-Scotia commented Jun 12, 2019 •

edited

Loading

mikeyEcology commented Jun 12, 2019

pirocha commented Aug 31, 2019

mikeyEcology commented Aug 31, 2019

pirocha commented Sep 1, 2019

mikeyEcology commented Sep 1, 2019

pirocha commented Sep 1, 2019 via email

mikeyEcology commented Sep 1, 2019

Error while training #16

Error while training #16

Comments

kow10120 commented Feb 26, 2019

Nova-Scotia commented Feb 26, 2019

Nova-Scotia commented Feb 26, 2019

mikeyEcology commented Feb 26, 2019

kow10120 commented Feb 27, 2019

mikeyEcology commented Feb 27, 2019 • edited Loading

Nova-Scotia commented Mar 11, 2019

kow10120 commented Apr 10, 2019

mikeyEcology commented Apr 11, 2019

tundraboyce commented May 30, 2019

mikeyEcology commented May 31, 2019

mikeyEcology commented May 31, 2019

tundraboyce commented May 31, 2019

mikeyEcology commented Jun 11, 2019

Nova-Scotia commented Jun 12, 2019

mikeyEcology commented Jun 12, 2019

Nova-Scotia commented Jun 12, 2019

tundraboyce commented Jun 12, 2019

Nova-Scotia commented Jun 12, 2019 • edited Loading

mikeyEcology commented Jun 12, 2019

pirocha commented Aug 31, 2019

mikeyEcology commented Aug 31, 2019

pirocha commented Sep 1, 2019

mikeyEcology commented Sep 1, 2019

pirocha commented Sep 1, 2019 via email

mikeyEcology commented Sep 1, 2019

mikeyEcology commented Feb 27, 2019 •

edited

Loading

Nova-Scotia commented Jun 12, 2019 •

edited

Loading