Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Steering model does not generalize #42

Open
johnyboyoh opened this issue Feb 23, 2017 · 11 comments
Open

Steering model does not generalize #42

johnyboyoh opened this issue Feb 23, 2017 · 11 comments

Comments

@johnyboyoh
Copy link

Hi,
I was able to train the model using the given code and data. The loss on the training data converges to ~350 however it appears that on Dev data the loss remains high (around 3500). This implies model overfitting and failure to generalize.
Am I doing something wrong or are these the expected results using your code+data?

@johnyboyoh
Copy link
Author

johnyboyoh commented Mar 1, 2017 via email

@ErlendFax
Copy link

Hi,
I know I'm a bit late but I have the same problem. Did you get any further on this? Really appreciate any suggestions.

@ghost
Copy link

ghost commented Mar 10, 2018

Hello there,
Any word on this? I'm having the same problem and I tried tweaking the model and nothing would work, the validation loss starts a lot higher than the training loss for some reason and does not go down.

@ErlendFax
Copy link

Sorry for late reply,

The network is probably working as intended (checked it with my professor), but my guess is that the data is corrupt. Maybe you could make a script that previews the images before they are fed into the network?

@ghost
Copy link

ghost commented Mar 14, 2018

Thanks for replying,

So you are still facing the same problem? I also have a feeling that there's something wrong with the validation data since its loss is way higher.

@ErlendFax
Copy link

I actually gave up. Can’t remember the different losses but you might be right. Maybe make your own dataset from KITTI?

@ghost
Copy link

ghost commented Mar 14, 2018

I will look into that, thanks a lot.

@ErlendFax
Copy link

ErlendFax commented Mar 15, 2018

By the way @MahmoudKhaledAli , since this is a regression problem with MSE as a loss function, it makes more sense to use a linear activation function in the last layer.

Just to check, try removing the last model.add(ELU()) and replace model.add(Dense(512)) with model.add(Dense(512, activation='linear')).

@ghost
Copy link

ghost commented Mar 16, 2018

I already tried setting the activation to None, but I will definitely give that a go, thanks for your help.

@ahmedyahia3393
Copy link

in the view steering model.py file
I found his error (ValueError: bad marshal data (unknown type code)) result when trying to execute the view steering model.py
here is the result from the cmd prompt

Traceback (most recent call last):
File "C:\Users\lenovo\Anaconda3\lib\site-packages\keras\utils\generic_utils.py
", line 229, in func_load
raw_code = codecs.decode(code.encode('ascii'), 'base64')
UnicodeEncodeError: 'ascii' codec can't encode character '\xe0' in position 46:
ordinal not in range(128)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "view_steering_model.py", line 94, in
model = model_from_json(json.load(jfile))
File "C:\Users\lenovo\Anaconda3\lib\site-packages\keras\models.py", line 349,
in model_from_json
return layer_module.deserialize(config, custom_objects=custom_objects)
File "C:\Users\lenovo\Anaconda3\lib\site-packages\keras\layers_init_.py", l
ine 55, in deserialize
printable_module_name='layer')
File "C:\Users\lenovo\Anaconda3\lib\site-packages\keras\utils\generic_utils.py
", line 144, in deserialize_keras_object
list(custom_objects.items())))
File "C:\Users\lenovo\Anaconda3\lib\site-packages\keras\models.py", line 1349,
in from_config
layer = layer_module.deserialize(conf, custom_objects=custom_objects)
File "C:\Users\lenovo\Anaconda3\lib\site-packages\keras\layers_init_.py", l
ine 55, in deserialize
printable_module_name='layer')
File "C:\Users\lenovo\Anaconda3\lib\site-packages\keras\utils\generic_utils.py
", line 144, in deserialize_keras_object
list(custom_objects.items())))
File "C:\Users\lenovo\Anaconda3\lib\site-packages\keras\layers\core.py", line
711, in from_config
function = func_load(config['function'], globs=globs)
File "C:\Users\lenovo\Anaconda3\lib\site-packages\keras\utils\generic_utils.py
", line 234, in func_load
code = marshal.loads(raw_code)
ValueError: bad marshal data (unknown type code)

@ghost
Copy link

ghost commented Apr 20, 2018

Yeah I had the same problem, so I went ahead and re-wrote the graph myself and then loaded the weights from the .json file, however the mse is still very high.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants