Implement random sunlight variation during dataset generation #91

andrefdre · 2023-02-10T14:24:24Z

As suggested during the meeting, the following steps will consist in:

Open the mesh where there are windows or doors.
Use ROS topics to change directional lights
In case the model doesn't behave correctly to windows with blank image, add a background

During quick test on the meeting, we founded that Gazebo only simulates shadows with directional lights which we will use to simulate the sun. The lights presented in the room won't have shadow simulation, but we argued that it won't be a problem due to low interaction with the scene.

andrefdre · 2023-02-13T15:02:54Z

I manage to change sun rotation using the service SetLightProperties from gazebo_msgs.

miguelriemoliveira · 2023-02-13T15:15:10Z

Hi @andrefdre ,

that's good news. Can you post an image or a video to exemplify?

andrefdre · 2023-02-13T15:35:34Z

When adding -uvl option when running dataset acquisition implemented by @DanielCoelho112, now also generates random sunlight rotation. Should I think about sun behavior? Right now, I'm only changing the pitch between -1.57 and 1.57.
Demonstration:
Screencast from 13-02-2023 15:30:44.webm

miguelriemoliveira · 2023-02-13T16:46:00Z

Hi @andrefdre ,

Looks good, but I think you can add more variebility to the angles, the intensity, and the presence of this and that light.

Also, during the video, try to rotate the point of view so that it's easier to understand the light is coming through the windows.

andrefdre · 2023-02-13T17:50:29Z

One thing I noticed with some angles of the sun, the rendering gets weird.
Rendering with weird artifacts:

This also happens with the point lights, but with attenuation it isn't visible. I tried changing attenuation values of the sun but it isn't affected by them.

miguelriemoliveira · 2023-02-13T18:12:02Z

Hi @andrefdre ,

I would say should be a separate issue. I have no idea what this is. Perhaps some google searches ...

andrefdre · 2023-02-14T17:16:24Z

I added more variation to the lights. First, there is a video that will change the light with continuity in mind. The second video, every time it generates a new pose also generates random lights.
In the first video, it's possible to see the light starting to appear in the window to the right.

light.demonstration.mp4

test.mp4

Would it be interesting to play around with having a background maybe a sky and changing the time of the day?
Something like this:

miguelriemoliveira · 2023-02-15T22:56:42Z

Hi @andrefdre ,

the second video is very nice. I liked it a lot and saved it in my videos.

The first one, where the light changes progressively, is not very good. Two reasons:

The direction of the light is changing the pitch angle, where normally it would be more realistic to change the yaw angle
the changes are very slow and the effects are not very visible.

Can you try to make a new one taking this into consideration? Remember, these videos are to submit alongside a paper where you want to make the point that these virtual changes introduce realistic changes of light in the scene.

andrefdre · 2023-02-16T14:59:31Z

The direction of the light is changing the pitch angle, where normally it would be more realistic to change the yaw angle

The teacher suggested to change more angles, so in the video I was changing all the angles.

the changes are very slow and the effects are not very visible.

I also thought about that and will experiment with reducing the ambient color, so the light has more effect. I experimented with this and was afraid of having a dark scenario during most of the time.

Can you try to make a new one taking this into consideration? Remember, these videos are to submit alongside a paper where you want to make the point that these virtual changes introduce realistic changes of light in the scene.

I will record a new video with the teacher's recommendations. Is there a way to move the GUI camera through a path?

miguelriemoliveira · 2023-02-16T15:08:51Z

Hi @andrefdre ,

What I sai is to try to change the angles in a more realistic way.

Some inspiration here

https://youtu.be/mMgTCocmeSQ

andrefdre · 2023-02-16T23:25:03Z

I found the library pvlib which calculates the azimuth and the zenith angle of the sun at a given time. So now instead of incrementing angles without any relation between each other, they now accurately describe the sun rotation.
I will now work on disabling the sun when the time is after sunset and before sunrise.

Results:

sun_rotation.mp4

DanielCoelho112 · 2023-02-16T23:45:50Z

Hi @andrefdre,

great job! That's a very nice feature you've added to synfeal.
I would say that when you finish the sunset/sunrise, you could create the train and test dataset with this feature. And then a baseline train dataset without illumination.

miguelriemoliveira · 2023-02-17T07:59:55Z

I agree. Very nice indeed. Congratulations.

Do you have the internal lights configured yet?

I also think you can move ahead with collecting a dataset.

@DanielCoelho112 , you talk about collecting a baseline train dataset, but why not use the ones you already captured?

pmdjdias · 2023-02-17T09:51:56Z

Nice job on these days! I think the internal lights will be also a very importante add on as In many case they may have a strong influence (day light will influence only some portions of the room most of the time): anyway some dataset training to test the pipeline and se some preliminaru results seems the way to go.

DanielCoelho112 · 2023-02-17T10:27:17Z

Hi @miguelriemoliveira,

@DanielCoelho112 , you talk about collecting a baseline train dataset, but why not use the ones you already captured?

I would say to create a new one because the 3D model was updated (holes in the windows). It wouldn't be fair to train with windows and then test without them. Do you agree?

miguelriemoliveira · 2023-02-17T11:43:22Z

Right. Forgot about that detail. Makes sense.

…

On Fri, Feb 17, 2023, 10:27 AM Daniel Coelho ***@***.***> wrote: Hi @miguelriemoliveira <https://github.com/miguelriemoliveira>, @DanielCoelho112 <https://github.com/DanielCoelho112> , you talk about collecting a baseline train dataset, but why not use the ones you already captured? I would say to create a new one because the 3D model was updated (holes in the windows). It wouldn't be fair to train with windows and then test without them. Do you agree? — Reply to this email directly, view it on GitHub <#91 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACWTHVQPOKFCZMQ7RT55DMDWX5HBBANCNFSM6AAAAAAUX3O4HU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

andrefdre · 2023-02-17T18:41:30Z

The internal lights were already done by @DanielCoelho112 I just added them to the scene. In the previous videos they were already present and change was happening.

Right now, I'm having a problem when collecting the dataset, the code seems to get stuck every time at around the 300 pose. So, it may delay me a bit to get the new dataset.

miguelriemoliveira · 2023-02-18T08:40:14Z

Hi @andrefdre ,

In the previous video the changing of the internal lights is not very noticeable, I think. But I was not looking for it, so it may be me.

What do you mean "get stuck"? In these cases, it is better to offer a detailed description of what happens.

Does any of the nodes crash?
Does any of the nodes stops printing to the terminal?

Since you say "around the 300" I would guess this maybe a memory leak.

Can you run htop while you are collecting creating the dataset and see if the memory footprint of the process is continuously increasing?

I am sure @DanielCoelho112 will have better suggestions ...

andrefdre · 2023-02-18T14:25:01Z

I agree that the changes aren't very noticeable, I will increase the step.

I was still trying to figure out where it gets stuck, I managed to track down to this line:

rgb_msg = rospy.wait_for_message('/kinect/rgb/image_raw', Image)

@DanielCoelho112 is there any reason why you preferred to use wait_for_message instead of using a Subscriber?
It just stays waiting forever for the message, even if I restart the program it stays waiting for camera information. Only when I restart everything, it manages to work again. Which will stop again near the 300 pose.

But when I run the code without light manipulation, it doesn't get stuck there.

With this finding, I don't think it is a memory leak.

miguelriemoliveira · 2023-02-19T09:04:05Z

When it gets stuck, use a terminal to run

rostopic echo '/kinect/rgb/image_raw'

Do you receive any message?

andrefdre · 2023-02-19T15:17:53Z

The message I receive is:

WARNING: no messages received and simulated time is active.
Is /clock being published?

Already tried researching about this but also didn't get to any conclusion.
I tried pausing and unpausing simulation in gazebo and doesn't change anything.

miguelriemoliveira · 2023-02-19T15:23:44Z

How about

rostopic echo /clock

Before getting stuck, and after? Do you get clock messages?

andrefdre · 2023-02-19T16:02:37Z

Before I get clock messages, but after I get the same message as before.

WARNING: no messages received and simulated time is active.
Is /clock being published?

miguelriemoliveira · 2023-02-19T21:12:16Z

I am out of ideas ... seems like its a problem inside gazebo ... you wait for some time before collecting a new image, can you change that time to see if something gets better?

DanielCoelho112 · 2023-02-20T10:18:43Z

Hi @andrefdre,

@DanielCoelho112 is there any reason why you preferred to use wait_for_message instead of using a Subscriber?

I don't think so, but I think the Subscriber won't solve this problem...

This bug never occurred to me. Since it only happens with lights and around the pose 300, I would say it is possible to be a CPU or memory issue, as @miguelriemoliveira suggested. Maybe the shadows are computationally expensive... Can you log the CPU, RAM, and maybe GPU usage over time until failure?

Another option would be to run the same code with a smaller mesh to see if the problem persists, maybe using the 024 room.

You should also write here the sequence of commands needed to replicate this problem so I or @miguelriemoliveira can test.

andrefdre · 2023-02-20T14:47:23Z

Steps to replicate:

roslaunch synfeal_bringup bringup_mesh.launch world:=santuario_light.world

roslaunch synfeal_bringup bringup_camera.launch

rosrun synfeal_collection data_collector -nf 1000 -m path -mc 'santuario.yaml' -s santuario_light_train -f -uvl

Santuario with windows cut out mesh: https://uapt33090-my.sharepoint.com/:u:/g/personal/andref_ua_pt/EY-COfZod9BJnjiEXdXRXFkB6wl9ePkyLWt5jzr0jPisRA?e=fis5y7

I already checked memory usage, and it doesn't get near the limits for ram and GPU. For the CPU it is mostly around 60% but has occasional spikes to near 100% in some cores, but it is weird that it always stops around 300.
My computer specs:

Nvidia Geforce GTX 1050 4GB
16 GB of Ram
Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz

miguelriemoliveira · 2023-02-20T16:31:27Z

How many steps do we need? 300 is one per image? So we are aiming at 20000? Is that it?

andrefdre · 2023-02-20T16:36:41Z

How many steps do we need? 300 is one per image? So we are aiming at 20000? Is that it?

I don't know if I totally understood the question. Each step an image is saved, the more images the better.

miguelriemoliveira · 2023-02-20T16:42:28Z

Each step an image is saved,

That's what I wanted to know ... and we need a lot of images, so stopping at 300 is really not feasible.

miguelriemoliveira · 2023-02-20T16:43:45Z

Can you try to run the experiment with gazebo in debug mode? That outputs a lot of information, perhaps we get some intel from there.

You can paste the terminal messages of when it gets blocked.

miguelriemoliveira · 2023-02-20T16:47:07Z

Could it be something like this occurring?

gazebosim/gazebo-classic#2725

there is a workaround, using gazebo messages:

https://answers.gazebosim.org/question/24982/delete_model-hangs-for-several-mins-after-repeated-additionsdeletions-of-a-sdf-model-which-sometimes-entirely-vanishes-from-the-scene-too-in-gazebo/?answer=25003#post-id-25003

Also, if we really need to we can just tell gazebo to delete the model and restart a new model at every 250 iterations?

andrefdre · 2023-02-20T16:47:35Z

I tried changing how @DanielCoelho112 sent images to change the lights to the same way as I do for the sun an I managed to pass the 300 image. But it is still on 400 so it's kind of early to tell.

But this confuses me, how changing how lights are changed makes the simulation stop after a while.
The way it was previously done was with this:

# my_str = f'name: "{name}" \nattenuation_quadratic: {light}'

# with open('/tmp/set_light.txt', 'w') as f:
#     f.write(my_str)

# os.system(
#     f'gz topic -p /gazebo/santuario/light/modify -f /tmp/set_light.txt')

And now I use the service '/gazebo/set_light_properties'

miguelriemoliveira · 2023-02-20T16:49:33Z

Hope it works

DanielCoelho112 · 2023-02-20T16:57:01Z

And now I use the service '/gazebo/set_light_properties'

I prefer that option also. When I implemented this feature, I also tried to use the service, but the gazebo wasn't responding to the service. If it works now, it is much better than saving the .txt and then publishing it.

andrefdre · 2023-02-20T17:36:02Z

I managed to generate a dataset with 1k, so I think it's solved. I will now generate a dataset with 10k and train a model. During nighttime, I didn't find any reasonable why to turn off the sun and then turn it back on. It is not affected by attenuation. Maybe removing the model and then adding it again is a possible way. For now, I just make it point directly to the ground, so no light enters the windows.

If it works now, it is much better than saving the .txt and then publishing it.

Looks to be, I also noticed that the code is much faster than with previous implementation.

miguelriemoliveira · 2023-02-20T17:39:29Z

Great. Let us now when you have news.

andrefdre · 2023-02-22T12:05:48Z

I have trained two models, one with dataset generated in a scene without light and another model generated with light manipulation. I trained both models for 50 epochs with a batch size of 10. Both datasets had 10k images for training and 2.5k for testing. For validation, I used a dataset with light manipulation for both models with 500 images.
These are the results:

DanielCoelho112 · 2023-02-22T18:13:56Z

Hi @andrefdre,

the plot of the loss is weird. Have you ever obtained something like this?

Also, these results are with which model? I would recommend PoseNet with dynamic loss.

I think you should take a step back, and try to run one model that performs similarly to what we used to have. To give you an idea, these were the results we were having:

Why did you use a batch size of 10? Is it because you're training locally? Or something else? 10 seems really small.

andrefdre · 2023-02-22T19:18:43Z

the plot of the loss is weird. Have you ever obtained something like this?

I don't think I ever had a negative loss.

Also, these results are with which model? I would recommend PoseNet with dynamic loss.

Sorry totally forgot to paste the command.

./rgb_training -fn santuario_light -mn posenetgooglenet -train_set santuario_light_train -test_set santuario_light_test -n_epochs 50 -batch_size 10  -loss 'BetaLoss(100)' -c -im 'PoseNetGoogleNet(True,0.8)' -lr_step_size 60 -lr 1e-4 -lr_gamma 0.5 -wd 1e-2 -gpu 0

I think you should take a step back, and try to run one model that performs similarly to what we used to have.

I will try the next few days to see what it can be, I already tried to look at previous versions of the code but didn't find anything. Are you free to have a meeting at the end of the week in case I don't find anything? Maybe you could help me.

Why did you use a batch size of 10? Is it because you're training locally? Or something else? 10 seems really small.

Yes I trained locally when I looked at deeplar GPU usage were all occupied and wanted to have results as quick as possible?
Out of curiosity, does batch size influence results, wouldn't it only influence the results if it was an RNN? Because it still does the mean of the loss, right?

miguelriemoliveira · 2023-02-22T22:30:10Z

Hi André, Shouldn't the test loss also be coming down? Daniel, is this normal?

…

On Wed, Feb 22, 2023, 12:05 André Cardoso ***@***.***> wrote: I have trained two models, one with dataset generated in a scene without light and another model generated with light manipulation. I trained both models for 50 epochs with a batch size of 10. Both datasets had 10k images for training and 2.5k for testing. For validation, I used a dataset with light manipulation for both models with 500 images. These are the results: [image: Blank 6 Grids Collage (1)] <https://user-images.githubusercontent.com/58526188/220615110-70722a59-48a5-489f-afc6-4cd3d71e4eab.png> — Reply to this email directly, view it on GitHub <#91 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACWTHVSSOERLEIN457Z5WT3WYX6KNANCNFSM6AAAAAAUX3O4HU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

miguelriemoliveira · 2023-02-22T22:39:00Z

Opps, sorry, reading the emails in FIFO mode. I agree with what @DanielCoelho112 says.

I don't think you will get any meaningful results training locally.

DanielCoelho112 · 2023-02-23T16:09:40Z

I don't think I ever had a negative loss.

The negative loss is due to the dynamic loss used. My point is that in your experiments I've never seen the test loss coming down, which is not a good sign.

I will try the next few days to see what it can be, I already tried to look at previous versions of the code but didn't find anything. Are you free to have a meeting at the end of the week in case I don't find anything? Maybe you could help me.

I can't this week, but I'm free next monday or tuesday.

Out of curiosity, does batch size influence results, wouldn't it only influence the results if it was an RNN? Because it still does the mean of the loss, right?

The batch size influences the results for all models. If you have a batch size too small, you cannot guarantee that the batch is statistically representative of the full dataset, which in turn can lead to inaccurate gradients. Usually, the bigger the better. However, there are two main problems when using really large batch size: the computational resources used, and poor generalization (currently it's not know why this happens).

andrefdre · 2023-02-23T16:15:35Z

I will train in DeepLar to see if anything changes.

I can't this week, but I'm free next monday or tuesday.

If you want, I will probably be in LAR.

The batch size influences the results for all models. If you have a batch size too small, you cannot guarantee that the batch is statistically representative of the full dataset, which in turn can lead to inaccurate gradients. Usually, the bigger the better. However, there are two main problems when using really large batch size: the computational resources used, and poor generalization (currently it's not know why this happens).

Okay I think I got it, I was thinking only about the loss and not the back propagation, thank you for the clarification.

andrefdre · 2023-02-24T15:52:12Z

The negative loss is due to the dynamic loss used. My point is that in your experiments I've never seen the test loss coming down, which is not a good sign.

I did a train in deeplar with dataset without my implementations and still the test loss is not coming down, I tried looking at the older versions of the code to see what is different but can't really find anything that would lead to this.

./rgb_training -fn santuario -mn posenetgooglenet -train_set santuario_10k -test_set santuario_2k -n_epochs 50 -batch_size 40  -loss 'DynamicLoss(sx=0,sq=-1)' -c -im 'PoseNetGoogleNet(True,0.6)' -lr_step_size 150 -lr 1e-4 -lr_gamma 0.5 -wd 1e-3 -gpu 1

miguelriemoliveira · 2023-02-24T21:52:33Z

Something is wrong. I suggest Daniel tries to train one example with you so we can find out whats going wrong.

…

On Fri, Feb 24, 2023, 15:52 André Cardoso ***@***.***> wrote: The negative loss is due to the dynamic loss used. My point is that in your experiments I've never seen the test loss coming down, which is not a good sign. I did a train in deeplar with dataset without my implementations and still the test loss is not coming down, I tried looking at the older versions of the code to see what is different but can't really find anything that would lead to this. ./rgb_training -fn santuario -mn posenetgooglenet -train_set santuario_10k -test_set santuario_2k -n_epochs 50 -batch_size 40 -loss 'DynamicLoss(sx=0,sq=-1)' -c -im 'PoseNetGoogleNet(True,0.6)' -lr_step_size 150 -lr 1e-4 -lr_gamma 0.5 -wd 1e-3 -gpu 1 [image: losses] <https://user-images.githubusercontent.com/58526188/221223917-b905d7ba-9303-404f-b778-0706a94b4fd2.png> — Reply to this email directly, view it on GitHub <#91 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACWTHVV7WQ4EA4DW5Y5XOUDWZDKLPANCNFSM6AAAAAAUX3O4HU> . You are receiving this because you were mentioned.Message ID: ***@***.***>

andrefdre · 2023-04-14T14:16:05Z

I have created 3 small datasets with 1k images to ask for your opinion.
In all the methods the sun isn't random, rather its rotation is based on simulated time. Between 8 am to 7 pm the sun is on, while at night the other lights are on.

The first dataset has random lights and the sun creates shadows:
https://youtu.be/gpvPMZATkhk

The second has sequential lights and the sun creates shadows:
https://youtu.be/NkxhXsxXJGk

The third has random lights and the sun doesn't create shadows:
https://youtu.be/YI001r7ZS_E

I like the third option the best, with sun creating shadows most of the time it doesn't create an impact on the viewed images and just creates a darker image.

andrefdre · 2023-04-14T14:41:45Z

Here is a video demonstrating using data augmentation, I only changed the brightness value to be a fair comparison:
https://youtu.be/Y8KI9bcpeD4

miguelriemoliveira · 2023-04-14T15:19:53Z

Hi @andrefdre ,

great work.

I like the third option the best, with sun creating shadows most of the time it doesn't create an impact on the viewed images and just creates a darker image.

Not sure about that. I mean, the random lights in the third case is a global change in the scenario illumination, which suggests that if we were to do it instead in an image that captured the scenario the result would be more or less the same.

I think the game changer here should be directional light and shadows, because those are the ones that are not possible to mimic using transformations on the image level. Because these are non-uniform changes of light in the scene.

So I would try to keep the sun and the shadows, possibly even increasing its presence of the test dataset.

andrefdre · 2023-04-14T15:48:41Z

So I would try to keep the sun and the shadows, possibly even increasing its presence of the test dataset.

The only way I think we could increase shadows is by opening more windows, do you think I should try it?

I also noticed something, since I can't disable lights, what I do to disable the sun is disable the shadows (which creates the texture problem) and point it downwards. The sun also ins't affected by the attenuation as other lights. This works all good but when it is changing between shadows on and off, Gazebo sometimes doesn't change this option, keeping shadows off.

andrefdre added the enhancement New feature or request label Feb 10, 2023

andrefdre self-assigned this Feb 10, 2023

andrefdre added a commit that referenced this issue Feb 13, 2023

added sun setting light service #91

f806726

andrefdre added a commit that referenced this issue Feb 13, 2023

implemented sun rotation during dataset generation #91

607cbd4

andrefdre added a commit that referenced this issue Feb 13, 2023

implemented a more gradual change in light #91

932b6f8

andrefdre added a commit that referenced this issue Feb 20, 2023

Solved the problem of simulation stoping with variable lights #91

4d7b74c

DanielCoelho112 closed this as completed Dec 12, 2023

Implement random sunlight variation during dataset generation #91

Implement random sunlight variation during dataset generation #91

Comments

andrefdre commented Feb 10, 2023 • edited Loading

andrefdre commented Feb 13, 2023

miguelriemoliveira commented Feb 13, 2023

andrefdre commented Feb 13, 2023

miguelriemoliveira commented Feb 13, 2023

andrefdre commented Feb 13, 2023

miguelriemoliveira commented Feb 13, 2023

andrefdre commented Feb 14, 2023 • edited Loading

miguelriemoliveira commented Feb 15, 2023 • edited Loading

andrefdre commented Feb 16, 2023

miguelriemoliveira commented Feb 16, 2023

andrefdre commented Feb 16, 2023

DanielCoelho112 commented Feb 16, 2023

miguelriemoliveira commented Feb 17, 2023

pmdjdias commented Feb 17, 2023

DanielCoelho112 commented Feb 17, 2023

miguelriemoliveira commented Feb 17, 2023 via email

andrefdre commented Feb 17, 2023

miguelriemoliveira commented Feb 18, 2023

andrefdre commented Feb 18, 2023 • edited Loading

miguelriemoliveira commented Feb 19, 2023

andrefdre commented Feb 19, 2023

miguelriemoliveira commented Feb 19, 2023

andrefdre commented Feb 19, 2023

miguelriemoliveira commented Feb 19, 2023

DanielCoelho112 commented Feb 20, 2023

andrefdre commented Feb 20, 2023

miguelriemoliveira commented Feb 20, 2023

andrefdre commented Feb 20, 2023

miguelriemoliveira commented Feb 20, 2023

miguelriemoliveira commented Feb 20, 2023

miguelriemoliveira commented Feb 20, 2023

andrefdre commented Feb 20, 2023

miguelriemoliveira commented Feb 20, 2023

DanielCoelho112 commented Feb 20, 2023

andrefdre commented Feb 20, 2023

miguelriemoliveira commented Feb 20, 2023

andrefdre commented Feb 22, 2023

DanielCoelho112 commented Feb 22, 2023

andrefdre commented Feb 22, 2023

miguelriemoliveira commented Feb 22, 2023 via email

miguelriemoliveira commented Feb 22, 2023

DanielCoelho112 commented Feb 23, 2023

andrefdre commented Feb 23, 2023

andrefdre commented Feb 24, 2023

miguelriemoliveira commented Feb 24, 2023 via email

andrefdre commented Apr 14, 2023

andrefdre commented Apr 14, 2023

miguelriemoliveira commented Apr 14, 2023 • edited Loading

andrefdre commented Apr 14, 2023

andrefdre commented Feb 10, 2023 •

edited

Loading

andrefdre commented Feb 14, 2023 •

edited

Loading

miguelriemoliveira commented Feb 15, 2023 •

edited

Loading

andrefdre commented Feb 18, 2023 •

edited

Loading

miguelriemoliveira commented Apr 14, 2023 •

edited

Loading