Another implementation of Randomized #232

wasowski · 2024-01-12T23:31:18Z

@mohsen-ghaffari1992 in a push to try to resolve bug #59, I have proposed to replace Randomized with Randomized2 based on probula. This has actually uncovered some design flaw with Randomized (it was not properly abstracted so its design details were leaking to symsim code). As a result this required a massive amount of changes.

I had no time to try the entire test suite, so I do not even know if this is correct (and it may have positive or negative impact on convergence)

But simple maze seems to work, and I was able to run a million episodes. So try using this branch for your experiments.

…ting

…sodes)

…t now

mohsen-ghaffari1992 · 2024-01-13T07:57:58Z

@wasowski
Thanks!
I tested a maze 100x100 for 1000 episodes with timeout=2000, it perfectly is running. I still do not know how good is the result but it is very hopeful that now we can run both long trials and high number of episodes.

wasowski · 2024-01-13T08:53:05Z

@mohsen-ghaffari1992
there are many tests failing so I am trying to chase bugs and push corrections to this branch. I think we will merge this branch after the deadline.

It should only be created in top-level executable files (which in this project means Spec files)

…oject

wasowski · 2024-01-13T12:57:31Z

concrete.simplebandit.Bandit is an Agent.agent.observe (step (s) (a)._1) ∈ ObservableState *** FAILED *** (44 milliseconds)
[info]   ArrayIndexOutOfBoundsException was thrown during property evaluation.

Gaussian Simple Bandit seems to fail because of a bug in spire (issue #233). Do we need Gaussian for the paper anywhere @mohsen-ghaffari1992 ?

mohsen-ghaffari1992 · 2024-01-13T13:11:36Z

No, we do not need the gaussian.

wasowski · 2024-01-13T13:28:42Z

OK. Then we fix this later.

Now cartpole was failing because of an assertion, but as far as I can see the assertion was wrong (the last requirement for state invariant was

require (pv <= PvMin)

I switch this to PvMax and now seems to work, although very strange how this mistake survived so long.

Now cartpole still uses memory intensively. Even with 10K episodes I am running out of memory.

Do we use cartpole in the paper?
With what number of episodes?
Did it work before or also timed out?

- otherwise we have OOM errors

So that we can diagnostic information in more situations, not only when moving (it was previously placed in move)

It was not excluding the right point margin as the spec would indicate

It seems that it was forgetting initializing at the max edges of the board

mohsen-ghaffari1992 · 2024-01-13T14:34:53Z

Thanks! That is fine. No problem. We do not evaluate on CartPole. Best, Mohsen From: Andrzej Wąsowski ***@***.***> Date: Saturday, 13 January 2024 at 14.28 To: itu-square/symsim ***@***.***> Cc: Mohsen Ghaffari ***@***.***>, Mention ***@***.***> Subject: Re: [itu-square/symsim] Another implementation of Randomized (PR #232) OK. Then we fix this later. Now cartpole was failing because of an assertion, but as far as I can see the assertion was wrong (the last requirement for state invariant was require (pv <= PvMin) I switch this to PvMax and now seems to work, but this one still uses memory intensively. Even with 10K episodes I am running out of memory. 1. Do we use cartpole in the paper? 2. With what number of episodes? 3. Did it work before or also timed out? — Reply to this email directly, view it on GitHub<#232 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASQKWROT5MSI23Z2HCQJLVLYOKDZJAVCNFSM6AAAAABBY25HNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJQGQ2TSMZXGI>. You are receiving this because you were mentioned.Message ID: ***@***.***>

wasowski · 2024-01-13T14:50:35Z

but I think there is something buggy, which means me question on all results. WindyGrid and CliffWalking are also not able to complete even one episode, and these should work just fine! CliffWalking works very well in ADPRO on an extremely similar design, so something is fishy.

mohsen-ghaffari1992 · 2024-01-13T15:21:32Z

I checked the CliffWalking and the problem is initialize function. You check it is not final but sometimes it is out of board. I believe the same issue for WindyGrid. Best, Mohsen From: Andrzej Wąsowski ***@***.***> Date: Saturday, 13 January 2024 at 15.50 To: itu-square/symsim ***@***.***> Cc: Mohsen Ghaffari ***@***.***>, Mention ***@***.***> Subject: Re: [itu-square/symsim] Another implementation of Randomized (PR #232) but I think there is something buggy, which means me question on all results. WindyGrid and CliffWalking are also not able to complete even one episode, and these should work just fine! CliffWalking works very well in ADPRO on an extremely similar design, so something is fishy. — Reply to this email directly, view it on GitHub<#232 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASQKWROG434B27O6RP6RA23YOKNMNAVCNFSM6AAAAABBY25HNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJQGQ3TQNJWGA>. You are receiving this because you were mentioned.Message ID: ***@***.***>

wasowski · 2024-01-13T19:00:40Z

Hmm... this is weird. The test for validity (CliffWalkingIsAgent) is passing, and we only produce states within the board? Also I moved the assertions about the states being legal from move to the constructor in CWState, and it never fails , which should indicate that these states are all good? How can you see that this is happening?

(I suspected the dreaded function tailRecM in Randomized2, but looking at it for several hours, I start to believe that it is correct).

mohsen-ghaffari1992 · 2024-01-13T19:05:40Z

I just printed the states that are sending to step function and noticed the value is out of bound. Since, state is created by initialize function, we should check it. The point is that we check the state is not final, but we do not check whether it is valid! Best, Mohsen From: Andrzej Wąsowski ***@***.***> Date: Saturday, 13 January 2024 at 20.00 To: itu-square/symsim ***@***.***> Cc: Mohsen Ghaffari ***@***.***>, Mention ***@***.***> Subject: Re: [itu-square/symsim] Another implementation of Randomized (PR #232) Hmm... this is weird. The test for validity (CliffWalkingIsAgent) is passing, and we only produce states within the board? How can you see that this is happening? (I suspected the dreaded function tailRecM in Randomized2, but looking at it for several hours, I start to believe that it is correct). — Reply to this email directly, view it on GitHub<#232 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASQKWRJ536CP6ZOZUIAKOZTYOLKWFAVCNFSM6AAAAABBY25HNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJQG4YTMMJSHA>. You are receiving this because you were mentioned.Message ID: ***@***.***>

wasowski · 2024-01-13T19:22:37Z

But maybe I do not understand what valid states are.

  // from CliffWalking
  def initialize: Randomized2[CWState] = { for
    x <- Randomized2.between (0, BoardWidth  + 1)
    y <- Randomized2.between (0, BoardHeight + 1)
    s = CWState (x, y) 
  yield s }.filter { !this.isFinal (_) }

I do not understand how this can create states outside the board. Also I have assertions now checking when the state is created in CWState, so if I try to create an illegal state everything crashes with an exception, but I do not see these exceptions. For example:

scala> CWState(20,20)
java.lang.IllegalArgumentException: requirement failed: Out-Of-Width x: ¬(20 ≤ 11)
  at scala.Predef$.require(Predef.scala:337)
  at symsim.examples.concrete.cliffWalking.CWState.<init>(CliffWalking.scala:18)
  at symsim.examples.concrete.cliffWalking.CWState$.apply(CliffWalking.scala:16)
  ... 42 elided

So I think this is not happening. We already had this problem that everything was hanging during the first episode (or it seems, after some debugging that this might be during the second episode). But I do not remember what was the reason.

wasowski · 2024-01-13T19:29:39Z

Oh - some progress. I just discovered that I was misguided. It is the evaluation that is hanging, not learning! I have not been looking at the evaluation code, as I was sure it was learning. I will look into eval. It might be the same problem as with maze, that the early policies are bad and they need a timeout.

mohsen-ghaffari1992 · 2024-01-13T19:34:35Z

If it is hanging, then the issue is probably the same. When I was testing, it reported error that I found it is for required in move function. That is how I became with initialize. However, the error that I am talking about can be due to moving from one step to another. Because we do not check the valid state for return of the move. Best, Mohsen From: Andrzej Wąsowski ***@***.***> Date: Saturday, 13 January 2024 at 20.29 To: itu-square/symsim ***@***.***> Cc: Mohsen Ghaffari ***@***.***>, Mention ***@***.***> Subject: Re: [itu-square/symsim] Another implementation of Randomized (PR #232) Oh - some progress. I just discovered that I was misguided. It is the evaluation that is hanging, not learning! I have not been looking at the evaluation code, as I was sure it was learning. I will look into eval. It might be the same problem as with maze, that the early policies are bad and they need a timeout. — Reply to this email directly, view it on GitHub<#232 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASQKWRJWHVWDZHIXTCUGMSTYOLOC5AVCNFSM6AAAAABBY25HNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJQG42DOMZTGI>. You are receiving this because you were mentioned.Message ID: ***@***.***>

wasowski · 2024-01-13T19:36:23Z

Now we check it every time (in cliffwalking) when a new state is constructed. So this checks both when moving and when initializing. I will look into eval.

mohsen-ghaffari1992 · 2024-01-13T19:37:17Z

[like] Mohsen Ghaffari reacted to your message:

…

________________________________ From: Andrzej Wąsowski ***@***.***> Sent: Saturday, January 13, 2024 7:36:34 PM To: itu-square/symsim ***@***.***> Cc: Mohsen Ghaffari ***@***.***>; Mention ***@***.***> Subject: Re: [itu-square/symsim] Another implementation of Randomized (PR #232) Now we check it every time (in cliffwalking) when a new state is constructed. So this checks both when moving and when initializing. I will look into eval. — Reply to this email directly, view it on GitHub<#232 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASQKWRMALIY3RMSHKIPCNVLYOLO4FAVCNFSM6AAAAABBY25HNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJQG42DQNRUGY>. You are receiving this because you were mentioned.Message ID: ***@***.***>

This seems to be an old bug! It was not initializing in the rightmost column.

Otherwise it is difficult to write some tests (becase class members are private for non-case classes apparently)

wasowski · 2024-01-13T21:50:27Z

OK. I added time horizon to mountaincar, cliffwalking, and windygrid. Now everything seems to behave. I think this is ready for your experiments. I hope this branch will allow all the experiments you need. I definitely hit heap problems still, but I believe we can handle larger cases now.

Some tests are still failing (about 10). All of them are due to randomness or the spire bug mentioned above (with Gaussian). I will stop now and move to other things.

mohsen-ghaffari1992 · 2024-01-13T21:52:57Z

Thanks Andrzej! You put a real effort on this, I appreciate it. Best, Mohsen From: Andrzej Wąsowski ***@***.***> Date: Saturday, 13 January 2024 at 22.50 To: itu-square/symsim ***@***.***> Cc: Mohsen Ghaffari ***@***.***>, Mention ***@***.***> Subject: Re: [itu-square/symsim] Another implementation of Randomized (PR #232) OK. I added time horizon to mountaincar, cliffwalking, and windygrid. Now everything seems to behave. I think this is ready for your experiments. I hope this branch will allow all the experiments you need. I definitely hit heap problems still, but I believe we can handle larger cases now. Some tests are still failing (about 10). All of them are due to randomness or the spire bug mentioned above (with Gaussian). I will stop now and move to other things. — Reply to this email directly, view it on GitHub<#232 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ASQKWRONGGNBY2OKYXQ5LWDYOL6S5AVCNFSM6AAAAABBY25HNOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQOJQG43TOMJSG4>. You are receiving this because you were mentioned.Message ID: ***@***.***>

One test fails, but this is the same bug in spire as others

wasowski added 6 commits January 11, 2024 11:21

Add probula to the mix

f0dfe4c

radnomizeD: Remove a stale import

c535c67

IData: Make toString less eager so that we do not clog heap when prin…

3a2b0e5

…ting

Dist: Add uniform continous distribution

8e03c03

build: Bump scala up to 3.3.1

9a06dba

Replace Randomized with Randomized2 everywhere

b3a4dd1

wasowski marked this pull request as draft January 12, 2024 23:31

wasowski added 5 commits January 13, 2024 08:39

Randomized2: Remove repeat (not needed in Randomized2)

175280e

simplemaze: Make the experiments a notch faster for testing (less epi…

c7dd99d

…sodes)

golf: Fix a broken import

c459a80

build: Bump up scalatest to reduce warnings

e69ae2f

car: Do not use min/max infix as Scala 3 generates warnings against i…

5da9e44

…t now

wasowski added 2 commits January 13, 2024 09:21

laws: Abstract away the random seed outside of the laws

f7e76c9

gitignore: Ignore the csv outputs from experiments

5fac470

probula: Sanitize where the rng state is created

06379f2

It should only be created in top-level executable files (which in this project means Spec files)

wasowski force-pushed the probula branch from 779475c to 06379f2 Compare January 13, 2024 11:59

Randomized2: Edit whitespace to be consistent with the rest of the pr…

9f89ca4

…oject

wasowski added 2 commits January 13, 2024 14:23

Gaussian simple bandit: Minor cleanup

e58640e

cartpole: Fix an assertion bug

0391a5a

wasowski added 5 commits January 13, 2024 14:36

cartpole: Reduce the number of episodes to save memory

72214de

- otherwise we have OOM errors

cliffwalking: Move the state invariant test to the state

c66ff2a

So that we can diagnostic information in more situations, not only when moving (it was previously placed in move)

Randomized2: fix a bug in integer between

2159d7c

It was not excluding the right point margin as the spec would indicate

cliffwalking: A slight bug in initializing the state

e8586f9

It seems that it was forgetting initializing at the max edges of the board

Car: Remove warning on using min/max infix

5605f63

minor clean up in various places

de2c9a3

wasowski added 5 commits January 13, 2024 21:51

simplemaze: Fix a bug in initialization + few minor cleanup edits

91b0bfd

This seems to be an old bug! It was not initializing in the rightmost column.

cliffwalking: Fix observable state in cliffwalking

6219ca1

cliffwalking: Separate observable and external states

9866cd4

Otherwise it is difficult to write some tests (becase class members are private for non-case classes apparently)

windygrid: Add timeout to this task to enable early evaluation

b72f4e3

mountaincar: Add time horizon to enable evaluation

dd10e24

wasowski added 3 commits January 13, 2024 23:06

probula: Allow using different Generators than Secure Random from Java

506ccf3

randomized2: Migrate the randomized spec to Randomized2

51d26de

One test fails, but this is the same bug in spire as others

Randomized: Remove the old randomized from the source

79f2a65

wasowski mentioned this pull request Jan 14, 2024

OutOfMemory with large number of episodes #59

Open

probula: Improve several comments + ws changes

31975af

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Another implementation of Randomized #232

Another implementation of Randomized #232

wasowski commented Jan 12, 2024 •

edited

Loading

mohsen-ghaffari1992 commented Jan 13, 2024 •

edited

Loading

wasowski commented Jan 13, 2024

wasowski commented Jan 13, 2024

mohsen-ghaffari1992 commented Jan 13, 2024 via email •

edited by wasowski

Loading

wasowski commented Jan 13, 2024 •

edited

Loading

mohsen-ghaffari1992 commented Jan 13, 2024 via email

wasowski commented Jan 13, 2024

mohsen-ghaffari1992 commented Jan 13, 2024 via email

wasowski commented Jan 13, 2024 •

edited

Loading

mohsen-ghaffari1992 commented Jan 13, 2024 via email

wasowski commented Jan 13, 2024 •

edited

Loading

wasowski commented Jan 13, 2024

mohsen-ghaffari1992 commented Jan 13, 2024 via email

wasowski commented Jan 13, 2024

mohsen-ghaffari1992 commented Jan 13, 2024 via email

wasowski commented Jan 13, 2024

mohsen-ghaffari1992 commented Jan 13, 2024 via email

Another implementation of Randomized #232

Are you sure you want to change the base?

Another implementation of Randomized #232

Conversation

wasowski commented Jan 12, 2024 • edited Loading

mohsen-ghaffari1992 commented Jan 13, 2024 • edited Loading

wasowski commented Jan 13, 2024

wasowski commented Jan 13, 2024

mohsen-ghaffari1992 commented Jan 13, 2024 via email • edited by wasowski Loading

wasowski commented Jan 13, 2024 • edited Loading

mohsen-ghaffari1992 commented Jan 13, 2024 via email

wasowski commented Jan 13, 2024

mohsen-ghaffari1992 commented Jan 13, 2024 via email

wasowski commented Jan 13, 2024 • edited Loading

mohsen-ghaffari1992 commented Jan 13, 2024 via email

wasowski commented Jan 13, 2024 • edited Loading

wasowski commented Jan 13, 2024

mohsen-ghaffari1992 commented Jan 13, 2024 via email

wasowski commented Jan 13, 2024

mohsen-ghaffari1992 commented Jan 13, 2024 via email

wasowski commented Jan 13, 2024

mohsen-ghaffari1992 commented Jan 13, 2024 via email

wasowski commented Jan 12, 2024 •

edited

Loading

mohsen-ghaffari1992 commented Jan 13, 2024 •

edited

Loading

mohsen-ghaffari1992 commented Jan 13, 2024 via email •

edited by wasowski

Loading

wasowski commented Jan 13, 2024 •

edited

Loading

wasowski commented Jan 13, 2024 •

edited

Loading

wasowski commented Jan 13, 2024 •

edited

Loading