Is concurrent execution on one machine supported? #231

ghost · 2016-11-22T03:38:21Z

Id like to concurrently run citgm on a couple of different node on the same machine. Is this possible without one messing the other(s) up?

ghost · 2016-11-22T05:11:49Z

..running several now, so seems to be.. thx

ghost · 2016-11-22T17:48:25Z

..but looking closer, I'm seeing strange results, like modules not installing, or their tests just skipped.. not yet familiar enough with all of citgm's nondeterminism (which is an awful thing to have during testing) to know why I'm seeing what I am..

does anyone know if a globally installed citgm can safely run concurrently?

gdams · 2016-11-22T18:00:42Z

what os are you using? Can you send an example of the failures and what command are you running?

ghost · 2016-11-22T19:57:31Z

uname -a
Linux druid 4.4.0-45-generic #66-Ubuntu SMP Wed Oct 19 14:12:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

I've been re-running serially to rule out so I can see if my build of node is blowing chunks. The consistency I'm seeing in the results leads me to think concurrent execution doesn't quite work, but not positive on that.

I will come back a little later and specifically run citgm concurrently to see what, if any, issues that actually creates. However, if you already know citgm is using some global state, like storing a list of work-in-progess in a file or something, then we'll already know probably not a good idea to run concurrently

richardlau · 2016-11-22T21:43:06Z

If any modules contain native code running concurrently will probably run into nodejs/node-gyp#1054.

MylesBorins · 2016-11-22T22:45:50Z

This is not a design consideration of citgm. Closing.

If you have a concise proposal as to how this could be fixed please open another issue to discuss prior to implementation.

ghost · 2016-11-22T23:15:51Z

I'm getting the feeling I'm bothering you guys. This issue wasn't at all meant to be any kind of proposal, but was simply a question. I'm sorry, I'll stop asking questions... just new to citgm.

ghost · 2016-11-23T15:28:52Z

@thealphanerd I'm requesting your permission to post questions on this repo.

Can you please help me understand how I must phrase questions, and what constraints they have as to content, such that they are not immediately closed.

ghost · 2016-12-08T15:16:58Z

@thealphanerd It would have been kind of you to reference your related PR #144

MylesBorins · 2016-12-08T15:21:42Z

my apologies, from your original issue it did not sound like you were attempting to accomplish the same goals as my other pr

…

On Thu, Dec 8, 2016, 7:16 AM Paul D. Hester ***@***.***> wrote: @thealphanerd <https://github.com/TheAlphaNerd> It would have been kind of you to reference your related PR #144 <#144> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#231 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAecV-8qKXHubP9-MvvQ6wCJDIE5dnYGks5rGB9qgaJpZM4K4-SL> .

ghost · 2016-12-08T15:30:57Z

Thank you kindly Myles for replying and apologizing; no worries. I know you're at least aware of my efforts around helping fix module symlinking. I was having to run citgm-all four times; v7.2.0 NODE_PRESERVE_SYMLINKS=0 and 1, and then v7.2.0-sjw NODE_SUPPORT_SYMLINKS=0 and 1.. and it was just hurting a little having to wait, which was the impetus for this issue. I'm happy to do anything I can to help, even if it's staying out of the way (fwiw, I'm continuing to work on my style of communications on this medium, and appreciate your tolerance)

MylesBorins · 2016-12-08T15:37:33Z

not a problem. at this current time there is not much we can do regarding running multiple instances of citgm on a single system... primarily due to the fact we have no control over the test suites we are running and how they want to work on the local system. (this problem also exists for running multiple instances of the node test suite)

…

On Thu, Dec 8, 2016, 7:30 AM Paul D. Hester ***@***.***> wrote: Thank you kindly Myles for replying and apologizing; no worries. I know you're at least aware of my efforts around helping fix module symlinking. I was having to run citgm-all four times; v7.2.0 NODE_PRESERVE_SYMLINKS=0 and 1, and then v7.2.0-sjw NODE_SUPPORT_SYMLINKS=0 and 1.. and it was just hurting a little having to wait, which was the impetus for this issue. I'm happy to do anything I can to help, even if it's staying out of the way (fwiw, I'm continuing to work on my style of communications on this medium, and appreciate your tolerance) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#231 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAecV1bbRg1UQjbLJzJmC4neWEXfcTWQks5rGCKygaJpZM4K4-SL> .

ghost · 2016-12-08T15:49:45Z

Scaling a single running instance of citgm-all will help :).

I had success by creating an additional user account, citgm1, and then using with nvm to completely isolate the environments (node/npm/node-gyp), and still use the same version of node. I was able to concurrently run two processes of citgm-all successfully.

Only one module differed while running under the citgm1 user account: a yeoman-generator test was trying to access a Gruntfile.js and threw EACCESS on some permission problem somewhere. I tried exploring a little bit, but haven't gotten back to uncovering the underlying cause. I did however confirm it was something related to user permissions, and not from the two processes clobbering each other in some way.

It might be worthwhile to explore integrating nvm, or implementing something similar in citgm with respect to environment isolation.

ghost · 2016-12-08T16:10:56Z

As an aside, my goal was just to use all the cores while testing (I was running on a 32 core server). A parallelized version of citgm-all alleviates the need to run two or more of them concurrently :).

I'n my fantasy world, citgm would be provided the module to install/test, and the versions of node and npm to use. It would then create an isolated environment, exercise the module, then store the results by hashes of node executable, module package, and date-time. It would have registered itself over a well known port to citgm-all which would be responsible for coordinating things, providing the needed info to the listeningcitgm instances. With this conceptual approach, if I was in a hurry, I could stand up two, 32 core servers and have all testing done in about as long is it takes the longest test to run (probably under a minute). Also, by keeping results by node + package hashes, it would be easy to compare results between any two versions at any time, and to not re-run (unless asked to) when unncessary, but still automatically re-run when a newer version of the module or node was being exercised.... but then again I'm a dreamer :)

MylesBorins · 2016-12-08T16:27:08Z

so I would view that type of logic something that should the be the responsibility of the runtime not the module. since we run citgm on ci on many platforms it would prove difficult to have a reliable solution for all runtime. thus type of functionality would likely best be implemented in a ci environment itself, or perhaps with docker, where citgm itself is embedded. I believe that a new reporter (maybe json? ) would enable you to build something like what you are envisioning and would happily review a pr for a new reporter.

…

On Thu, Dec 8, 2016, 8:10 AM Paul D. Hester ***@***.***> wrote: As an aside, my goal was just to use all the cores while testing (I was running on a 32 core server). A parallelized version of citgm-all alleviates the need to run two or more of them concurrently :). I'n my fantasy world, citgm would be provided the module to install/test, and the versions of node and npm to use. It would then create an isolated environment, exercise the module, then store the results by hashes of node executable, module package, and date-time. It would have registered itself over a well known port to citgm-all which would be responsible for coordinating things, providing the needed info to the listeningcitgm instances. With this conceptual approach, if I was in a hurry, I could stand up two, 32 core servers and have all testing done in about as long is it takes the longest test to run (probably under a minute). Also, by keeping results by node + package version hashes, it would be easy to compare results between any two versions at any time, and to not re-run (unless asked to) when unncessary, but still automatically re-run when a newer version of the module or node was being exercised.... but then again I'm a dreamer :) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#231 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAecV_dOSptJqH9DpOiBP8pAAxww1gtyks5rGCwQgaJpZM4K4-SL> .

ghost · 2016-12-08T17:01:15Z

Forgive my miscommunication. The described implementation is not the thing, but rather what it's implementing is. To further elaborate and expand on your comment:

The ideal is that a test-coordinator has a list of registered workers that can only run a single atomic test at a time, where each test is distinguished by a cpu, os-version, node-version, npm-version, and module-version, and the results are keyed and stored by those things (and timestamp) in a shared store. The workers would have registered themselves by cpu and os-version. The coordinator would be given a list of one or more test configurations to run, and would just go make it happen based on what workers had registered themselves.

I suggested one way to isolate a worker on a single machine. Using docker is just another way to box an environment, but I would still think some level of participation would be required by citgm as workers, and citgm-all as a coordinator, to manage the specifics in the way they're specifying, testing and collecting results.

Might you educate me a little, as I'm ignorant in many ways, as to how you see that concept implemented entirely as "the responsibility of the runtime not the module"? Are my presumptions correct, that by "runtime" you mean "cpu/os/node", and by "module" you mean "citgm/npm/mod-under-test"?

MylesBorins · 2016-12-08T17:07:24Z

by runtime I mean your environment and by module I mean citgm what you are suggesting sounds to me like a complex ci job that could utilize citgm, rather than what citgm itself should be doing

…

On Thu, Dec 8, 2016, 9:01 AM Paul D. Hester ***@***.***> wrote: Forgive my miscommunication. The described implementation is not the thing, but rather what it's implementing is. To further elaborate and expand on your comment: The ideal is that a test-coordinator has a list of registered workers that can only run a single atomic test at a time, where each test is distinguished by a cpu, os-version, node-version, npm-version, and module-version, and the results are keyed and stored by those things (and timestamp) in a shared store. The workers would have registered themselves by cpu and os-version. The coordinator would be given a list of one or more test configurations to run, and would just go make it happen based on what workers had registered themselves. I suggested one way to isolate a worker on a single machine. Using docker is just another way to box an environment, but I would still think some level of participation would be required by citgm as workers, and citgm-all as a coordinator, to manage the specifics in the way they're specifying, testing and collecting results. Might you educate me a little, as I'm ignorant in many ways, as to how you see that concept implemented entirely as "*the responsibility of the runtime not the module*"? Are my presumptions correct, that by "runtime" you mean "cpu/os/node", and by "module" you mean "citgm/npm/mod-under-test"? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#231 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAecV0ccd5z91H32q9Yp-vcKGRqY85Gxks5rGDfcgaJpZM4K4-SL> .

ghost · 2016-12-08T17:09:23Z

Thank you

ghost · 2016-12-08T17:21:10Z

...I am a little confused? wouldn't #144 then also be more appropriate for a complex ci? How is what its doing fundamentally different?

MylesBorins · 2016-12-08T17:27:52Z

the primary difference from that implementation and what you are suggesting is that #144 simply runs the suites of multiple modules at the same time. since we are using child processes we can get multi threading for free. what you are suggesting involves using multiple binaries, and platform specific sand boxes

…

On Thu, Dec 8, 2016, 9:21 AM Paul D. Hester ***@***.***> wrote: ...I am a little confused? wouldn't #144 <#144> then also be more appropriate for a complex ci? How is what its doing fundamentally different? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#231 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAecV0GodVmFNqKYU0SWVGskla-V9cDJks5rGDyGgaJpZM4K4-SL> .

ghost · 2016-12-08T17:32:12Z

I see. My original proposition did not include things related to platform specific sand boxes, I only added that from your comment:

since we run citgm on ci on many platforms it would prove difficult to have a reliable solution for all
runtime.

So is my understanding correct that the differing characteristic from what I originally suggested and #144 is merely being able to specify a different version of node to use for testing?

MylesBorins · 2016-12-08T17:36:28Z

at this point the only suggestion I can make is to submit a pr if you have a solution you think will work that will not involve adding substantial complexity or platform specific implementation details

…

On Thu, Dec 8, 2016, 9:32 AM Paul D. Hester ***@***.***> wrote: I see. My original proposition did not include things related to platform specific sand boxes, I only added that from your comment: since we run citgm on ci on many platforms it would prove difficult to have a reliable solution for all runtime. So is my understanding correct that the differing characteristic from what I originally suggested and #144 <#144> is merely being able to specify a different version of node to use for testing? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#231 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAecV969qYrW2jEB8FTMULZFo1oVPY8yks5rGD8egaJpZM4K4-SL> .

ghost · 2016-12-08T17:49:01Z

I can certainly do that, but I wouldn't want to duplicate something fundamentally being taken care of by #144, so I was just trying to understand the distinguishing characteristics.

Would you agree then, from everything you've described, that the only things a PR might offer on top of #144 is:

Being able to explicitly specify the version of node to test rather than the one resolved from PATH, and
Storing the results in way a little more convenient to then review and compare?

MylesBorins · 2016-12-08T17:53:35Z

setting the node bin for running install / test is a tangential feature storing the results in a different fashion is also tangential, and would be a fairly disruptive change. we are discussing doing a refactoring of the logger... thus would likely involve changing how some of the data will be stored / reported. are you hoping to have the data stored differently internally or just reported in a different fashion? if the latter you can always submit a PR for a new reporter as mentioned above

…

On Thu, Dec 8, 2016, 9:49 AM Paul D. Hester ***@***.***> wrote: I can certainly do that, but I wouldn't want to duplicate something fundamentally being taken care of by #144 <#144>, so I was just trying to understand the distinguishing characteristics. Would you agree then, from everything you've described, that the only things a PR might offer on top of #144 <#144> is: - Being able to explicitly specify the version of node to test rather than the one resolved from PATH, and - Storing the results in way a little more convenient to then review and compare? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#231 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAecV0juGzr94M2a694lysr9jiWPixZSks5rGEMOgaJpZM4K4-SL> .

ghost · 2016-12-08T19:11:30Z

Context to understand impetus of discussion:

Implemented changes to node v7.2.0, represented by node v7.2.0-sjw
Was required by @ sam-github to use citgm to show no regressions.
Test Environment:
- Azure west region 32 core 64 bit vm
- uname: Linux 3.4.0+ x86_64
- lsb_release: Ubuntu 14.04.5 LTS
Ran citgm-all -m | tee citgm.1.v7.2.0-sjw.md with PATH resolving node to v7.2.0-sjw.
citgm.1.v7.2.0-sjw.md contained failures that upon review appeared unrelated to changes.
Had no choice but to baseline v7.2.0 via running citgm-all -m | tee citgm.1.v7.2.0.md with PATH resolving node to v7.2.0
Comparing citgm.1.v7.2.0-sjw.md to citgm.1.v7.2.0.md showed one additional failure in v7.2.0: one test in spdy-transport failed when attempting to access a network port.
Ran citgm spdy-transport with PATH resolving to v7.2.0, all tests passed.
Re-ran citgm -all -m | tee citgm.2.v7.2.0.md, and one module failed, because download from github failed from network issue (unknown as to which side)
Re-ran citgm -all -m | tee citgm.3.v7.2.0.md, and pass/fail results equivalent to citgm.1.v7.2.0-sjw.md

Initial conclusions from above experience:

citgm-all does not scale
citgm-all itself runs with the version of node in PATH
citgm-all generates different output from the same input & is sensitive to environment; i.e. non-deterministic
citgm-all OOB downloads the latest versions of test modules from github.com, creating potential for implicit variance in test results
Comparing results between two different versions of node for a given module@version is tedious

Thoughts on addressing issues to mitigate non-determinism, speed up locating cause of regressions:

Parallelize citgm-all
Be explicit with version of node-under-test, rather than resolving from PATH
Ensure citgm-all itself does not use version of node-under-test
Store and index test results by sha1(node) + sha1(module@version) + timestamp
- Provides means to quickly compare results between two versions of node for a given module@version
- Enables running tests only when necessary (when bits change, or explicitly requested)
- Auto download of latest module@version creates distinguished results
- Specifics of test-output-encoding and storage engine entirely irrelevant

MylesBorins · 2016-12-08T19:21:58Z

Paul, with all due respect I do not have the time to go through this discussion. we outline individual features which could be submitted as standalone pull requests. please feel free to create a pr for each of these individual features.

…

On Thu, Dec 8, 2016, 11:11 AM Paul D. Hester ***@***.***> wrote: Context to understand impetus of discussion: - Implemented changes to node v7.2.0, represented by node v7.2.0-sjw - Was required <nodejs/node#9673 (comment)> by @ sam-github to use citgm to show no regressions. - Test Environment: - Azure west region 32 core 64 bit vm - uname: Linux 3.4.0+ x86_64 - lsb_release: Ubuntu 14.04.5 LTS - Ran citgm-all -m | tee citgm.1.v7.2.0-sjw.md with PATH resolving node to v7.2.0-sjw. - citgm.1.v7.2.0-sjw.md contained failures that upon review appeared unrelated to changes. - Had *no choice* but to baseline v7.2.0 via running citgm-all -m | tee citgm.1.v7.2.0.md with PATH resolving node to v7.2.0 - Comparing citgm.1.v7.2.0-sjw.md to citgm.1.v7.2.0.md showed one *additional* failure in v7.2.0: one test in spdy-transport failed when attempting to access a network port. - Ran citgm spdy-transport with PATH resolving to v7.2.0, all tests passed. - Re-ran citgm -all -m | tee citgm.2.v7.2.0.md, and one module failed, because download from github failed from network issue (unknown as to which side) - Re-ran citgm -all -m | tee citgm.3.v7.2.0.md, and pass/fail results equivalent to citgm.1.v7.2.0-sjw.md Initial conclusions from above experience: - citgm-all does not scale - citgm-all itself runs with the version of node in PATH - citgm-all generates different output from the same input & is sensitive to environment; i.e. non-deterministic - citgm-all OOB downloads the latest versions of test modules from github.com, creating potential for implicit variance in test results - Comparing results between two different versions of node for a given ***@***.*** is tedious Thoughts on addressing issues to mitigate non-determinism, speed up locating cause of regressions: - Parallelize citgm-all - Be explicit with version of node-under-test, rather than resolving from PATH - Ensure citgm-all itself does not use version of node-under-test - Store and index test results by sha1(node) + ***@***.***) + timestamp - Provides means to quickly compare results between two versions of node for a given ***@***.*** - Enables running tests only when necessary (when bits change, or explicitly requested) - Auto download of latest ***@***.*** creates distinguished results - Specifics of test-output-encoding and storage engine entirely irrelevant — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#231 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAecV-TI6f29MbwWnrwWoRtEmRSU54Czks5rGFZjgaJpZM4K4-SL> .

ghost · 2016-12-08T19:32:32Z

Not wanting to waste your time.

Please respect that I described real challenges & problems in using your tool and I'm seeking understanding in how they be best addressed.

This will be my last comment on this issue.

I will repeat here excluding my thoughts on how they might be addressed (ie. I'm NOT stating any type of feature request!!!!), and instead ask of you:

how the issues should be dealt with, or
request you point me to someone who knows how citgm works, or
allow me to repost in a new issue so others may respond:

Conclusions from using citgm:

citgm-all generates different output from the same input & is sensitive to environment; i.e. non-deterministic
citgm-all does not scale
citgm-all itself runs with the version of node in PATH
citgm-all OOB downloads the latest versions of test modules from github.com, creating potential for implicit variance in test results
Comparing results between two different versions of node for a given module@version is tedious

Conclusions arrived out be these explicit steps taken in using citgm:

Implemented changes to node v7.2.0, represented by node v7.2.0-sjw
Was required by @ sam-github to use citgm to show no regressions.
Test Environment:
- Azure west region 32 core 64 bit vm
- uname: Linux 3.4.0+ x86_64
- lsb_release: Ubuntu 14.04.5 LTS
Ran citgm-all -m | tee citgm.1.v7.2.0-sjw.md with PATH resolving node to v7.2.0-sjw.
citgm.1.v7.2.0-sjw.md contained failures that upon review appeared unrelated to changes.
Had no choice but to baseline v7.2.0 via running citgm-all -m | tee citgm.1.v7.2.0.md with PATH resolving node to v7.2.0
Comparing citgm.1.v7.2.0-sjw.md to citgm.1.v7.2.0.md showed one additional failure in v7.2.0: one test in spdy-transport failed when attempting to access a network port.
Ran citgm spdy-transport with PATH resolving to v7.2.0, all tests passed.
Re-ran citgm -all -m | tee citgm.2.v7.2.0.md, and one module failed, because download from github failed from network issue (unknown as to which side)
Re-ran citgm -all -m | tee citgm.3.v7.2.0.md, and pass/fail results equivalent to citgm.1.v7.2.0-sjw.md

ghost closed this as completed Nov 22, 2016

ghost reopened this Nov 22, 2016

MylesBorins closed this as completed Nov 22, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is concurrent execution on one machine supported? #231

Is concurrent execution on one machine supported? #231

ghost commented Nov 22, 2016

ghost commented Nov 22, 2016

ghost commented Nov 22, 2016

gdams commented Nov 22, 2016 •

edited

Loading

ghost commented Nov 22, 2016

richardlau commented Nov 22, 2016 •

edited

Loading

MylesBorins commented Nov 22, 2016

ghost commented Nov 22, 2016

ghost commented Nov 23, 2016

ghost commented Dec 8, 2016

MylesBorins commented Dec 8, 2016 via email

ghost commented Dec 8, 2016

MylesBorins commented Dec 8, 2016 via email

ghost commented Dec 8, 2016

ghost commented Dec 8, 2016 •

edited by ghost

Loading

MylesBorins commented Dec 8, 2016 via email

ghost commented Dec 8, 2016

MylesBorins commented Dec 8, 2016 via email

ghost commented Dec 8, 2016

ghost commented Dec 8, 2016

MylesBorins commented Dec 8, 2016 via email

ghost commented Dec 8, 2016

MylesBorins commented Dec 8, 2016 via email

ghost commented Dec 8, 2016

MylesBorins commented Dec 8, 2016 via email

ghost commented Dec 8, 2016

MylesBorins commented Dec 8, 2016 via email

ghost commented Dec 8, 2016

Is concurrent execution on one machine supported? #231

Is concurrent execution on one machine supported? #231

Comments

ghost commented Nov 22, 2016

ghost commented Nov 22, 2016

ghost commented Nov 22, 2016

gdams commented Nov 22, 2016 • edited Loading

ghost commented Nov 22, 2016

richardlau commented Nov 22, 2016 • edited Loading

MylesBorins commented Nov 22, 2016

ghost commented Nov 22, 2016

ghost commented Nov 23, 2016

ghost commented Dec 8, 2016

MylesBorins commented Dec 8, 2016 via email

ghost commented Dec 8, 2016

MylesBorins commented Dec 8, 2016 via email

ghost commented Dec 8, 2016

ghost commented Dec 8, 2016 • edited by ghost Loading

MylesBorins commented Dec 8, 2016 via email

ghost commented Dec 8, 2016

MylesBorins commented Dec 8, 2016 via email

ghost commented Dec 8, 2016

ghost commented Dec 8, 2016

MylesBorins commented Dec 8, 2016 via email

ghost commented Dec 8, 2016

MylesBorins commented Dec 8, 2016 via email

ghost commented Dec 8, 2016

MylesBorins commented Dec 8, 2016 via email

ghost commented Dec 8, 2016

MylesBorins commented Dec 8, 2016 via email

ghost commented Dec 8, 2016

Conclusions from using citgm:

Conclusions arrived out be these explicit steps taken in using citgm:

gdams commented Nov 22, 2016 •

edited

Loading

richardlau commented Nov 22, 2016 •

edited

Loading

ghost commented Dec 8, 2016 •

edited by ghost

Loading