Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parallel workers should have an ID that can be used to prevent conflicts among workers #4456

Closed
catdad opened this issue Oct 1, 2020 · 10 comments · Fixed by #4813
Closed
Labels
area: parallel mode Regarding parallel mode type: feature enhancement proposal

Comments

@catdad
Copy link

catdad commented Oct 1, 2020

Is your feature request related to a problem or a nice-to-have?? Please describe.
I have a problem. I need to start a server on a pre-know port (due to limitations of a third-party library). I usually hard-code this port, but when running parallel tests, I need to start multiple unique servers -- one for each worker. I can't just check for a free port due to race conditions. Usually, parallel systems such as this have a way to identify each workers -- for example, and unique integer ID is given to each worker.

Describe the solution you'd like
An unique ID integer would be great. This can be passed to the worker with an environment variable (MOCHA_WORKER_ID) or just as an extra value in the worker arguments (accessible with process.argv[2]). Really, any acceptable solution to identify unique workers.

Describe alternatives you've considered
I've considered using just a random port every time, but that gets problematic because I don't know how long any given port will be used for due to different lengths of tests, or when a port could conflict with something else running on the operating system. I also considered using a file on the file system to identify used/free resources, but there are similarly many race conditions to consider when multiple processes write to a single file.

Additional context
Here is the same request in jest: jestjs/jest#2284

@catdad catdad added the type: feature enhancement proposal label Oct 1, 2020
@rgroothuijsen
Copy link
Contributor

rgroothuijsen commented Oct 3, 2020

If registering workers with a unique ID is all that's required, one option would be to use dedicated orchestration tools such as ZooKeeper. They're designed to handle the race conditions you describe.

@catdad
Copy link
Author

catdad commented Oct 3, 2020

I'm not really describing cluster orchestration. In fact, my software has nothing to with clustering. And I feel like integrating a complicated orchestration framework in order to write trivial tests is not feasible. Providing an ID for test workers is a very common for parallel testing libraries. Am I missing something?

@outsideris
Copy link
Contributor

Couldn't you use a kind of get-port module to address your issue?
Otherwise, how will you use MOCHA_WORKER_ID for your available port?

@catdad
Copy link
Author

catdad commented Oct 4, 2020

I actually did use that module, and there's a problem with doing so. This module and others like it work in a deterministic way every time. They start at one port, and just increment up until they find a free port. That means that when you first start several workers, they will all find the same free port. Then when you try to start the server, only one of the workers will actually succeed starting on that port.

In my case, this is complicated by using a third-party module for the server, which assumes that if it can't start on the given port everything is fine rather than erroring (I did also file a bug for this with that module). Ideally, I could just retry getting a port and starting the server until a port actually succeeds, but this adds complication and makes the test setup slower.

In the MR, I mentioned that with this feature, a server can easily be started like this:

const port = 1234 + Number(process.env.MOCHA_WORKER_ID || 0);

@outsideris outsideris added the area: parallel mode Regarding parallel mode label Oct 9, 2020
@psmarshall
Copy link

psmarshall commented Nov 10, 2020

I ran into a similar problem for ports, and I solved it by starting the server using port 0. This requests a free port from the OS and avoids the races you get with other solutions. You need a way to get the resulting port from the server though. Even just parsing the stdout from the server might be ok.

That said, for non-port shared resources you might run into similar problems which aren't solvable, so I think this is a good idea.

@catdad
Copy link
Author

catdad commented Nov 10, 2020

Thanks for the input! Do you know if port 0 is something that node handles or something that happens at the OS level? I have use cases that go beyond just node servers (e.g. needing to provide a port to selenium, browser debug ports, ssh reverse proxy ports, mapping docker ports, etc.)

@psmarshall
Copy link

It's an OS level thing, node should pass it through without touching it. We are using it both to boot up a node server and as the remote debug port for chrome - you just need to make sure whatever is using the port will report which port it ended up using somehow so that you can get hold of it.

@forty
Copy link
Contributor

forty commented May 20, 2021

Hello,

I have a similar issue, but slightly different. We need a mysql database during our tests. With our current test tools, each process get assigned an id (from 0 for the first process, to N-1 for the Nth process). This way each process use their own database (mydb0, mydb1, mydb2, ...).

This use case is slightly more complicated than the port one, because there is not system level shared state (like it's the case for open port, where you can basically try to open the port and see if is available).

A possible workaround would be to use the OS PID of the process to name the DBs, but the downside of this approach is that we cannot create the databases beforehand (since we don't know the PID) which is less convenient.

So 👍 to an approach suggested by @catdad with MOCHA_WORKER_ID

@4ekki
Copy link

4ekki commented Sep 29, 2021

I have a similar use case, when I need to make sure that different workers pick their own part of the config and don't interfere with each other. Any luck on implementing this variable suggested by @catdad?

@forty
Copy link
Contributor

forty commented Sep 29, 2021

@4ekki funnily I've revived that PR josdejong/workerpool#296 just yesterday, which should allow to implement this cleanly. I don't know if workerpool's maintainer is still going to be interested to work on that PR, we'll see.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: parallel mode Regarding parallel mode type: feature enhancement proposal
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants