Running ATM on a cluster #128

beevabeeva · 2019-04-18T11:44:57Z

Hi.
Sorry if I missed it in the docs or Readme, but I can't seem to find details about running ATM on a cluster (local). Do I have to implement this myself using something like Apache Spark?

Thanks

csala · 2019-04-25T12:19:05Z

Hi @beevabeeva

The current ATM version is already prepared to run as a cluster, but setting it up is currently a responsibility of the user.

All you have to do to have a cluster running is starting multiple worker instances.
These worker instances can either be all on the same machine or on different machines, and the only requirements are:

All the machines need to have access to the database being used.
All the machines need to have access to the data in the same way by either having a shared filesystem which is mounted in the same path for all the machines or using an S3 bucket as the dataset source.

For example, if you just wanted to start a cluster with 4 workers on your local machine, all you need to do is running the following two commands:

atm enter_data ...your enter_data options here..
for i in {1..4}; do atm worker ..your worker options here.. > /dev/null & done

The first command will enter your data as usual, and the second one will start 4 workers as background processes, redirecting their outputs to /dev/null to avoid cluttering your console, as you will be able to find their logs in the logs/{your hostname}.txt file anyway.

I hope this helps!

csala · 2019-04-25T12:45:44Z

Also see #130, which will make cluster management much easier once done.

csala · 2019-05-07T15:47:03Z

Closed via #133

beevabeeva mentioned this issue Apr 25, 2019

Add atm start, atm status and atm stop commands #130

Closed

csala added this to the 0.1.2 milestone Apr 28, 2019

csala closed this as completed May 7, 2019

csala assigned pvk-developer May 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running ATM on a cluster #128

Running ATM on a cluster #128

beevabeeva commented Apr 18, 2019

csala commented Apr 25, 2019

csala commented Apr 25, 2019

csala commented May 7, 2019

Running ATM on a cluster #128

Running ATM on a cluster #128

Comments

beevabeeva commented Apr 18, 2019

csala commented Apr 25, 2019

csala commented Apr 25, 2019

csala commented May 7, 2019