Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not running nimbus in YARN #66

Open
ekohlwey opened this issue Jan 27, 2014 · 1 comment
Open

Not running nimbus in YARN #66

ekohlwey opened this issue Jan 27, 2014 · 1 comment

Comments

@ekohlwey
Copy link

It seems like there may be a case to be made for not running nimbus in yarn.

Nimbus is a relatively lightweight process, and the cluster only needs one of them running at any given time. Running nimbus in YARN actually reduces the availability of SOY which is not really the Right Thing(TM).

It substantially complicates cluster operations in terms of job submission to rely on the nimbus container never failing. It also makes job submission difficult.

Running nimbus separately is not unlike running an independent map/reduce history server. So it makes a lot of sense in my opinion and there are apparent design patterns in other yarn based frameworks.

At least until the time when SOY is able to properly handle topology submission along with starting the framework, this seems like a good modification to make to simplify running SOY.

I would propose adding an external nimbus configuration option in order to prevent SOY from starting one, if a user would prefer to operate that way. Thoughts?

@revans2
Copy link
Collaborator

revans2 commented Jan 27, 2014

If you want to add an option to not launch nimbus, I think that would be fine. You probably also want to have the AM run as an unmanaged AM in that case, similar to how impala tries to integrate with YARN. When Hadoop runs with security the individual processes run as the user that launched the job. If we have one nimbus with multiple different AMs then there is the possibility that part of a topology will be running as one user, and part of it will be running as a different user.

The original plan for nimbus was to wait for the Nimbus HA pull requests to be merged in.

nathanmarz/storm#422

Then we could write a simple plug-in to place the important state files in HDFS instead of the local file system, and we would get recovery. Especially if we combine this with the work that has been going on in YARN to not shoot worker processes when the AM dies. With storm's transition to apache the HA pull request appears to have stagnated.

Long term I also like the idea of trying to have nimbus communicate with the AM directly, which would be simpler if they are collocated on the same box. If we want the AM to be able to request resources that reduce network traffic it almost needs nimbus and the nimbus scheduler to decide where the resources should be placed, and then tell the AM that information. It would also allow the cluster to automatically respond to demand, instead of having an external call to add new nodes to the cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants