Help understanding how to control which services go to which machines #979

lgomez · 2014-10-17T22:09:12Z

Hi all,

I'm getting started with Fleet and feel like I'm getting a handle on it but I'm struggling to understand how to control what gets stated on which machines. I think metadata is the answer but not sure.

You can restrict per unit by setting something like MachineMetadata=location=chicago like so (aome unit file):

...
[X-Fleet]
MachineMetadata=region=us-west
...

You can set metadata: region=us-west for a fleet like so (user-data file):

#cloud-config

coreos:
  etcd:
    # generate a new token for each unique cluster from https://discovery.etcd.io/new
    # WARNING: replace each time you 'vagrant destroy'
    # discovery: https://discovery.etcd.io/<token>
    addr: $public_ipv4:4001
    peer-addr: $public_ipv4:7001
  fleet:
    public-ip: $public_ipv4
    metadata: region=us-west
  units:
    ...

You can check that your machines have that metadata by looking at fleetctl list-machines like so:

$ fleetctl list-machines
MACHINE     IP            METADATA
29db5063... 172.17.8.101  region=us-west
ebb97ff7... 172.17.8.102  region=us-west
f823e019... 172.17.8.103  region=us-west

What I don't understand though is: how do I control what services go into what machines?

I understand I can use metadata to set some constraints but wouldn't I want machines to have different metadata? Say, what if I want to make sure 50% of my machines are used for a database service and 50% for a webapp service? If I use Conflicts=database* on my webapp unit and Conflicts=webapp* on my database unit, how do I tell fleet what the split is?

Maybe fleet will split them 50/50 but this is just a simple example, in reality I want to deploy webapp, database, api and sessionstore services at different proportions and with some overlap depending on the service. And I'm not including ambassador/sidekick-type services.

Do I need to keep more than one cloud-config files differing only in machine metadata?

Thank you so much for taking the time to read this.

Luis

The text was updated successfully, but these errors were encountered:

lgomez · 2014-10-17T22:17:48Z

Oh... Another note that is, perhaps, worth mentioning...

The way I got to wondering about how to control this is that I have vagrant provisioning a number of VMs for local development. I started fiddling with the cloud-config file and it worked fine. Then I wondered if I could use the same file when deploying to Google Compute Engine.

If I could (I hope I can), then fleet could automatically figure out service distribution within the environment in which it's deployed. If I deploy it through vagrant to five local VMs, it'll distribute my services in the right proportions in the five VMs. If I deploy it through Google's Cloud SDK (the gcloud client) and my cloud has 400 machines, it'll distribute the services proportionately to that number of machines.

Sorry if this sounds confusing... Ultimately I want to separate the sizing of my cluster(s) out of the configuration for fleet so the same can be used regardless of how I expand or contract my cloud.

Thanks again!

robszumski · 2014-10-17T22:57:02Z

I’d be a better idea to use the machine metadata to express attributes of the machines themselves, like SSD/spinning disk, cloud region, cabinet number, etc. If you have units that have specific needs, you can specify MachineMetadata=disk=SSD to place it on a machine with those resources.

If you use the metadata for unit grouping, what happens when you lose all of the machines intended to run databases, but all of the web machines are still up? It would make more sense to enable fleet to move some of the units to other machines in the cluster so you don’t have downtime.

I think the core topic that you’re not embracing is that fact that the way to achieve higher density is to run multiple services on the machines in your cluster. In a traditional VM use-case, you’d have one dedicated to X and another for Y. There’s no reason that X and Y can’t run on the same box if needed, assuming they have the correct resource constraints enabled so they don’t clobber each other.

Regarding your cloud-config issue, you should be able to reuse the same cloud-config, but you will need to change the discovery url. If you end up going down the machine metadata route, you will have to maintain a version of the cloud-config for each machine type in the cluster. There is also an issue (#555) to be able to modify the metadata after the machine has booted.

lgomez · 2014-10-18T04:12:07Z

I’d be a better idea to use the machine metadata to express attributes of the machines themselves, like SSD/spinning disk, cloud region, cabinet number, etc. If you have units that have specific needs, you can specify MachineMetadata=disk=SSD to place it on a machine with those resources.

Makes sense to keep MachineMetadata for machine metadata. Didn't think of it that way. Agree.

If you use the metadata for unit grouping, what happens when you lose all of the machines intended to run databases, but all of the web machines are still up? It would make more sense to enable fleet to move some of the units to other machines in the cluster so you don’t have downtime.

You're absolutely right. My question now is, how does Fleet decide what to put where? Does that mean I'm gonna end up with everything on every machine unless I specify conflicts or have two cloud-config files?

Then there's the topic of load balancers (lb) where I may have:

request->lb->web.{1..3}->lb->api.{1..3}->lb->db.{1..5}...

Would you suggest having each of those services in different load-balanced clusters and therefore separate could-config files?

I think the core topic that you’re not embracing is that fact that the way to achieve higher density is to run multiple services on the machines in your cluster. In a traditional VM use-case, you’d have one dedicated to X and another for Y. There’s no reason that X and Y can’t run on the same box if needed, assuming they have the correct resource constraints enabled so they don’t clobber each other.

In reality, at least what I foresee being my case, I'll probably be fine having web (express.js), api (express.js), db (mongodb) and sessionstore (redis) on every box reading from a shared file system where appropriate. What I was wondering is what the setup would be if I find I need more of one container/service instances than I did db instances.

Regarding your cloud-config issue, you should be able to reuse the same cloud-config, but you will need to change the discovery url. If you end up going down the machine metadata route, you will have to maintain a version of the cloud-config for each machine type in the cluster. There is also an issue (#555) to be able to modify the metadata after the machine has booted.

I'm not gonna take this route. Keeping the machine metadata to data specific to the machine like you suggest feels clean and makes sense to me.

Thank you for your answer. Very helpful.

Luis

superstructor · 2014-10-18T06:38:18Z

Hi @lgomez

Template units are great for finer grained control over the number of instances running on a cluster.

You only need a single e.g. nginx@.service template unit then you can start as many as you want from the one template fleetctl start nginx@1.service nginx@2.service.

bcwaldon · 2014-12-17T19:42:00Z

No activity in two months.

lgomez changed the title ~~Help understanding MachineMetadata~~ Help understanding how to control which services go to which machines Oct 17, 2014

robszumski added the support-request label Oct 17, 2014

bcwaldon closed this as completed Dec 17, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Help understanding how to control which services go to which machines #979

Help understanding how to control which services go to which machines #979

lgomez commented Oct 17, 2014

lgomez commented Oct 17, 2014

robszumski commented Oct 17, 2014

lgomez commented Oct 18, 2014

superstructor commented Oct 18, 2014

bcwaldon commented Dec 17, 2014

Help understanding how to control which services go to which machines #979

Help understanding how to control which services go to which machines #979

Comments

lgomez commented Oct 17, 2014

lgomez commented Oct 17, 2014

robszumski commented Oct 17, 2014

lgomez commented Oct 18, 2014

superstructor commented Oct 18, 2014

bcwaldon commented Dec 17, 2014