Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

reclass can't file node definition, if it defined in subdirectory #10

Closed
octo47 opened this issue Nov 26, 2013 · 12 comments
Closed

reclass can't file node definition, if it defined in subdirectory #10

octo47 opened this issue Nov 26, 2013 · 12 comments

Comments

@octo47
Copy link

octo47 commented Nov 26, 2013

In case of node definition resides in subdirectory within nodes, reclass fails, can't find node definition file.

%%
$ mkdir examples/nodes/subdir
$ cp examples/nodes/localhost.yml examples/nodes/subdir/n1.yml
$ ./reclass.py -b examples -i
No such file: reclass/examples/nodes/n1.yml
$ ./reclass.py -b examples -n localhost
... correct output
$ ./reclass.py -b examples -n n1
No such file: reclass/examples/nodes/n1.yml

@madduck
Copy link
Owner

madduck commented Nov 26, 2013

Thank you for taking your time to file this issue!

Subdirectories are currently not implemented, and I am hesitant to implement them because I have not worked out the namespacing issues.

For instance, what should happen when subdir1 and subdir2 both contain a file n1.yml?

This would be a nice way to implement node groups (see pull request #9), but first the namespacing should be worked out, and we need to figure out if subdirectories are the best approach for this.

I think that there would be a benefit for implementing this for classes, though, e.g.: debiannode/squeeze.yml and debiannode/wheezy.yml instead of debiannode@squeeze.yml and debiannodes@wheezy.yml, which is what I do currently.

@octo47
Copy link
Author

octo47 commented Nov 26, 2013

In case of my pull request reclass simply throws exception. I think that namespace has not sense at all.
We can't have identical hostnames across installation. (or it is a bad idea)

@octo47
Copy link
Author

octo47 commented Nov 26, 2013

I mean, for nodes. For classes namespaces are ok, because they can naturally implement environments for salt.

@madduck
Copy link
Owner

madduck commented Nov 26, 2013

So how about we implement this for classes, but not for nodes?

@octo47
Copy link
Author

octo47 commented Nov 26, 2013

I can fix my pull request, and it will support namespaces for classes and uniqueness for nodes.

My idea is that: classes are immutable, they don't know about nodes at all. They can be part of salt states for examples. And nodes inventory - it is a dynamic thing, and directories needed here to arrange nodes for large installations (I have a couple of clusters with more then 2k nodes as total) and place all nodes definitions in one directory - bad idea.

@madduck
Copy link
Owner

madduck commented Nov 26, 2013

I suggest we keep the two issues separate: namespaces/subdirectories, and node groups. Ideally, there would be two pull requests too, if this isn't asking too much.

With modern filesystems, you can have thousands of files in a directory and that works just fine thanks to B-Trees. Or is the issue more from a human side?

@octo47
Copy link
Author

octo47 commented Nov 26, 2013

We can split issues, it is ok. But they somehow depends on each other.

Concerning filesystems there a couple of issues:

  1. human, it is good to keep data organized.
  2. all this files should be keeped in sync (or symlinks or sort of should be used)
  3. it is not atomical to remove hosts from list (but one file gives such ability)
  4. minor issue, but .hosts solution uses 1 read io for each group, many files uses 1 for each file. In case of network filesystem it can be an issue (some sort of gluster or ceph filsystem for example to keep hostlists in sync in multimaster salt installations)

@madduck
Copy link
Owner

madduck commented Nov 26, 2013

Yes, they depend on each other, which is why I am trying to get small, logical changes implemented one after another.

About your issues:

  1. Agreed, so we would read files in subdirectories and write an error if we encounter a nodename twice;

  2. If you factor our all commonalitites between nodes into classes, the files do not need to be kept in sync.

  3. Now you are talking node groups (PR node lists support implemented #9) again, which is separate from this issue. Once we have node namespacing, the directory is your node group, right? For instance, if you have group1/node1.yml and group1/node2.yml`, then we could work with mappings again:

    group1/* → group1
    

    which would assign class group1 to all nodes in the group1 namespace.

  4. See (3.).

@octo47
Copy link
Author

octo47 commented Nov 26, 2013

  1. But I still need to create dozens of files. And moreover I need to delete nodes, which not in group. So I need some logic to sync two states. With file it is quite natural, but with bunch of files it is slightly complicated.
  2. This solution is redundant, why not simply create group1.hosts group1.yml and place whatever (even classes: group1) into configuration for that group. It will be explicit. In case of automatic class assigment it is easy to lost this class and get error at runtime and reclass will be unable to complete node definition. And you need the same dir group1 in classes too, thats complicates a bit more.
  3. How 3 can solve multiple files for each node?

One more thought, why you want to hardcode meaning for directories for nodes? May be it can be more flexible to allow admins place node in any order they want. Moreover, we can allow to create _dir.yaml (for example) definitions, which a merged from top to down into node definition. In that case admin can do whatever he wants, for example assign classes for its own physical structure of network.
Suppose directory structure: /datacenter/queue/rack/node, in that case admin can create _dir.yaml on each level of this structure and define some physical properties which a essential for nodes. And of course he can define classes for each level, if he likes.

@madduck
Copy link
Owner

madduck commented Nov 27, 2013

Please have a look at the class_mappings branch I pushed, which is still under development.

This should already do almost everything you want, except it cannot enumerate nodes, obviously.

Put into your config file:

class_mappings:
  - group1/* group1

and then call e.g. reclass -n group1/somehost and you will see that you get output without a host file.

This does not hardcode the meaning of directories. But directories have a special meaning anyway for nodes, as a node can only be defined in one subdirectory, so essentially, the directory name is just a prefix to the node name, not a hierarchical element (like it could be for classes).

Once we implement subdirectory scanning for nodes and classes, then you can create node files as well to assist in enumeration, and I will also think more about other means to enumerate, including your suggested group files.

Your example /datacenter/queue/rack/node is already fully implemented using classes, which themselves are hierarchical, i.e. node is in rack, so it specifies class rack, which specifies class queue, which specifies class datacenter.

I am willing to entertain ideas about making yaml\_fs more useful and understandable for humans, and even to play with the idea of namespacing classes and grouping hosts in cosmetic ways, but I will not reimplement reclass functionality in yaml\_fs.

@octo47
Copy link
Author

octo47 commented Nov 27, 2013

Thank you for clarification. I've understood your point.

Idea with dc/queue/rack can be dropped, I agree. That is over engineering, okay.
But reclass -n group1/somehost will not work for me, and I suppose for salt too. Salt will call reclass -n somehost (via adapter, if I understand correctly).

Right now I stick with my solution in my branch, it fits me well and allows to extend further (I think about validations and aggregators).
Actually, my pull #9 not about .hosts file itself, but about directory scanning, which allows some interesting things, like generated properties, for example I can aggregate all nodes with class==zk and parameter: { cluster = some } into one list some.zk.quorum or validations like validate { r1: "len(some.zk.quorum) % 3 == 0" }

Thank you.

@madduck
Copy link
Owner

madduck commented Nov 30, 2013

The current master branch now implements reading nodes from subdirectories using yaml_fs.

@madduck madduck closed this as completed Nov 30, 2013
AndrewPickford pushed a commit to AndrewPickford/reclass that referenced this issue Sep 18, 2017
Use isinstance() insted of type()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants