Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrading generate_yamls.py for easier configuration automation. #506

Open
wants to merge 4 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 72 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,78 @@ For example, the configuration file in [conf/dynomite.yml](conf/dynomite.yml)
Finally, to make writing syntactically correct configuration files easier, dynomite provides a command-line argument -t or --test-conf that can be used to test the YAML configuration file for any syntax error.


### Configuration YAML Generator

The utility [generate_yams.py](https://github.com/Netflix/dynomite/blob/dev/scripts/dynomite/generate_yamls.py) automates
.yaml configuration files creation, needed for every node.

The following usage examples show how to create different cluster configuration
with the command line.

**Usage example 1**:
- 2 Datacenters (usa / europe)
- Each having 1 Rack (usa_rack1 / europe_rack1)
- One node per Rack, external ip's should be used (1.1.1.2 / 1.1.1.3)


```
python generate_yamls.py 1.1.1.2:europe_rack1:europe 1.1.1.3:usa_rack1:usa -o mycluster
```

Will generate two yml files outputing the files to 'mycluster' folder

More commands:
```
python generate_yamls.py --help
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just leave a reference to the fact that help is available, and not include the whole help text. Something like:

For full help output, do `scripts/dynomite/generate_yamls.py --help`


usage:
Dynomite Configuration YAML Generator

Script for generating Dynomite yaml configuration files for distribution with every node.
generated yaml files will be outputted for each node, named as {ipaddress}.yml
so cluster wide can be easily configured


[-h] [-cp CLIENT_PORT] [-o OUTPUT_DIR] [-sp SERVER_PORT]
[-pp PEER_PORT] [-rc {DC_QUORUM,DC_ONE,DC_SAFE_QUORUM}]
[-sso {datacenter,none,rack}] [--redis] [--mem]
nodes [nodes ...]

positional arguments:
nodes Usage: <script> publicIp:rack_name:datacenter
publicIp:rack_name:datacenter ... outputs one yaml
file per input node(for a single rack) restrict
generation of the confs for all hosts per rack and not
across rack.

optional arguments:
-h, --help show this help message and exit
-cp CLIENT_PORT Client port to use (The client port Dynomite provides
instead of directly accessing redis or memcache)
Default is: 8102 Your redis or memcache clients should
connect to this port
-o OUTPUT_DIR Output directory for the YAML files, if does not exist
will be created, Default is current directory (.)
-sp SERVER_PORT The port your redis or memcache will run locally on
each node, assuming it's uniform, (default is: 6379)
-pp PEER_PORT The port Dynamo clients will use to communicate with
each other, (default is: 8101)
-rc {DC_QUORUM,DC_ONE,DC_SAFE_QUORUM}
Sets the read_consistency of the cluster operation
mode (default is: DC_ONE)
-sso {datacenter,none,rack}
Type of communication between Dynomite nodes, Must be
one of 'none', 'rack', 'datacenter', or 'all' (default
is: datacenter)
--redis Sets the data_store property to use redis (0), Default
is 0.
--mem Sets the data_store property to use memcache (1),
Default is 0.

```



## License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0
231 changes: 187 additions & 44 deletions scripts/dynomite/generate_yamls.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,52 +9,195 @@
'''

import yaml, sys
import argparse
import os

APPNAME='dyn_o_mite'
CLIENT_PORT='8102'
DYN_PEER_PORT=8101
DEFAULT_CLIENT_PORT='8102'
DYN_PEER_PORT= '8101'
MEMCACHE_PORT='11211'
REDIS_PORT='6379'
MAX_TOKEN = 4294967295

DEFAULT_DC = 'default_dc'

# generate the equidistant tokens for the number of nodes given. max 4294967295
token_map = dict()
token_item = (MAX_TOKEN // (len(sys.argv) -1))
for i in range(1, len(sys.argv)):
node = sys.argv[i]
token_value = (token_item * i)
if token_value > MAX_TOKEN:
token_value = MAX_TOKEN

token_map[node] = token_value

for k,v in token_map.items():
# get the peers ready, and yank the current one from the dict
dyn_seeds_map = token_map.copy()
del dyn_seeds_map[k]
dyn_seeds = []
for y,z in dyn_seeds_map.items():
key = y.split(':')
dyn_seeds.append(key[0] + ':' + str(DYN_PEER_PORT) + ':' + key[1] + ':' + DEFAULT_DC + ':' + str(z));

ip_dc = k.split(':');
data = {
'listen': '0.0.0.0:' + CLIENT_PORT,
'timeout': 150000,
'servers': ['127.0.0.1:' + MEMCACHE_PORT + ':1'],
'dyn_seed_provider': 'simple_provider',

'dyn_port': DYN_PEER_PORT,
'dyn_listen': '0.0.0.0:' + str(DYN_PEER_PORT),
'datacenter': DEFAULT_DC,
'rack': ip_dc[1],
'tokens': v,
'dyn_seeds': dyn_seeds,
}

outer = {APPNAME: data}

file_name = ip_dc[0] + '.yml'
with open(file_name, 'w') as outfile:
outfile.write( yaml.dump(outer, default_flow_style=False) )
SECURE_SERVER_OPTION = "datacenter"
READ_CONSISTENCY = "DC_ONE"



if __name__ == '__main__':

parser = argparse.ArgumentParser("""
Dynomite Configuration YAML Generator

Script for generating Dynomite yaml configuration files for distribution with every node.
generated yaml files will be outputted for each node, named as {ipaddress}.yml
so cluster wide can be easily configured

""")

parser.add_argument('nodes', type=str, nargs='+',
help="""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get this help text down to a single line describing how to correctly format a node description? I think some extraneous stuff snuck in here.

Usage: <script> publicIp:rack_name:datacenter publicIp:rack_name:datacenter ...

outputs one yaml file per input node(for a single rack)
restrict generation of the confs for all hosts per rack and not across rack.
""")


parser.add_argument(
'-cp',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we go with two dashes for multi-character arguments?

dest='client_port',
type=str,
default=DEFAULT_CLIENT_PORT,
help="""
Client port to use (The client port Dynomite provides instead of directly accessing redis or memcache)
Default is: {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of formatting the string ourselves, let's do %(default)s here, and let argparse do the work.


Your redis or memcache clients should connect to this port
""".format(DEFAULT_CLIENT_PORT)
)

parser.add_argument(
'-o',
dest='output_dir',
type=str,
default='./',
help="""
Output directory for the YAML files, if does not exist will be created, Default is current directory (.)
"""
)

parser.add_argument(
'-sp',
dest='server_port',
type=str,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be an int type?

default=REDIS_PORT,
help="""
The port your redis or memcache will run locally on each node, assuming it's uniform, (default is: {})
""".format(REDIS_PORT)
)

parser.add_argument(
'-pp',
dest='peer_port',
type=str,
default=DYN_PEER_PORT,
help="""
The port Dynamo clients will use to communicate with each other, (default is: {})
""".format(DYN_PEER_PORT)
)

parser.add_argument(
'-rc',
dest='read_consistency',
type=str,
default=READ_CONSISTENCY,
choices=set(('DC_ONE', 'DC_QUORUM', 'DC_SAFE_QUORUM')),
help="""
Sets the read_consistency of the cluster operation mode (default is: DC_ONLY)
"""
)

parser.add_argument(
'-sso',
dest='secure_server_option',
type=str,
default=SECURE_SERVER_OPTION,
choices=set(('none', 'rack', 'datacenter')),
help="""
Type of communication between Dynomite nodes, Must be one of 'none', 'rack', 'datacenter', or 'all' (default is: {})
""".format(SECURE_SERVER_OPTION)
)


parser.add_argument(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and --mem would be a good use of action="store_const" and ArgumentParser.add_mutually_exclusive_group

'--redis',
action="store_true",
default=True,
help="""
Sets the data_store property to use redis (0), Default is 0.
"""
)

parser.add_argument(
'--mem',
action="store_true",
default=False,
help="""
Sets the data_store property to use memcache (1), Default is 0.
"""
)


args = parser.parse_args()


dir = os.path.dirname(__file__)

output_path = args.output_dir

if os.path.isabs(dir):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message doesn't match the test condition, which checks whether the path is absolute.

print "path exists:{}".format(dir)
else:
output_path = os.path.join(dir, args.output_dir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, if I specify generate_yamls.py ... -o foobar, I expect foobar to created in my current working directory. It would be surprising to me if it were instead put next to the script itself.

if (not os.path.exists(output_path)):
os.makedirs(output_path)
print "creating output path %s" % output_path
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following the Python convention of EAFP, I would do (in Python 3):

try:
    os.makedirs(output_path)
    print("Created output path {}".format(output_path), file=sys.stderr)
except FileExistsError:
    pass





if (args.mem):
data_store = 1
else:
data_store = 0

# generate the equidistant tokens for the number of nodes given. max 4294967295
token_map = dict()
total_nodes = len(args.nodes)
token_item = (MAX_TOKEN // total_nodes)

#print "token_item:{}".format(token_item)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you go through and remove any commented out code?

for i in range(0, total_nodes):
node = args.nodes[i]
#print "Iterating node file ... > {}.yml".format(node)
token_value = (token_item * (i+1))
if token_value > MAX_TOKEN:
token_value = MAX_TOKEN

token_map[node] = token_value

for k,v in token_map.items():
# get the peers ready, and yank the current one from the dict
#print "k:{} v:{}".format(k,v)
dyn_seeds_map = token_map.copy()
del dyn_seeds_map[k]
dyn_seeds = []
for y,z in dyn_seeds_map.items():
key = y.split(':')
dyn_seeds.append(key[0] + ':' + str(args.peer_port) + ':' + key[1] + ':' + key[2] + ':' + str(z))

ip_dc = k.split(':')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would work better as ip, rack, datacenter = k.split(':')

#print ip_dc
data = {
'data_store': data_store,
'listen': '0.0.0.0:' + args.client_port,
'timeout': 150000,
'servers': ['127.0.0.1:' + args.server_port + ':1'],
'dyn_seed_provider': 'simple_provider',
'read_consistency': 'DC_ONE',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this instead be args.read_consistency?

'secure_server_option': args.secure_server_option,
'dyn_port': args.peer_port,
'dyn_listen': '0.0.0.0:' + args.peer_port,
'datacenter': ip_dc[2],
'rack': """{}""".format(ip_dc[1]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to do just 'rack': rack?

'tokens': """{}""".format(v),
'dyn_seeds': dyn_seeds,
}

outer = {APPNAME: data}

file_name = """{}.yml""".format(os.path.join(output_path, ip_dc[0]))

with open(file_name, 'w') as outfile:
outfile.write( yaml.dump(outer, default_flow_style=False))