Skip to content

Commit

Permalink
Support Accumulo installs on Microsoft Azure
Browse files Browse the repository at this point in the history
* Add new cluster type 'azure' which leverages VM Scale Sets
* Add HA (high-availability) capabilities for the Hadoop Name Node,
  Accumulo master, and Zookeeper roles within Muchos. Note: HA is on by
  default and should not be disabled
* Enable central collection of metrics and logs using Azure Monitor
* Increase some CentOS defaults to improve cluster stability
* Fix latent bugs which prevent Spark from being set up correctly
* Add checksums for specific Spark and Hadoop versions, as well as for
  Accumulo 2.0.0
  • Loading branch information
Raj Tiwari committed Aug 14, 2019
1 parent 80281f2 commit be5ae7a
Show file tree
Hide file tree
Showing 61 changed files with 1,457 additions and 114 deletions.
95 changes: 80 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

**Muchos automates setting up [Apache Accumulo][accumulo] or [Apache Fluo][fluo] (and their dependencies) on a cluster**

Muchos makes it easy to launch a cluster in Amazon's EC2 and deploy Accumulo or Fluo to it. Muchos
Muchos makes it easy to launch a cluster in Amazon's EC2 or Microsoft Azure and deploy Accumulo or Fluo to it. Muchos
enables developers to experiment with Accumulo or Fluo in a realistic, distributed environment.
Muchos installs all software using tarball distributions which makes its easy to experiment
with the latest versions of Accumulo, Hadoop, Zookeeper, etc without waiting for downstream packaging.
Expand All @@ -17,35 +17,67 @@ Muchos is structured into two high level components:

* [Ansible] scripts that install and configure Fluo and its dependencies on a cluster.
* Python scripts that push the Ansible scripts from a local development machine to a cluster and
run them. These Python scripts can also optionally launch a cluster in EC2 using [boto].
run them. These Python scripts can also optionally launch a cluster in EC2 using [boto] or in Azure using Azure CLI.

Checkout [Uno] for setting up Accumulo or Fluo on a single machine.

## Requirements
## Requirements

Muchos requires the following:
### Common

Muchos requires the following common components for installation and setup:

* Python 3.6.8 with a virtual environment setup
Create a Python environment and switch to it
```bash
cd ~
python3.6 -m venv env
source env/bin/activate
```
* `ssh-agent` installed and running and ssh-agent forwarding. Note that this may also require the creation of SSH public-private [key pair](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/mac-create-ssh-keys).
```bash
eval $(ssh-agent -s)
ssh-add ~/.ssh/id_rsa
```
* Git (current version)

### EC2

Muchos requires the following for EC2 installations:

* Python 3
* [awscli] & [boto3] libraries - Install using `pip3 install awscli boto3 --upgrade --user`
* `ssh-agent` installed and running
* An AWS account with your SSH public key uploaded. When you configure [muchos.props], set `key.name`
to name of your key pair in AWS.
* `~/.aws` [configured][aws-config] on your machine. Can be created manually or using [aws configure][awscli-config].

### Azure

Muchos requires the following for Azure installations:

* [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest) must be installed, configured and authenticated to an Azure subscription. Please note - you should install [Azure CLI 2.0.69](https://packages.microsoft.com/yumrepos/azure-cli/azure-cli-2.0.69-1.el7.x86_64.rpm) on CentOS. Higher versions of Azure CLI are unsupported for Muchos on CentOS at this time until [this issue](https://github.com/Azure/azure-cli/issues/10128) in the Azure CLI 2.0.70 is fixed. Example command to install Azure CLI 2.0.69 on CentOS is below:
```bash
wget https://packages.microsoft.com/yumrepos/azure-cli/azure-cli-2.0.69-1.el7.x86_64.rpm
sudo yum install azure-cli-2.0.69-1.el7.x86_64.rpm
```
* An Azure account with permissions to either use an existing or create new Resource Groups, Virtual Networks and Subnets
* A machine which can connect to securely deploy the cluster in Azure. If the machine is running CentOS, check if `SELinux` is enabled by running `sestatus` and if so, disable it by editing `/etc/selinux/config` file and set the SELinux mode to `disabled`. Save the file and reboot the VM.
* Install [Ansible for Azure](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/ansible-install-configure) within the Python virtual environment by using `pip install ansible[azure]`

## Quickstart

The following commands will install Muchos, launch an EC2 cluster, and setup/run Accumulo:
The following commands will install Muchos, launch a cluster, and setup/run Accumulo:

```bash
git clone https://github.com/apache/fluo-muchos.git

cd fluo-muchos/
cp conf/muchos.props.example conf/muchos.props
vim conf/muchos.props # Edit to configure Muchos cluster
./bin/muchos launch -c mycluster # Launches Muchos cluster in EC2
./bin/muchos launch -c mycluster # Launches Muchos cluster in EC2 or Azure
./bin/muchos setup # Set up cluster and start Accumulo
```

The `setup` command can be run repeatedly to fix any failures and will not repeat successful operations.
The `launch` command will create a cluster with the name specified in the command (e.g. 'mycluster'). The `setup` command can be run repeatedly to fix any failures and will not repeat successful operations.

After your cluster is launched, SSH to it using the following command:

Expand Down Expand Up @@ -92,9 +124,36 @@ You can check the status of the nodes using the EC2 Dashboard or by running the

./bin/muchos status

## Launching an Azure cluster

Before launching a cluster, you will need to complete the requirements for Azure above, clone the Muchos repo, and
create [muchos.props] by making a copy of existing [muchos.props.example]. If you want to give others access to your cluster, add their public keys to a file named `keys` in your `conf/` directory. During the setup of your cluster, this file will be appended on each node to the `~/.ssh/authorized_keys` file for the user set by the `cluster.username` property. You will also need to ensure you have authenticated to Azure and set the target subscription using the [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/manage-azure-subscriptions-azure-cli?view=azure-cli-latest).

Muchos by default uses a CentOS 7 image that is hosted in the Azure marketplace. The Azure Linux Agent is already pre-installed on the Azure Marketplace images and is typically available from the distribution's package repository. Azure requires that the publishers of the endorsed Linux distributions regularly update their images in the Azure Marketplace with the latest patches and security fixes, at a quarterly or faster cadence. Updated images in the Azure Marketplace are available automatically to customers as new versions of an image SKU.

Edit the values in the sections within [muchos.props] as below
Under the `general` section, edit following values as per your configuration
* `cluster_type = azure`
* `cluster_user` should be set to the name of the administrative user
* `proxy_hostname` (optional) is the name of the machine which has access to the cluster VNET

Under the `azure` section, edit following values as per your configuration
* `resource_group` and provide the same created in your Azure subscription for the cluster deployment
* `vnet` and provide the same created in your Azure subscription for the cluster deployment
* `subnet` and provide the same created in your Azure subscription for the cluster deployment
* `numnodes` can be changed as per the cluster size
* `vm_sku` can be specified from the available skus in the [selected Azure region](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/cli-ps-findimage)

Within Azure the `nodes` section is auto populated with the hostnames and their default roles.

After following the steps above, run the following command to launch an Azure VMSS cluster called `mycluster` (where 'mycluster' is the name assigned to your cluster):
```bash
.bin/muchos launch -c `mycluster` # Launches Muchos cluster in Azure
```

## Set up the cluster

The `./bin/muchos setup` command will set up your cluster and start Hadoop, Zookeeper, & Accumulo. It
Once your cluster is built in EC2 or Azure, the `./bin/muchos setup` command will set up your cluster and start Hadoop, Zookeeper & Accumulo. It
will download release tarballs of Fluo, Accumulo, Hadoop, etc. The versions of these tarballs are
specified in [muchos.props] and can be changed if desired.

Expand Down Expand Up @@ -199,16 +258,18 @@ master, etc. It also has variables in the `[all:vars]` section that contain sett
useful in user playbooks. It is recommended that any user-defined Ansible playbooks should be
managed in their own git repository (see [mikewalch/muchos-custom][mc] for an example).

## Terminating your EC2 cluster
## Terminating your cluster

If you launched your cluster on EC2, run the following command terminate your cluster. WARNING - All
If you launched your cluster, run the following command to terminate your cluster. WARNING - All
data on your cluster will be lost:

./bin/muchos terminate

## Automatic shutdown of EC2 clusters
Note: The terminate command is currently unsupported for Azure based clusters. Instead, you should delete underlying Azure VMSS resources when you need to terminate the cluster.

## Automatic shutdown of clusters

With the default configuration, EC2 clusters will not shutdown automatically after a delay and the default
With the default configuration, clusters will not shutdown automatically after a delay and the default
shutdown behavior will be stopping the node. If you would like your cluster to terminate after 8 hours,
set the following configuration in [muchos.props]:

Expand All @@ -233,8 +294,10 @@ $ ./bin/muchos config -p leader.public.ip
Muchos is powered by the following projects:

* [boto] - Python library used by `muchos launch` to start a cluster in AWS EC2.
* [Ansible] - Cluster management tool that is used by `muchos setup` to install, configure, and
* [ansible] - Cluster management tool that is used by `muchos setup` to install, configure, and
start Fluo, Accumulo, Hadoop, etc on an existing EC2 or bare metal cluster.
* [azure-cli] - The Azure CLI is a command-line tool for managing Azure resources.
* [ansible-azure] - Ansible includes a suite of modules for interacting with Azure Resource Manager.

## Muchos Testing

Expand All @@ -250,6 +313,8 @@ The following command runs the unit tests:
[aws-config]: http://docs.aws.amazon.com/cli/latest/userguide/cli-config-files.html
[awscli]: https://docs.aws.amazon.com/cli/latest/userguide/installing.html
[awscli-config]: http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html#cli-quick-configuration
[azure-cli]: https://packages.microsoft.com/yumrepos/azure-cli/azure-cli-2.0.69-1.el7.x86_64.rpm
[ansible-azure]: https://docs.ansible.com/ansible/latest/scenario_guides/guide_azure.html
[fluo-app]: https://github.com/apache/fluo/blob/master/docs/applications.md
[WebIndex]: https://github.com/apache/fluo-examples/tree/master/webindex
[Phrasecount]: https://github.com/apache/fluo-examples/tree/master/phrasecount
Expand Down
4 changes: 2 additions & 2 deletions ansible/accumulo.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,10 +19,10 @@
tasks:
- import_tasks: roles/accumulo/tasks/download.yml
when: download_software
- hosts: all
- hosts: all:!{{ azure_proxy_host }}
roles:
- accumulo
- hosts: accumulomaster
- hosts: accumulomaster[0]
tasks:
- import_tasks: roles/accumulo/tasks/init-accumulo.yml
handlers:
Expand Down
22 changes: 22 additions & 0 deletions ansible/azure.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
#
# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements. See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#


---
- hosts: localhost
roles:
- azure
14 changes: 10 additions & 4 deletions ansible/common.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,23 +15,29 @@
# limitations under the License.
#

- hosts: nodes
become: yes
tasks:
- import_tasks: roles/common/tasks/hosts.yml
- hosts: proxy
become: yes
roles:
- proxy
tasks:
- import_tasks: roles/proxy/tasks/main.yml
- import_tasks: roles/proxy/tasks/download.yml
when: download_software
- hosts: nodes
become: yes
tasks:
- import_tasks: roles/common/tasks/hosts.yml
- hosts: all
become: yes
roles:
- common
tasks:
- import_tasks: roles/common/tasks/ssh.yml
- import_tasks: roles/common/tasks/os.yml
- import_tasks: roles/common/tasks/azure.yml
when: cluster_type == 'azure'
- import_tasks: roles/common/tasks/azure_selinux.yml
when: cluster_type == 'azure'
- import_tasks: roles/common/tasks/ec2.yml
when: cluster_type == 'ec2'
- import_tasks: roles/common/tasks/existing.yml
Expand Down
2 changes: 2 additions & 0 deletions ansible/conf/ansible.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,5 @@
host_key_checking = False
forks = 50
gathering = smart
callback_whitelist = profile_tasks
timeout=30
2 changes: 1 addition & 1 deletion ansible/docker.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
# limitations under the License.
#

- hosts: all
- hosts: all:!{{ azure_proxy_host }}
become: yes
roles:
- docker
Expand Down
22 changes: 20 additions & 2 deletions ansible/hadoop.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,12 +15,30 @@
# limitations under the License.
#

- hosts: all
- hosts: all:!{{ azure_proxy_host }}
roles:
- hadoop
- hosts: journalnode
tasks:
- import_tasks: roles/hadoop/tasks/start-journal.yml
- hosts: namenode[0]
tasks:
- import_tasks: roles/hadoop/tasks/format-nn.yml
- hosts: namenode[0]
tasks:
- import_tasks: roles/hadoop/tasks/format-zk.yml
- hosts: namenode
tasks:
- import_tasks: roles/hadoop/tasks/start-hdfs.yml
- import_tasks: roles/hadoop/tasks/start-zkfc.yml
- hosts: namenode[0]
tasks:
- import_tasks: roles/hadoop/tasks/start-nn1.yml
- hosts: namenode[1]
tasks:
- import_tasks: roles/hadoop/tasks/start-nn2.yml
- hosts: workers
tasks:
- import_tasks: roles/hadoop/tasks/start-dn.yml
- hosts: resourcemanager
tasks:
- import_tasks: roles/hadoop/tasks/start-yarn.yml
2 changes: 1 addition & 1 deletion ansible/mesos.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@
# limitations under the License.
#

- hosts: all
- hosts: all:!{{ azure_proxy_host }}
become: yes
roles:
- mesos
Expand Down
8 changes: 7 additions & 1 deletion ansible/roles/accumulo/tasks/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -59,4 +59,10 @@
args:
creates: "{{ accumulo_home }}/lib/native/libaccumulo.so"
- name: "Create accumulo log dir"
file: path={{ worker_data_dirs[0] }}/logs/accumulo state=directory
file: path={{ worker_data_dirs[1] }}/logs/accumulo state=directory
- name: "Configure max log file size to 10G for Azure log analytics integration"
replace:
path: "{{ accumulo_home }}/conf/generic_logger.xml"
regexp: '.*\"MaxFileSize\".*value.*'
replace: " <param name=\"MaxFileSize\" value=\"10000MB\"/>"
when: cluster_type == 'azure' and accumulo_major_version == '1'
13 changes: 7 additions & 6 deletions ansible/roles/accumulo/templates/accumulo-env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,21 +16,22 @@
# limitations under the License.
#

export ACCUMULO_LOG_DIR={{ worker_data_dirs[0] }}/logs/accumulo
export ACCUMULO_LOG_DIR={{ worker_data_dirs[1] }}/logs/accumulo
export ZOOKEEPER_HOME={{ zookeeper_home }}
export JAVA_HOME={{ java_home }}
export ACCUMULO_PID_DIR={{ worker_data_dirs[1] }}/accumulo

{% if accumulo_major_version == '1' %}

export HADOOP_PREFIX={{ hadoop_home }}
export HADOOP_CONF_DIR="$HADOOP_PREFIX/etc/hadoop"
export ACCUMULO_TSERVER_OPTS="-Xmx{{ accumulo_tserv_mem }} -Xms{{ accumulo_tserv_mem }}"
export ACCUMULO_MASTER_OPTS="-Xmx256m -Xms256m"
export ACCUMULO_MONITOR_OPTS="-Xmx128m -Xms64m"
export ACCUMULO_GC_OPTS="-Xmx128m -Xms128m"
export ACCUMULO_SHELL_OPTS="-Xmx256m -Xms64m"
export ACCUMULO_MASTER_OPTS="-Xmx4g -Xms1g"
export ACCUMULO_MONITOR_OPTS="-Xmx4g -Xms1g"
export ACCUMULO_GC_OPTS="-Xmx4g -Xms1g"
export ACCUMULO_SHELL_OPTS="-Xmx4g -Xms2g"
export ACCUMULO_GENERAL_OPTS="-XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -Djava.net.preferIPv4Stack=true -XX:+CMSClassUnloadingEnabled"
export ACCUMULO_OTHER_OPTS="-Xmx256m -Xms64m"
export ACCUMULO_OTHER_OPTS="-Xmx4g -Xms1g"
export ACCUMULO_KILL_CMD='kill -9 %p'
export NUM_TSERVERS=1
export MALLOC_ARENA_MAX=${MALLOC_ARENA_MAX:-1}
Expand Down
4 changes: 3 additions & 1 deletion ansible/roles/accumulo/templates/gc
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
{{ groups['accumulomaster'][0] }}
{% for host in groups['accumulomaster'] %}
{{ host }}
{% endfor %}
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,11 @@ accumulo.sink.graphite.class=org.apache.hadoop.metrics2.sink.GraphiteSink
accumulo.sink.graphite.server_host={{ groups['metrics'][0] }}
accumulo.sink.graphite.server_port=2004
accumulo.sink.graphite.metrics_prefix=accumulo

{% if cluster_type == 'azure' %}
*.sink.statsd.class=org.apache.hadoop.metrics2.sink.StatsDSink
accumulo.sink.statsd.server.host=127.0.0.1
accumulo.sink.statsd.server.port=8125
accumulo.sink.statsd.skip.hostname=true
accumulo.sink.statsd.service.name=master
{% endif %}
4 changes: 3 additions & 1 deletion ansible/roles/accumulo/templates/masters
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
{{ groups['accumulomaster'][0] }}
{% for host in groups['accumulomaster'] %}
{{ host }}
{% endfor %}
4 changes: 3 additions & 1 deletion ansible/roles/accumulo/templates/monitor
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
{{ groups['accumulomaster'][0] }}
{% for host in groups['accumulomaster'] %}
{{ host }}
{% endfor %}
4 changes: 3 additions & 1 deletion ansible/roles/accumulo/templates/tracers
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
{{ groups['accumulomaster'][0] }}
{% for host in groups['accumulomaster'] %}
{{ host }}
{% endfor %}
Loading

0 comments on commit be5ae7a

Please sign in to comment.