Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embedded Ansible setup guide #276

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
- [Providers development guide](providers/dev-guide.md)
- Provider setup instructions
- [Amazon AWS](providers/amazon_aws_config.md)
- [Embedded Ansible](providers/embedded_ansible.md)
- [Openshift](providers/openshift.md)
- [Openstack Infra](providers/openstack_infra_provider.md)
- [Interactive debugging with Pry-Remote](developer_setup/debugging.md)
Expand Down
227 changes: 227 additions & 0 deletions providers/embedded_ansible.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,227 @@
## Embedded Ansbile - via AWX in a docker container

ManageIQ supports more than one way (`manageiq/lib/embedded_ansible*`) of connecting to the Embedded Ansible provider.

Here, we'll set up `DockerEmbeddedAnsible`, introduced in [ManageIQ/manageiq#16205](https://github.com/ManageIQ/manageiq/pull/16205) - this will download an awx docker container, and set local manageiq to connect to it.



### Dependencies

You need docker.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should add any more than this. Different platforms have it packaged with their native package manager otherwise Docker has instructions for installation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we needed to find out how to do this on a mac. I can either throw this info away, or include it in the guide :).

Since many of our devels use macs, I think this is valuable :).


On a mac, you also need `docker-machine`:

```
brew install docker docker-machine

# read the output and start the service

# run this in your .bashrc, or every shell where needed
eval `docker-machine env default`
```


### DB config

Your PostgreSQL must be configured to allow connections from the docker container, so that AWX can connect to manageiq's database.

Your ManageIQ DB user (the one in `manageiq/config/database.yml`) must have the `SUPERUSER` privilege (or at least needs to be able to create roles and databases).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's already a requirement of the development setup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #276 (comment)

Nobody who knows about DB security follows these intstructions precisely :).



Config file locations (where to expect it):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, do we need this? It's going to be different for every platform

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, we can make it easy by giving people info they'll probably need, or we can tell them to go looking.

If you're willing to answer all the questions from people trying to follow the document, I'm willing to remove it :).


* Debian: `/etc/postgresql/9.6/main/`
* Fedora: `/var/lib/pgsql/data/`
* MacOSX: `/usr/local/var/postgres/`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would imagine these would depend heavily on where you installed the package from.

For example I'm using Fedora and my postgresql.conf file is in /var/lib/pgsql/9.5/data

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this is probably more portable.

$ psql -d postgres -c 'show config_file'
               config_file               
-----------------------------------------
 /var/lib/pgsql/9.5/data/postgresql.conf
(1 row)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except then we need the right invocation for each system :)...

$ psql -d postgres -c 'show config_file'
ERROR:  must be superuser to examine "config_file"

Copy link
Contributor Author

@himdel himdel Dec 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll just add a mention that these dirs may wary, and that you may have some luck running that command, WDYT?

The idea was mostly so that people have a place to start looking, it will always depend on the current version, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, added a note about these depending on other stuff, and a note that you can try running that command.


Note that these may depend on your version, or oven on how you installed PostgreSQL.
If you still can't find the right location, you may have luck running `psql -d postgres -c 'show config_file'`.



Make sure your `postgresql.conf` contains this line:

```
listen_addresses = '*'
```


Make sure your `pg_hba.conf` contains:

```
host all all 172.17.0.1/24 md5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this subnet is the same for all docker installations.

Maybe an explicit address isn't a good idea here. I would say instead to allow connections from the subnet used by the docker0 interface.

For example, ip a shows this for me, but that gateway address might not be the same for everyone.

    docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default 
    link/ether 02:42:3a:94:df:ba brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 scope global docker0
       valid_lft forever preferred_lft forever

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note on running ip addr show dev docker0, thanks :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe these postgresql.conf and pg_hba.conf changes should be made to the developer setup guide with a note that you'll only need them if you want to use embedded ansible.

```

Note: that `172.17.0.1/24` may depend on the address of your docker network interface - run `ip addr show dev docker0` and you should see an `inet` line with a similar address - use that.


Mac users: you may also need to add this one, for `docker-machine`.

```
host all all 192.168.99.0/24 md5
```


Ensure your DB user (the one in `config/database.yml`) has `superuser` rights..
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this needs to be specified here as this is suggested in the main developer setup document

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd keep it just to make sure :). It surprised me so it could surprise more people, and ... assuming every devel has removed any security from their postgres install is probably wrong, even though those instructions mention it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this needs to be specified here as this is suggested in the main developer setup document

Well I don't have it that way. Security is one of the reasons.

This setup adds more stuff that depends on the superadmin config?

I don't think we should be doing that and I definetly agree that if it's really needed we write it here (again).

Copy link
Contributor Author

@himdel himdel Dec 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This setup adds more stuff that depends on the superadmin config?

Unfortunately the worker needs to create a DB and a role, so yes. :(

Agreed it's a bad idea, in the gitter discussion @Fryguy mentioned this should be something that should be optional for the more exotic setups ... but for now we seem to be stuck with it, yes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, you can't rake evm:db:reset either?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you run differently then the default guides, you're going to have issues because we can't know how you diverged from the guides. Please either change the default guides to be "right", include sensible alternatives in the guides, or use the defaults. It's hard enough to maintain these guides when people use the defaults, if they don't use the defaults, the guides become stale and it's easy to accidentally break someone else's setup.


* Debian: `sudo su - postgres -c psql -c 'ALTER ROLE "root" SUPERUSER'`
* Fedora / MacOSX: `psql -c 'ALTER ROLE "root" SUPERUSER' postgres`


### Clean up

If you've already set up an AWX instance this way and want to clean it up:

```
psql -d postgres -c 'DROP DATABASE awx'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably enhance rake evm:db:reset to handle these for us.

psql -d postgres -c 'DROP ROLE awx'
bin/rake evm:db:reset
bin/rake db:seed
```


If you had previously added an embedded ansible using the [old way](http://talk.manageiq.org/t/howto-setup-embedded-ansible/2291/5?u=himdel), you'll need to clean up the provider (in Rails console):

```
ManageIQ::Providers::EmbeddedAnsible::Provider.first.destroy!
```


In both cases, you may also need to clean up the old authentications (in Rails console):

```
db = MiqDatabase.first
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should probably be cleaned up on provider destroy.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should.. but aren't :).

I had to do this to get it working, so IMHO this needs to be in the guide

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a provider bug we should fix. I'm fine with keeping these steps but open an issue and link to the issue so we can circle back and remove this if and when it's fixed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db.authentication_type('ansible_secret_key').delete # db.ansible_secret_key.delete
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, shouldn't these be destroy?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

db.ansible_rabbitmq_authentication.delete
db.ansible_admin_authentication.delete
db.ansible_database_authentication.delete
```


### Procfiles

Under your `manageiq/` directory, there should be a `Procfile.example` file.

You need to uncomment these lines:

```
generic: ruby lib/workers/bin/run_single_worker.rb MiqGenericWorker
ansible: ruby lib/workers/bin/run_single_worker.rb EmbeddedAnsibleWorker
embedded_ansible_refresh: ruby lib/workers/bin/run_single_worker.rb --ems-id <id> ManageIQ::Providers::EmbeddedAnsible::AutomationManager::RefreshWorker
embedded_ansible_event: ruby lib/workers/bin/run_single_worker.rb --ems-id <id> ManageIQ::Providers::EmbeddedAnsible::AutomationManager::EventCatcher
```

And you'll need to replace that `<id>` with the id of the newly created **manager** instance (`ManageIQ::Providers::EmbeddedAnsible::AutomationManager`, not `ManageIQ::Providers::EmbeddedAnsible::Provider`).


### Setting it up

* configure your server to enable the ansible role (from Rails console):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, you should be able to rake evm:start and turn the role on. Right @carbonin ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've never run evm:start and don't really intend to. That's simply not how UI does it..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@himdel If that's the case, there's some things you just won't be able to do in the UI. I would not expect all features on the appliance to be feasible or desirable in a rails server process.

Copy link
Contributor Author

@himdel himdel Dec 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know which? So far, everything seemed to work...

But, as long as this is mostly around refresh or things like that I think that's perfectly acceptable - at least, I have not needed to evm:start for any PR I've reviewed in the last year, so.. :)

At least, still better than that "Embedded Ansible not supported, deal with it" :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@himdel replication, server failover, various platform/provider specific things such as scanning vms/instances on various providers. We will try to make things easier if we can but at some point we have to say it's too much effort and frankenstein code if there's an easy workaround.

Copy link
Contributor Author

@himdel himdel Dec 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jrafanie Ah, perfect, thanks! :) I don't think we get many UI PRs dealing with those things.

I agree that when you want to test replication, etc., you need to run the whole app.

But this mostly is so that every member of the team can actually test embedded ansible PRs in their devel setups, and for that I think this is OK :).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I long ago accepted that there's just some things that are easier to let run on appliances. Thanks for questioning this because if there are easy changes to do, we should do them. With that said, though we have to be careful not to add too much hard to maintain hackery. At some point, we'll need to step back, measure the size of each yak and see if it's worth shaving.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed there.. but.. if the UI team can't run code needed to merge PRs, we would literally waste hours on reviews, so... balance :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need to run most PRs locally before merge, that sounds like the problem in the workflow. The only ones I ever run manually on an appliance are ones where tests would be brittle or even useless since they'd require mocks around everything, such as running system commands in ruby and testing some side effect much later (black console partitioning, etc.)


```
server = MiqServer.my_server(true)
server.role = "embedded_ansible,ems_inventory,ems_operations,event"
server.activate_roles(%w(embedded_ansible ems_inventory ems_operations event))
server.save!
```

* run rails: `bin/rails s`

* run the worker that will download and set up the container: `foreman start -f Procfile.example` (only the `ansible` worker is needed at this point).

* grab a coffee or two - you can watch the progress by watching:
* authentication errors, docker problems: `tail -f managiq/log/evm.log`
* running containers: `docker ps`
* container logs: `docker logs -f awx_web`
* seeing awx initial upgrade progress `localhost:54321`

* once everything suceeded, you should see `Finished starting embedded ansible service.` in `evm.log`

* if you got that far, AWX is running and ManageIQ has an EmbeddedAnsible provider instance

* you need to edit `Procfile.example`, to replace that `<id>` with the actual id of the new manager (not provider) instance:

```
ManageIQ::Providers::EmbeddedAnsible::Provider.first.managers.first.id
```

* run `foreman start -f Procfile.example`

* try adding a Repository in ManageIQ (Automate > Ansible > Repositories) :)


If you're on MacOSX, you will also need to run these first:

```
# redirect local 54321 to docker-machine - otherwise, localhost:54321 doesn't work
docker-machine ssh default -L 54321:127.0.0.1:54321
Copy link
Member

@Glutexo Glutexo May 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn’t it be better to set up a proper port forwarding in VirtualBox? This way it wouldn’t be necessary to keep the SSH session open. It’d be a less hacky solution too.

See https://stackoverflow.com/questions/32174560/port-forwarding-in-docker-machine

You’d need to run something like this

VBoxManage modifyvm "default" --natpf1 "awx,tcp,127.0.0.1,54321,,54321"

or you can set it up in the VirtualBox GUI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds better, how would you do that so that it only affects the docker machine and not other VirtualBox machines?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setting is machine-specific. The default there in the command is a name of the Docker Machine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mentioned as an alternative


# inside that docker-machine ssh shell - redirect postgres from the docker machine to the real one (otherwise, awx_web can't connect to manageiq DB)
sudo sh -c 'echo 1 > /proc/sys/net/ipv4/ip_forward'
sudo iptables -t nat -I PREROUTING --dst 172.17.0.1 -p tcp --dport 5432 -j DNAT --to-destination 192.168.99.1:5432

# don't exit the shell
```

(`172.17.0.1` is the docker host IP address, `192.168.99.1` is the adress `docker-machine` gives to the host (the VM will have `192.168.99.100` most likely), and `default` is the default name for the docker machine)

(Alternately, something like `VBoxManage modifyvm "default" --natpf1 "awx,tcp,127.0.0.1,54321,,54321"` might work too.)


### Running it again

Just run these 3, each in a different terminal:

```
bin/rails s
foreman start -f Procfile.example
```


On a mac, you'll also need to do the `docker-machine ssh...` command running.
And if you've restarted the machine (or `docker-machine`) since the last time, you'll also need the `iptables...` command.


### Connecting to AWX directly

Your docker awx instance is listening on `localhost:54321`.

To log in, you can use the `admin` account - to determine the password, run Rails console and do:

```
MiqDatabase.first.ansible_admin_authentication.password
```


### Troubleshooting

* watch `manageiq/log/evm.log`

```
# should see this in evm.log
[----] I, [2017-12-07T11:32:46.833998 #29139:2acb0db2ef8c] INFO -- : MIQ(EmbeddedAnsibleWorker::Runner#setup_ansible) Starting embedded ansible service ...
[----] I, [2017-12-07T11:33:06.637266 #29139:2acb0db2ef8c] INFO -- : MIQ(DockerEmbeddedAnsible#start) Waiting for Ansible container to respond
... a whole lot of this ....
[----] I, [2017-12-07T11:33:08.732190 #29139:2acb0db2ef8c] INFO -- : MIQ(DockerEmbeddedAnsible#start) Waiting for Ansible container to respond
[----] I, [2017-12-07T11:33:13.530599 #29139:2acb0db2ef8c] INFO -- : MIQ(EmbeddedAnsibleWorker::Runner#setup_ansible) Finished starting embedded ansible service.
[----] I, [2017-12-07T11:33:15.605973 #29139:2acb0db2ef8c] INFO -- : MIQ(ManageIQ::Providers::EmbeddedAnsible::Provider#with_provider_connection) Connecting through ManageIQ::Providers::EmbeddedAnsible::Provider: [Embedded Ansible]
[----] I, [2017-12-07T11:33:16.033227 #29139:2acb0db2ef8c] INFO -- : MIQ(AuthUseridPassword#validation_successful) [Provider] [1], previously valid/invalid on: []/[], previous status: []
```

* watch docker container output - for problems like awx not being able to connect to ManageIQ database

```
docker logs -f awx_web
```

* watch `docker ps` output

```
# should see this in `docker ps`
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d10993d1f25b ansible/awx_task:latest "/tini -- /bin/sh ..." 6 seconds ago Up 5 seconds 8052/tcp awx_task
b63a677d32a7 ansible/awx_web:latest "/tini -- /bin/sh ..." 7 seconds ago Up 6 seconds 0.0.0.0:54321->8052/tcp awx_web
59806de1bcd1 memcached:alpine "docker-entrypoint..." 27 seconds ago Up 26 seconds 11211/tcp memcached
a89aa0e4a395 rabbitmq:3 "docker-entrypoint..." 27 seconds ago Up 26 seconds 4369/tcp, 5671-5672/tcp, 25672/tcp rabbitmq
```