Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shield integration #399

Closed
electrical opened this issue Jul 1, 2015 · 49 comments
Closed

Shield integration #399

electrical opened this issue Jul 1, 2015 · 49 comments
Milestone

Comments

@electrical
Copy link

Shield enables authentication for API calls.
Since we do a few of those with the puppet module we need to expose options to support that.

@cdenneen
Copy link
Contributor

cdenneen commented Mar 1, 2016

@electrical any update?

@electrical
Copy link
Author

@cdenneen I'm afraid I left the company a few weeks ago and someone else is taking over.

@cdenneen
Copy link
Contributor

cdenneen commented Mar 2, 2016

@electrical thanks.

Do we know who this will be? Anyway we can get even the basics of these modules for shield and kibana published and public contribution help to work on them?(even if not a published module right away)
Problem right now with little insight into logstash 2.x support, beats, kibana, shield, etc people are left fending for themselves and hacking forks of these modules that majorly diverge or are creating their own because of the time lag on most of these. (For example I think the logstash module was getting the revamp after the 2.0 release of ES with all its goodness and multi instance support, think that was a year ago and we are on 2.2.x and soon 3.x)

Having these re-opened as public modules like the original electrical ones were might not be a bad idea if @elastic doesn't have the cycles to solely maintain?

@tylerjl
Copy link
Contributor

tylerjl commented Mar 3, 2016

Hi @cdenneen, sorry for the delayed response here. The short story here is that @Jarpy and I will be helping to further the excellent work @electrical has already done. As a prerequisite to diving into the guts of the module, we'd really like to give the CI/testing environment some love to get it back up to speed running through the thorough test suite for some confidence moving forward.

Once that's done (and ideally in a place the community can access and interact with) we'll be in a great place to start actively accepting community PRs. The community work is greatly appreciated and if we can put some infrastructure in place to lower the barrier to entry with good testing infra, we can make community contributions happen more quickly and confidently.

I'll try and provide as much transparency as possible as we transition some things, please let us know about additional needs as well! 💯

@cdenneen
Copy link
Contributor

cdenneen commented Mar 5, 2016

@tylerjl thanks for the update
I think as far as modules waiting for:

  • shield
  • kibana
  • logstash 2.x and up support

(Put shield at top because figured would be quick win to get initial release out)

I left filebeats off this list for now because @pcfens has done great job of creating one and maintaining. So think besides the above list of missing modules the backlog of issues would b next of importance before filebeats module.

Maybe creating a bullet list as issue to track order of what we can expect?

@tylerjl
Copy link
Contributor

tylerjl commented Mar 7, 2016

@cdenneen yep, that aligns pretty closely to the pulse I've gotten from others as well. Hopefully we can pick up the shield work efficiently to keep it moving.

I'm currently knee-deep in CI environment work, but hopefully we'll have a fully functioning CI pipeline Soon™.

@cdenneen
Copy link
Contributor

@tylerjl Any luck in releasing the beta for puppet-shield plugin?

@tylerjl
Copy link
Contributor

tylerjl commented Apr 21, 2016

@cdenneen I'm happy to report I've been making lots of progress on Shield integration. I still need to put some work in on TLS management, but if anyone wants to offer feedback regarding design choices, please to check out the branch diff and let me know if the design pattern looks good from and end-user perspective. I'm first and foremost interested to know if this is usable for people.

I've gotten most of the CI shakiness sorted out, so Shield support is the primary focus right now.

@cdenneen
Copy link
Contributor

@tylerjl awesome. Can you add realm support to that fork as well?
(http://blog.comperiosearch.com/blog/2015/08/21/elasticsearch-security-shield/)

@tylerjl
Copy link
Contributor

tylerjl commented Apr 21, 2016

@cdenneen assuming you mean Active Directory realms, standing up a Windows test environments is obviously less straightforward than a Linux one. That said, I think at the very least managing the role mappings yaml file should be very doable now that I've written the native puppet provider for the yaml roles file. Again, I'm not sure what the most preferable way would be for people to define those mappings in puppet, so feel free to comment on the branch regarding how it's shaping up.

@cregev
Copy link

cregev commented Apr 21, 2016

Any update when the PR is going to be merged ?

@tylerjl
Copy link
Contributor

tylerjl commented Apr 21, 2016

No ETA yet, but I'll be pushing updates regularly if anyone is anxious to watch progress.

@cdenneen
Copy link
Contributor

@tylerjl sort of... the AD is just one option of a realm, you could use just about anything (LDAP, AD, OAuth, SAML, etc)... it's basically because of the custom realms (https://www.elastic.co/guide/en/shield/current/custom-realms.html) so esusers is just one realm...
In your fork if you can define type the realm it could handle esusers for the basic use case but then just about any other provider anyone wants to use for shield authentication.

So for example most of my auth will be with esusers for applications but I would probably secure Kibana with shield and then use AD/LDAP to auth users to Kibana.

@fgimian
Copy link

fgimian commented May 2, 2016

Would love to see this too, so that we can easily manage local Shield users and LDAP configuration with Puppet 😄

For now, I'll write my own little module to manage these files.

Keep up the great work guys!
Fotis

@cdenneen
Copy link
Contributor

cdenneen commented May 4, 2016

@tylerjl I see the mapping support in your fork... I think once the realm support is added to the elasticsearch.yml.erb this should be working, right?

@tylerjl
Copy link
Contributor

tylerjl commented May 4, 2016

@cdenneen from what I understand, configuring realms is just a matter of defining the configuration in elasticsearch.yml.erb. If that's the case, you'd just need to set the config => parameter in puppet as needed, so there isn't anything realm-specific the module needs (other than the mappings support.) Does that sound right? (this is helpful, I haven't done anything with AD in the past.)

@cdenneen
Copy link
Contributor

cdenneen commented May 4, 2016

Yeah I'm guessing custom config block would work and result should look something like this:

shield.authc:
  realms:
    esusers:
      type: esusers
      order: 0
    ldap1:
      type: ldap
      order: 1
      enabled: false
      url: 'url_to_ldap1'
    ...
    ad1:
      type: active_directory
      order: 3
      url: 'url_to_ad'

Otherwise maybe an if condition and loop in the erb to do hiera lookup for all shield::authc::realms similar to https://github.com/jfryman/puppet-nginx/blob/master/templates/vhost/location_header.erb#L31-L45?

@tylerjl
Copy link
Contributor

tylerjl commented May 4, 2016

I'd prefer to avoid hiera lookups in the erb per Puppet Lab's recommendation, but defining realm configuration in hiera should be as simple as something like the following in a hiera yaml config:

---
elasticsearch::config:
  shield.authc:
    realms:
      esusers:
        type: users
...

The config on the elasticsearch class gets merged with instance configs so there's no need for hiera gymnastics to get a config into a defined instance in a manifest.

@cdenneen
Copy link
Contributor

cdenneen commented May 4, 2016

Understood... makes sense... so what's left before your fork gets merged? Documentation updates?

@tylerjl
Copy link
Contributor

tylerjl commented May 4, 2016

Just getting the tests clear, still some failures on different OS/puppet combinations... close, though! 😅

@tylerjl
Copy link
Contributor

tylerjl commented May 5, 2016

PR open at #624, do provide feedback if there are any parts that need changing.

@cdenneen
Copy link
Contributor

cdenneen commented Jun 2, 2016

@tylerjl using 0.11.0 and getting

==> eslog: Error: Could not find a suitable provider for elasticsearch_shield_user_roles

@tylerjl
Copy link
Contributor

tylerjl commented Jun 2, 2016

@cdenneen those types of errors crop up when the requirements for a provider haven't been met - my guess is that it comes from this line which checks for the existense of the users_roles file.

Is /usr/share/elasticsearch/shield/users_roles on the target system? (assuming this isn't OpenBSD)

@cdenneen
Copy link
Contributor

cdenneen commented Jun 2, 2016

@tylerjl no it doesn't exist there:

[vagrant@eslog ~]$ ls /usr/share/elasticsearch/shield/*
/usr/share/elasticsearch/shield/roles.yml  /usr/share/elasticsearch/shield/users

but does here:

[vagrant@eslog ~]$ ls /etc/elasticsearch/shield/
logging.yml  role_mapping.yml  roles.yml  roles.yml.new  users  users_roles

@cdenneen
Copy link
Contributor

cdenneen commented Jun 2, 2016

@tylerjl this is a fresh vagrant box (so no cruft there).
Is there something in config hash I need to add in order for this users_roles to be created under /usr/share/elasticsearch/shield? This is 2.3.3 centos rpm install from managed repo

@tylerjl
Copy link
Contributor

tylerjl commented Jun 2, 2016

So, my guess is that the shield directory exists there in /etc because the plugin was either installed 1) by hand or 2) by an earlier version of the module.

In version 0.11.0 of the module, any plugin installations should drop configuration files into /usr/share/elasticsearch, which helps to centralize the shield config files - earlier versions or running plugin by hand assumes your elasticsearch config lives under /etc/elasticsearch, which doesn't work with the puppet module's notion of instances, so the config files are centralized in /usr/share and recursively kept in sync with all instance directories (i.e., /etc/elasticsearch/es-01/shield.)

In your case you could either remove the module and then define it in puppet to manage, or, if you just want to start from scratch without any of the pre-packaged roles/other files, you could always create users_roles by hand.

It's a bit of an ugly workaround, but it's a compromise between respecting the security manager's need to only read config files within the CONF_DIR and keeping /etc/elasticsearch consistent with directories representing instance CONF_DIRs.

@bflad
Copy link
Contributor

bflad commented Jun 6, 2016

@tylerjl thanks for that last comment. In our environment we installed the shield plugin with an older version of this Puppet module and we needed to run this before running Puppet with the newer module: cp -R /etc/elasticsearch/es-01/shield /usr/share/elasticsearch/. Worked like a charm afterwards!

@cdenneen
Copy link
Contributor

cdenneen commented Jun 6, 2016

@bflad thanks, I haven't had time to test.
@tylerjl guess my concern here is you can't always assume that this will be on a fresh install so I would suggest there be some method of cleaning up old shield plugin prior to new plugin install so this works if it's added to already built system? Maybe one running 1.7.3 and 0.10.0 of the module.

@cdenneen
Copy link
Contributor

cdenneen commented Jun 9, 2016

@tylerjl There must still be a ordering issue here.
I rebuilt fresh box using 0.11.0 and it worked... made some modifications, destroyed my vagrant boxes, went to rebuild and got the same
==> eslogmd: Error: Could not find a suitable provider for elasticsearch_shield_user_roles

Here is the output of those directories as before:

[vagrant@eslogmd ~]$ ls -la /etc/elasticsearch/
total 12
drwxr-xr-x   3 elasticsearch elasticsearch   20 Jun  9 10:48 .
drwxr-xr-x. 84 root          root          8192 Jun  9 10:48 ..
drwxr-x---   2 root          elasticsearch    6 May 17 11:49 scripts

[vagrant@eslogmd ~]$ ls -la /usr/share/elasticsearch/shield/
total 0
drwxr-xr-x 2 root          root            6 Jun  9 10:48 .
drwxr-xr-x 8 elasticsearch elasticsearch 146 Jun  9 10:48 ..

Only large difference is I changed the default elasticsearch::config in hiera to be able to have per instance configs.

Here is all my data, in case it helps tell me where I made a stupid mistake ;-):

---
profile::elasticsearch::es_instances:
  - 'es-01'
  - 'es-02'
profile::elasticsearch::data::config:
  cluster.name: 'loggingtest'
  node.name: "%{::hostname}"
  index.number_of_shards: 5
  index.number_of_replicas: 1
  bootstrap.mlockall: true
  discovery.zen.ping.unicast.hosts:
    - 'eslogmd'
    - 'eslogm'
  discovery.zen.ping.multicast.enabled: false
  node.data: true
  node.master: false
  http.port: '9200'
profile::elasticsearch::master::config:
  cluster.name: 'loggingtest'
  node.name: "%{::hostname}"
  index.number_of_shards: 5
  index.number_of_replicas: 1
  bootstrap.mlockall: true
  discovery.zen.ping.unicast.hosts:
    - 'eslogmd'
    - 'eslogm'
  discovery.zen.ping.multicast.enabled: false
  node.data: false
  node.master: true
  http.port: '9201'
elasticsearch::init_defaults:
  ES_HOME: '/usr/share/elasticsearch'
  MAX_OPEN_FILES: '65535'
  MAX_MAP_COUNT: '262144'
  LOG_DIR: '/var/log/elasticsearch'
  DATA_DIR: '/data/elasticsearch'
  WORK_DIR: '/tmp/elasticsearch'
  ES_USER: 'elasticsearch'
  ES_GROUP: 'elasticsearch'
  ES_JAVA_OPTS: '-Djava.net.preferIPv4Stack=true'
  ES_HEAP_SIZE: '128m'
  MAX_LOCKED_MEMORY: 'unlimited'
elasticsearch::java: true
elasticsearch::datadir: '/data/elasticsearch'
elasticsearch::status: 'enabled'
class profile::elasticsearch(
  $es_instances = 'es-01'
) {
  class {'::elasticsearch':
    manage_repo   => true,
    repo_version  => '2.x',
    java_package  => 'java-1.8.0-openjdk-headless',
    package_pin   => true,
    version       => '2.3.3',
  }
  ::elasticsearch::plugin{'mobz/elasticsearch-head':
    instances => $es_instances
  }
  ::elasticsearch::plugin{'lmenezes/elasticsearch-kopf':
    instances => $es_instances
  }
  ::elasticsearch::plugin{'royrusso/elasticsearch-hq':
    instances => $es_instances
  }
  ::elasticsearch::plugin{'license':
    instances => $es_instances
  }
  ::elasticsearch::plugin{'shield':
    instances => $es_instances
  }
  ::elasticsearch::plugin{'watcher':
    instances => $es_instances
  }
  ::elasticsearch::plugin{'marvel-agent':
    instances => $es_instances
  }
  elasticsearch::shield::user { 'admin':
    password => 'changeme',
    roles    => ['admin'],
  }
}
class profile::elasticsearch::master(
  $config = {}
) {
  include ::profile::elasticsearch
  ::elasticsearch::instance {'es-02':
    config => $config
  }
  firewall { '101 ES MASTER HTTP':
    proto  => 'tcp',
    action => 'accept',
    dport  => ['9201','9301'],
  }
}
class profile::elasticsearch::data(
  $config = {}
) {
  include ::profile::elasticsearch
  ::elasticsearch::instance {'es-01':
    config => $config
  }
  firewall { '100 ES HTTP':
    proto  => 'tcp',
    action => 'accept',
    dport  => ['9200','9300'],
  }
}

@tylerjl
Copy link
Contributor

tylerjl commented Jun 9, 2016

Regarding the assumption about previous installations, you're right - we should handle that case, that may be tricky, have to think on that one.

Regarding the ordering issue: that's strange; as seen here, the resource ordering explicitly enforces that the plugin should be installed before any shield users/roles are handled:

...
    -> Elasticsearch::Plugin <| |>
    -> Elasticsearch::Shield::Role <| |>
    -> Elasticsearch::Shield::User <| |>
...

One ordering issue that may come up in that manifest is that shield will fail to install unless the license plugin is there first, but it sounds like that isn't happening. I wonder if any other errors would show up if you got rid of the elasticsearch::shield::user resource and one of the plugins failed to install, etc.

Agreed about the hiera config; I don't think that would be related to the issue you're seeing.

@cdenneen
Copy link
Contributor

cdenneen commented Jun 9, 2016

From the clean run here are the errors in order received:

==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Plugin[mobz/elasticsearch-head]/Elasticsearch_plugin[mobz/elasticsearch-head]: Could not evaluate: undefined method `scan' for nil:NilClass
==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Plugin[lmenezes/elasticsearch-kopf]/Elasticsearch_plugin[lmenezes/elasticsearch-kopf]: Could not evaluate: undefined method `scan' for nil:NilClass
==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Plugin[royrusso/elasticsearch-hq]/Elasticsearch_plugin[royrusso/elasticsearch-hq]: Could not evaluate: undefined method `scan' for nil:NilClass
==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Plugin[license]/Elasticsearch_plugin[license]: Could not evaluate: undefined method `scan' for nil:NilClass
==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Plugin[shield]/Elasticsearch_plugin[shield]: Could not evaluate: undefined method `scan' for nil:NilClass
==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Plugin[watcher]/Elasticsearch_plugin[watcher]: Could not evaluate: undefined method `scan' for nil:NilClass
==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Plugin[marvel-agent]/Elasticsearch_plugin[marvel-agent]: Could not evaluate: undefined method `scan' for nil:NilClass
==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Shield::User[admin]/Elasticsearch_shield_user[admin]: Provider esusers is not functional on this host
==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Shield::User[jaguar]/Elasticsearch_shield_user[jaguar]: Provider esusers is not functional on this host
==> eslogmd: Error: Could not find a suitable provider for elasticsearch_shield_user_roles

Only other thing I can think of is the $es_instances APL for the profiles?
Anything you'd like me to try next?

@tylerjl
Copy link
Contributor

tylerjl commented Jun 9, 2016

Ah, so the plugins aren't getting installed. That'll definitely cause problems. The specific error comes from this scan call which is scanning the string looking for the ES version. So something is wrong when the plugin provider tries to call elasticsearch -version.

I'd dig into how that call behaves (inside and outside puppet), it may be a Java issue of some sort.

@cdenneen
Copy link
Contributor

cdenneen commented Jun 9, 2016

@tylerjl I think the issue might be i have some class parameters in profile::elasticsearch for elasticsearch and some in hiera like elasticsearch::java_install: true.
Can you not mix parameters like this?
I have deep_merge on in hiera.yaml if that helps.

@cdenneen
Copy link
Contributor

cdenneen commented Jun 9, 2016

@tylerjl merging wasn't happening because of a typo... so that is fixed and now java is installing so rest is working... just getting odd start error and nothing getting logged to any logs:

[vagrant@eslogmd ~]$ sudo service elasticsearch-es-01 status
Redirecting to /bin/systemctl status  elasticsearch-es-01.service
● elasticsearch-es-01.service - Starts and stops a single elasticsearch instance on this system
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch-es-01.service; enabled; vendor preset: disabled)
   Active: failed (Result: signal) since Thu 2016-06-09 17:42:23 EDT; 2s ago
     Docs: http://www.elasticsearch.org
  Process: 14767 ExecStart=/usr/share/elasticsearch/bin/elasticsearch -d -p /var/run/elasticsearch/elasticsearch-es-01.pid -Des.default.path.home=${ES_HOME} -Des.default.path.logs=${LOG_DIR} -Des.default.path.data=${DATA_DIR} -Des.default.path.work=${WORK_DIR} -Des.default.path.conf=${CONF_DIR} (code=exited, status=0/SUCCESS)
 Main PID: 14778 (code=killed, signal=KILL)

Jun 09 17:42:22 eslogmd.vm.local systemd[1]: Starting Starts and stops a single elasticsearch instance on this system...
Jun 09 17:42:22 eslogmd.vm.local systemd[1]: PID file /var/run/elasticsearch/elasticsearch-es-01.pid not readable (yet?) after start.
Jun 09 17:42:23 eslogmd.vm.local systemd[1]: Started Starts and stops a single elasticsearch instance on this system.
Jun 09 17:42:23 eslogmd.vm.local systemd[1]: elasticsearch-es-01.service: main process exited, code=killed, status=9/KILL
Jun 09 17:42:23 eslogmd.vm.local systemd[1]: Unit elasticsearch-es-01.service entered failed state.
Jun 09 17:42:23 eslogmd.vm.local systemd[1]: elasticsearch-es-01.service failed.

@tylerjl
Copy link
Contributor

tylerjl commented Jun 10, 2016

Progress! 🏁

What shows up in the Elasticsearch instance's logs in /var/log? If the systemd status output doesn't have anything, there's bound to be something in some of the separate instance logs...

@cdenneen
Copy link
Contributor

cdenneen commented Jun 10, 2016

All the logs are 0 bytes. Need to manually execute to debug I'm afraid.

Update: So starting through systemctl isn't working but manually works. Is there something wrong with the systemd service file?

[root@eslogmd systemd]# more system/multi-user.target.wants/elasticsearch-es-02.service
[Unit]
Description=Starts and stops a single elasticsearch instance on this system
Documentation=http://www.elasticsearch.org

[Service]
Type=forking
EnvironmentFile=/etc/sysconfig/elasticsearch-es-02
User=elasticsearch
Group=elasticsearch
PIDFile=/var/run/elasticsearch/elasticsearch-es-02.pid
ExecStart=/usr/share/elasticsearch/bin/elasticsearch -d -p /var/run/elasticsearch/elasticsearch-es-02.pid -Des.default.path.home=${ES_HOME} -Des.default.path.logs=${LOG_DIR} -Des.default.path.data=${DATA_DIR}
 -Des.default.path.work=${WORK_DIR} -Des.default.path.conf=${CONF_DIR}
# See MAX_OPEN_FILES in sysconfig
LimitNOFILE=65535
# See MAX_LOCKED_MEMORY in sysconfig, use "infinity" when MAX_LOCKED_MEMORY=unlimited and using bootstrap.mlockall: true

LimitMEMLOCK=infinity

# Shutdown delay in seconds, before process is tried to be killed with KILL (if configured)
TimeoutStopSec=20

[Install]
WantedBy=multi-user.target
[root@eslogmd systemd]# ps -ef |grep elastic
root     15267 14810  0 10:09 pts/0    00:00:00 grep --color=auto elastic
[root@eslogmd systemd]# su -s /bin/bash -c "source /etc/sysconfig/elasticsearch-es-02; /usr/share/elasticsearch/bin/elasticsearch -d -p /var/run/elasticsearch/elasticsearch-es-02.pid -Des.default.path.home=${ES_HOME} -Des.default.path.logs=${LOG_DIR} -Des.default.path.data=${DATA_DIR} -Des.default.path.work=${WORK_DIR} -Des.default.path.conf=${CONF_DIR}" elasticsearch
[root@eslogmd systemd]# ps -ef |grep elastic
elastic+ 15281     1 97 10:10 ?        00:00:08 /bin/java -Xms256m -Xmx1g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Dfile.encoding=UTF-8 -Djna.nosys=true -Des.path.home=/usr/share/elasticsearch -cp /usr/share/elasticsearch/lib/elasticsearch-2.3.3.jar:/usr/share/elasticsearch/lib/* org.elasticsearch.bootstrap.Elasticsearch start -d -p /var/run/elasticsearch/elasticsearch-es-02.pid -Des.default.path.home=/usr/share/elasticsearch -Des.default.path.logs=/var/log/elasticsearch/es-02 -Des.default.path.data=/data/elasticsearch -Des.default.path.work=/tmp/elasticsearch -Des.default.path.conf=/etc/elasticsearch/es-02
root     15297 14810  0 10:10 pts/0    00:00:00 grep --color=auto elastic
[root@eslogmd systemd]# systemctl status elasticsearch-es-02
● elasticsearch-es-02.service - Starts and stops a single elasticsearch instance on this system
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch-es-02.service; enabled; vendor preset: disabled)
   Active: failed (Result: signal) since Fri 2016-06-10 10:06:51 EDT; 3min 51s ago
     Docs: http://www.elasticsearch.org
  Process: 15214 ExecStart=/usr/share/elasticsearch/bin/elasticsearch -d -p /var/run/elasticsearch/elasticsearch-es-02.pid -Des.default.path.home=${ES_HOME} -Des.default.path.logs=${LOG_DIR} -Des.default.path.data=${DATA_DIR} -Des.default.path.work=${WORK_DIR} -Des.default.path.conf=${CONF_DIR} (code=exited, status=0/SUCCESS)
 Main PID: 15225 (code=killed, signal=KILL)

Jun 10 10:06:49 eslogmd.vm.local systemd[1]: Starting Starts and stops a single elasticsearch instance on this system...
Jun 10 10:06:49 eslogmd.vm.local systemd[1]: PID file /var/run/elasticsearch/elasticsearch-es-02.pid not readable (yet?) after start.
Jun 10 10:06:50 eslogmd.vm.local systemd[1]: Started Starts and stops a single elasticsearch instance on this system.
Jun 10 10:06:51 eslogmd.vm.local systemd[1]: elasticsearch-es-02.service: main process exited, code=killed, status=9/KILL
Jun 10 10:06:51 eslogmd.vm.local systemd[1]: Unit elasticsearch-es-02.service entered failed state.
Jun 10 10:06:51 eslogmd.vm.local systemd[1]: elasticsearch-es-02.service failed.
[root@eslogmd systemd]# ls -la /var/run/elasticsearch/elasticsearch-es-02.pid
-rw-r--r-- 1 elasticsearch elasticsearch 5 Jun 10 10:10 /var/run/elasticsearch/elasticsearch-es-02.pid
[root@eslogmd systemd]# more /var/run/elasticsearch/elasticsearch-es-02.pid
15281

Adding the init defaults just for good measure:

CONF_DIR=/etc/elasticsearch/es-02
CONF_FILE=/etc/elasticsearch/es-02/elasticsearch.yml
DATA_DIR=/data/elasticsearch
ES_GROUP=elasticsearch
ES_HEAP_SIZE=128m
ES_HOME=/usr/share/elasticsearch
ES_JAVA_OPTS=-Djava.net.preferIPv4Stack=true
ES_USER=elasticsearch
LOG_DIR=/var/log/elasticsearch/es-02
MAX_LOCKED_MEMORY=unlimited
MAX_MAP_COUNT=262144
MAX_OPEN_FILES=65535
WORK_DIR=/tmp/elasticsearch

@cdenneen
Copy link
Contributor

@tylerjl looks like it's this init_defaults parameter:

MAX_LOCKED_MEMORY: 'unlimited'

Any idea why?

@tylerjl
Copy link
Contributor

tylerjl commented Jun 10, 2016

I found some discussion about this in the past. It looks like the unit file itself explains what you need to set if you've configured mlockall to be true and instruct ES to lock all memory - basically, you need to set the env var to "infinity" (note that I haven't tried this, just reading the documentation comments in the unit file).

@cdenneen
Copy link
Contributor

@tylerjl are you sure?
It says if set to 'unlimited', as I do, it will write out 'infinity' in the unit file.
Am I reading that condition correctly?
Maybe worth checking with ES team that 'infinity' actually works?

@tylerjl
Copy link
Contributor

tylerjl commented Jun 13, 2016

@cdenneen do you have a distro and version so I can try and emulate what you're seeing? I tried setting up an instance with the same mlockall and MAX_LOCKED_MEMORY settings in CentOS 7 but didn't see the same behavior.

@cdenneen
Copy link
Contributor

@tylerjl CentOS Linux release 7.2.1511 (Core)

sysconfig

CONF_DIR=/etc/elasticsearch/data
CONF_FILE=/etc/elasticsearch/data/elasticsearch.yml
DATA_DIR=/var/lib/elasticsearch
ES_GROUP=elasticsearch
ES_HEAP_SIZE=128m
ES_HOME=/usr/share/elasticsearch
ES_USER=elasticsearch
LOG_DIR=/var/log/elasticsearch/data
MAX_LOCKED_MEMORY=unlimited
MAX_OPEN_FILES=65535

systemd

[Unit]
Description=Starts and stops a single elasticsearch instance on this system
Documentation=http://www.elasticsearch.org

[Service]
Type=forking
EnvironmentFile=/etc/sysconfig/elasticsearch-data
User=elasticsearch
Group=elasticsearch
PIDFile=/var/run/elasticsearch/elasticsearch-data.pid
ExecStart=/usr/share/elasticsearch/bin/elasticsearch -d -p /var/run/elasticsearch/elasticsearch-data.pid -Des.default.path.home=${ES_HOME} -Des.default.path.logs=${LOG_DIR} -Des.default.path.data=${DATA_DIR}
-Des.default.path.work=${WORK_DIR} -Des.default.path.conf=${CONF_DIR}
# See MAX_OPEN_FILES in sysconfig
LimitNOFILE=65535
# See MAX_LOCKED_MEMORY in sysconfig, use "infinity" when MAX_LOCKED_MEMORY=unlimited and using bootstrap.mlockall: true

LimitMEMLOCK=infinity

# Shutdown delay in seconds, before process is tried to be killed with KILL (if configured)
TimeoutStopSec=20

[Install]
WantedBy=multi-user.target

Looks like it's doing what it needs to but when the MAX_LOCKED_MEMORY is set in the init_defaults the systemctl status shows:

[vagrant@eslogd ~]$ sudo systemctl status elasticsearch-data
● elasticsearch-data.service - Starts and stops a single elasticsearch instance on this system
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch-data.service; enabled; vendor preset: disabled)
   Active: failed (Result: signal) since Tue 2016-06-14 11:40:57 EDT; 7min ago
     Docs: http://www.elasticsearch.org
 Main PID: 12142 (code=killed, signal=KILL)

Jun 14 11:40:53 eslogd.vm.local systemd[1]: Starting Starts and stops a single elasticsearch instance on this system...
Jun 14 11:40:54 eslogd.vm.local systemd[1]: PID file /var/run/elasticsearch/elasticsearch-data.pid not readable (yet?) after start.
Jun 14 11:40:54 eslogd.vm.local systemd[1]: Started Starts and stops a single elasticsearch instance on this system.
Jun 14 11:40:57 eslogd.vm.local systemd[1]: elasticsearch-data.service: main process exited, code=killed, status=9/KILL
Jun 14 11:40:57 eslogd.vm.local systemd[1]: Unit elasticsearch-data.service entered failed state.
Jun 14 11:40:57 eslogd.vm.local systemd[1]: elasticsearch-data.service failed.

and the logs are all 0 bytes:

[vagrant@eslogd ~]$ ls -la /var/log/elasticsearch/data/
total 4
drwxr-xr-x 2 elasticsearch root          4096 Jun 14 11:40 .
drwxr-xr-x 3 elasticsearch elasticsearch   17 Jun 14 11:40 ..
-rw-r--r-- 1 elasticsearch elasticsearch    0 Jun 14 11:40 vagrant-logging-access.log
-rw-r--r-- 1 elasticsearch elasticsearch    0 Jun 14 11:40 vagrant-logging_index_indexing_slowlog.log
-rw-r--r-- 1 elasticsearch elasticsearch    0 Jun 14 11:40 vagrant-logging_index_search_slowlog.log
-rw-r--r-- 1 elasticsearch elasticsearch    0 Jun 14 11:40 vagrant-logging.log

I've created a gist with all puppet and hieradata if it helps: https://gist.github.com/cdenneen/e43be80ed3c429c94e2f30767ef603b0

@tylerjl
Copy link
Contributor

tylerjl commented Jun 14, 2016

I got an instance up and running with the sysconfig and yaml config you've got, so I'm not really sure what could be happening. I'd suggest maybe bumping up the logging verbosity level to see if you can get Elasticsearch to spit out any additional information, as from the Puppet side everything seems to be writing configurations correctly.

@cdenneen
Copy link
Contributor

cdenneen commented Jun 14, 2016

Setting instance to DEBUG leaves me with these in the log file:

[2016-06-14 17:49:26,039][DEBUG][bootstrap                ] seccomp(SECCOMP_SET_MODE_FILTER): Function not implemented, falling back to prctl(PR_SET_SECCOMP)...
[2016-06-14 17:49:26,043][DEBUG][bootstrap                ] Linux seccomp filter installation successful, threads: [app]

Does your instance have SELINUX enabled?

@tylerjl
Copy link
Contributor

tylerjl commented Jun 14, 2016

I doubt it - maybe try disabling it across the board and seeing if that fixes things; if so, we can try and determine which access violations are hindering service startup.

@cdenneen
Copy link
Contributor

I've narrowed down the issue to the following lines:

bootstrap:
  mlockall: true

These are what cause

[2016-06-15 16:26:31,228][DEBUG][bootstrap                ] seccomp(SECCOMP_SET_MODE_FILTER): Function not implemented, falling back to prctl(PR_SET_SECCOMP)...
[2016-06-15 16:26:31,228][DEBUG][bootstrap                ] Linux seccomp filter installation successful, threads: [app]

If I leave everything else but remove those 2 lines... it works fine.
MAX_LOCKED_MEMORY=unlimited is still in the defaults... the infinity in the systemd is still there...

@tylerjl
Copy link
Contributor

tylerjl commented Jun 15, 2016

@cdenneen this is probably outside the scope of this ticket at this point - if you want to post a summary of what you've got so far on https://discuss.elastic.co/c/elasticsearch, we can probably get more eyes on it and pull in some ES devs if we need to.

@cdenneen
Copy link
Contributor

@tylerjl thanks... opened support case

@cdenneen
Copy link
Contributor

@TylerLJ ticket yielded memory issue was the cause. Need at least 512mb. Logging should be corrected to show something.

@tylerjl tylerjl added this to the 0.13.0 milestone Jul 27, 2016
@tylerjl
Copy link
Contributor

tylerjl commented Jul 28, 2016

AFAIK, the only API access that requires username/password auth credentials is template management, and with the merge of #663 the top-level parameters api_* expose SSL, username, and password options for people using shield. As per normal practices, elasticsearch::template will inherit those top-level settings but they can be set explicitly on template resources if needed.

If there are additional API options that need to be exposed aside from the API protocol, host, port, basic auth username and password, and whether or not to validate SSL certs, do raise an issue to address it.

@tylerjl tylerjl closed this as completed Jul 28, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants