Shield integration #399

electrical · 2015-07-01T10:51:50Z

Shield enables authentication for API calls.
Since we do a few of those with the puppet module we need to expose options to support that.

cdenneen · 2016-03-01T22:50:39Z

@electrical any update?

electrical · 2016-03-01T23:41:14Z

@cdenneen I'm afraid I left the company a few weeks ago and someone else is taking over.

cdenneen · 2016-03-02T02:47:00Z

@electrical thanks.

Do we know who this will be? Anyway we can get even the basics of these modules for shield and kibana published and public contribution help to work on them?(even if not a published module right away)
Problem right now with little insight into logstash 2.x support, beats, kibana, shield, etc people are left fending for themselves and hacking forks of these modules that majorly diverge or are creating their own because of the time lag on most of these. (For example I think the logstash module was getting the revamp after the 2.0 release of ES with all its goodness and multi instance support, think that was a year ago and we are on 2.2.x and soon 3.x)

Having these re-opened as public modules like the original electrical ones were might not be a bad idea if @elastic doesn't have the cycles to solely maintain?

tylerjl · 2016-03-03T16:23:23Z

Hi @cdenneen, sorry for the delayed response here. The short story here is that @Jarpy and I will be helping to further the excellent work @electrical has already done. As a prerequisite to diving into the guts of the module, we'd really like to give the CI/testing environment some love to get it back up to speed running through the thorough test suite for some confidence moving forward.

Once that's done (and ideally in a place the community can access and interact with) we'll be in a great place to start actively accepting community PRs. The community work is greatly appreciated and if we can put some infrastructure in place to lower the barrier to entry with good testing infra, we can make community contributions happen more quickly and confidently.

I'll try and provide as much transparency as possible as we transition some things, please let us know about additional needs as well! 💯

cdenneen · 2016-03-05T00:01:54Z

@tylerjl thanks for the update
I think as far as modules waiting for:

shield
kibana
logstash 2.x and up support

(Put shield at top because figured would be quick win to get initial release out)

I left filebeats off this list for now because @pcfens has done great job of creating one and maintaining. So think besides the above list of missing modules the backlog of issues would b next of importance before filebeats module.

Maybe creating a bullet list as issue to track order of what we can expect?

tylerjl · 2016-03-07T17:55:32Z

@cdenneen yep, that aligns pretty closely to the pulse I've gotten from others as well. Hopefully we can pick up the shield work efficiently to keep it moving.

I'm currently knee-deep in CI environment work, but hopefully we'll have a fully functioning CI pipeline Soon™.

cdenneen · 2016-04-21T16:04:46Z

@tylerjl Any luck in releasing the beta for puppet-shield plugin?

tylerjl · 2016-04-21T16:30:11Z

@cdenneen I'm happy to report I've been making lots of progress on Shield integration. I still need to put some work in on TLS management, but if anyone wants to offer feedback regarding design choices, please to check out the branch diff and let me know if the design pattern looks good from and end-user perspective. I'm first and foremost interested to know if this is usable for people.

I've gotten most of the CI shakiness sorted out, so Shield support is the primary focus right now.

cdenneen · 2016-04-21T16:42:21Z

@tylerjl awesome. Can you add realm support to that fork as well?
(http://blog.comperiosearch.com/blog/2015/08/21/elasticsearch-security-shield/)

tylerjl · 2016-04-21T20:17:02Z

@cdenneen assuming you mean Active Directory realms, standing up a Windows test environments is obviously less straightforward than a Linux one. That said, I think at the very least managing the role mappings yaml file should be very doable now that I've written the native puppet provider for the yaml roles file. Again, I'm not sure what the most preferable way would be for people to define those mappings in puppet, so feel free to comment on the branch regarding how it's shaping up.

cregev · 2016-04-21T20:55:08Z

Any update when the PR is going to be merged ?

tylerjl · 2016-04-21T22:46:21Z

No ETA yet, but I'll be pushing updates regularly if anyone is anxious to watch progress.

cdenneen · 2016-04-22T15:35:13Z

@tylerjl sort of... the AD is just one option of a realm, you could use just about anything (LDAP, AD, OAuth, SAML, etc)... it's basically because of the custom realms (https://www.elastic.co/guide/en/shield/current/custom-realms.html) so esusers is just one realm...
In your fork if you can define type the realm it could handle esusers for the basic use case but then just about any other provider anyone wants to use for shield authentication.

So for example most of my auth will be with esusers for applications but I would probably secure Kibana with shield and then use AD/LDAP to auth users to Kibana.

fgimian · 2016-05-02T01:38:39Z

Would love to see this too, so that we can easily manage local Shield users and LDAP configuration with Puppet 😄

For now, I'll write my own little module to manage these files.

Keep up the great work guys!
Fotis

cdenneen · 2016-05-04T16:33:20Z

@tylerjl I see the mapping support in your fork... I think once the realm support is added to the elasticsearch.yml.erb this should be working, right?

tylerjl · 2016-05-04T16:40:22Z

@cdenneen from what I understand, configuring realms is just a matter of defining the configuration in elasticsearch.yml.erb. If that's the case, you'd just need to set the config => parameter in puppet as needed, so there isn't anything realm-specific the module needs (other than the mappings support.) Does that sound right? (this is helpful, I haven't done anything with AD in the past.)

cdenneen · 2016-05-04T16:54:14Z

Yeah I'm guessing custom config block would work and result should look something like this:

shield.authc:
  realms:
    esusers:
      type: esusers
      order: 0
    ldap1:
      type: ldap
      order: 1
      enabled: false
      url: 'url_to_ldap1'
    ...
    ad1:
      type: active_directory
      order: 3
      url: 'url_to_ad'

Otherwise maybe an if condition and loop in the erb to do hiera lookup for all shield::authc::realms similar to https://github.com/jfryman/puppet-nginx/blob/master/templates/vhost/location_header.erb#L31-L45?

tylerjl · 2016-05-04T18:36:35Z

I'd prefer to avoid hiera lookups in the erb per Puppet Lab's recommendation, but defining realm configuration in hiera should be as simple as something like the following in a hiera yaml config:

---
elasticsearch::config:
  shield.authc:
    realms:
      esusers:
        type: users
...

The config on the elasticsearch class gets merged with instance configs so there's no need for hiera gymnastics to get a config into a defined instance in a manifest.

cdenneen · 2016-05-04T18:47:09Z

Understood... makes sense... so what's left before your fork gets merged? Documentation updates?

tylerjl · 2016-05-04T20:42:58Z

Just getting the tests clear, still some failures on different OS/puppet combinations... close, though! 😅

tylerjl · 2016-05-05T19:41:30Z

PR open at #624, do provide feedback if there are any parts that need changing.

cdenneen · 2016-06-02T15:53:24Z

@tylerjl using 0.11.0 and getting

==> eslog: Error: Could not find a suitable provider for elasticsearch_shield_user_roles

tylerjl · 2016-06-02T15:56:44Z

@cdenneen those types of errors crop up when the requirements for a provider haven't been met - my guess is that it comes from this line which checks for the existense of the users_roles file.

Is /usr/share/elasticsearch/shield/users_roles on the target system? (assuming this isn't OpenBSD)

cdenneen · 2016-06-02T16:12:57Z

@tylerjl no it doesn't exist there:

[vagrant@eslog ~]$ ls /usr/share/elasticsearch/shield/*
/usr/share/elasticsearch/shield/roles.yml  /usr/share/elasticsearch/shield/users

but does here:

[vagrant@eslog ~]$ ls /etc/elasticsearch/shield/
logging.yml  role_mapping.yml  roles.yml  roles.yml.new  users  users_roles

cdenneen · 2016-06-02T16:56:04Z

@tylerjl this is a fresh vagrant box (so no cruft there).
Is there something in config hash I need to add in order for this users_roles to be created under /usr/share/elasticsearch/shield? This is 2.3.3 centos rpm install from managed repo

tylerjl · 2016-06-02T18:30:45Z

So, my guess is that the shield directory exists there in /etc because the plugin was either installed 1) by hand or 2) by an earlier version of the module.

In version 0.11.0 of the module, any plugin installations should drop configuration files into /usr/share/elasticsearch, which helps to centralize the shield config files - earlier versions or running plugin by hand assumes your elasticsearch config lives under /etc/elasticsearch, which doesn't work with the puppet module's notion of instances, so the config files are centralized in /usr/share and recursively kept in sync with all instance directories (i.e., /etc/elasticsearch/es-01/shield.)

In your case you could either remove the module and then define it in puppet to manage, or, if you just want to start from scratch without any of the pre-packaged roles/other files, you could always create users_roles by hand.

It's a bit of an ugly workaround, but it's a compromise between respecting the security manager's need to only read config files within the CONF_DIR and keeping /etc/elasticsearch consistent with directories representing instance CONF_DIRs.

bflad · 2016-06-06T21:04:53Z

@tylerjl thanks for that last comment. In our environment we installed the shield plugin with an older version of this Puppet module and we needed to run this before running Puppet with the newer module: cp -R /etc/elasticsearch/es-01/shield /usr/share/elasticsearch/. Worked like a charm afterwards!

cdenneen · 2016-06-06T22:02:43Z

@bflad thanks, I haven't had time to test.
@tylerjl guess my concern here is you can't always assume that this will be on a fresh install so I would suggest there be some method of cleaning up old shield plugin prior to new plugin install so this works if it's added to already built system? Maybe one running 1.7.3 and 0.10.0 of the module.

cdenneen · 2016-06-09T15:01:02Z

@tylerjl There must still be a ordering issue here.
I rebuilt fresh box using 0.11.0 and it worked... made some modifications, destroyed my vagrant boxes, went to rebuild and got the same
==> eslogmd: Error: Could not find a suitable provider for elasticsearch_shield_user_roles

Here is the output of those directories as before:

[vagrant@eslogmd ~]$ ls -la /etc/elasticsearch/
total 12
drwxr-xr-x   3 elasticsearch elasticsearch   20 Jun  9 10:48 .
drwxr-xr-x. 84 root          root          8192 Jun  9 10:48 ..
drwxr-x---   2 root          elasticsearch    6 May 17 11:49 scripts

[vagrant@eslogmd ~]$ ls -la /usr/share/elasticsearch/shield/
total 0
drwxr-xr-x 2 root          root            6 Jun  9 10:48 .
drwxr-xr-x 8 elasticsearch elasticsearch 146 Jun  9 10:48 ..

Only large difference is I changed the default elasticsearch::config in hiera to be able to have per instance configs.

Here is all my data, in case it helps tell me where I made a stupid mistake ;-):

---
profile::elasticsearch::es_instances:
  - 'es-01'
  - 'es-02'
profile::elasticsearch::data::config:
  cluster.name: 'loggingtest'
  node.name: "%{::hostname}"
  index.number_of_shards: 5
  index.number_of_replicas: 1
  bootstrap.mlockall: true
  discovery.zen.ping.unicast.hosts:
    - 'eslogmd'
    - 'eslogm'
  discovery.zen.ping.multicast.enabled: false
  node.data: true
  node.master: false
  http.port: '9200'
profile::elasticsearch::master::config:
  cluster.name: 'loggingtest'
  node.name: "%{::hostname}"
  index.number_of_shards: 5
  index.number_of_replicas: 1
  bootstrap.mlockall: true
  discovery.zen.ping.unicast.hosts:
    - 'eslogmd'
    - 'eslogm'
  discovery.zen.ping.multicast.enabled: false
  node.data: false
  node.master: true
  http.port: '9201'
elasticsearch::init_defaults:
  ES_HOME: '/usr/share/elasticsearch'
  MAX_OPEN_FILES: '65535'
  MAX_MAP_COUNT: '262144'
  LOG_DIR: '/var/log/elasticsearch'
  DATA_DIR: '/data/elasticsearch'
  WORK_DIR: '/tmp/elasticsearch'
  ES_USER: 'elasticsearch'
  ES_GROUP: 'elasticsearch'
  ES_JAVA_OPTS: '-Djava.net.preferIPv4Stack=true'
  ES_HEAP_SIZE: '128m'
  MAX_LOCKED_MEMORY: 'unlimited'
elasticsearch::java: true
elasticsearch::datadir: '/data/elasticsearch'
elasticsearch::status: 'enabled'

class profile::elasticsearch(
  $es_instances = 'es-01'
) {
  class {'::elasticsearch':
    manage_repo   => true,
    repo_version  => '2.x',
    java_package  => 'java-1.8.0-openjdk-headless',
    package_pin   => true,
    version       => '2.3.3',
  }
  ::elasticsearch::plugin{'mobz/elasticsearch-head':
    instances => $es_instances
  }
  ::elasticsearch::plugin{'lmenezes/elasticsearch-kopf':
    instances => $es_instances
  }
  ::elasticsearch::plugin{'royrusso/elasticsearch-hq':
    instances => $es_instances
  }
  ::elasticsearch::plugin{'license':
    instances => $es_instances
  }
  ::elasticsearch::plugin{'shield':
    instances => $es_instances
  }
  ::elasticsearch::plugin{'watcher':
    instances => $es_instances
  }
  ::elasticsearch::plugin{'marvel-agent':
    instances => $es_instances
  }
  elasticsearch::shield::user { 'admin':
    password => 'changeme',
    roles    => ['admin'],
  }
}

class profile::elasticsearch::master(
  $config = {}
) {
  include ::profile::elasticsearch
  ::elasticsearch::instance {'es-02':
    config => $config
  }
  firewall { '101 ES MASTER HTTP':
    proto  => 'tcp',
    action => 'accept',
    dport  => ['9201','9301'],
  }
}

class profile::elasticsearch::data(
  $config = {}
) {
  include ::profile::elasticsearch
  ::elasticsearch::instance {'es-01':
    config => $config
  }
  firewall { '100 ES HTTP':
    proto  => 'tcp',
    action => 'accept',
    dport  => ['9200','9300'],
  }
}

tylerjl · 2016-06-09T15:22:53Z

Regarding the assumption about previous installations, you're right - we should handle that case, that may be tricky, have to think on that one.

Regarding the ordering issue: that's strange; as seen here, the resource ordering explicitly enforces that the plugin should be installed before any shield users/roles are handled:

...
    -> Elasticsearch::Plugin <| |>
    -> Elasticsearch::Shield::Role <| |>
    -> Elasticsearch::Shield::User <| |>
...

One ordering issue that may come up in that manifest is that shield will fail to install unless the license plugin is there first, but it sounds like that isn't happening. I wonder if any other errors would show up if you got rid of the elasticsearch::shield::user resource and one of the plugins failed to install, etc.

Agreed about the hiera config; I don't think that would be related to the issue you're seeing.

cdenneen · 2016-06-09T16:03:07Z

From the clean run here are the errors in order received:

==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Plugin[mobz/elasticsearch-head]/Elasticsearch_plugin[mobz/elasticsearch-head]: Could not evaluate: undefined method `scan' for nil:NilClass
==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Plugin[lmenezes/elasticsearch-kopf]/Elasticsearch_plugin[lmenezes/elasticsearch-kopf]: Could not evaluate: undefined method `scan' for nil:NilClass
==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Plugin[royrusso/elasticsearch-hq]/Elasticsearch_plugin[royrusso/elasticsearch-hq]: Could not evaluate: undefined method `scan' for nil:NilClass
==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Plugin[license]/Elasticsearch_plugin[license]: Could not evaluate: undefined method `scan' for nil:NilClass
==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Plugin[shield]/Elasticsearch_plugin[shield]: Could not evaluate: undefined method `scan' for nil:NilClass
==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Plugin[watcher]/Elasticsearch_plugin[watcher]: Could not evaluate: undefined method `scan' for nil:NilClass
==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Plugin[marvel-agent]/Elasticsearch_plugin[marvel-agent]: Could not evaluate: undefined method `scan' for nil:NilClass
==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Shield::User[admin]/Elasticsearch_shield_user[admin]: Provider esusers is not functional on this host
==> eslogmd: Error: /Stage[main]/Profile::Elasticsearch/Elasticsearch::Shield::User[jaguar]/Elasticsearch_shield_user[jaguar]: Provider esusers is not functional on this host
==> eslogmd: Error: Could not find a suitable provider for elasticsearch_shield_user_roles

Only other thing I can think of is the $es_instances APL for the profiles?
Anything you'd like me to try next?

tylerjl · 2016-06-09T17:30:18Z

Ah, so the plugins aren't getting installed. That'll definitely cause problems. The specific error comes from this scan call which is scanning the string looking for the ES version. So something is wrong when the plugin provider tries to call elasticsearch -version.

I'd dig into how that call behaves (inside and outside puppet), it may be a Java issue of some sort.

cdenneen · 2016-06-09T20:41:37Z

@tylerjl I think the issue might be i have some class parameters in profile::elasticsearch for elasticsearch and some in hiera like elasticsearch::java_install: true.
Can you not mix parameters like this?
I have deep_merge on in hiera.yaml if that helps.

cdenneen · 2016-06-09T21:45:19Z

@tylerjl merging wasn't happening because of a typo... so that is fixed and now java is installing so rest is working... just getting odd start error and nothing getting logged to any logs:

[vagrant@eslogmd ~]$ sudo service elasticsearch-es-01 status
Redirecting to /bin/systemctl status  elasticsearch-es-01.service
● elasticsearch-es-01.service - Starts and stops a single elasticsearch instance on this system
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch-es-01.service; enabled; vendor preset: disabled)
   Active: failed (Result: signal) since Thu 2016-06-09 17:42:23 EDT; 2s ago
     Docs: http://www.elasticsearch.org
  Process: 14767 ExecStart=/usr/share/elasticsearch/bin/elasticsearch -d -p /var/run/elasticsearch/elasticsearch-es-01.pid -Des.default.path.home=${ES_HOME} -Des.default.path.logs=${LOG_DIR} -Des.default.path.data=${DATA_DIR} -Des.default.path.work=${WORK_DIR} -Des.default.path.conf=${CONF_DIR} (code=exited, status=0/SUCCESS)
 Main PID: 14778 (code=killed, signal=KILL)

Jun 09 17:42:22 eslogmd.vm.local systemd[1]: Starting Starts and stops a single elasticsearch instance on this system...
Jun 09 17:42:22 eslogmd.vm.local systemd[1]: PID file /var/run/elasticsearch/elasticsearch-es-01.pid not readable (yet?) after start.
Jun 09 17:42:23 eslogmd.vm.local systemd[1]: Started Starts and stops a single elasticsearch instance on this system.
Jun 09 17:42:23 eslogmd.vm.local systemd[1]: elasticsearch-es-01.service: main process exited, code=killed, status=9/KILL
Jun 09 17:42:23 eslogmd.vm.local systemd[1]: Unit elasticsearch-es-01.service entered failed state.
Jun 09 17:42:23 eslogmd.vm.local systemd[1]: elasticsearch-es-01.service failed.

tylerjl · 2016-06-10T01:05:59Z

Progress! 🏁

What shows up in the Elasticsearch instance's logs in /var/log? If the systemd status output doesn't have anything, there's bound to be something in some of the separate instance logs...

cdenneen · 2016-06-10T13:29:21Z

All the logs are 0 bytes. Need to manually execute to debug I'm afraid.

Update: So starting through systemctl isn't working but manually works. Is there something wrong with the systemd service file?

[root@eslogmd systemd]# more system/multi-user.target.wants/elasticsearch-es-02.service
[Unit]
Description=Starts and stops a single elasticsearch instance on this system
Documentation=http://www.elasticsearch.org

[Service]
Type=forking
EnvironmentFile=/etc/sysconfig/elasticsearch-es-02
User=elasticsearch
Group=elasticsearch
PIDFile=/var/run/elasticsearch/elasticsearch-es-02.pid
ExecStart=/usr/share/elasticsearch/bin/elasticsearch -d -p /var/run/elasticsearch/elasticsearch-es-02.pid -Des.default.path.home=${ES_HOME} -Des.default.path.logs=${LOG_DIR} -Des.default.path.data=${DATA_DIR}
 -Des.default.path.work=${WORK_DIR} -Des.default.path.conf=${CONF_DIR}
# See MAX_OPEN_FILES in sysconfig
LimitNOFILE=65535
# See MAX_LOCKED_MEMORY in sysconfig, use "infinity" when MAX_LOCKED_MEMORY=unlimited and using bootstrap.mlockall: true

LimitMEMLOCK=infinity

# Shutdown delay in seconds, before process is tried to be killed with KILL (if configured)
TimeoutStopSec=20

[Install]
WantedBy=multi-user.target
[root@eslogmd systemd]# ps -ef |grep elastic
root     15267 14810  0 10:09 pts/0    00:00:00 grep --color=auto elastic
[root@eslogmd systemd]# su -s /bin/bash -c "source /etc/sysconfig/elasticsearch-es-02; /usr/share/elasticsearch/bin/elasticsearch -d -p /var/run/elasticsearch/elasticsearch-es-02.pid -Des.default.path.home=${ES_HOME} -Des.default.path.logs=${LOG_DIR} -Des.default.path.data=${DATA_DIR} -Des.default.path.work=${WORK_DIR} -Des.default.path.conf=${CONF_DIR}" elasticsearch
[root@eslogmd systemd]# ps -ef |grep elastic
elastic+ 15281     1 97 10:10 ?        00:00:08 /bin/java -Xms256m -Xmx1g -Djava.awt.headless=true -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError -XX:+DisableExplicitGC -Dfile.encoding=UTF-8 -Djna.nosys=true -Des.path.home=/usr/share/elasticsearch -cp /usr/share/elasticsearch/lib/elasticsearch-2.3.3.jar:/usr/share/elasticsearch/lib/* org.elasticsearch.bootstrap.Elasticsearch start -d -p /var/run/elasticsearch/elasticsearch-es-02.pid -Des.default.path.home=/usr/share/elasticsearch -Des.default.path.logs=/var/log/elasticsearch/es-02 -Des.default.path.data=/data/elasticsearch -Des.default.path.work=/tmp/elasticsearch -Des.default.path.conf=/etc/elasticsearch/es-02
root     15297 14810  0 10:10 pts/0    00:00:00 grep --color=auto elastic
[root@eslogmd systemd]# systemctl status elasticsearch-es-02
● elasticsearch-es-02.service - Starts and stops a single elasticsearch instance on this system
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch-es-02.service; enabled; vendor preset: disabled)
   Active: failed (Result: signal) since Fri 2016-06-10 10:06:51 EDT; 3min 51s ago
     Docs: http://www.elasticsearch.org
  Process: 15214 ExecStart=/usr/share/elasticsearch/bin/elasticsearch -d -p /var/run/elasticsearch/elasticsearch-es-02.pid -Des.default.path.home=${ES_HOME} -Des.default.path.logs=${LOG_DIR} -Des.default.path.data=${DATA_DIR} -Des.default.path.work=${WORK_DIR} -Des.default.path.conf=${CONF_DIR} (code=exited, status=0/SUCCESS)
 Main PID: 15225 (code=killed, signal=KILL)

Jun 10 10:06:49 eslogmd.vm.local systemd[1]: Starting Starts and stops a single elasticsearch instance on this system...
Jun 10 10:06:49 eslogmd.vm.local systemd[1]: PID file /var/run/elasticsearch/elasticsearch-es-02.pid not readable (yet?) after start.
Jun 10 10:06:50 eslogmd.vm.local systemd[1]: Started Starts and stops a single elasticsearch instance on this system.
Jun 10 10:06:51 eslogmd.vm.local systemd[1]: elasticsearch-es-02.service: main process exited, code=killed, status=9/KILL
Jun 10 10:06:51 eslogmd.vm.local systemd[1]: Unit elasticsearch-es-02.service entered failed state.
Jun 10 10:06:51 eslogmd.vm.local systemd[1]: elasticsearch-es-02.service failed.
[root@eslogmd systemd]# ls -la /var/run/elasticsearch/elasticsearch-es-02.pid
-rw-r--r-- 1 elasticsearch elasticsearch 5 Jun 10 10:10 /var/run/elasticsearch/elasticsearch-es-02.pid
[root@eslogmd systemd]# more /var/run/elasticsearch/elasticsearch-es-02.pid
15281

Adding the init defaults just for good measure:

CONF_DIR=/etc/elasticsearch/es-02
CONF_FILE=/etc/elasticsearch/es-02/elasticsearch.yml
DATA_DIR=/data/elasticsearch
ES_GROUP=elasticsearch
ES_HEAP_SIZE=128m
ES_HOME=/usr/share/elasticsearch
ES_JAVA_OPTS=-Djava.net.preferIPv4Stack=true
ES_USER=elasticsearch
LOG_DIR=/var/log/elasticsearch/es-02
MAX_LOCKED_MEMORY=unlimited
MAX_MAP_COUNT=262144
MAX_OPEN_FILES=65535
WORK_DIR=/tmp/elasticsearch

cdenneen · 2016-06-10T17:43:43Z

@tylerjl looks like it's this init_defaults parameter:

MAX_LOCKED_MEMORY: 'unlimited'

Any idea why?

tylerjl · 2016-06-10T17:51:48Z

I found some discussion about this in the past. It looks like the unit file itself explains what you need to set if you've configured mlockall to be true and instruct ES to lock all memory - basically, you need to set the env var to "infinity" (note that I haven't tried this, just reading the documentation comments in the unit file).

cdenneen · 2016-06-11T13:20:36Z

@tylerjl are you sure?
It says if set to 'unlimited', as I do, it will write out 'infinity' in the unit file.
Am I reading that condition correctly?
Maybe worth checking with ES team that 'infinity' actually works?

tylerjl · 2016-06-13T20:10:41Z

@cdenneen do you have a distro and version so I can try and emulate what you're seeing? I tried setting up an instance with the same mlockall and MAX_LOCKED_MEMORY settings in CentOS 7 but didn't see the same behavior.

cdenneen · 2016-06-14T15:51:38Z

@tylerjl CentOS Linux release 7.2.1511 (Core)

sysconfig

CONF_DIR=/etc/elasticsearch/data
CONF_FILE=/etc/elasticsearch/data/elasticsearch.yml
DATA_DIR=/var/lib/elasticsearch
ES_GROUP=elasticsearch
ES_HEAP_SIZE=128m
ES_HOME=/usr/share/elasticsearch
ES_USER=elasticsearch
LOG_DIR=/var/log/elasticsearch/data
MAX_LOCKED_MEMORY=unlimited
MAX_OPEN_FILES=65535

systemd

[Unit]
Description=Starts and stops a single elasticsearch instance on this system
Documentation=http://www.elasticsearch.org

[Service]
Type=forking
EnvironmentFile=/etc/sysconfig/elasticsearch-data
User=elasticsearch
Group=elasticsearch
PIDFile=/var/run/elasticsearch/elasticsearch-data.pid
ExecStart=/usr/share/elasticsearch/bin/elasticsearch -d -p /var/run/elasticsearch/elasticsearch-data.pid -Des.default.path.home=${ES_HOME} -Des.default.path.logs=${LOG_DIR} -Des.default.path.data=${DATA_DIR}
-Des.default.path.work=${WORK_DIR} -Des.default.path.conf=${CONF_DIR}
# See MAX_OPEN_FILES in sysconfig
LimitNOFILE=65535
# See MAX_LOCKED_MEMORY in sysconfig, use "infinity" when MAX_LOCKED_MEMORY=unlimited and using bootstrap.mlockall: true

LimitMEMLOCK=infinity

# Shutdown delay in seconds, before process is tried to be killed with KILL (if configured)
TimeoutStopSec=20

[Install]
WantedBy=multi-user.target

Looks like it's doing what it needs to but when the MAX_LOCKED_MEMORY is set in the init_defaults the systemctl status shows:

[vagrant@eslogd ~]$ sudo systemctl status elasticsearch-data
● elasticsearch-data.service - Starts and stops a single elasticsearch instance on this system
   Loaded: loaded (/usr/lib/systemd/system/elasticsearch-data.service; enabled; vendor preset: disabled)
   Active: failed (Result: signal) since Tue 2016-06-14 11:40:57 EDT; 7min ago
     Docs: http://www.elasticsearch.org
 Main PID: 12142 (code=killed, signal=KILL)

Jun 14 11:40:53 eslogd.vm.local systemd[1]: Starting Starts and stops a single elasticsearch instance on this system...
Jun 14 11:40:54 eslogd.vm.local systemd[1]: PID file /var/run/elasticsearch/elasticsearch-data.pid not readable (yet?) after start.
Jun 14 11:40:54 eslogd.vm.local systemd[1]: Started Starts and stops a single elasticsearch instance on this system.
Jun 14 11:40:57 eslogd.vm.local systemd[1]: elasticsearch-data.service: main process exited, code=killed, status=9/KILL
Jun 14 11:40:57 eslogd.vm.local systemd[1]: Unit elasticsearch-data.service entered failed state.
Jun 14 11:40:57 eslogd.vm.local systemd[1]: elasticsearch-data.service failed.

and the logs are all 0 bytes:

[vagrant@eslogd ~]$ ls -la /var/log/elasticsearch/data/
total 4
drwxr-xr-x 2 elasticsearch root          4096 Jun 14 11:40 .
drwxr-xr-x 3 elasticsearch elasticsearch   17 Jun 14 11:40 ..
-rw-r--r-- 1 elasticsearch elasticsearch    0 Jun 14 11:40 vagrant-logging-access.log
-rw-r--r-- 1 elasticsearch elasticsearch    0 Jun 14 11:40 vagrant-logging_index_indexing_slowlog.log
-rw-r--r-- 1 elasticsearch elasticsearch    0 Jun 14 11:40 vagrant-logging_index_search_slowlog.log
-rw-r--r-- 1 elasticsearch elasticsearch    0 Jun 14 11:40 vagrant-logging.log

I've created a gist with all puppet and hieradata if it helps: https://gist.github.com/cdenneen/e43be80ed3c429c94e2f30767ef603b0

tylerjl · 2016-06-14T18:44:32Z

I got an instance up and running with the sysconfig and yaml config you've got, so I'm not really sure what could be happening. I'd suggest maybe bumping up the logging verbosity level to see if you can get Elasticsearch to spit out any additional information, as from the Puppet side everything seems to be writing configurations correctly.

cdenneen · 2016-06-14T21:50:50Z

Setting instance to DEBUG leaves me with these in the log file:

[2016-06-14 17:49:26,039][DEBUG][bootstrap                ] seccomp(SECCOMP_SET_MODE_FILTER): Function not implemented, falling back to prctl(PR_SET_SECCOMP)...
[2016-06-14 17:49:26,043][DEBUG][bootstrap                ] Linux seccomp filter installation successful, threads: [app]

Does your instance have SELINUX enabled?

tylerjl · 2016-06-14T22:25:10Z

I doubt it - maybe try disabling it across the board and seeing if that fixes things; if so, we can try and determine which access violations are hindering service startup.

cdenneen · 2016-06-15T20:32:09Z

I've narrowed down the issue to the following lines:

bootstrap:
  mlockall: true

These are what cause

[2016-06-15 16:26:31,228][DEBUG][bootstrap                ] seccomp(SECCOMP_SET_MODE_FILTER): Function not implemented, falling back to prctl(PR_SET_SECCOMP)...
[2016-06-15 16:26:31,228][DEBUG][bootstrap                ] Linux seccomp filter installation successful, threads: [app]

If I leave everything else but remove those 2 lines... it works fine.
MAX_LOCKED_MEMORY=unlimited is still in the defaults... the infinity in the systemd is still there...

tylerjl · 2016-06-15T21:15:52Z

@cdenneen this is probably outside the scope of this ticket at this point - if you want to post a summary of what you've got so far on https://discuss.elastic.co/c/elasticsearch, we can probably get more eyes on it and pull in some ES devs if we need to.

cdenneen · 2016-06-15T21:21:43Z

@tylerjl thanks... opened support case

cdenneen · 2016-07-20T12:53:03Z

@TylerLJ ticket yielded memory issue was the cause. Need at least 512mb. Logging should be corrected to show something.

tylerjl · 2016-07-28T14:59:19Z

AFAIK, the only API access that requires username/password auth credentials is template management, and with the merge of #663 the top-level parameters api_* expose SSL, username, and password options for people using shield. As per normal practices, elasticsearch::template will inherit those top-level settings but they can be set explicitly on template resources if needed.

If there are additional API options that need to be exposed aside from the API protocol, host, port, basic auth username and password, and whether or not to validate SSL certs, do raise an issue to address it.

bilsch mentioned this issue Oct 5, 2015

confusing config directories when using the Shield plugin #433

Closed

bilsch mentioned this issue Oct 14, 2015

initial commit for shield. users and roles only #464

Closed

dansajner mentioned this issue Feb 3, 2016

Add support for SSL auth on api calls. #577

Merged

tylerjl added this to the 0.13.0 milestone Jul 27, 2016

tylerjl closed this as completed Jul 28, 2016

Shield integration #399

Shield integration #399

Comments

electrical commented Jul 1, 2015

cdenneen commented Mar 1, 2016

electrical commented Mar 1, 2016

cdenneen commented Mar 2, 2016

tylerjl commented Mar 3, 2016

cdenneen commented Mar 5, 2016

tylerjl commented Mar 7, 2016

cdenneen commented Apr 21, 2016

tylerjl commented Apr 21, 2016

cdenneen commented Apr 21, 2016

tylerjl commented Apr 21, 2016

cregev commented Apr 21, 2016

tylerjl commented Apr 21, 2016

cdenneen commented Apr 22, 2016

fgimian commented May 2, 2016

cdenneen commented May 4, 2016

tylerjl commented May 4, 2016

cdenneen commented May 4, 2016

tylerjl commented May 4, 2016

cdenneen commented May 4, 2016

tylerjl commented May 4, 2016

tylerjl commented May 5, 2016

cdenneen commented Jun 2, 2016

tylerjl commented Jun 2, 2016

cdenneen commented Jun 2, 2016

cdenneen commented Jun 2, 2016 • edited Loading

tylerjl commented Jun 2, 2016

bflad commented Jun 6, 2016

cdenneen commented Jun 6, 2016

cdenneen commented Jun 9, 2016

tylerjl commented Jun 9, 2016

cdenneen commented Jun 9, 2016

tylerjl commented Jun 9, 2016

cdenneen commented Jun 9, 2016

cdenneen commented Jun 9, 2016

tylerjl commented Jun 10, 2016

cdenneen commented Jun 10, 2016 • edited Loading

cdenneen commented Jun 10, 2016

tylerjl commented Jun 10, 2016

cdenneen commented Jun 11, 2016

tylerjl commented Jun 13, 2016

cdenneen commented Jun 14, 2016

tylerjl commented Jun 14, 2016

cdenneen commented Jun 14, 2016 • edited Loading

tylerjl commented Jun 14, 2016

cdenneen commented Jun 15, 2016

tylerjl commented Jun 15, 2016

cdenneen commented Jun 15, 2016

cdenneen commented Jul 20, 2016

tylerjl commented Jul 28, 2016

cdenneen commented Jun 2, 2016 •

edited

Loading

cdenneen commented Jun 10, 2016 •

edited

Loading

cdenneen commented Jun 14, 2016 •

edited

Loading