Skip to content
This repository has been archived by the owner on Jun 1, 2023. It is now read-only.

Use a healthcheck as recommended by the puppet-grafana docs #26

Closed
wants to merge 1 commit into from

Conversation

genebean
Copy link
Contributor

@genebean genebean commented Jan 3, 2019

This builds on #25. It implements some additional ordering of resources so that ones that require the API are not run before the API is up.

@genebean genebean added the WIP Work in progress label Jan 4, 2019
@genebean genebean removed the WIP Work in progress label Jan 4, 2019
@genebean
Copy link
Contributor Author

genebean commented Jan 4, 2019

Should have done this earlier, my mistake. Tests should pass now:

$ pdk test unit
pdk (INFO): Using Ruby 2.5.1
pdk (INFO): Using Puppet 6.0.2
[✔] Preparing to run the unit tests.
[✔] Running unit tests.
  Evaluated 343 tests in 126.045424 seconds: 0 failures, 0 pending.

Total resources:   58
Touched resources: 58
Resource coverage: 100.00%

@genebean
Copy link
Contributor Author

genebean commented Jan 4, 2019

Annnd now it seems that the node declaration in spec/classes/init_spec.rb is being ignored on Travis...

expected that the catalogue would contain Class[puppet_metrics_dashboard] with master_list set to ["testhost.example.com"] but it is set to ["travis-job-3a96c840-be26-47d9-b045-fe7ef1499c0c.c.eco-emissary-99515.internal"], and parameter puppetdb_list set to ["testhost.example.com"] but it is set to ["travis-job-3a96c840-be26-47d9-b045-fe7ef1499c0c.c.eco-emissary-99515.internal"]

@genebean
Copy link
Contributor Author

Testing:

Fresh CentOS 7, all defaults

~ » sudo -i puppet apply -e "include puppet_metrics_dashboard"                                                                                   vagrant@localhost
Notice: Compiled catalog for localhost.localdomain in environment production in 0.43 seconds
Notice: /Stage[main]/Puppet_metrics_dashboard::Repos/Yumrepo[influxdb]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Repos/Yumrepo[grafana-repo]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Install/Package[influxdb]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Service/Service[influxdb]/ensure: ensure changed 'stopped' to 'running'
Notice: Unable to connect to the server (http://localhost:8086): Failed to open TCP connection to localhost:8086 (Address family not supported by protocol - socket(2) for "localhost" port 8086)
Notice: Failed to make an HTTP connection; sleeping 10 seconds before retry
Notice: /Stage[main]/Puppet_metrics_dashboard::Post_start_configs/Exec[create influxdb admin user]/returns: executed successfully
Notice: /Stage[main]/Puppet_metrics_dashboard::Post_start_configs/Exec[create influxdb puppet_metrics database telegraf]/returns: executed successfully
Notice: /Stage[main]/Grafana::Install/Package[fontconfig]/ensure: created
Notice: /Stage[main]/Grafana::Install/Package[grafana]/ensure: created
Notice: /Stage[main]/Grafana::Config/File[/etc/grafana/grafana.ini]/content: content changed '{md5}f2042cbb229fba6b450b339ad549b6e6' to '{md5}719ef8c42415b35ed434b2f0e3a8e945'
Notice: /Stage[main]/Grafana::Config/File[/etc/grafana/grafana.ini]/owner: owner changed 'root' to 'grafana'
Notice: /Stage[main]/Grafana::Config/File[/etc/grafana/grafana.ini]/seluser: seluser changed 'unconfined_u' to 'system_u'
Notice: /Stage[main]/Grafana::Config/File[/var/lib/grafana/plugins]/ensure: created
Notice: /Stage[main]/Grafana::Service/Service[grafana-server]/ensure: ensure changed 'stopped' to 'running'
Notice: /Stage[main]/Puppet_metrics_dashboard::Grafana/Exec[update Grafana admin password]: Triggered 'refresh' from 1 event
Notice: /Stage[main]/Puppet_metrics_dashboard::Post_start_configs/Grafana_datasource[influxdb_telegraf]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Install/Package[telegraf]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf agent - interval]/value: value changed [redacted sensitive information] to [redacted sensitive information]
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf agent - collection_jitter]/value: value changed [redacted sensitive information] to [redacted sensitive information]
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf agent - flush_interval]/value: value changed [redacted sensitive information] to [redacted sensitive information]
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf agent - flush_jitter]/value: value changed [redacted sensitive information] to [redacted sensitive information]
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf agent - precision]/value: value changed [redacted sensitive information] to [redacted sensitive information]
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf agent - logfile]/value: value changed [redacted sensitive information] to [redacted sensitive information]
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf agent - hostname]/value: value changed [redacted sensitive information] to [redacted sensitive information]
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf outputs.influxdb - urls]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf outputs.influxdb - database]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf outputs.influxdb - retention_policy]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf outputs.influxdb - write_consistency]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf outputs.influxdb - timeout]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/File[/etc/telegraf/telegraf.d/puppet_metrics_dashboard.conf]/ensure: defined content as '{md5}342ce8ae5c4114f945f2d5e39b0b798a'
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Service/Service[telegraf]/ensure: ensure changed 'stopped' to 'running'
Notice: Applied catalog in 59.50 seconds
------------------------------------------------------------
~ » sudo -i puppet apply -e "include puppet_metrics_dashboard"                                                                                   vagrant@localhost
Notice: Compiled catalog for localhost.localdomain in environment production in 0.40 seconds
Notice: Applied catalog in 1.37 seconds
------------------------------------------------------------

Fresh CentOS 7, custom password, add dashboards

~ » sudo -i puppet apply -e "class{'puppet_metrics_dashboard': grafana_password => 'puppet123', overwrite_dashboards => false, add_dashboard_examples => true,}"
Notice: Compiled catalog for localhost.localdomain in environment production in 0.42 seconds
Notice: /Stage[main]/Puppet_metrics_dashboard::Repos/Yumrepo[influxdb]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Repos/Yumrepo[grafana-repo]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Install/Package[influxdb]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Service/Service[influxdb]/ensure: ensure changed 'stopped' to 'running'
Notice: Unable to connect to the server (http://localhost:8086): Failed to open TCP connection to localhost:8086 (Address family not supported by protocol - socket(2) for "localhost" port 8086)
Notice: Failed to make an HTTP connection; sleeping 10 seconds before retry
Notice: /Stage[main]/Puppet_metrics_dashboard::Post_start_configs/Exec[create influxdb admin user]/returns: executed successfully
Notice: /Stage[main]/Puppet_metrics_dashboard::Post_start_configs/Exec[create influxdb puppet_metrics database telegraf]/returns: executed successfully
Notice: /Stage[main]/Grafana::Install/Package[fontconfig]/ensure: created
Notice: /Stage[main]/Grafana::Install/Package[grafana]/ensure: created
Notice: /Stage[main]/Grafana::Config/File[/etc/grafana/grafana.ini]/content: content changed '{md5}f2042cbb229fba6b450b339ad549b6e6' to '{md5}719ef8c42415b35ed434b2f0e3a8e945'
Notice: /Stage[main]/Grafana::Config/File[/etc/grafana/grafana.ini]/owner: owner changed 'root' to 'grafana'
Notice: /Stage[main]/Grafana::Config/File[/etc/grafana/grafana.ini]/seluser: seluser changed 'unconfined_u' to 'system_u'
Notice: /Stage[main]/Grafana::Config/File[/var/lib/grafana/plugins]/ensure: created
Notice: /Stage[main]/Grafana::Service/Service[grafana-server]/ensure: ensure changed 'stopped' to 'running'
Notice: /Stage[main]/Puppet_metrics_dashboard::Grafana/Exec[update Grafana admin password]: Triggered 'refresh' from 1 event
Notice: /Stage[main]/Puppet_metrics_dashboard::Post_start_configs/Grafana_datasource[influxdb_telegraf]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Dashboards/File[/opt/puppetlabs/puppet/cache/state/overwrite_dashboards_disabled]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Dashboards::Telegraf/Grafana_dashboard[Telegraf PuppetDB Performance]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Dashboards::Telegraf/Grafana_dashboard[Telegraf PuppetDB Workload]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Dashboards::Telegraf/Grafana_dashboard[Telegraf Puppetserver Performance]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Dashboards::Telegraf/Grafana_dashboard[Telegraf File Sync Metrics]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Install/Package[telegraf]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf agent - interval]/value: value changed [redacted sensitive information] to [redacted sensitive information]
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf agent - collection_jitter]/value: value changed [redacted sensitive information] to [redacted sensitive information]
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf agent - flush_interval]/value: value changed [redacted sensitive information] to [redacted sensitive information]
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf agent - flush_jitter]/value: value changed [redacted sensitive information] to [redacted sensitive information]
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf agent - precision]/value: value changed [redacted sensitive information] to [redacted sensitive information]
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf agent - logfile]/value: value changed [redacted sensitive information] to [redacted sensitive information]
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf agent - hostname]/value: value changed [redacted sensitive information] to [redacted sensitive information]
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf outputs.influxdb - urls]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf outputs.influxdb - database]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf outputs.influxdb - retention_policy]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf outputs.influxdb - write_consistency]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/Ini_setting[telegraf outputs.influxdb - timeout]/ensure: created
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Config/File[/etc/telegraf/telegraf.d/puppet_metrics_dashboard.conf]/ensure: defined content as '{md5}342ce8ae5c4114f945f2d5e39b0b798a'
Notice: /Stage[main]/Puppet_metrics_dashboard::Telegraf::Service/Service[telegraf]/ensure: ensure changed 'stopped' to 'running'
Notice: Applied catalog in 66.27 seconds
------------------------------------------------------------
~ » sudo -i puppet apply -e "class{'puppet_metrics_dashboard': grafana_password => 'puppet123', overwrite_dashboards => false, add_dashboard_examples => true,}"
Notice: Compiled catalog for localhost.localdomain in environment production in 0.43 seconds
Notice: Applied catalog in 1.31 seconds
------------------------------------------------------------

I logged into the web interface to verify the pw and the dashboards

Copy link
Contributor

@suckatrash suckatrash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did some testing just now and ran into a few issues.

manifests/repos.pp Outdated Show resolved Hide resolved

exec { 'update Grafana admin password':
path => '/usr/bin',
command => @("CHANGE_GRAFANA_PW"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still seeing issues with this on the initial run

Error: /Stage[main]/Puppet_metrics_dashboard::Grafana/Exec[update Grafana admin password]: Failed to call refresh: 'curl -X PUT -H "Content-Type: application/json" -d '{
  "oldPassword": "admin",
  "newPassword": "admin",
  "confirmNew": "admin"
}' http://admin:admin@localhost:8080/api/user/password
' returned 7 instead of one of [0]

and attempting to change the password:

Error: /Stage[main]/Puppet_metrics_dashboard::Post_start_configs/Grafana_datasource[influxdb_telegraf]: Could not evaluate: Fail to retrieve datasources (HTTP response: 401/{"message":"Invalid username or password"})

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's your test methodology here? I ask cause, as shown in the output above, a single pass works fine on CentOS 7.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a vmpooler host (Debian 9) running against a 2018.1.5 master. I did notice that the compiled catalog looks a little odd wrt the exec resource in the puppet_metrics_dashboard::grafana class:

            "parameters": {
                "command": "curl -X PUT -H \"Content-Type: application/json\" -d '{\n  \"oldPassword\": \"admin\",\n  \"newPassword\": \"password\",\n  \"confirmNew\": \"password\"\n}' http://admin:admin@localhost:3000/api/user/password\n",
                "cwd": "/usr/share/grafana",
                "path": "/usr/bin",
                "refreshonly": true
            },

Maybe the line breaks are the problem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't be but at least now I've got an idea on what to look at.

@genebean genebean force-pushed the http_conn_validator branch 2 times, most recently from c535d39 to 55a5521 Compare February 21, 2019 16:24
@genebean
Copy link
Contributor Author

@suckatrash Sorry for the hiatus! I have just rebased this on current master. Will you see if you are still having problems? If the admin stuff is still messing up it might be better to abandon that aspect of this and just use the task that I got added to the Grafana module via voxpupuli/puppet-grafana#148

@suckatrash
Copy link
Contributor

Just tested this and I think in the agent run puppet_metrics_dashboard::post_start_configs happens before puppet_metrics_dashboard::grafana (where the password is set). Post start configs uses puppet_metrics_dashboard::grafana_password (which is the new password, not set yet) and so it fails.

Cool that you got a task added! Want to just go with the healthcheck changes since that's a nice improvement? Or maybe some added ordering could solve the problem above?

@genebean
Copy link
Contributor Author

Running some tests now to see if some simple ordering will help

@genebean genebean force-pushed the http_conn_validator branch 3 times, most recently from bd1c2ff to 9610367 Compare March 25, 2019 15:23
@genebean
Copy link
Contributor Author

@suckatrash try this one please

@suckatrash
Copy link
Contributor

@genebean I'm getting the same error as before :(

Error: 
/Stage[main]/Puppet_metrics_dashboard::Post_start_configs/Grafana_datasource[influxdb_telegraf]: 
Could not evaluate: Fail to retrieve datasources (HTTP response: 401/{"message":"Invalid username 
or password"})

@genebean
Copy link
Contributor Author

Can we pair on this @suckatrash as there seems to be a serious disconnect between our two testing methods?

Implement setting the admin password
- Setting occurs during installation or update.
- This fixes puppetlabs#24

Use a healthcheck as recommended by the docs
- This implements some additional ordering of resources so that ones that
  require the API are not run before the API is up.

Also added 'descr' to yum repo to make the yum command happy
@suckatrash
Copy link
Contributor

@genebean - this looks good now. I think the only thing we're missing is a Readme update to reflect how password change works. i.e. - the password will be set during the initial install, otherwise, you'll have to trigger a refresh with the addition of some other config change (setting / unsetting ssl would work, as would changing grafana's port)

@genebean
Copy link
Contributor Author

Given that, should we pull the password stuff back out and just let people change it post-install using one of the fully supported methods?

@suckatrash
Copy link
Contributor

I just found out that in Grafana 6 you'll be prompted to change the password from the default when you log in for the first time. That means that a user might go with a default configuration, log in, change the password and then end up with failing puppet runs until they update the grafana_password parameter.

With that in mind I think we might need a password change mechanism built in.

@runejuhl
Copy link

runejuhl commented Nov 7, 2019

I just found out that in Grafana 6 you'll be prompted to change the password from the default when you log in for the first time.

You should be able to avoid that by setting these environment variables:

GF_AUTH_DISABLE_LOGIN_FORM=true
GF_AUTH_ANONYMOUS_ENABLED=true
GF_AUTH_ANONYMOUS_ORG_ROLE="Admin"

This post has some more info on configuring Dockerized Grafana: https://riamf.github.io/posts/dockerized_grafana_setup/

@gavindidrichsen gavindidrichsen changed the base branch from master to main May 7, 2021 08:28
@gavindidrichsen gavindidrichsen requested a review from a team as a code owner May 7, 2021 08:28
@CLAassistant
Copy link

CLAassistant commented Sep 1, 2021

CLA assistant check
All committers have signed the CLA.

@MartyEwings
Copy link
Contributor

@genebean is this something still relevant or has everything moved on too much

@jarretlavallee
Copy link
Contributor

This would be a welcome addition but this PR is stale with no updates for over 2 years. I am closing this PR, but please reopen it with changes or open a new one with this feature.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants