Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom facts provided by module slow and thrash glusterd #137

Open
sammcj opened this issue Sep 13, 2017 · 3 comments
Open

Custom facts provided by module slow and thrash glusterd #137

sammcj opened this issue Sep 13, 2017 · 3 comments

Comments

@sammcj
Copy link
Contributor

sammcj commented Sep 13, 2017

The custom facts for the volumes / bricks that this module provides causes gluster volume info and gluster volume status to be called every time the facts are calculated, this may be the cause of #86.

Both the gluster volume status and gluster volume info commands are not designed to be run frequently as they query every node in the cluster along with volume metadata and do create quite a bit of load on the cluster.

On a small cluster with 120 volumes, this puppet module takes over 2.5 minutes just loading facts.

As such, if you have a three node cluster, with each node running puppet every 10 minutes it means that every 10 minutes you're spending 7 minutes just in facter.

I don't have a solution for this, but I don't think people should be constantly running these commands across their clusters, I did just find the gluster get-state command which dumps the state of the cluster out to a local file and is very fast - perhaps this could be used instead?

root@int-gluster-01:~ # time facter -p
real	2m8.956s
user	0m8.099s
sys	0m0.603s

vs

root@int-gluster-01:~ # time gluster get-state
glusterd state dumped to /var/run/gluster/glusterd_state_20170913_160143

real	0m0.471s
user	0m0.010s
sys	0m0.004s
@ekohl
Copy link
Member

ekohl commented Sep 13, 2017

Calling these commands in a loop means you have a linear complexity where dumping a state would be constant so reading from it sounds like a very good idea. I don't have a gluster setup but I'd be willing to help you with the puppet/ruby side of it. Am I reading the docs right that the state dump is an inifile? Could you post an example of a state dump?

@sammcj
Copy link
Contributor Author

sammcj commented Sep 13, 2017 via email

@sammcj
Copy link
Contributor Author

sammcj commented Sep 14, 2017

OK so the following is the format of the state-dump, at the time of this dump the volume was just created (via this puppet module) with replica 3, arbiter 1 and the volume had not yet been started.

[Global]
MYUUID: 917df952-a8a2-4019-ae6e-177ffe938334
op-version: 31200

[Global options]
cluster.brick-multiplex: enable

[Peers]
Peer1.primary_hostname: int-gluster-03.fqdn.com
Peer1.uuid: a348a96e-3e32-4c52-974f-72e1e316c745
Peer1.state: Peer in Cluster
Peer1.connected: Connected
Peer1.othernames:
Peer2.primary_hostname: int-gluster-02.fqdn.com
Peer2.uuid: 8a93118f-9fd9-47aa-b0a3-a5bb5f0f3acd
Peer2.state: Peer in Cluster
Peer2.connected: Connected
Peer2.othernames:

[Volumes]
Volume1.name: my_cool_volume
Volume1.id: d50fec2b-cf69-4ebd-bdd2-0516c2f72f49
Volume1.type: Replicate
Volume1.transport_type: tcp
Volume1.status: Created
Volume1.brickcount: 3
Volume1.Brick1.path: int-gluster-01.fqdn.com:/mnt/gluster-storage/my_cool_volume
Volume1.Brick1.hostname: int-gluster-01.fqdn.com
Volume1.Brick1.port: 0
Volume1.Brick1.rdma_port: 0
Volume1.Brick1.status: Stopped
Volume1.Brick1.spacefree: 53622665216Bytes
Volume1.Brick1.spacetotal: 53656686592Bytes
Volume1.Brick2.path: int-gluster-02.fqdn.com:/mnt/gluster-storage/my_cool_volume
Volume1.Brick2.hostname: int-gluster-02.fqdn.com
Volume1.Brick3.path: int-gluster-03.fqdn.com:/mnt/gluster-storage/my_cool_volume
Volume1.Brick3.hostname: int-gluster-03.fqdn.com
Volume1.snap_count: 0
Volume1.stripe_count: 1
Volume1.replica_count: 3
Volume1.subvol_count: 1
Volume1.arbiter_count: 0
Volume1.disperse_count: 0
Volume1.redundancy_count: 0
Volume1.quorum_status: not_applicable
Volume1.snapd_svc.online_status: Offline
Volume1.snapd_svc.inited: False
Volume1.rebalance.id: 00000000-0000-0000-0000-000000000000
Volume1.rebalance.status: not_started
Volume1.rebalance.failures: 0
Volume1.rebalance.skipped: 0
Volume1.rebalance.lookedup: 0
Volume1.rebalance.files: 0
Volume1.rebalance.data: 0Bytes
Volume1.time_left: 0
Volume1.gsync_count: 0
Volume1.options.storage.linux-aio: enable
Volume1.options.server.event-threads: 1
Volume1.options.performance.rda-cache-limit: 10MB
Volume1.options.performance.quick-read: true
Volume1.options.performance.parallel-readdir: true
Volume1.options.performance.client-io-threads: true
Volume1.options.performance.cache-size: 32MB
Volume1.options.network.ping-timeout: 5
Volume1.options.diagnostics.client-log-level: WARNING
Volume1.options.diagnostics.brick-log-level: WARNING
Volume1.options.cluster.readdir-optimize: true
Volume1.options.cluster.lookup-optimize: true
Volume1.options.client.event-threads: 2
Volume1.options.auth.allow: 10.51.65.*
Volume1.options.transport.address-family: inet
Volume1.options.nfs.disable: true

And that continues on with Volume2, 3 etc...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants