Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using logging in crmsh #614

Merged

Conversation

liangxin1300
Copy link
Collaborator

@liangxin1300 liangxin1300 commented Jul 28, 2020

Table of contents

  • Motivation
  • New added features
    • Log in bootstrap
    • Message redirection
    • Hierarchical loggers
    • Record interactive input
    • All in /var/log/crmsh.log
  • Existing functionalities comparison
    • Common logging interface
    • Conditional debug
    • Add line number
    • Buffered message

Motivation

The current message/log system in crmsh has below issues:

  • At least two message/log systems available, msg.py for common use and log related functions in bootstrap.py
  • Non-standard log record format for log of bootstrap process
  • All levels redirected to sys.stderr, which might not be convenient for some specific log consume program like hb_report
  • Message api has similar functions like common_error/common_err/err_buf.error, personally I prefer more clear and unify interface

Python logging module can perform exactly the same functionalities and address above issues, in a more flexible way.

New added features

Log in bootstrap

# Current
# After running crm cluster init -y
# cat /var/log/crmsh/ha-cluster-bootstrap.log
+ systemctl reload rsyslog.service
================================================================
2020-08-03 11:14:57+08:00 /usr/sbin/crm cluster init -y
----------------------------------------------------------------
WARNING: chronyd.service is not configured to start at system boot.
+ rpm -q --quiet firewalld
+ firewall-offline-cmd --zone=public --add-port=30865/tcp
success
Warning: ALREADY_ENABLED: '30865:tcp'
+ firewall-offline-cmd --zone=public --add-port=5560/tcp
success
Warning: ALREADY_ENABLED: '5560:tcp'
+ firewall-offline-cmd --zone=public --add-port=7630/tcp
success
Warning: ALREADY_ENABLED: '7630:tcp'
+ firewall-offline-cmd --zone=public --add-port=21064/tcp
success
Warning: ALREADY_ENABLED: '21064:tcp'
+ systemctl enable sshd.service
# Configuring csync2
+ rm -f /etc/csync2/key_hagroup
# Generating csync2 shared key (this may take a while)...
+ csync2 -k /etc/csync2/key_hagroup
# done
+ systemctl enable csync2.socket
# csync2 checking files...
+ csync2 -cr /
# done
+ corosync-keygen -l
Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/urandom.
Writing corosync key to /etc/corosync/authkey.
# Configuring corosync
+ csync2 -rm /etc/corosync/corosync.conf
+ csync2 -rxv /etc/corosync/corosync.conf
Connection closed.
Finished with 0 errors.
+ dd if=/dev/urandom of=/etc/pacemaker/authkey bs=4096 count=1
1+0 records in
1+0 records out
4096 bytes (4.1 kB, 4.0 KiB) copied, 0.000106306 s, 38.5 MB/s
WARNING: Not configuring SBD (/etc/sysconfig/sbd left untouched).
+ rpm -q --quiet firewalld
+ firewall-offline-cmd --zone=public --add-port=5405/udp
success
Warning: ALREADY_ENABLED: '5405:udp'
+ firewall-offline-cmd --zone=public --add-port=5404/udp
success
Warning: ALREADY_ENABLED: '5404:udp'
+ rm -f /var/lib/heartbeat/crm/* /var/lib/pacemaker/cib/*
WARNING: Hawk not installed - not configuring web management interface.
+ systemctl disable sbd.service
+ systemctl enable pacemaker.service
+ systemctl start pacemaker.service
# Waiting for cluster...
# done
# Loading initial cluster configuration
: loading crm config (update), content is:

property cib-bootstrap-options: stonith-enabled=false
op_defaults op-options: timeout=600 record-pending=true
rsc_defaults rsc-options: resource-stickiness=1 migration-threshold=3

# Done (log saved to /var/log/crmsh/ha-cluster-bootstrap.log)
# This PR
# After running crm cluster init -y
Sep 13 11:20:20 15sp2-1 crmsh.ui_context: INFO: crm(live/15sp2-1)# cluster init -y
Sep 13 11:20:20 15sp2-1 crmsh.bootstrap: INFO: ================================================================
Sep 13 11:20:20 15sp2-1 crmsh.bootstrap: INFO: /usr/sbin/crm cluster init -y
Sep 13 11:20:20 15sp2-1 crmsh.bootstrap: INFO: ----------------------------------------------------------------
Sep 13 11:20:20 15sp2-1 crmsh.ui_context: ERROR: cluster.init: Cluster is currently active - can't run
Sep 13 11:20:29 15sp2-1 crmsh.ui_context: INFO: crm(live/15sp2-1)# cluster stop
Sep 13 11:20:32 15sp2-1 crmsh.ui_cluster: INFO: Cluster services stopped
Sep 13 11:20:33 15sp2-1 crmsh.ui_context: INFO: crm(live/15sp2-1)# cluster init -y
Sep 13 11:20:33 15sp2-1 crmsh.bootstrap: INFO: ================================================================
Sep 13 11:20:33 15sp2-1 crmsh.bootstrap: INFO: /usr/sbin/crm cluster init -y
Sep 13 11:20:33 15sp2-1 crmsh.bootstrap: INFO: ----------------------------------------------------------------
Sep 13 11:20:33 15sp2-1 crmsh.bootstrap: INFO: Loading "default" profile from /etc/crm/profiles.yml
Sep 13 11:20:33 15sp2-1 crmsh.bootstrap: WARNING: chronyd.service is not configured to start at system boot.
Sep 13 11:20:33 15sp2-1 crmsh.bootstrap: INFO: invoke: firewall-offline-cmd --zone=public --add-port=30865/tcp
Sep 13 11:20:34 15sp2-1 crmsh.bootstrap: INFO: stdout: success
Sep 13 11:20:34 15sp2-1 crmsh.bootstrap: INFO: stderr: Warning: ALREADY_ENABLED: '30865:tcp'
Sep 13 11:20:34 15sp2-1 crmsh.bootstrap: INFO: invoke: firewall-offline-cmd --zone=public --add-port=5560/tcp
Sep 13 11:20:34 15sp2-1 crmsh.bootstrap: INFO: stdout: success
Sep 13 11:20:34 15sp2-1 crmsh.bootstrap: INFO: stderr: Warning: ALREADY_ENABLED: '5560:tcp'
Sep 13 11:20:34 15sp2-1 crmsh.bootstrap: INFO: invoke: firewall-offline-cmd --zone=public --add-port=7630/tcp
Sep 13 11:20:34 15sp2-1 crmsh.bootstrap: INFO: stdout: success
Sep 13 11:20:34 15sp2-1 crmsh.bootstrap: INFO: stderr: Warning: ALREADY_ENABLED: '7630:tcp'
Sep 13 11:20:34 15sp2-1 crmsh.bootstrap: INFO: invoke: firewall-offline-cmd --zone=public --add-port=21064/tcp
Sep 13 11:20:34 15sp2-1 crmsh.bootstrap: INFO: stdout: success
Sep 13 11:20:34 15sp2-1 crmsh.bootstrap: INFO: stderr: Warning: ALREADY_ENABLED: '21064:tcp'
Sep 13 11:20:35 15sp2-1 crmsh.bootstrap: INFO: Configuring csync2
Sep 13 11:20:35 15sp2-1 crmsh.bootstrap: INFO: invoke: rm -f /etc/csync2/key_hagroup
Sep 13 11:20:35 15sp2-1 crmsh.bootstrap: INFO: Generating csync2 shared key (this may take a while)...
Sep 13 11:20:35 15sp2-1 crmsh.bootstrap: INFO: invoke: csync2 -k /etc/csync2/key_hagroup
Sep 13 11:20:35 15sp2-1 crmsh.bootstrap: INFO: done
Sep 13 11:20:36 15sp2-1 crmsh.bootstrap: INFO: csync2 checking files...
Sep 13 11:20:36 15sp2-1 crmsh.bootstrap: INFO: invoke: csync2 -cr /
Sep 13 11:20:39 15sp2-1 crmsh.bootstrap: INFO: done
Sep 13 11:20:39 15sp2-1 crmsh.bootstrap: INFO: invoke: corosync-keygen -l -k /etc/corosync/authkey
Sep 13 11:20:39 15sp2-1 crmsh.bootstrap: INFO: stdout: Corosync Cluster Engine Authentication key generator.
Gathering 1024 bits for key from /dev/urandom.
Writing corosync key to /etc/corosync/authkey.
Sep 13 11:20:39 15sp2-1 crmsh.bootstrap: INFO: Configuring corosync
Sep 13 11:20:39 15sp2-1 crmsh.bootstrap: INFO: invoke: csync2 -rm /etc/corosync/corosync.conf
Sep 13 11:20:39 15sp2-1 crmsh.bootstrap: INFO: invoke: csync2 -rxv /etc/corosync/corosync.conf
Sep 13 11:20:39 15sp2-1 crmsh.bootstrap: INFO: stderr: Connection closed.
Finished with 0 errors.
Sep 13 11:20:40 15sp2-1 crmsh.bootstrap: INFO: invoke: dd if=/dev/urandom of=/etc/pacemaker/authkey bs=4096 count=1
Sep 13 11:20:40 15sp2-1 crmsh.bootstrap: INFO: stderr: 1+0 records in
1+0 records out
4096 bytes (4.1 kB, 4.0 KiB) copied, 0.000340365 s, 12.0 MB/s
Sep 13 11:20:40 15sp2-1 crmsh.sbd: WARNING: Not configuring SBD - STONITH will be disabled.
Sep 13 11:20:40 15sp2-1 crmsh.bootstrap: INFO: invoke: systemctl disable sbd.service
Sep 13 11:20:40 15sp2-1 crmsh.bootstrap: INFO: invoke: firewall-offline-cmd --zone=public --add-port=5405/udp
Sep 13 11:20:40 15sp2-1 crmsh.bootstrap: INFO: stdout: success
Sep 13 11:20:40 15sp2-1 crmsh.bootstrap: INFO: stderr: Warning: ALREADY_ENABLED: '5405:udp'
Sep 13 11:20:40 15sp2-1 crmsh.bootstrap: INFO: invoke: firewall-offline-cmd --zone=public --add-port=5404/udp
Sep 13 11:20:41 15sp2-1 crmsh.bootstrap: INFO: stdout: success
Sep 13 11:20:41 15sp2-1 crmsh.bootstrap: INFO: stderr: Warning: ALREADY_ENABLED: '5404:udp'
Sep 13 11:20:41 15sp2-1 crmsh.bootstrap: INFO: invoke: rm -f /var/lib/heartbeat/crm/* /var/lib/pacemaker/cib/*
Sep 13 11:20:41 15sp2-1 crmsh.bootstrap: WARNING: Hawk not installed - not configuring web management interface.
Sep 13 11:20:41 15sp2-1 crmsh.bootstrap: INFO: Starting pacemaker...
Sep 13 11:20:42 15sp2-1 crmsh.bootstrap: INFO: done
Sep 13 11:20:42 15sp2-1 crmsh.bootstrap: INFO: Waiting for cluster...
Sep 13 11:21:04 15sp2-1 crmsh.bootstrap: INFO: done
Sep 13 11:21:05 15sp2-1 crmsh.bootstrap: INFO: Loading initial cluster configuration
Sep 13 11:21:05 15sp2-1 crmsh.bootstrap: INFO: Loading crm config (update), content is:
Sep 13 11:21:05 15sp2-1 crmsh.bootstrap: INFO: property cib-bootstrap-options: stonith-enabled=false
op_defaults op-options: timeout=600 record-pending=true
rsc_defaults rsc-options: resource-stickiness=1 migration-threshold=3

Sep 13 11:21:05 15sp2-1 crmsh.bootstrap: INFO: Done (log saved to /var/log/crmsh/crmsh.log)

Message redirection

# Current
# cat test1.py 
from crmsh.msg import common_info, common_error

common_info("This is an info")
common_error("This is an error")
# python3 test1.py 1> /dev/null
INFO: This is an info
ERROR: This is an error
# python3 test1.py 2> /dev/null
# No output this time
# This PR
# cat test2.py
from crmsh import log

log.setup_logging()
logger = log.setup_logger("crmsh.bootstrap")

logger.info("This is an info")
logger.error("This is an error")
# python3 test2.py 1> /dev/null
ERROR: This is an error
# python3 test2.py 2> /dev/null
INFO: This is an info

Hierarchical loggers

# record examples in crmsh.log
Aug 23 15:54:37 15sp2-1 crmsh.bootstrap: INFO: Waiting for cluster...
Aug 23 15:54:22 15sp2-1 crmsh.sbd: INFO: Initializing SBD......
Aug 23 15:25:04 15sp2-1 crmsh.ui_cluster: INFO: Cluster services stopped
Aug 23 16:01:53 15sp2-1 crmsh.ra: ERROR: ocf:heartbeat:xxxx: got no meta-data, does this RA exist?

Record interactive input

# record examples in crmsh.log
Aug 23 15:54:08 15sp2-1 crmsh.bootstrap: INFO: IP or network address to bind to [192.168.122.193]
Aug 23 15:54:12 15sp2-1 crmsh.bootstrap: INFO: input result: 192.168.122.193
Aug 23 15:54:12 15sp2-1 crmsh.bootstrap: INFO: Multicast address [239.25.45.161]
Aug 23 15:54:13 15sp2-1 crmsh.bootstrap: INFO: input result: 239.25.45.161
Aug 23 15:54:13 15sp2-1 crmsh.bootstrap: INFO: Multicast port [5405]
Aug 23 15:54:13 15sp2-1 crmsh.bootstrap: INFO: input result: 5405
Aug 23 15:54:14 15sp2-1 crmsh.bootstrap: INFO: /etc/pacemaker/authkey already exists - overwrite (y/n)?
Aug 23 15:54:16 15sp2-1 crmsh.bootstrap: INFO: input result: y

All in /var/log/crmsh.log

bootstrap

Aug 25 14:38:10 15sp2-1 crmsh.bootstrap: INFO: Starting pacemaker...
Aug 25 14:38:11 15sp2-1 crmsh.bootstrap: INFO: done
Aug 25 14:38:11 15sp2-1 crmsh.bootstrap: INFO: Waiting for cluster...
Aug 25 14:38:34 15sp2-1 crmsh.bootstrap: INFO: done
Aug 25 14:38:34 15sp2-1 crmsh.bootstrap: INFO: Loading initial cluster configuration

other sub-commands

Sep 13 12:05:40 15sp2-1 crmsh.ui_context: INFO: crm(live/15sp2-1)# configure
Sep 13 12:05:40 15sp2-1 crmsh.ui_context: INFO: crm(live/15sp2-1)configure# show
Sep 13 12:05:42 15sp2-1 crmsh.ui_context: INFO: crm(live/15sp2-1)configure# up
Sep 13 12:05:43 15sp2-1 crmsh.ui_context: INFO: crm(live/15sp2-1)# cluster
Sep 13 12:05:44 15sp2-1 crmsh.ui_context: INFO: crm(live/15sp2-1)cluster# up

hb_report

Aug 25 15:45:00 15sp2-1 hb_report: WARNING: could not figure out the log format of /var/log/cluster/corosync.log
Aug 25 15:45:10 15sp2-1 hb_report: INFO: The report is saved in ./hb_report-Wed-25-Aug-2021.tar.bz2
Aug 25 15:45:10 15sp2-1 hb_report: INFO: Report timespan: 08/25/21 03:44:00 - 08/25/21 15:45:10
Aug 25 15:45:10 15sp2-1 hb_report: INFO: Thank you for taking time to create this report.

cluster crash test

Aug 25 15:45:27 15sp2-1 crmsh.ui_context: INFO: Input line: cluster
Aug 25 15:45:32 15sp2-1 crmsh.ui_context: INFO: Input line: crash_test
Aug 25 15:45:47 15sp2-1 crmsh.ui_context: INFO: Input line: crash_test --split-brain-iptables
Aug 25 15:45:47 15sp2-1 crmsh.crash_test.utils: INFO: Simulate split brain by blocking corosync ports
Aug 25 15:45:47 15sp2-1 crmsh.ui_context: INFO: Input line: configure get_property stonith-enabled
Aug 25 15:45:47 15sp2-1 crmsh.ui_context: INFO: Input line: configure get_property stonith-action
Aug 25 15:45:48 15sp2-1 crmsh.ui_context: INFO: Input line: configure get_property stonith-timeout
Aug 25 15:45:52 15sp2-1 crmsh.crash_test.utils: INFO: Trying to temporarily block 15sp2-2 communication ip
Aug 25 15:45:55 15sp2-1 crmsh.crash_test.utils: INFO: Node "15sp2-2" will be fenced by "15sp2-1"!
Aug 25 15:46:13 15sp2-1 crmsh.crash_test.utils: INFO: Node "15sp2-2" was successfully fenced by "15sp2-1"
Aug 25 15:46:13 15sp2-1 crmsh.crash_test.utils: INFO: Trying to recover 15sp2-2 communication ip

Existing functionalities comparison

Common logging interface

# Current
from .msg import common_warn, common_err, common_debug, common_info, err_buf

common_info("This is an info")
common_err("This is an error")
common_warn("This is a warning")
common_debug("This is a debug")
err_buf.warning("This is a warning")
# This PR
from crmsh import log

log.setup_logging()
logger = log.setup_logger("crmsh.bootstrap")

logger.info("This is an info")
logger.error("This is an error")
logger.warning("This is a warning")

# console output
INFO: This is an info
ERROR: This is an error
WARNING: This is a warning

# file output
Aug 21 23:46:25 15sp2-1 crmsh.bootstrap: INFO: This is an info
Aug 21 23:46:25 15sp2-1 crmsh.bootstrap: ERROR: This is an error
Aug 21 23:46:25 15sp2-1 crmsh.bootstrap: WARNING: This is a warning

Conditional debug

# Current
from crmsh.msg import common_debug
from crmsh import config

config.core.debug = "yes"
common_debug("This is a debug")
# This PR
from crmsh import log
from crmsh import config

log.setup_logging()
logger = log.setup_logger("crmsh.bootstrap")
config.core.debug = "yes"

logger.debug("This is a debug")

Add line number(only for original regression test)

# Current
from crmsh.msg import common_info, common_error, err_buf

err_buf.start_tmp_lineno()
err_buf.incr_lineno()
common_info("This is an info")
err_buf.incr_lineno()
common_error("This is an error")
err_buf.stop_tmp_lineno()
# Current output
INFO: 1: This is an info
ERROR: 2: This is an error
# This PR
from crmsh import log

log.setup_logging()
logger = log.setup_logger("crmsh.bootstrap")
logger_utils = log.LoggerUtils(logger)

with logger_utils.line_number():
    logger_utils.incr_lineno()
    logger.info("This is an info")
    logger_utils.incr_lineno()
    logger.error("This is an error")
# This PR output
INFO: 1: This is an info
ERROR: 2: This is an error

Buffered message

# Current
from crmsh.msg import common_info, common_error, err_buf
from crmsh import options

options.batch = True

err_buf.buffer()
common_info("This is an info from buffer")
common_error("This is an error from buffer")
print("This is a text show firstly")
err_buf.release()
# Current output
This is a text show firstly
INFO: This is an info from buffer
ERROR: This is an error from buffer
# This PR
from crmsh import log
from crmsh import options

options.batch = True

log.setup_logging()
logger = log.setup_logger("crmsh.bootstrap")
logger_utils = log.LoggerUtils(logger)

with logger_utils.buffer():
    logger.info("This is an info from buffer")
    logger.error("This is an error from buffer")
    print("This is a text show firstly")
# This PR output
This is a text show firstly
INFO: This is an info from buffer
ERROR: This is an error from buffer

rpm package

https://build.opensuse.org/package/show/home:XinLiang:branches:logging/crmsh

@liangxin1300 liangxin1300 force-pushed the 20200720_using_python_logging branch 28 times, most recently from 1721d9f to e8912c5 Compare August 2, 2020 06:43
@juadk
Copy link

juadk commented Aug 4, 2020

Good works Xin. From what I can see, looks like nothing will break our openQA tests. Anyway, I think this improvement will be added though QAM, which means we will see how our tests will behave.

@liangxin1300 liangxin1300 force-pushed the 20200720_using_python_logging branch from e8912c5 to 7e12d16 Compare August 27, 2020 08:24
@liangxin1300 liangxin1300 force-pushed the 20200720_using_python_logging branch 7 times, most recently from 125a84a to bbc0c66 Compare August 26, 2021 00:42
crmsh/ui_context.py Outdated Show resolved Hide resolved
crmsh/bootstrap.py Outdated Show resolved Hide resolved
crmsh/cibconfig.py Outdated Show resolved Hide resolved
crmsh/cibconfig.py Show resolved Hide resolved
crmsh/corosync.py Outdated Show resolved Hide resolved
crmsh/crash_test/task.py Show resolved Hide resolved
crmsh/crash_test/utils.py Show resolved Hide resolved
crmsh/log.py Show resolved Hide resolved
crmsh/log.py Show resolved Hide resolved
crmsh/utils.py Outdated Show resolved Hide resolved
@liangxin1300 liangxin1300 force-pushed the 20200720_using_python_logging branch 2 times, most recently from f038780 to 9bce87a Compare August 26, 2021 07:18
@liangxin1300 liangxin1300 force-pushed the 20200720_using_python_logging branch 3 times, most recently from c00b291 to 3f68c39 Compare September 3, 2021 04:28
crmsh/bootstrap.py Outdated Show resolved Hide resolved
crmsh/utils.py Show resolved Hide resolved
@liangxin1300 liangxin1300 force-pushed the 20200720_using_python_logging branch from 3f68c39 to 219af9c Compare September 7, 2021 05:18
@liangxin1300 liangxin1300 force-pushed the 20200720_using_python_logging branch from 219af9c to fc17d30 Compare September 13, 2021 02:01
@liangxin1300 liangxin1300 force-pushed the 20200720_using_python_logging branch from fc17d30 to 8f7fe29 Compare September 13, 2021 02:06
@liangxin1300 liangxin1300 merged commit d735666 into ClusterLabs:master Sep 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider using logging
5 participants