Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic log-file rotation seems to be missing #173

Closed
mpvader opened this issue May 25, 2024 · 20 comments · Fixed by #174
Closed

Automatic log-file rotation seems to be missing #173

mpvader opened this issue May 25, 2024 · 20 comments · Fixed by #174

Comments

@mpvader
Copy link

mpvader commented May 25, 2024

Hi,

I tried to help a user, and encountered that his data partition is 100% full. Can I suggest that you change the project to prevent that, and at a minimum always have log file rotation? Perhaps just use the same type of logging that we do in venus os; it takes care of rotation.

See also: https://community.victronenergy.com/questions/286286/after-cerbo-gx-update-to-v331-i-now-get-error-67-b.html?childToView=287794#comment-287794

I checked his config.ini, logging is set to ERROR.
And the current.log file, 1.134.657.536 bytes large, is full of errors, like below one.

Thanks!

  File "/usr/lib/python3.8/site-packages/requests/adapters.py", line 504, in send
    raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='192.168.178.9', port=80): Max retries exceeded with url: /api/livedata/status (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0xb5bbac28>, 'Connection to 192.168.178.9 timed out. (connect timeout=2.5)'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 159, in _new_conn
    conn = connection.create_connection(
  File "/usr/lib/python3.8/site-packages/urllib3/util/connection.py", line 84, in create_connection
    raise err
  File "/usr/lib/python3.8/site-packages/urllib3/util/connection.py", line 74, in create_connection
    sock.connect(sa)
OSError: [Errno 113] No route to host

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 665, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 387, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python3.8/http/client.py", line 1256, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1302, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1251, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1011, in _send_output
    self.send(msg)
  File "/usr/lib/python3.8/http/client.py", line 951, in send
    self.connect()
  File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 187, in connect
    conn = self._new_conn()
  File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 171, in _new_conn
    raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0xb5bbae20>: Failed to establish a new connection: [Errno 113] No route to host

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 719, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3.8/site-packages/urllib3/util/retry.py", line 436, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='192.168.178.9', port=80): Max retries exceeded with url: /api/livedata/status (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0xb5bbae20>: Failed to establish a new connection: [Errno 113] No route to host'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/dbus-opendtu/dbus_service.py", line 441, in fetch_url
    json_str = requests.get(
  File "/usr/lib/python3.8/site-packages/requests/api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/lib/python3.8/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3.8/site-packages/requests/sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.8/site-packages/requests/sessions.py", line 643, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.8/site-packages/requests/adapters.py", line 516, in send
    raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='192.168.178.9', port=80): Max retries exceeded with url: /api/livedata/status (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0xb5bbae20>: Failed to establish a new connection: [Errno 113] No route to host'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 159, in _new_conn
    conn = connection.create_connection(
  File "/usr/lib/python3.8/site-packages/urllib3/util/connection.py", line 84, in create_connection
    raise err
  File "/usr/lib/python3.8/site-packages/urllib3/util/connection.py", line 74, in create_connection
    sock.connect(sa)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 665, in urlopen
    httplib_response = self._make_request(
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 387, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/lib/python3.8/http/client.py", line 1256, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1302, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1251, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/lib/python3.8/http/client.py", line 1011, in _send_output
    self.send(msg)
  File "/usr/lib/python3.8/http/client.py", line 951, in send
    self.connect()
  File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 187, in connect
    conn = self._new_conn()
  File "/usr/lib/python3.8/site-packages/urllib3/connection.py", line 164, in _new_conn
    raise ConnectTimeoutError(
urllib3.exceptions.ConnectTimeoutError: (<urllib3.connection.HTTPConnection object at 0xb5bea2f8>, 'Connection to 192.168.178.9 timed out. (connect timeout=2.5)')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/requests/adapters.py", line 439, in send
    resp = conn.urlopen(
  File "/usr/lib/python3.8/site-packages/urllib3/connectionpool.py", line 719, in urlopen
    retries = retries.increment(
  File "/usr/lib/python3.8/site-packages/urllib3/util/retry.py", line 436, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='192.168.178.9', port=80): Max retries exceeded with url: /api/livedata/status (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0xb5bea2f8>, 'Connection to 192.168.178.9 timed out. (connect timeout=2.5)'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/dbus-opendtu/dbus-opendtu.py", line 93, in main
    service = DbusService(
  File "/data/dbus-opendtu/dbus_service.py", line 84, in __init__
    self.numberofinverters = self.get_number_of_inverters()
  File "/data/dbus-opendtu/dbus_service.py", line 289, in get_number_of_inverters
    meter_data = self._get_data()
  File "/data/dbus-opendtu/dbus_service.py", line 474, in _get_data
    self._refresh_data()
  File "/data/dbus-opendtu/dbus_service.py", line 363, in _refresh_data
    meter_data = self.fetch_url(url)
  File "/data/dbus-opendtu/helpers.py", line 89, in wrapped_func
    result = func(*args, **kwargs)
  File "/data/dbus-opendtu/dbus_service.py", line 466, in fetch_url
    return self.fetch_url(url, try_number + 1)
  File "/data/dbus-opendtu/helpers.py", line 89, in wrapped_func
    result = func(*args, **kwargs)
  File "/data/dbus-opendtu/dbus_service.py", line 466, in fetch_url
    return self.fetch_url(url, try_number + 1)
  File "/data/dbus-opendtu/helpers.py", line 89, in wrapped_func
    result = func(*args, **kwargs)
  File "/data/dbus-opendtu/dbus_service.py", line 441, in fetch_url
    json_str = requests.get(
  File "/usr/lib/python3.8/site-packages/requests/api.py", line 76, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/lib/python3.8/site-packages/requests/api.py", line 61, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/lib/python3.8/site-packages/requests/sessions.py", line 530, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/lib/python3.8/site-packages/requests/sessions.py", line 643, in send
    r = adapter.send(request, **kwargs)
  File "/usr/lib/python3.8/site-packages/requests/adapters.py", line 504, in send
    raise ConnectTimeout(e, request=request)
requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='192.168.178.9', port=80): M^C
@henne49
Copy link
Owner

henne49 commented May 25, 2024

That is a very good idea, we state to set this back to info or less in documentation. Can you point to the documentation of venus logging? Rhank you!

@mpvader
Copy link
Author

mpvader commented May 25, 2024

Hi,

what we do else where in Venus OS, is make apps and services, python, c or whatever, write their logging info to stdout and stderr.

So no in-app writing to log files.

Then, we use multilog, part of daemontools, to capture stdout and stderr, and logging that to disk. How is shown here, note how for dbus-opendtu the ./log/run file is missing, while for a random one of us its there:

root@einstein:/service/dbus-opendtu# ls
run        service    supervise

root@einstein:/service/dbus-fronius# ls
log        run        supervise

root@einstein:/service/dbus-fronius# cat ./log/
run        supervise/ 

root@einstein:/service/dbus-fronius# cat ./log/run 
#!/bin/sh
exec 2>&1
exec multilog t s25000 n4 /var/log/dbus-fronius

root@einstein:/service/dbus-fronius# cat ./run 
#!/bin/sh
echo "*** starting dbus-fronius ***"
exec 2>&1
exec softlimit -d 100000000 -s 1000000 -a 100000000 /opt/victronenergy/dbus-fronius/dbus-fronius

hope that helps!

ps. important: in your next release ,can you add a line of code that just deletes that current.log and any other log files that your code might have generated? I'm sure doing that will help a lot of people. Or even better, as part of the update/install script that you might have do that: installing a newer version of dbus-opendtu on that system I was logged into is most likely impossible now, since data partition is full..

@0x7878
Copy link
Collaborator

0x7878 commented May 26, 2024

I have a different meaning. First of all, the user has a bigger problem than the log.
Secondly, we use the log to granularly log things from debug to error to debug the software ourselves. also it is easier to find.

I have been running this service for a year with occasional connection errors. My log is 62MB after a year. Have a look:

Screenshot_20240526-104530.png

I think what we could do is give an option "logging=none" and write errors to stderr instead. This is manageable then. I also like your idea of clearing the log from time to time. Maybe check the size? Over 100Mb?

@mpvader
Copy link
Author

mpvader commented May 26, 2024

Hi,

I have a different meaning. First of all, the user has a bigger problem than the log.

I have a different meaning on that. The user had a small problem, ie. something related only to, I suppose misconfiguration of, or a bug in-, dbus-opendtu. But because of how the logging of dbus-opendtu works, this small problem has escalated itself into near-bricking the whole GX device.

And moreover it resulted in the user thinking there might be some bug per Venus OS v3.31 and that resulted in a post on our forum with this title: After Cerbo GX update to V3.31 I now get Error #67 BMS connection lost.

Please think about what seeing such message does to our developers and my-self; as well as anyone else reading it.

I think what we could do is give an option "logging=none" and write errors to stderr instead. This is manageable then. I also like your idea of clearing the log from time to time. Maybe check the size? Over 100Mb?

100Mb is far too much. On that system, here is the number of services, its around 60. Sixty times 100Mb is 6 gigabyte. The data partition is only 1 Gb and has to do more than just store log files. Especially ancient logfiles.

root@einstein:/data/log# ls
boot                         dbus-systemcalc-py           messages.3                   vecan-dbus.can0
boot~                        dbus-vebus-to-pvinverter     messages.4                   vedirect.ttyS5
can-bus-bms.can0             dmesg                        messages.5                   vedirect.ttyS6
can-bus-bms.can1             dmesg.old                    mk2_dbus.ttyS4               vedirect.ttyS7
dbus-adc                     flashmq                      mosquitto                    vedirect.ttyUSB0
dbus-cgwacs.ttyUSB0          gps-dbus.ttyUSB0             mqtt-rpc                     venus-access
dbus-digitalinputs           gui                          netmon                       venus-button-handler
dbus-fronius                 hiawatha                     nginx                        venus-html5-app
dbus-fzsonick-48tl.ttyUSB0   hostapd                      node-red-venus               venus-html5-logger
dbus-generator-starter       hub4control                  serial-starter               venus-platform
dbus-imt-si-rs485tc.ttyUSB0  lastlog                      service-advertiser           vesmart-server
dbus-modbus-client           localsettings                simple-upnpd                 vrmlogger
dbus-modbus-client.ttyUSB0   messages                     ssh-tunnel                   websockify-c
dbus-mqtt                    messages.0                   sshd                         wtmp
dbus-qwacs                   messages.1                   start-gui
dbus-shelly                  messages.2                   swupdate

I'd really prefer that any open source project uses our method + configuration; (or if it exists a better one). Added advantage that I see of all doing it the same is that its more efficient, easier for anyone to help each other, and so forth.

here is how that works:

Contents of /service/<servicename>/log/run:

#!/bin/sh
exec 2>&1
exec multilog t s25000 n4 /var/log/dbus-fronius

What it does is make 4 files, and each 25.000 bytes large. So 100kB in total max; and in practice it varies between 75kB and 100kB.

I hope that this in more detail explains why I'm concerned about this.

@mpvader
Copy link
Author

mpvader commented May 26, 2024

ps. one thing that we have on our backlog that will help people once they are in trouble is this one: victronenergy/venus#1234

@henne49
Copy link
Owner

henne49 commented May 26, 2024

thank you, log rotation was already in, we have now 100kb max and setting to CRITICAL for quick fix, and will later on change to your log file handling #44

@henne49 henne49 linked a pull request May 26, 2024 that will close this issue
@mr-manuel
Copy link

Is there a reason why you are not using the system build in logging and log rotation? It is as easy as creating a file with three lines.

Just create a folder log in service with a file named run in it. Paste this code in the run file and make it executable:

#!/bin/sh
exec 2>&1
exec multilog t s25000 n4 /var/log/dbus-opendtu

Thats it. Everything which is normally printed to the console is written to the log.

See https://github.com/Louisvdw/dbus-serialbattery/tree/master/etc/dbus-serialbattery/service for an example.

The handlers should then not be needed anymore in this code:

log_rotate_handler = logging.handlers.RotatingFileHandler(
maxBytes=5*1024*1024*10,
backupCount=2,
encoding=None,
delay=0,
filename="%s/current.log" % (os.path.dirname(os.path.realpath(__file__)))
)
logging.basicConfig(
format="%(asctime)s,%(msecs)d %(name)s %(levelname)s %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
level=logging_level,
handlers=[
logging.StreamHandler(),
log_rotate_handler
],
)

@mr-manuel
Copy link

Secondly, we use the log to granularly log things from debug to error to debug the software ourselves. also it is easier to find.

The granularly does not change, only the path and the fact that Venus OS handles log rotation and all other.

Everyone would expect a log file to be found in the default log folder var/log on Linux systems.

Give it a try :-)

@henne49
Copy link
Owner

henne49 commented May 26, 2024

very good and thank you! that works
unaware of the built in functionality - the logrotate was build to limit log size. Still cannot find the documentation, how it is properly done.

One last question, I see that all log levels are now written, how are you handling, that only the log level required is written? To avoid to much logging, when not necessary?

@henne49
Copy link
Owner

henne49 commented May 26, 2024

if I do not remove this part, we can limit the log detail level, will rework the PR and built this in.
logging.basicConfig( format="%(asctime)s,%(msecs)d %(name)s %(levelname)s %(message)s", datefmt="%Y-%m-%d %H:%M:%S", level=logging_level, )

Thank you again, very helpful!

@henne49
Copy link
Owner

henne49 commented May 26, 2024

Built into PR #174 - waiting for review

@mr-manuel
Copy link

Unfortunately I don't even know where the documentation is, but GitHub Copilot and ChatGPT help you, if you want to customize the log settings.

You handle the log level with

 logging.basicConfig(
    level=logging_level
 )

therefore in your case yo can use

 logging.basicConfig( 
     format="%(asctime)s,%(msecs)d %(name)s %(levelname)s %(message)s", 
     datefmt="%Y-%m-%d %H:%M:%S", 
     level=logging_level, 
 )

instead of

log_rotate_handler = logging.handlers.RotatingFileHandler(
maxBytes=5*1024*1024*10,
backupCount=2,
encoding=None,
delay=0,
filename="%s/current.log" % (os.path.dirname(os.path.realpath(__file__)))
)
logging.basicConfig(
format="%(asctime)s,%(msecs)d %(name)s %(levelname)s %(message)s",
datefmt="%Y-%m-%d %H:%M:%S",
level=logging_level,
handlers=[
logging.StreamHandler(),
log_rotate_handler
],
)

I do it this way:

https://github.com/mr-manuel/venus-os_dbus-mqtt-temperature/blob/ecafd22c17ba668685e5b232a42f2fd843a9c468/dbus-mqtt-temperature/dbus-mqtt-temperature.py#L52-L67

@henne49
Copy link
Owner

henne49 commented May 26, 2024

Thank you, but the documentation about how venus handles system log rotation. With the service and log folder?

@mr-manuel
Copy link

mr-manuel commented May 26, 2024

#!/bin/sh
exec 2>&1
exec multilog t s25000 n4 /var/log/dbus-opendtu

This shell script does the following:

  • #!/bin/sh: This is a shebang line that specifies the script should be executed using the sh shell.

  • exec 2>&1: This line redirects standard error (2) to the same location as standard output (1). This means that all output, including errors, will be sent to the same place (usually the terminal).

  • exec multilog t s25000 n4 /var/log/dbus-opendtu: This line runs the multilog command with the specified parameters. multilog is a tool for managing log files, part of the daemontools package.

    • t: This option adds a timestamp to each log line.
    • s25000: This option sets the maximum size of each log file to 25000 bytes.
    • n4: This option sets the maximum number of log files to keep to 4.
    • /var/log/dbus-opendtu: This is the directory where the log files will be stored.

In summary, this script is used to manage log files for some process, adding timestamps to each log entry and limiting the size and number of log files.

Therefore it's not directly Venus OS which handles the log rotation but the multilog command which is launched by the service/deamon.

@henne49
Copy link
Owner

henne49 commented May 26, 2024

Ok, I get that as well, sorry maybe to let to clearly articulate

wanted to know where the best practise documentation is for how to properly write scripts that work best with venus os and in specific the piece that says do log rotation like you showed?

anyway thank you it works and will hopefully soon merge into main..

@mr-manuel
Copy link

I found also none documentation. My path was to get a Victron written script, check how it's build and running. Then I build my script the same way.

@dsteinkopf
Copy link
Collaborator

Thank you @mr-manuel for your help and @henne49 for your work. I've found this one:
https://manpages.debian.org/stretch/daemontools/multilog.8.en.html

@mpvader
Copy link
Author

mpvader commented May 27, 2024

Good morning!

wanted to know where the best practise documentation is for how to properly write scripts that work best with venus os and in specific the piece that says do log rotation like you showed?

We have this document: https://github.com/victronenergy/venus/wiki/howto-add-a-driver-to-Venus .

I added an initial section about logging to it.

@mpvader
Copy link
Author

mpvader commented May 27, 2024

Ps. all of you are quick! 👍

@henne49
Copy link
Owner

henne49 commented May 27, 2024

Thank you @mpvader for this link, that is very useful, also to review and align to best practise.

kai-morich added a commit to kai-morich/dbus-d0-smartmeter that referenced this issue Sep 5, 2024
else errors from 'run' script are not logged
kai-morich added a commit to kai-morich/dbus-d0-smartmeter that referenced this issue Sep 5, 2024
else errors from 'run' script are not logged
kai-morich added a commit to kai-morich/dbus-d0-smartmeter that referenced this issue Sep 5, 2024
else errors from 'run' script are not logged
kai-morich added a commit to kai-morich/dbus-d0-smartmeter that referenced this issue Sep 5, 2024
kai-morich added a commit to kai-morich/dbus-d0-smartmeter that referenced this issue Sep 6, 2024
else errors from 'run' script are not logged
kai-morich added a commit to kai-morich/dbus-d0-smartmeter that referenced this issue Sep 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants