Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: reentrant call inside <_io.BufferedWriter name=''> #1003

Closed
likg opened this issue May 21, 2019 · 10 comments
Closed

RuntimeError: reentrant call inside <_io.BufferedWriter name=''> #1003

likg opened this issue May 21, 2019 · 10 comments
Assignees
Labels
Bug Heisenbug a problem that is not reproducible but random (non-deterministic)

Comments

@likg
Copy link

likg commented May 21, 2019

With backintime (1.2.0) when you try to backup whole system (/ mount point) backup task fails with the following error:

Back In Time
Version: 1.2.0

Back In Time comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions; type `backintime --license' for details.

INFO: Lock
INFO: Take a new snapshot. Profile: 1 Main profile
INFO: Call rsync to take the snapshot
Traceback (most recent call last):
  File "/usr/share/backintime/common/backintime.py", line 1165, in <module>
    startApp()
  File "/usr/share/backintime/common/backintime.py", line 517, in startApp
    args.func(args)
  File "/usr/share/backintime/common/backintime.py", line 739, in backup
    ret = takeSnapshot(cfg, force)
  File "/usr/share/backintime/common/backintime.py", line 94, in takeSnapshot
    ret = snapshots.Snapshots(cfg).backup(force)
  File "/usr/share/backintime/common/snapshots.py", line 692, in backup
    ret_val, ret_error = self.takeSnapshot(sid, now, include_folders)
  File "/usr/share/backintime/common/snapshots.py", line 1036, in takeSnapshot
    proc.run()
  File "/usr/share/backintime/common/tools.py", line 1981, in run
    self.callback(line, self.user_data)
  File "/usr/share/backintime/common/snapshots.py", line 797, in rsyncCallback
    self.setTakeSnapshotMessage(0, _('Take snapshot') + " (rsync: %s)" % line)
  File "/usr/share/backintime/common/snapshots.py", line 143, in setTakeSnapshotMessage
    self.snapshotLog.append('[I] '  + message, 3)
  File "/usr/share/backintime/common/snapshotlog.py", line 195, in append
    self.logFile.write(msg + '\n')
  File "/usr/share/backintime/common/tools.py", line 1513, in handler
    self.callback()
  File "/usr/share/backintime/common/snapshotlog.py", line 203, in flush
    self.logFile.flush()
RuntimeError: reentrant call inside <_io.BufferedWriter name='/root/.local/share/backintime/takesnapshot_.log'>

Simple workaround in this case is to add backintime log file (e.g. /root/.local/share/backintime/takesnapshot_.log) to exclude list.

Please review this case, maybe log file should be excluded (or handled in a special way), to avoid such issue.

@MAProsper
Copy link

MAProsper commented Jul 10, 2019

Exact same issue, but in my case it was a backup from one mount point to another, the log file was not included in anyway. Also if you take a look at the log file itself, the rsync command already exclude it --exclude=/root/.local/share/backintime so it can't be the issue.

What I find weird is the way the rsync command is composed (for reference, running as root with only include /mnt/data/, to /mnt/backup/).

Why does it have to change the permissions? --chmod=Du+wx
Why is this necessary? --exclude=.local/share/backintime/mnt

Edit: Weirdly enough a reboot solve the issue for me.

@hannes101
Copy link

I got the same issue as the downstream maintainer in fedora:
https://bugzilla.redhat.com/show_bug.cgi?id=1729773

@tatokis
Copy link
Contributor

tatokis commented Apr 22, 2020

I just had the exact same issue with the same backtrace, after backing up multiple terabytes of data over 24 hours.
Please consider ignoring the log file unconditionally, as it is a very nasty surprise.

@hannes101
Copy link

Still happening on a recent fedora version with the latest upstream backintime https://bugzilla.redhat.com/show_bug.cgi?id=2016628

@rvega
Copy link

rvega commented Jun 9, 2022

Same error with Backintime 1.3.2 running in Manjaro. Trying to backup files from one USB drive onto another USB drive.

@emtiu emtiu added Distro-Specific only for certain distributions, desktop environments or display servers Bug labels Sep 25, 2022
@emtiu
Copy link
Member

emtiu commented Sep 25, 2022

There may be useful hints in the closed OpenSUSE bug 1188115.

@hannes101
Copy link

hannes101 commented Dec 3, 2022

another crash on fedora https://bugzilla.redhat.com/show_bug.cgi?id=2150491
Could this be caused by the reason, described in https://bugzilla.opensuse.org/show_bug.cgi?id=1188115#c4 ? So it would be possible to fix it by excluding the log file by default from the backup?

@aryoda
Copy link
Contributor

aryoda commented Dec 3, 2022

Reason for this bug

RuntimeError: reentrant call inside <_io.BufferedWriter ...

This is a known issue of Python3 and works as expected from the Python point-of-view ("won't fix") since this code is not reentrant: https://bugs.python.org/issue24283

Looking into the traceback I think this a non-deterministic bug in BiT that happens whenever BiT tries to write into the log file in the main thread while triggering a file buffer flush in another thread (via a "timer" signal) so excluding the BiT log file will not eliminate this RTE!

Code that causes the bug

The snapshot logger establishes a signal handler that is called in regular time intervals to flush the log file (in another thread!):

self.timer = tools.Alarm(self.flush, overwrite = False)

A log entry is written in the main thread and the signal handler instructed to be called after 5 secs to flush the file buffer:

self.logFile.write(msg + '\n')
self.timer.start(5)

After 5 secs the flush method is called for the log file

signal.signal(signal.SIGALRM, self.handler)

Whenever writing a snapshot log entry (snapshotlog.py#L195) is interrupted by the flush call we get the RTE with the traceback as reported here.

Proposed fix:

The Python3 runtime will not support reentrance and BiT should flush the log every n seconds to make the progress visible and persist the up-to-date log file content even if BiT crashes.

So the easiest fix is to catch this RTE in the flush method and log it as INFO only (and ignore it).

Edit: Logging of this RTE is not reliable since it would cause another RTE of the same type ;-)

I will prepare a PR to fix this.

Consider excluding BiT files from the backups

BiT uses rsync and this is not meant to backup all files of the whole system since open files may cause problems.
but users want to to do this anyhow so ignoring BiT files in backups would be a good idea.

From what I see in the code

backintime/common/config.py

Lines 184 to 189 in bff7aeb

HOME_FOLDER = os.path.expanduser('~')
DATA_FOLDER = '.local/share'
CONFIG_FOLDER = '.config'
BIT_FOLDER = 'backintime'
self._DEFAULT_LOCAL_DATA_FOLDER = os.path.join(HOME_FOLDER, DATA_FOLDER, BIT_FOLDER)
self._LOCAL_CONFIG_FOLDER = os.path.join(HOME_FOLDER, CONFIG_FOLDER, BIT_FOLDER)

the following folders should be excluded too by default (and manually as workaround for older versions of BiT):

  • /root/.local/share/backintime
  • /<currentuser>/.local/share/backintime

Ignoring specific files kept open by BiT would be even better since ignoring the whole folder means that

  • the configs (backup profiles) and
  • the user-callback script

are not contained in the backup 😲

This would require to add a positive list of all file names (or extensions) that are uses in the config.py (e.g. worker*.* files).

The default config can be changed here in the source code:

backintime/common/config.py

Lines 132 to 135 in bff7aeb

DEFAULT_EXCLUDE = [ '.gvfs', '.cache/*', '.thumbnails*',
'.local/share/[Tt]rash*', '*.backup*', '*~', '.dropbox*', '/proc/*',
'/sys/*', '/dev/*', '/run/*', '/etc/mtab', '/var/cache/apt/archives/*.deb',
'lost+found/*', '/tmp/*', '/var/tmp/*', '/var/backups/*', '.Private' ]

@aryoda aryoda self-assigned this Dec 3, 2022
@aryoda aryoda added Heisenbug a problem that is not reproducible but random (non-deterministic) and removed Distro-Specific only for certain distributions, desktop environments or display servers labels Dec 3, 2022
@aryoda aryoda closed this as completed in 110d82a Dec 4, 2022
@aryoda
Copy link
Contributor

aryoda commented Dec 4, 2022

FYI: I have fixed the bug and opened a separate issue as feature request (#1382) to exclude BiT files by default so that this issue can be closed now.

@hannes101
Copy link

Thanks a lot! I am really impressed by the new BiT team, please keep up the good work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Heisenbug a problem that is not reproducible but random (non-deterministic)
Projects
None yet
Development

No branches or pull requests

7 participants