-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Backup stopped working after upgrade to 1.3.2 #8677
Comments
On another server, where influxdb was installed much later (thus a more recent version), backup still works after the upgrade. |
I was able to successfully perform the backup after downgrading to 1.3.1-1 (from deb file luckily still in apt cache) and manually restarting influxdb daemon. The failing influxdb instance was originally installed back in 2016, version 1.0.0:
|
I have same problem with windows version 1.3.2 -1
|
We have ~10 influx instances we just upgraded to 1.3.2-1 today and we are now seeing this same issue. |
same issue here:
downgrade to 1.3.1 solved the issue. looks like the fix from https://github.com/influxdata/influxdb/pull/8378/commits somehow got reverted |
fix #8677: check for snapshot size == 0
[backport] fix #8677: check for snapshot size == 0
Works for me again since latest nightly: 1.4.0~n201708170800-0 |
Apparently the patch did not make it into 1.3.3. |
Can confirm, that in 1.3.3 it's also not working for me. |
It will be in 1.3.4. |
This is broken in 1.3.3. Metastore backup is fine but database backups produce errors. Can't get it to work on Mac or Ubuntu. I can also confirm that the backup directories have correct permissions, so it doesn't seem to be that.
|
1.3.4 is out, backup works again. Thanks! |
I still observe it on 1.3.5 although less often. Shard 1784 is the one that currently gets many updates:
|
influx --version Same issue
|
I am getting the same on 1.5.0. Are there any updates on this?
|
@dgnorton Can you work on reproducing this David? |
@iliketosneeze this looks like #9618. In your case, is the |
@dgnorton We are using 1.5.0 for both the running instance and the backup. |
@iliketosneeze is TSI enabled? |
@dgnorton I'm having this issue on influxd-1.5.0-1 with TSI enabled. |
@dgnorton Yes, we have TSI enabled as well. |
I have been able to reproduce this. In the repro, it happens fairly often but not every time. Setup using AWS
|
@dgnorton the above repro seems to occur if another snapshot is being taken at the same time, as in the case of periodic compactions. I've seen it fail out after 10 attempts. I've also seen it fail 5, 6 times and then succeed. We could consider this to be a resource/locking error, but regardless this scenario doesn't seem to be a blocker. @iliketosneeze does the problem happen consistently, like 20/50/100% of the time? Or more sporadically? Could you share the influxd logs from the time of the backup? |
@aanthony1243 It's not 100%, but it is very often. I also seems to depend on the size of the db/shards. Smaller ones don't seem to do it that often. |
@iliketosneeze I don't see any of the errors in your gist that I saw when reproducing above, but the symptoms are still there. It happens more frequently on larger DB/shards because there's a resource competition when taking shard snapshots. If another process holds the resource for too long the backup will fail. We're looking into the root cause now. |
@iliketosneeze we've adjusted the backup to use an exponential backoff to give the server more time to free resources. It's out in the just-released 1.5.2. If you continue to see frequent occurrence of this after upgrading, please open a new issue and we will follow up. |
Hate to comment on a closed bug, but it appears this is still a problem with later versions:
We need the back-up as we have to migrate things around in the Docker container InfluxDB is installed in. Long story short, due to a typo in So far, I have one instance running 1.3.2, and one running 1.5.3, and both exhibit the same problem regarding backups: the back up fails because it can't copy files (for no reason; if I'm to understand the What's the safest way to back up these instances if |
it sounds like you are prepared to tolerate some down time. It should be safe to stop influxd, move the entire influxdb directory to a new location, and then update your influxdb config to use that new location. |
This is true, that is safe enough, but how about back-up? We'll ultimately want to bump these up to the latest release before long (there are a lot of features in the newer InfluxDB), but doing so without a good back-up is a risky proposition. In the meantime we need to be able to back up that instance, and do so without disruption. We do these backups on a daily basis, re-starting InfluxDB every 24 hours will likely not be welcome. How do we safely back up the data without shutting down InfluxDB? |
I misunderstood your issue. you can check the influxdb logs on the server side to see if an error is being logged when nothing is returned on the connection. If you find something, perhaps we should open it as a new issue and continue from there? |
I have the same issue. I found out that whenever I mount my volume under When I mount my volume under |
After yesterday's upgrade to version 1.3.2 (from 1.3.1 via Debian repo), backup fails with this error message:
influxd backup -database telegraf /tmp/influxbackup
Version: 1.3.2
OS: Debian jessie on x86_64
The text was updated successfully, but these errors were encountered: