Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect dirty regions containing only zeroes during incremental backup (fstrim) #250

Closed
abbbi opened this issue Feb 13, 2025 · 9 comments · Fixed by #251
Closed

Detect dirty regions containing only zeroes during incremental backup (fstrim) #250

abbbi opened this issue Feb 13, 2025 · 9 comments · Fixed by #251
Labels
enhancement New feature or request Needs-testing Feature needs testing

Comments

@abbbi
Copy link
Owner

abbbi commented Feb 13, 2025

As discussed in:

#139 (comment)
#139 (reply in thread)

The current situation is:

  • fstrim within virtual machine will represent changed blocks that are zeroes within the bitmap, these blocks will be backed up which may result in higher backup volume than required.
  • during backup, virtnbdbackup currently does not compare the changed block state of the dirty bitmap against the BASE_ALLOCATION bitmap, thus it is backing up zeores that it doesnt have to resulting in big backups if fstrim is running within vm.

The proposed solution could be that during incremental/differential backup, the nbd client requests both the actual bitmap and the base allocation bitmap from the nbd service.

The extend handler could then check if the block that is marked as changed within the bitmap is actually a zero block in the BASE_ALLOCATION bitmap and skip those.

a little bit more complex. List of extents for block allocation and list of extents for bitmaps could be different, f.e.

allocation: [0-1G] - zero, present; [1G-2G] - data, present
dirty: [0-512M] - clean, [512M-2G] - dirty
though this is not principal.

Current workarounds for users:

  • Schedule fstrim related operations within VM before FULL backups.

To reproduce:

  • Virtual machine disk must be configured with discard=unmap option:

<driver name='qemu' type='qcow2' discard='unmap'/>

  • Create some data: on specific disk:
root@debian12:/mnt# dd if=/dev/zero  of=data bs=1M count=15
15+0 records in
15+0 records out
15728640 bytes (16 MB, 15 MiB) copied, 0.0465533 s, 338 MB/s

Incremental backup will backup:

Total saved disk data: [15.9MB]

Now delete the data file and run fstrim:

root@debian12:/mnt# rm data
root@debian12:/mnt# sync
root@debian12:/mnt# fstrim -v /mnt
/mnt: 18.1 MiB (18966528 bytes) trimmed

Next incremental backup will be as big:

Total saved disk data: [18.6MB]

@abbbi abbbi added the enhancement New feature or request label Feb 13, 2025
@abbbi abbbi changed the title Skip dirty zero blocks during backup Skip dirty zero blocks during incremental backup (fstrim) Feb 13, 2025
@abbbi abbbi changed the title Skip dirty zero blocks during incremental backup (fstrim) Dected dirty regions containing only zeroes during incremental backup (fstrim) Feb 13, 2025
@abbbi abbbi changed the title Dected dirty regions containing only zeroes during incremental backup (fstrim) Detect dirty regions containing only zeroes during incremental backup (fstrim) Feb 13, 2025
@rmk177
Copy link

rmk177 commented Feb 17, 2025

Good day! Here https://github.com/oVirt/ovirt-imageio/tree/master they use qemu:allocation-depth with base:allocation context to detect holes. Maybe we should use it?
in nbd.py 613:

#qemu:allocation-depth is required to detect holes in qcow2 images -
#unallocated clusters exposing data from the backing chain.
#Added in qemu 5.2.0.

@abbbi
Copy link
Owner Author

abbbi commented Feb 17, 2025

Good day! Here https://github.com/oVirt/ovirt-imageio/tree/master they use qemu:allocation-depth with base:allocation context to detect holes. Maybe we should use it? in nbd.py 613:

the ovirt-imageio project has a complete nbd client implementation, as in, the plain NBD protocol. This project in
turn uses the higher level libnbd bindings. I dont want to change the current implementation to ship a complete
nbd client. So it is hardly re-usable. In this branch:

https://github.com/abbbi/virtnbdbackup/tree/issue_250

iam attempting a POC for the functionality. Ive already enhanced the extent handler to query both the base:allocation
and qemu-XX contexts during incremental/differential backup. What needs to be done now is to enhance the code to detect the overlaps between the extents returned by both contexts and skip according data during backup, if it detects so. Not sure where this is going ..

@rmk177
Copy link

rmk177 commented Feb 17, 2025

Good day! I see your effors in branch issue_250. I have one remark if I rightly understand your code

       base_extents = [
            e for e in extents if e.context == CONTEXT_BASE_ALLOCATION and not e.data
        ]
        backup_extents = [
            e for e in extents if e.context == self._metaContext and not e.data
        ]

Maybe we should use e for e in extents if e.context == self._metaContext and e.data because these blocks will be marked dirty and in base-allocation metacontext they marked as zero or zero,hole after fstrim

@abbbi
Copy link
Owner Author

abbbi commented Feb 17, 2025

yes, this is WIP and probably very buggy. Could also be i'm on the wrong track here :-)

@rmk177
Copy link

rmk177 commented Feb 17, 2025

I think best solution will be write zero extents without payload, so we can save space. And if algorithm of virtnbdbackup will include extents with zeros - impact will be minimal, because of zero payload. Is virtnbdbackup backend ready for zero extens with no payload?

@abbbi
Copy link
Owner Author

abbbi commented Feb 17, 2025

I think best solution will be write zero extents without payload, so we can save space. And if algorithm of virtnbdbackup will include extents with zeros - impact will be minimal, because of zero payload. Is virtnbdbackup backend ready for zero extens with no payload?

zero extents are already stored in this way. Its just that during incremental backup fstrim situations are not handled.
see: https://github.com/abbbi/virtnbdbackup/blob/master/libvirtnbdbackup/sparsestream/types.py#L27 and https://github.com/abbbi/virtnbdbackup/blob/master/libvirtnbdbackup/sparsestream/types.py#L62

@rmk177
Copy link

rmk177 commented Feb 18, 2025

Good day! As far as I understand you use this apporoach: Take dirtybitmaps and base allocation extents and use base-allocation extents to filter extents, that do not have valid data, and write data in range of dirtybitmaps to backup file. I think we can make this task in another way: On incremental backup phase we have to collect base-allocation extents and use dirtybitmaps with 1 to include base-allocation extents into backup if they are in ranges dirty-bitmap, and after that we can unify extents to write data. In this approach we include data extents and zero extents to reflect changes to backup.

@rmk177
Copy link

rmk177 commented Feb 18, 2025

Good day! Could you please review my patch for fixing extents calculation?

From: Roman Grigoriev <grigoriev177@gmail.com>
Date: Tue, 18 Feb 2025 16:59:17 +0300
Subject: [PATCH 7/7] Fix caculation of extents

---
 .../extenthandler/extenthandler.py            | 75 +++++++++----------
 1 file changed, 34 insertions(+), 41 deletions(-)

diff --git a/libvirtnbdbackup/extenthandler/extenthandler.py b/libvirtnbdbackup/extenthandler/extenthandler.py
index e5f5e99..36294d3 100644
--- a/libvirtnbdbackup/extenthandler/extenthandler.py
+++ b/libvirtnbdbackup/extenthandler/extenthandler.py
@@ -137,7 +137,7 @@ class ExtentHandler:
         for myExtent in extentObjects:
             if cur is None:
                 cur = myExtent
-            elif cur.type == myExtent.type:
+            elif cur.type == myExtent.type and cur.context == myExtent.context:
                 cur.length += myExtent.length
             else:
                 yield cur
@@ -198,24 +198,33 @@ class ExtentHandler:
     def overlap(self, extents: List[Extent]) -> List[Extent]:
         """Find overlaps between base allocation and incremental bitmap to detect zero regions"""
         base_extents = [
-            e for e in extents if e.context == CONTEXT_BASE_ALLOCATION and not e.data
+            e for e in extents if e.context == CONTEXT_BASE_ALLOCATION
         ]
         backup_extents = [
-            e for e in extents if e.context == self._metaContext and not e.data
+            e for e in extents if e.context == self._metaContext and e.data
         ]
 
-        skippable_extents = []
+        selected_extents = []
+
+        totalLength = 0
 
         for backup in backup_extents:
+            backup_end = backup.offset + backup.length
+            log.debug("check extent backup %d %d",backup.offset,backup_end)
             for base in base_extents:
                 base_end = base.offset + base.length
-                backup_end = backup.offset + backup.length
-
-                if not (base_end <= backup.offset or backup_end <= base.offset):
-                    skippable_extents.append(backup)
-                    break
-
-        return skippable_extents
+                log.debug("-add base0 %d %d %d %d",backup.offset,base.offset,base_end,backup_end)
+                ext = Extent(base.context, base.data,base.offset,base.length)
+                ext.offset = max(base.offset,backup.offset)
+                ext_end = min(base_end,backup_end)
+                ext.length = max(0,ext_end - ext.offset) 
+                if (ext.length):
+                    log.debug("->extent %d %d %d",ext.offset,ext.offset+ext.length,ext.length)
+                    selected_extents.append(ext)
+                    totalLength+=ext.length
+                
+        log.debug("totalLength %d",totalLength)
+        return selected_extents
 
     def queryBlockStatus(self) -> List[Extent]:
         """Check the status for each extent, whether if it is
@@ -226,41 +235,25 @@ class ExtentHandler:
 
         extents = self.queryExtentsNbd()
         extentList: List[Extent] = []
-        fullExtentList: List[Extent] = []
         start = 0
-        fullStart = 0
+        baseStart = 0
+        totalLength = 0
         for extent in self._unifyExtents(extents):
             extObj = Extent(
                 extent.context,
                 self.setBlockType(extent.context, extent.type),
-                start,
+                baseStart if extent.context == CONTEXT_BASE_ALLOCATION else start,
                 extent.length,
             )
-
-            fullExtentList.append(extObj)
-            fullStart += extent.length
-
-            if extent.context != self._metaContext:
-                continue
-
             extentList.append(extObj)
-            start += extent.length
-
-        log.warning("----------")
-        log.warning("%s", extentList)
-        overlapping_regions = self.overlap(fullExtentList)
-        if not overlapping_regions:
-            log.warning("No overlapping regions found")
-        else:
-            log.warning("%s overlapping regions found", len(overlapping_regions))
-            for skipme in overlapping_regions:
-                log.warning("Omittable extent found: %s\n", skipme)
-
-        filtered_extents = [e for e in extentList if e not in overlapping_regions]
-
-        log.warning(
-            "Returning filtered extent list with %s objects", len(filtered_extents)
-        )
-        log.warning("%s", filtered_extents)
-        log.warning("----------")
-        return filtered_extents
+            if extent.context == CONTEXT_BASE_ALLOCATION:
+                baseStart += extent.length
+            else:
+                start += extent.length
+                if (extObj.data):
+                    totalLength+=extent.length
+            log.debug("%s %d %d %d", extObj.context, extObj.data, extObj.offset, extObj.offset+extObj.length)
+        log.debug("changed data %d", totalLength)
+        if (self._metaContext != CONTEXT_BASE_ALLOCATION):
+            extentList = self.overlap(extentList)
+        return extentList
-- 
2.39.2

@abbbi
Copy link
Owner Author

abbbi commented Feb 18, 2025

hi Roman,

thanks for digging into this! This looks quite good, first tests show that behavior on fstrim has improved alot.
Restore also looks OK, i tested with various delete/fstrim actions and the restored image was OK.

Im going to tidy up things a bit in the next weeks (--qemu option needs to be fixed, to pass the bitmap too, amongst
other improvements) and see if i can come up with a automated testcase that can run in gitlab CI and verify functionality.

@abbbi abbbi added the Needs-testing Feature needs testing label Feb 18, 2025
abbbi added a commit that referenced this issue Feb 20, 2025
…#251)

* Detect dirty regions containing only zeroes during incremental backup (fstrim) #250
* add github action and testcases
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Needs-testing Feature needs testing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants