Skip to content

Commit

Permalink
EOS-25302 After delete, storage is not reclaimed completely. (Seagate…
Browse files Browse the repository at this point in the history
…#1215)

Problem :
In some cases, after deleting the object, entire space used by object
is not reclaimed.

Root cause:
This issues is observed in some cases where we do not write a complete
parity group in one IO request. Following happens in 3+2 config (an example)
1.Write 1 M at offset 0 - space consumed is 3M (1M for data and 2 M for parity)
2.Write 2 M at offset 1048576 - space consumed is 4M (2M for data and 2 M for parity)
Total space consumed is 7M, after unlink space reclaimed is 5M. Leak of 2M.
This leak is because of parity units.

In step 2, a memory leak happens.
We do not deallocated the space consumed by parity units because of step 1 and
we reallocate space for the parity units again.
Because of this leak we are not able to reclaim space even after object delete.

Memory leak happens because in motr code, to release a balloc segment, a condition
check is done and this condition check is always false as of today.

Solution:
Removed the condition check which was preventing balloc segment to be released
and causing this leak.

Testing done:
verified the issues is not getting reproduced after changes.
ST's (mostly related to healthy IO path, dgmode IO and SNS repair).

Signed-off-by: Shipra Gupta <shipra.gupta@seagate.com>

* EOS-25302 After delete, storage is not reclaimed completely

Problem :
In some cases, after deleting the object, entire space used by object
is not reclaimed.

Root cause:
This issues is observed in some cases where we do not write a complete
parity group in one IO request. Following happens in 3+2 config (an example)
1.Write 1 M at offset 0 - space consumed is 3M (1M for data and 2 M for parity)
2.Write 2 M at offset 1048576 - space consumed is 4M (2M for data and 2 M for parity)
Total space consumed is 7M, after unlink space reclaimed is 5M. Leak of 2M.
This leak is because of parity units.

In step 2, a memory leak happens.
We do not deallocated the space consumed by parity units because of step 1 and
we reallocate space for the parity units again.
Because of this leak we are not able to reclaim space even after object delete.

Memory leak happens because in motr code, to release a balloc segment, a condition
check is done and this condition check is always false as of today.

Solution:
sad_overwrite was added a kept default to false to fix bulk_server_ut.
sad_overwrite is set to true by default.
Added a fault point to make sad_overwrite false for bulk_server_ut.

Signed-off-by: Shipra Gupta <shipra.gupta@seagate.com>
Co-authored-by: Yatin Mahajan <yatin.mahajan@seagate.com>
  • Loading branch information
gshipra and yatin-mahajan authored Dec 1, 2021
1 parent e0bb65c commit ab22d23
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 1 deletion.
2 changes: 2 additions & 0 deletions ioservice/ut/bulkio_ut.c
Original file line number Diff line number Diff line change
Expand Up @@ -1775,6 +1775,7 @@ static void bulkio_init(void)
*/
m0_fi_enable("io_fop_di_prepare", "skip_di_for_ut");
m0_fi_enable("m0_file_init", "skip_di_for_ut");
m0_fi_enable("stob_ad_domain_create", "write_undo");

M0_ALLOC_PTR(bp);
M0_ASSERT(bp != NULL);
Expand Down Expand Up @@ -1806,6 +1807,7 @@ static void bulkio_fini(void)

m0_fi_disable("io_fop_di_prepare", "skip_di_for_ut");
m0_fi_disable("m0_file_init", "skip_di_for_ut");
m0_fi_disable("stob_ad_domain_create", "write_undo");
}

/*
Expand Down
19 changes: 18 additions & 1 deletion stob/ad.c
Original file line number Diff line number Diff line change
Expand Up @@ -579,7 +579,10 @@ static int stob_ad_domain_create(struct m0_stob_type *type,
cfg->adg_spare_blocks_per_group;
#endif
adom->sad_bstore_id = cfg->adg_id;
adom->sad_overwrite = false;
if (M0_FI_ENABLED("write_undo"))
adom->sad_overwrite = false;
else
adom->sad_overwrite = true;
strcpy(adom->sad_path, location_data);
m0_format_footer_update(adom);
emap = &adom->sad_adata;
Expand Down Expand Up @@ -1172,10 +1175,24 @@ static void stob_ad_write_credit(const struct m0_stob_domain *dom,
*/
m0_be_emap_credit(&adom->sad_adata, M0_BEO_PASTE, frags + 1, accum);

/*
* Commenting out below part of code with #if 0, earlier it was based
* on assumption that that adom->sad_overwrite will be always false,
* Now we have set it to True by default to fix EOS-25302 hence
* disabling it with "#if 0"
* TODO: Probably sad_overwrite is introduced for COW(Copy on Write) and
* for Object versioning in Motr, which is not implemented yet. Need to
* revisit this part while implementing COW and object versioning.
* We do not know whether bo_free_credit should be commented out or not,
* but this is done to maintain existing behavior as the code was
* anyway redundant earlier.
*/
#if 0
if (adom->sad_overwrite && ballroom->ab_ops->bo_free_credit != NULL) {
/* for each emap_paste() seg_free() could be called 3 times */
ballroom->ab_ops->bo_free_credit(ballroom, 3 * frags, accum);
}
#endif
m0_stob_io_credit(io, m0_stob_dom_get(adom->sad_bstore), accum);
}

Expand Down

0 comments on commit ab22d23

Please sign in to comment.