-
Notifications
You must be signed in to change notification settings - Fork 275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support dynamically removing store from storage manager #1232
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1232 +/- ##
=============================================
- Coverage 88.92% 72.21% -16.71%
- Complexity 60 6054 +5994
=============================================
Files 6 439 +433
Lines 352 34968 +34616
Branches 37 4437 +4400
=============================================
+ Hits 313 25252 +24939
- Misses 29 8560 +8531
- Partials 10 1156 +1146
Continue to review full report at Codecov.
|
boolean removeBlobStore(BlobStore store) { | ||
boolean result; | ||
if (compactionExecutor == null) { | ||
stores.remove(store); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once #1226 is merged, stores
will become concurrent set which should be ok here.
1. Given partition id, remove the corresponding store from storage manager, disk manager and compaction manager. 2. Support deleting allocated files of store and returning swap segments to reserver pool if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM after comments addressed
String[] swapSegmentsInUse = compactor.getSwapSegmentsInUse(); | ||
if (swapSegmentsInUse.length > 0) { | ||
for (String fileName : swapSegmentsInUse) { | ||
logger.trace("Returning swap segment {} to reserve pool", fileName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe log this at info level since it is relatively rare
} | ||
// step1: return occupied swap segments (if any) to reserve pool | ||
String[] swapSegmentsInUse = compactor.getSwapSegmentsInUse(); | ||
if (swapSegmentsInUse.length > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't need this if around the for loop
logger.info("Deleting store {} directory", storeId); | ||
File storeDir = new File(dataDir); | ||
try { | ||
Files.walk(storeDir.toPath()).sorted(Comparator.reverseOrder()).map(Path::toFile).forEach(File::delete); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also have Utils.deleteFileOrDirectory
that you can use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, I didn't notice that. Will change this piece of code.
if (compactionExecutor == null) { | ||
stores.remove(store); | ||
result = true; | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can remove this nested else and use else if (!compact...
* @param id the {@link PartitionId} associated with store | ||
* @return {@code true} if removal succeeds. {@code false} otherwise. | ||
*/ | ||
public boolean removeBlobStore(PartitionId id) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason for this method (and the ones in DiskManager and CompactionManager) to return success/failure booleans instead of throwing exceptions? With exceptions, you can log just once at the top level
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am slightly leaning towards current way to make it sort of consistent with scheduleNextForCompaction
, controlCompactionForBlobStore
etc. Returning boolean can be explicitly verified in unit tests.
@@ -219,6 +219,8 @@ private void deregisterIndexGauges(String storeId) { | |||
registry.remove(MetricRegistry.name(Log.class, prefix + "CurrentCapacityUsed")); | |||
registry.remove(MetricRegistry.name(Log.class, prefix + "PercentageUsedCapacity")); | |||
registry.remove(MetricRegistry.name(Log.class, prefix + "CurrentSegmentCount")); | |||
registry.remove(MetricRegistry.name(Log.class, "ByteBufferForAppendTotalCount")); | |||
registry.remove(MetricRegistry.name(Log.class, "UnderCompaction" + SEPERATOR + "ByteBufferForAppendTotalCount")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor typo: SEPARATOR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
*/ | ||
@Test | ||
public void deleteStoreFilesTest() throws Exception { | ||
assumeTrue(isLogSegmented); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this code be tested with non segmented logs? Perhaps just put the test of swap segments within if (isLogSegmented)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I added some codes to ensure both segmented and non-segmented cases are tests.
StoreConfig config = new StoreConfig(new VerifiableProperties(properties)); | ||
MetricRegistry registry = new MetricRegistry(); | ||
StoreMetrics metrics = new StoreMetrics(registry); | ||
//createBlobStore(getMockReplicaId(storeDir.getAbsolutePath())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove commented code
Addressed @cgtz 's comments. @lightningrob gentle reminder to review. |
ambry-store/src/main/java/com.github.ambry.store/BlobStore.java
Outdated
Show resolved
Hide resolved
ambry-store/src/main/java/com.github.ambry.store/CompactionManager.java
Outdated
Show resolved
Hide resolved
stores.remove(store); | ||
result = true; | ||
} else if (!compactionExecutor.getStoresDisabledCompaction().contains(store)) { | ||
logger.error("Fail to remove store ({}) from compaction manager because compaction of it is still enabled", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it should be an IllegalStateException. Is there a valid scenario where compaction is still enabled when removeStore is called? If we return false and compaction is turned off later, will the store ever be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with your point, however, I am trying to use same way in scheduleNextForCompaction
method (line 360). Returning false may happen when disabling compaction on the store hasn't completed or hasn't been executed. The Helix state model will retry entire workflow (disable compaction, remove store) to guarantee the store is correctly removed.
logger.info("Deleting store {} directory", storeId); | ||
File storeDir = new File(dataDir); | ||
try { | ||
Utils.deleteFileOrDirectory(storeDir); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method already throws IOException, so you probably don't have to catch the exception, unless you want dataDir
to be in the message.
Also, I would recommend setting the cause exception to e
if you keep the catch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reminding me of this. I prefer to keep this catch in case there is any non-IOException. Also, I would take your advice to set the exception e
manager, disk manager and compaction manager.
segments to reserver pool if necessary.