[FAB-17992] Allow remove ledger data for a channel #1403

wenjianqiao · 2020-06-15T14:01:40Z

Signed-off-by: Wenjian Qiao wenjianq@gmail.com

Type of change

New feature

Description

Add a Remove function to block storage provider in oder to
remove ledger data for a channel. It creates a temporary file
to indicate the channel is to be removed and start a goroutine
to remove channel ledger data in background. If remove fails or
orderer is stopped before remove is done, upon ledger restart,
it checks the existence of the temporary file and complete remove
as needed.

Additional details

Related issues

Story: https://jira.hyperledger.org/browse/FAB-17992
Epic: https://jira.hyperledger.org/browse/FAB-17712

Add a Remove function to block storage provider in oder to remove ledger data for a channel. It creates a temporary file to indicate the channel is to be removed and start a goroutine to remove channel ledger data in background. If remove fails or orderer is stopped before remove is done, upon ledger restart, it checks the existence of the temporary file and complete remove as needed. Signed-off-by: Wenjian Qiao <wenjianq@gmail.com>

cendhu

Thanks, Wenjian. I have a few suggestions on the approach.

cendhu · 2020-06-15T18:37:42Z

common/ledger/blkstorage/blockstore_provider.go

+}
+
+// Remove block index and blocks for the given ledgerid (channelID).
+// It creates a temporary file to indicate the channel is to be removed and deletes the ledger data in a separate goroutine.


I am not sure about this. Given that the block store is common to both orderer and peer, isn't it good to handle the failures outside block store?

Let's assume we support deletion of channel from an orderer as well as a peer. In that case, for peer, we would use the idStore at kvledger pkg to manage the channel removal request (similar to what we do for join request). This is because at a peer, we need to remove data from many stores not just block store.

Similarly, I think the orderer needs to handle the failure (not at the block store). IMO, Orderer needs to have something like idStore to manage join/delete request. We shouldn't have this logic at the common block store. May be it is good to introduce idStore at

fabric/common/ledger/blockledger/fileledger/factory.go

Line 23 in 946016f

type fileLedgerFactory struct {

Thank you @Cendu. Since blockstore has only 1 provider, failure handling can be self-contained in blockstore even if it is called via peer. Having said so, agree that it is better to do it in the higher level above blockstore.

cendhu · 2020-06-15T18:44:33Z

common/ledger/blkstorage/blockstore_provider.go

+	}
+	f.Close()
+
+	go p.removeLedgerData(ledgerid)


There are multiple issues with this line.

What would happen when the removeLedgerData() returns an error?

There is a fundamental assumption that the channel removal request is non-blocking. While this may be true for orderer (need to verify), I am not sure whether it would be true for the peer (because the behaviour of peer channel delete might be similar to peer channel rollback, i.e., blocking call)

I would suggest leaving this blocking/non-blocking decision to the caller, i.e., either orderer or peer. If orderer wants to support a non-blocking delete, it can call Remove() within a goroutine. If peer wants to support a blocking delete, it can just call Remove().

The plan in the channel participation API is to have a blocking call that returns only when resources are gone. Gone, as far as the API is concerned, means that after the call returns (API) List will not find that channel, and moreover, that we can create a new channel by the same name immediately after it was removed.

This doesn't mean that the underlying implementation cannot remove resources in a lazy fashion if it knows how to mark them as deleted and support the creation of a new channel before old resources were fully removed.

This is a rare call, and the performance impacts of it are negligible. I assume (hope) it is also relatively fast - a fraction of a second - such that a human user using an UI won't loose patience. Therefore the preference is for the REST call to return after all resources are removed.

cendhu · 2020-06-15T19:06:52Z

common/ledger/blkstorage/blockstore_provider.go

+
+// completePendingRemoves checks __toBeRemoved_xxx files and removes the corresponding channel ledger data
+// if any temporary file(s) is found. This function should only be called upon ledger init.
+func (p *BlockStoreProvider) completePendingRemoves() {


This is similar to

fabric/core/ledger/kvledger/kv_ledger_provider.go

Line 417 in 946016f

func (p *Provider) recoverUnderConstructionLedger() {

Recovery of creation or deletion request is better done at the higher layer using the idStore (i.e., peer or orderer).

cendhu · 2020-06-15T19:23:54Z

common/ledger/blkstorage/blockstore_provider.go

 func (p *BlockStoreProvider) Exists(ledgerid string) (bool, error) {
 	exists, _, err := util.FileExists(p.conf.getLedgerBlockDir(ledgerid))
-	return exists, err
+	if !exists || err != nil {
+		return false, err
+	}
+	toBeRemoved, _, err := util.FileExists(p.conf.getToBeRemovedFilePath(ledgerid))
+	if err != nil {
+		return false, err
+	}
+	return !toBeRemoved, nil
 }


Is this function used by the production code? I see it used only in the test. If it is not used in production, would it be better to remove it rather than complicating this method?

The Remove function calls this function to know if a channel exists. Before this PR, no production code uses it.

cendhu · 2020-06-15T19:28:00Z

common/ledger/blkstorage/blockstore_provider.go

+// A channel is filtered out if it has a temporary __toBeRemoved_ file.
 func (p *BlockStoreProvider) List() ([]string, error) {
-	return util.ListSubdirs(p.conf.getChainsDir())
+	subdirs, err := util.ListSubdirs(p.conf.getChainsDir())
+	if err != nil {
+		return nil, err
+	}
+	channelNames := []string{}
+	for _, subdir := range subdirs {
+		toBeRemoved, _, err := util.FileExists(p.conf.getToBeRemovedFilePath(subdir))
+		if err != nil {
+			return nil, err
+		}
+		if !toBeRemoved {
+			channelNames = append(channelNames, subdir)
+		}
+	}
+	return channelNames, nil
+}


In the peer, we only call Exists() and List() methods on the idStore at the kvledger pkg.

fabric/core/ledger/kvledger/kv_ledger_provider.go

Line 376 in 946016f

func (p *Provider) Exists(ledgerID string) (bool, error) {

The List() on idStore returns only the active ledger IDs.

fabric/core/ledger/kvledger/kv_ledger_provider.go

Line 381 in 946016f

func (p *Provider) List() ([]string, error) {

Would it be good to implement idStore at the factory.go which is used only by the orderer?

fabric/common/ledger/blockledger/fileledger/factory.go

Line 23 in 946016f

type fileLedgerFactory struct {

This idStore in fileLedgerFactory may take care of managing non-blocking channel removal request and failures. It can also implement the List() similar to what we have at the kvledger. As a result, we can remove this List() at the block store.

cendhu · 2020-06-15T19:46:41Z

common/ledger/util/leveldbhelper/leveldb_provider.go

+		batch.Delete(key)
+		numKeys++
+		if batch.Len() >= maxBatchSize {
+			if err := h.WriteBatch(batch, true); err != nil {
+				return err


I assume that we limit the batch size to

reduce the memory usage

reduce huge disk reads (continuously)

If yes, instead of having a maxBatchSize, we need to have a limit on the batch memory limit. For example, we can have a memory limit of size ~10 MB. We can calculate the len(key) to measure the memory utilization of the batch.

cendhu · 2020-06-15T19:51:26Z

common/ledger/util/leveldbhelper/leveldb_provider.go

+	for iter.Next() {
+		key := iter.Key()


Not much can be done here. It is unfortunate that we need to read the whole DB to delete the DB. When we call iter.Next(), it internally reads both the key and value.

cendhu · 2020-06-15T19:55:32Z

common/ledger/util/leveldbhelper/leveldb_provider.go

+			sleepTime := time.Duration(batchesInterval)
+			logger.Infof("Sleep for %d milliseconds between batches of deletion. Entries have been removed for channel %s: %d", sleepTime, h.dbName, numKeys)


Having a sleep of 1 second might increase the channel removal time. Moreover, the delete might set some flags or marker rather than writing huge amount of data to disk. Unless we do some benchmark, I am not sure about the sleep.

The sleep is between batches so that the deletion does not throttle the other operations by causing a sudden burst of I/O. Sleep is added because Remove is a non-blocking method. Need to reconsider if Remove is a blocking method.

wenjianqiao · 2020-06-15T21:27:04Z

@Cendu Thank you for your comments. The approach taken in this PR was based on our discussion in scrum - a temporary file is used for simplicity. Asynchronous remove and sleep was based on discussion with @manish-sethi, the purpose is to prevent a sudden I/O burst. However, asynchronous remove prevents error propagation when removing a channel from peer. Agree it is better to let the caller decide how to call Remove (blocking vs. non-blocking). Will discuss offline with you and Manish regarding to idStore and how to limit batch size.

manish-sethi · 2020-06-15T22:48:03Z

@Cendu Thank you for your comments. The approach taken in this PR was based on the discussion in scrum. Asynchronous remove and sleep was based on discussion with @manish-sethi. Having said so, asynchronous removal at blockstore prevents error propagation when removing a channel from peer. Therefore, it makes sense to let the caller decide how to call Remove (either blocking or non-blocking). Will discuss offline with you and Manish regarding to idStore and how to limit batch size.

In any case, we should still mark a channel 'under-deletion' state internally and

Don't return this channel in the list of available channels
Or rather, enhance the list API to include the status of the channel (e.g., active / under deletion) so consumer not necessarily has to maintain the state and can query blockstore to invoke the continuation of deletion.
@tock-ibm - do you have an opinion on this?

cendhu · 2020-06-16T03:15:28Z

@Cendu Thank you for your comments. The approach taken in this PR was based on the discussion in scrum. Asynchronous remove and sleep was based on discussion with @manish-sethi. Having said so, asynchronous removal at blockstore prevents error propagation when removing a channel from peer. Therefore, it makes sense to let the caller decide how to call Remove (either blocking or non-blocking). Will discuss offline with you and Manish regarding to idStore and how to limit batch size.

In any case, we should still mark a channel 'under-deletion' state internally and

Don't return this channel in the list of available channels

Or rather, enhance the list API to include the status of the channel (e.g., active / under deletion) so consumer not necessarily has to maintain the state and can query blockstore to invoke the continuation of deletion.
@tock-ibm - do you have an opinion on this?

@manish-sethi why do you say so? IMO what you have done with idStore at the kvledger is the clean approach. The peer does not ask the block store to find out the list of active channels --

fabric/core/ledger/kvledger/kv_ledger_provider.go

Line 679 in 946016f

func (s *idStore) getActiveLedgerIDs() ([]string, error) {

If we have idStore at the orderer too, we can use it to store

the last config block used for the join & under construction flag for a channel
whether a channel is passed. (@tock-ibm at the peer, we can pause/resume a channel. See whether it is applicable/useful for the orderer too)
whether a channel is being removed
all active channels

I am not convinced that we need to add multiple internal files at the block store to manage these when it can be done cleanly using an idStore.

Moreover, ListDir() at the block store cannot be trusted in the current form. When a peer channel join command is issued, we first create all necessary DBs including a folder at the block store. However, if the peer fails before committing the genesis block, the folder at the block store is still present when the peer starts again. If the user does not issue another peer join channel, the ListDir() would return in-consistent result. Current, We have a no-op cleanup function

fabric/core/ledger/kvledger/kv_ledger_provider.go

Line 451 in 946016f

func (p *Provider) runCleanup(ledgerID string) error {

that can be enhanced to remove these empty folders and DBs created or we can store the genesis block itself in the idStore to manage failure but I still think that these things need to be managed at the consumer. I am not familiar with the orderer code and hence I am not sure whether orderer does a cleanup after a failure to rely on ListDir().

manish-sethi · 2020-06-16T05:06:28Z

@manish-sethi why do you say so? IMO what you have done with idStore at the kvledger is the clean approach. The peer does not ask the block store to find out the list of active channels --

Yes, I agree and that's why I did that at the kvledger level at the first place. But somehow, orderer never maintained that status on it's own. I guess that it hit a corner case bug for not supporting atomic creation of a channel with genesis block commit. However, unfortunately, we made a mistake of supporting the list function which is not robust enough to recover from failures.

So, now the call we have to make is whether orderer maintains it's status on it's own or make the blockstore more robust in maintaining this. However, I did not want to drag that discussion over this PR, as this is beyond this PR.

cendhu · 2020-06-16T06:23:52Z

Yes, I agree and that's why I did that at the kvledger level at the first place. But somehow, orderer never maintained that status on it's own. I guess that it hit a corner case bug for not supporting atomic creation of a channel with genesis block commit. However, unfortunately, we made a mistake of supporting the list function which is not robust enough to recover from failures.

So, now the call we have to make is whether orderer maintains it's status on it's own or make the blockstore more robust in maintaining this. However, I did not want to drag that discussion over this PR, as this is beyond this PR.

Sure. So we are in the same page. I will move this discussion offline. I think @wenjianqiao can limit the scope of this PR to just removal of the channel's data without handling the failure/recovery.

tock-ibm · 2020-06-16T09:48:10Z

@Cendu Thank you for your comments. The approach taken in this PR was based on the discussion in scrum. Asynchronous remove and sleep was based on discussion with @manish-sethi. Having said so, asynchronous removal at blockstore prevents error propagation when removing a channel from peer. Therefore, it makes sense to let the caller decide how to call Remove (either blocking or non-blocking). Will discuss offline with you and Manish regarding to idStore and how to limit batch size.

In any case, we should still mark a channel 'under-deletion' state internally and

Don't return this channel in the list of available channels

Or rather, enhance the list API to include the status of the channel (e.g., active / under deletion) so consumer not necessarily has to maintain the state and can query blockstore to invoke the continuation of deletion.
@tock-ibm - do you have an opinion on this?

@manish-sethi why do you say so? IMO what you have done with idStore at the kvledger is the clean approach. The peer does not ask the block store to find out the list of active channels --

fabric/core/ledger/kvledger/kv_ledger_provider.go

Line 679 in 946016f

func (s *idStore) getActiveLedgerIDs() ([]string, error) {

If we have idStore at the orderer too, we can use it to store

the last config block used for the join & under construction flag for a channel

whether a channel is passed. (@tock-ibm at the peer, we can pause/resume a channel. See whether it is applicable/useful for the orderer too)

whether a channel is being removed

all active channels

I am not convinced that we need to add multiple internal files at the block store to manage these when it can be done cleanly using an idStore.

Moreover, ListDir() at the block store cannot be trusted in the current form. When a peer channel join command is issued, we first create all necessary DBs including a folder at the block store. However, if the peer fails before committing the genesis block, the folder at the block store is still present when the peer starts again. If the user does not issue another peer join channel, the ListDir() would return in-consistent result. Current, We have a no-op cleanup function

fabric/core/ledger/kvledger/kv_ledger_provider.go

Line 451 in 946016f

func (p *Provider) runCleanup(ledgerID string) error {

that can be enhanced to remove these empty folders and DBs created or we can store the genesis block itself in the idStore to manage failure but I still think that these things need to be managed at the consumer. I am not familiar with the orderer code and hence I am not sure whether orderer does a cleanup after a failure to rely on ListDir().

The orderer responds with panic if it finds a folder that it got from r.ledgerFactory.ChannelIDs() but has no blocks in it:

fabric/orderer/common/multichannel/registrar.go

Line 155 in 2fdbafb

func (r *Registrar) Initialize(consenters map[string]consensus.Consenter) {

It expects to get a configTX for every ChannelID that r.ledgerFactory.ChannelIDs() finds.

tock-ibm · 2020-06-16T11:55:27Z

The orderer uses this API:

type Factory interface {
	// GetOrCreate gets an existing ledger (if it exists) or creates it if it does not
	GetOrCreate(channelID string) (ReadWriter, error)

	// ChannelIDs returns the channel IDs the Factory is aware of
	ChannelIDs() []string

	// Remove removes block indexes and blocks for the given channelID
	Remove(channelID string) error

	// Close releases all resources acquired by the factory
	Close()
}

So from that perspective, there are two options:

Blocking
Remove is a blocking call that returns when resources are removed or marked for removal and moved elsewhere, and lazily removed in the background. When remove returns, the ChannelIDs() should not return this channel, and GetOrCreate() should succeed in creating a new empty channel. Remove() should ideally complete fast - without a dependency on the amount of storage in the channel. The lazy cleanup can take longer time. From that perspective, it would be useful also to add a Get() and a Create() to this API. The orderer generally knows when a channel exists or not, and can select the correct call.

Non-Blocking
If Remove() can return and leave a channel in an intermediate state, i.e. "pending-removal", then this needs to be reflected in the API. That is, both ChannelIDs() and GetOrCreate() (or Get(), etc) should return a special code to reflect that.

I vote for the first approach, because it is simpler and easier to work with. The latter will trigger a lot of changes in every place in the code that uses this API.

As for crash tolerance, I think this API should handle it, i.e. the one created by:
func New(directory string, metricsProvider metrics.Provider) (blockledger.Factory, error)
called from here:

fabric/orderer/common/server/util.go

Line 19 in 2fdbafb

    
           func createLedgerFactory(conf *config.TopLevel, metricsProvider metrics.Provider) (blockledger.Factory, string, error) {

tock-ibm · 2020-06-16T12:09:11Z

common/ledger/blockledger/fileledger/factory.go

+// Remove removes block indexes and blocks for the given channelID
+func (flf *fileLedgerFactory) Remove(channelID string) error {
+	return flf.blkstorageProvider.Remove(channelID)
+}


When is a channelID removed from the flf.ledgers ?
Why is this not in sync with GetOrCreate()?
I don't see anything that will prevent a user from doing GetOrCreate() while this is going on, and getting stale results.

tock-ibm · 2020-06-16T12:19:54Z

common/ledger/blkstorage/blockstore_provider.go

+func (p *BlockStoreProvider) removeLedgerData(ledgerid string) error {
+	logger.Infof("Removing block data for channel %s", ledgerid)
+	if err := p.leveldbProvider.Remove(ledgerid); err != nil {
+		logger.Errorf("Failed to remove block index for channel %s, error: %s", ledgerid, err)
+		return err
+	}


What happens to this if someone calls Open(ledgerid) while this method is being executed in a go-routine? The lock is already released, but some keys that belong to the old channel are still in the db, right?

tock-ibm · 2020-06-16T12:25:06Z

common/ledger/blkstorage/blockstore_provider.go

+	if err := os.RemoveAll(p.conf.getLedgerBlockDir(ledgerid)); err != nil {
+		logger.Errorf("Failed to remove blocks for channel %s, error: %s", ledgerid, err)
+		return err
+	}


Isn't it safer to first move the folder to a different name in the same parent folder, with a prefix that indicates it is about to be removed say, "~my-channel", and only then remove it? The move can be made sync, whereas the RemoveAll async. This assumes that the move is to the same file system, and therefore will use the system call rename(), which will make it atomic.

wenjianqiao · 2020-06-19T03:03:55Z

Close this PR due to design change. New PR is #1423

wenjianqiao force-pushed the removechannel branch from 7d2035a to 3298690 Compare June 15, 2020 16:42

wenjianqiao changed the title ~~Allow remove ledger data for a channel~~ [FAB-17992] Allow remove ledger data for a channel Jun 15, 2020

wenjianqiao force-pushed the removechannel branch from 3298690 to 3866768 Compare June 15, 2020 18:00

wenjianqiao marked this pull request as ready for review June 15, 2020 18:04

wenjianqiao requested a review from a team as a code owner June 15, 2020 18:04

cendhu reviewed Jun 15, 2020

View reviewed changes

tock-ibm reviewed Jun 16, 2020

View reviewed changes

tock-ibm requested changes Jun 16, 2020

View reviewed changes

manish-sethi mentioned this pull request Jun 17, 2020

Bootstrap blockstore from snapshotted TxIDs #1413

Merged

wenjianqiao closed this Jun 19, 2020

wenjianqiao deleted the removechannel branch July 15, 2020 21:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FAB-17992] Allow remove ledger data for a channel #1403

[FAB-17992] Allow remove ledger data for a channel #1403

wenjianqiao commented Jun 15, 2020 •

edited

Loading

cendhu left a comment

cendhu Jun 15, 2020

wenjianqiao Jun 15, 2020 •

edited

Loading

cendhu Jun 15, 2020

tock-ibm Jun 16, 2020 •

edited

Loading

cendhu Jun 15, 2020

cendhu Jun 15, 2020

wenjianqiao Jun 15, 2020

cendhu Jun 15, 2020

cendhu Jun 15, 2020

cendhu Jun 15, 2020

cendhu Jun 15, 2020

wenjianqiao Jun 15, 2020

wenjianqiao commented Jun 15, 2020 •

edited

Loading

manish-sethi commented Jun 15, 2020

cendhu commented Jun 16, 2020 •

edited

Loading

manish-sethi commented Jun 16, 2020

cendhu commented Jun 16, 2020

tock-ibm commented Jun 16, 2020 •

edited

Loading

tock-ibm commented Jun 16, 2020

tock-ibm Jun 16, 2020

tock-ibm Jun 16, 2020 •

edited

Loading

tock-ibm Jun 16, 2020

wenjianqiao commented Jun 19, 2020

		sleepTime := time.Duration(batchesInterval)
		logger.Infof("Sleep for %d milliseconds between batches of deletion. Entries have been removed for channel %s: %d", sleepTime, h.dbName, numKeys)

[FAB-17992] Allow remove ledger data for a channel #1403

[FAB-17992] Allow remove ledger data for a channel #1403

Conversation

wenjianqiao commented Jun 15, 2020 • edited Loading

Type of change

Description

Additional details

Related issues

cendhu left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wenjianqiao Jun 15, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tock-ibm Jun 16, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wenjianqiao commented Jun 15, 2020 • edited Loading

manish-sethi commented Jun 15, 2020

cendhu commented Jun 16, 2020 • edited Loading

manish-sethi commented Jun 16, 2020

cendhu commented Jun 16, 2020

tock-ibm commented Jun 16, 2020 • edited Loading

tock-ibm commented Jun 16, 2020

Choose a reason for hiding this comment

tock-ibm Jun 16, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wenjianqiao commented Jun 19, 2020

wenjianqiao commented Jun 15, 2020 •

edited

Loading

wenjianqiao Jun 15, 2020 •

edited

Loading

tock-ibm Jun 16, 2020 •

edited

Loading

wenjianqiao commented Jun 15, 2020 •

edited

Loading

cendhu commented Jun 16, 2020 •

edited

Loading

tock-ibm commented Jun 16, 2020 •

edited

Loading

tock-ibm Jun 16, 2020 •

edited

Loading