Skip to content
This repository has been archived by the owner on Sep 11, 2020. It is now read-only.

storage: transactional, new storage with transactional capabilities #1006

Merged
merged 5 commits into from
Feb 2, 2019
Merged

storage: transactional, new storage with transactional capabilities #1006

merged 5 commits into from
Feb 2, 2019

Conversation

mcuadros
Copy link
Contributor

This PR introduces a new storage, called transactional allow to do Rollback any kind of operation over the storage, discarding Objects, Config, References, etc

…WIP)

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>
@mcuadros mcuadros requested a review from smola October 29, 2018 14:45
@smola
Copy link
Collaborator

smola commented Oct 30, 2018

What would be the relation of this storage with our current object.Transaction interface?

return err
}

return c.ConfigStorer.SetConfig(cfg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might not be transactional depending on the ConfigStorer implementation under the hood. Per example, using filesystem implementation SetConfig can fail on Write, causing a partial write.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this can be a problem. At least for our use case, since we don't do a commit per operation, but after a group of operations. That is, we want a Commit on the full storage, not just part of it.

@@ -0,0 +1,63 @@
package transactional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No Commit method here, when the objects from temporal are copied to the main storer?

return obj, err
}

func (o *ObjectStorage) IterEncodedObjects(t plumbing.ObjectType) (storer.EncodedObjectIter, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mixed concepts here. If we are talking about transactional object storage, objects shouldn't be read until Commit is called.

Also, depending on the use case, maybe we want to do the opposite, a Rollback.

Copy link
Collaborator

@smola smola Oct 30, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be ok to read objects in a transaction. Just like in SQL. As long as read in a transaction sees writes in the same transaction, and does not read objects written in a different transaction.

@mcuadros
Copy link
Contributor Author

None, most likely in v5, the old transaction should be deleted.

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>
// deleted, remaining references at this maps are going to be deleted when
// commit is requested, the entries are added when RemoveReference is called
// and deleted if SetReference is called.
deleted map[plumbing.ReferenceName]struct{}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you think about this @smola?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

However, I'm still concerned that the whole approach is not really usable by borges. If that was the goal, it would actually need its own implementation of Commit...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the implementation of Commit should be ad-hoc for each interface, but I don't think is going to be the current one.

deleted map[plumbing.ReferenceName]struct{}
// packRefs if true PackRefs is going to be called in the based storer when
// commit is called.
packRefs bool
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

…ndSetReference

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>
Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>
@mcuadros mcuadros changed the title storage: transactional, new storage with transactional capabilities (WIP) storage: transactional, new storage with transactional capabilities Dec 10, 2018
@mcuadros
Copy link
Contributor Author

@smola @ajnavarro PTAL

@smola
Copy link
Collaborator

smola commented Dec 11, 2018

I'm thinking about how this fits in borges. It looks like it doesn't, since it would produce just additional copies of everything in filesystem for no additional benefit.

@mcuadros
Copy link
Contributor Author

I don't understand several things:

  1. Why this should be something that works with the current borges implementation
  2. Why you say that produces additional copies

@smola
Copy link
Collaborator

smola commented Dec 11, 2018

Why this should be something that works with the current borges implementation

It does not need to work with current implementation, but I thought this was meant to be used by borges future implementation. So I would expect the result to be equal or better in terms of performance, robustness, etc.

Why you say that produces additional copies

Well, I have looked better to it and it looks like additional copies would not be necessary when borges provides a Commit implementation.

It currently looks like this:

  1. Begin transaction -> a siva file is copied from remote/permanent storage to temporary/local storage.
  2. Perform changes -> appends each change directly to the local siva file.
    3a. Commit -> Move local siva file to remote storage.
    3b or Rollback -> Remove local siva file.

This could be, in theory, optimized as follows:

  1. Begin transaction -> no-op
  2. Perform changes -> append directly to a new siva block in local/temporary storage, possibly reading directly the previous blocks from remote storage if needed.
    3a. Commit -> Concat new block in local storage to remote storage.
    3b or Rollback -> Remove new block from local storage.

With this new implementation, if we integrated it with borges, I see it being something like this:

  1. Begin transaction -> kind of no-op / lazily creating a new temporary local storage, this can be either a full siva copy (as in our current solution) or a new empty siva block (as in the optimized solution)
  2. Perform changes -> perform changes on temporary storage, appended directly to the local siva file or block.
    3a. Commit -> Move local siva file to remote storage or Concat local siva block to remote storage.
    3b or Rollback -> Remove local siva file or block.

Would it be like this? Looks like it could actually work well with the custom Commit for borges, not sure if I'm missing something.

Copy link
Collaborator

@smola smola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation looks good to me, although it would be good to clarify how borges is meant to use this before merging (see my previous comment).

@smola
Copy link
Collaborator

smola commented Dec 12, 2018

@mcuadros What about providing an interface too?

Signed-off-by: Máximo Cuadros <mcuadros@gmail.com>
@mcuadros
Copy link
Contributor Author

mcuadros commented Feb 2, 2019

I marked this package as experimental, is this way we can merged and used in the other projects, and changed it is required.

//cc @smola @jfontan

@mcuadros mcuadros merged commit d1b5bce into src-d:master Feb 2, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants