-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added tree interface #1043
added tree interface #1043
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! This is very much what I was expecting, it will be exteremly helpful.
I left some comments.
The thing I'm missing the most is documentation. What is the functionality provided by each interface and each method. I know it may be clear to you, but it will help others allow.
forest/tree/tree.go
Outdated
PartName sstable.ID `json:"part_name"` | ||
MaxKey gr.Key `json:"max_path"` | ||
} | ||
type TreeType struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove the word "type", it's confusing.
Also from the treePartType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
forest/tree/tree.go
Outdated
} | ||
|
||
type TreeWriter interface { | ||
HasOpenWriter() bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not really clear to me how the writer works.
Are HasOpenWriter
, ForceCloseCurrentPart
and IsSplitKey
part of the interface, or are they implementation details?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HasOpenWriter and ForceClose are interfaces needed by Apply.
I am not sure they will be needed by others. If Merge is implemented without apply, It will probably need it.
IsSplitKey determines if a key should terminate a part. is probably redundant. So unless someone really wants it - it will be removed
return t.partManger | ||
} | ||
|
||
type TreeRepo interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about GetTree?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's discuss it. a Tree is an internal structure of the package. I am not sure it is needed.
There is another one needed, which returns two reduced trees for diff. let's discuss it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So how can one pass a TreeSlice to use the interface?
forest/tree/tree.go
Outdated
treeSlice []treePartType | ||
} | ||
|
||
type treesRepo struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an implementation. Did you mean to create an interface from it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be moved
forest/tree/tree.go
Outdated
partManger sstable.Manager | ||
} | ||
|
||
const ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are these? add documentation if they are part of the interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will be moved to the cache part
forest/tree/tree.go
Outdated
NewTreeWriter(splitFactor int, closeAsync sstable.BatchWriterCloser) TreeWriter | ||
NewScannerFromID(treeID gr.TreeID, start gr.Key) (gr.ValueIterator, error) | ||
NewScannerFromTreeParts(treeSlice TreeType, start gr.Key) (gr.ValueIterator, error) | ||
GetPartManger() sstable.Manager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the tree interface must expose the inner part manager? It's the sort of complication we really need to avoid if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently process initialization, and where application parts get trees and sstables is not finalized AFAIK. Once it is - I will happily remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is initialisation part of the interface?
Isn't it a one time operation during setup?
Co-authored-by: itaiad200 <itaiad200@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this interface requires a do-over - it expoes some of the implementation details and is pretty hard to understand without seeing how it is being used, so kind of defeating the purpose of having a interface in the first place.
forest/tree/tree.go
Outdated
type TreeWriter interface { | ||
HasOpenWriter() bool | ||
WriteEntry(record gr.ValueRecord) error | ||
ForceCloseCurrentPart() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This very much looks like implementation details, not an interface. What is a current part? it is not mentioned anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, removed from interface
forest/tree/tree.go
Outdated
HasOpenWriter() bool | ||
WriteEntry(record gr.ValueRecord) error | ||
ForceCloseCurrentPart() | ||
IsSplitKey(key gr.Key, rowNum int) bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this part of a tree writer? Why not decouple these and have a KeySplitter
interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed it from interfaces. currently no need to expose it
forest/tree/tree.go
Outdated
ForceCloseCurrentPart() | ||
IsSplitKey(key gr.Key, rowNum int) bool | ||
FlushIterToTree(iter gr.ValueIterator) error | ||
SaveTree(reuseTree TreeType) (gr.TreeID, error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is a reuseTree? the interface doesn't really explain itself, so at a minimum it requires extensive documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, added explanation
forest/tree/tree.go
Outdated
|
||
import ( | ||
cache "github.com/treeverse/lakefs/forest/cache_map" | ||
gr "github.com/treeverse/lakefs/graveler" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's not alias our own packages, it makes the rest of the implementation harder to read
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
…ture/tree-interface # Conflicts: # forest/tree/tree.go
forest/tree/tree.go
Outdated
MaxKey gr.Key `json:"max_path"` | ||
} | ||
type TreeSlice struct { | ||
treeSlice []TreePart |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, if it's a struct including a treeSlice, then the struct type should be called Tree
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after commit
forest/tree/tree.go
Outdated
treeSlice []TreePart | ||
} | ||
|
||
type treesRepo struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change "trees-repo" to tree-repo (singular) in all relevant places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after commit
Codecov Report
@@ Coverage Diff @@
## master #1043 +/- ##
==========================================
+ Coverage 46.17% 46.23% +0.05%
==========================================
Files 152 152
Lines 12316 12318 +2
==========================================
+ Hits 5687 5695 +8
+ Misses 5903 5897 -6
Partials 726 726
Continue to review full report at Codecov.
|
…ange once integrated with rest of tree package
…ange once integrated with rest of tree package
} | ||
|
||
type TreeWriter interface { | ||
WriteEntry(record graveler.ValueRecord) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming suggestions:
WriteEntry >> WriteValue
FlushIterToTree >> WriteIterator
SaveTree: split to 1. Save() which will save and return tree ID.
and 2. Merge(tree) which will create a new tree on top of an existing one and return a tree id as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good ideas. will do after merge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why wait?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
package tree | ||
|
||
import ( | ||
// cache "github.com/treeverse/lakefs/forest/cache_map" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove.
"github.com/treeverse/lakefs/graveler/committed/sstable" | ||
) | ||
|
||
type TreePart struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you need json tags?
|
||
type TreePart struct { | ||
PartName sstable.ID `json:"part_name"` | ||
MaxKey graveler.Key `json:"max_path"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MaxKey graveler.Key `json:"max_path"` | |
MaxKey graveler.Key `json:"max_key"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
PartName sstable.ID `json:"part_name"` | ||
MaxKey graveler.Key `json:"max_path"` | ||
} | ||
type TreeSlice struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need it? Also - name is confusing as it's parts slice.
|
||
type TreeRepo interface { | ||
// NewTreeWriter returns a writer that uses the part manager to create a new tree | ||
NewTreeWriter(splitFactor int, // average number of keys that we want to stored in a part |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use go standard of documentation?
i.e. Everything is about the declaration, not embedded in the args across the lines.
forest/tree/tree.go
Outdated
NewTreeWriter(splitFactor int, closeAsync sstable.BatchWriterCloser) TreeWriter | ||
NewScannerFromID(treeID gr.TreeID, start gr.Key) (gr.ValueIterator, error) | ||
NewScannerFromTreeParts(treeSlice TreeType, start gr.Key) (gr.ValueIterator, error) | ||
GetPartManger() sstable.Manager |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is initialisation part of the interface?
Isn't it a one time operation during setup?
} | ||
|
||
type TreeWriter interface { | ||
WriteEntry(record graveler.ValueRecord) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why wait?
type TreeRepo interface { | ||
// NewTreeWriter returns a writer that uses the part manager to create a new tree | ||
NewTreeWriter(splitFactor int, // average number of keys that we want to stored in a part | ||
// for more detail, look at "IsSplitKey" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is 'IsSplitKey' ?
|
||
type TreeRepo interface { | ||
// NewTreeWriter returns a writer that uses the part manager to create a new tree | ||
NewTreeWriter(splitFactor int, // average number of keys that we want to stored in a part |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is splitFactor changeable between trees? IMO should be a const.
// NewTreeWriter returns a writer that uses the part manager to create a new tree | ||
NewTreeWriter(splitFactor int, // average number of keys that we want to stored in a part | ||
// for more detail, look at "IsSplitKey" | ||
closeAsync sstable.BatchWriterCloser, // component used to close part asynchronously, and wait for all part |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're braking the abstraction layers, why does the caller of the TreeManager need to pass a sstable writer feature?
Initial commit to enable other team members to work against this interface/ will be changed later according to needs |
Just one small file
Yoni - please see if it answers your needs.
Others - please see it match our conventions and golang
This change is