-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prune/Retention Policy #3
base: master
Are you sure you want to change the base?
Conversation
I'm still figuring out tests so i've been testing by setting retention periods in my globals and running the prune command with
|
I don't know how I yet feel about string-ish duration types when we have Regarding the "Update documentation to match reality" commitThis is probably a side effect of me rebasing your last PR (#1) onto master (instead of merging it). To resolve this (on your end, in this PR), you need to do something like this: $# add this repo as "upstream"
$ git remote add upstream git@github.com:digineo/zackup
$# verify remotes
$ git remote -v
origin git@github.com:ngharo/zackup (fetch)
origin git@github.com:ngharo/zackup (push)
upstream git@github.com:digineo/zackup (fetch)
upstream git@github.com:digineo/zackup (push)
$# fetch changes from upstream
$ git fetch upstream master
From github.com:digineo/zackup
* branch master -> FETCH_HEAD
* [new branch] master -> upstream/master
$# where are we now?
$ git status
On branch feature/prune
nothing to commit, working tree clean
$# rebase onto upstream master. this will replace f9b040a with 9398c53
$ git rebase upstream/master
README.md | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)
First, rewinding head to replay your work on top of it...
Applying: Laying out snapshot pruning functionality
Applying: Loop over each bucket when finding keepers
$# do a similar thing with your master branch
$ git checkout master
$ git reset --hard upstream/master
$# update your repository
$ git push --force-with-lease origin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left some comments, which requires a bit of refactoring.
For now, I've avoided to review the listKeepers
function, because I believe it will look quite different after you've addressed the other comments. I'll also need some time to think about the patterns
variable.
for _, ss := range strings.Fields(o.String()) { | ||
ts, err := time.Parse(time.RFC3339, strings.Split(ss, "@")[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
string.Fields(o.String())
and strings.Split
produce quite a few allocations. We can avoid them with something like this:
s := bufio.NewScanner(o)
for s.Scan() {
name := s.Bytes()
at := bytes.IndexRune(line, '@')
if at < 0 {
// malformed snapshot name
continue
}
ts, err := time.Parse(time.RFC3339, string(line[at]:) // might be off-by-one
...
}
if err := s.Err(); err != nil {
// could not parse the output
log(...)
return nil
}
This parses the output line-by-line (with a bufio.Scanner
), and converts only the part after the @
in each line to a string.
I've been thinking about the What I'd prefer to have instead is a list of (interval, keepCount) tuples to define the retention policy. This brings a few other changes with it. I hope I've covered everything here, but please feel free to ask if you have questions :-) First, we want/need a few constants to translate the const (
daily = 24*time.Hour
weekly = 7*daily
monthly = 30*daily
yearly = 360*daily
)
type bucket struct {
interval time.Duration
keep *int
}
func PruneSnapshots(job *config.JobConfig) {
// keep sorted by interval (or call sort.StringSlice for safety)
buckets := []bucket{
{daily, job.Retention.Daily},
{weekly, job.Retention.Weekly},
{monthly, job.Retention.Monthly},
{yearly, job.Retention.Yearly},
}
// ...
} Then, instead of a function // A retentionPolicy is an ordered list of buckets, which define a
// keep-or-delete policy for a set of snapshots.
type retentionPolicy []bucket
// apply partitions the given input snapshot list into two new slices: one
// containing snapshots to keep and one containing snapshots to delete.
// You need to pass a reference time for the policy's buckets to anchor on.
func (rp retentionPolicy) apply(now time.Time, snapshots []snapshot) (toKeep, toDelete []snapshot) {
// iterate over rp and snapshots and sort snapshots[i] in either toKeep or
// toDelete, depending on weather snapshots[i].Time < now.Add(rp[j].interval)
// A naïve implementation could collect a list of candidates per entry in rp
// and define toKeep as the unique set of all candidates (and toDelete as
// snapshots - toKeep). This is easy to implement in more functional languages,
// but a bit of a hassle in Go.
} This would reduce func PruneSnapshots(job *config.JobConfig) {
policy := retentionPolicy{
{daily, job.Retention.Daily},
// ...
}
snapshots := listSnapshots(job.Host)
_, del := policy.apply(time.Now(), snapshots)
deleteSnapshots(del) // TODO
} Notes on testing:
A table-driven test as samplefunc TestRetentionPolicy_apply(t *testing.T) {
now := time.Now()
const D = 24*time.Hour
intptr := func(i int) *int { return &i }
cases := map[string]struct{
subject retentionPolicy
input []snapshot
// only the names of input
keepNames, deleteName []string
}{
"test demo": {
subject: retentionPolicy{{D, intptr(2)}},
input: []snapshot{{"a", now}, {"b", now.Add(-1*D}, {"c", now.Add(-2*D}},
keepNames: []string{"a", "b"},
deleteNames: []string{"c"},
},
// ... other more non-trivial cases ...
}
for name := range cases {
tc := cases[name]
t.Run(name, func(t *testing.T) {
keep, del := tc.subject.apply(now, tc.input)
// assertEqual(tc.toKeep, [s.name for s in keep])
// assertEqual(tc.toDelete, [s.name for s in del])
}
}
} In another test, we could define a single |
Closes #2
This is a work in progress.
This uses a super simple algorithm to determine with snapshots to keep, see
listKeepers
function.The
PruneSnapshots
command currently does not perform any destructive operations on your ZFS datasets. The debug logging is how I've been testing. I want to write some tests but I wanted to get this out there for more discussion around thelistKeepers
function.(Side note: I don't know why commit f9b040a is showing up in this PR since it's already in master ???)