Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use JSON file for diskUsage.cache and include note on accuracy of value #35

Merged
merged 14 commits into from
Apr 3, 2018

Conversation

kevina
Copy link
Contributor

@kevina kevina commented Mar 23, 2018

Alternative to #33.

Closes #31. Closes #37.

@ghost ghost assigned kevina Mar 23, 2018
@ghost ghost added the status/in-progress In progress label Mar 23, 2018
Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I obviously prefer this way :). It gives us a single file that we can store multiple stats in atomically.

flatfs.go Outdated
diskUsage int64
storedValue diskUsageValue
// updateLock must be held when updating storedValue or writing
// to a file, it doesn't need to be held when reading
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to "a file"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean when writing diskUsage to a file.

flatfs.go Outdated
tmp, err := ioutil.TempFile(fs.path, "du-")
if err != nil {
atomic.StoreInt64(&fs.storedValue.diskUsage, origVal)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should definitely be logging these errros.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay.

flatfs.go Outdated
// do not store on errors. just ignore them
return
}
fs.updateLock.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You know, really, we should probably just do this in a goroutine and signal the goroutine with a channel. We could end up with a lot of blocking here, especially because this the "should we update" check is racy. (old bug, I guess, but we should still fix it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your right, and that was going to be my next step.

func (fs *Datastore) calculateDiskUsage() error {
// Try to obtain a previously stored value from disk
if persDu := fs.readDiskUsageFile(); persDu > 0 {
atomic.StoreInt64(&fs.diskUsage, persDu)
fs.diskUsage = persDu
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually, I'd like to also mark the on-file disk usage as an estimate until we close. That way, if we crash, we know that our estimate may be off by a bit. However, we can probably do that in a future PR (your preference).

This now means that the datastore needs to be properly closed, to clean up
the background thread.
@kevina kevina force-pushed the kevina/diskusage-notes2 branch from c2d5de5 to 3633dfa Compare March 24, 2018 05:21
@Kubuxu
Copy link
Member

Kubuxu commented Mar 24, 2018

Won't this need migration?

@Stebalien
Copy link
Member

Won't this need migration?

We haven't bubbled the previous changes yet.

@kevina kevina requested a review from Stebalien March 25, 2018 03:03
Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits/suggestions.

flatfs.go Outdated

fs.checkpointCh = make(chan bool, 1)
fs.done = make(chan bool)
go func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Can just call go fs.checkpointLoop().

flatfs.go Outdated
@@ -207,6 +241,12 @@ func Open(path string, syncFiles bool) (*Datastore, error) {
// elements in the datastore.
return nil, err
}

fs.checkpointCh = make(chan bool, 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: chan struct{} is cheeper.

flatfs.go Outdated
func (fs *Datastore) deactivate() error {
if fs.checkpointCh != nil {
fs.checkpointCh <- true
close(fs.checkpointCh)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: can just close the channel and always checkpoint one more time.

flatfs.go Outdated
}

fs.storedValue.diskUsage = du
fs.dirty = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: may be cleaner to return errors from this function instead of using this shared dirty flag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried that, and it seamed to make the code more complicated.

flatfs.go Outdated
newDu := float64(du)
lastCheckpointDu := float64(fs.storedValue.diskUsage)
diff := math.Abs(newDu - lastCheckpointDu)
if (lastCheckpointDu * diskUsageCheckpointPercent / 100.0) < diff {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ultra nit: can avoid floats here (lastCheckpointDu * diskUsageCheckpointPercent) < diff * 100

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can avoid the division but not the use of floats. diskUsageCheckpointPercent is a float and I don't want to make it an integer.

// Otherwise insure the value will be written to disk after
// `diskUsageCheckpointTimeout`
if fs.dirty && !timerActive {
timer.Reset(diskUsageCheckpointTimeout)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't appear to ever turn this off.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops. Forgot to add timerActive = true.

Copy link
Member

@Stebalien Stebalien Mar 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still missing a call to Stop as far as I can tell (and a defered call to Stop).

Copy link
Contributor Author

@kevina kevina Mar 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, there is no need to call Stop here. If timerActive is false than the timer already expired. From the go docs (https://golang.org/pkg/time/#Timer.Reset):

If a program has already received a value from t.C, the timer is known to have expired, and t.Reset can be used directly.

And so that we are clear the expected behavior is for the DiskUsage to be written out every 2 seconds regardless of activity (rather than only do it after idling for 2 seconds). This is more likely to keep the Disk Usage estimate accurate in case of a crash during periods of high activity and a single write every 2 seconds is unlikely to add much overhead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. I see.

@kevina kevina requested a review from Stebalien March 26, 2018 22:56
@kevina kevina force-pushed the kevina/diskusage-notes2 branch from 75d2d25 to 6392d21 Compare March 27, 2018 08:08
@kevina
Copy link
Contributor Author

kevina commented Mar 27, 2018

Note. I enhanced the tests and am now getting this failure:

--- FAIL: TestDiskUsage (0.95s)
    --- FAIL: TestDiskUsage/next-to-last (0.32s)
        flatfs_test.go:401: /tmp/test-datastore-flatfs-557319926
        flatfs_test.go:416: duNew: 1238
        flatfs_test.go:433: duPostPut: 4038
        flatfs_test.go:448: duPostDelete: 2038
        flatfs_test.go:451: duFinal: 2038
        flatfs_test.go:470: Unexpected value for diskUsage in diskUsage.cache: 2058 (expected 2038)

I put debug statements in the code and discovered that the value stored in diskUsage.cache is a value from a previous write. I verified that the final write to disk on close is being done in the correct order and has the expected value. Right now my only conclusion is that the o.s. is doing the renaming from the temporary files out of order and an older rename is being done before the final, but I can't seam to find any documentation that this is ever a problem when doing a "write to temp file and rename over the original". I do know that you should probably sync the temporary file before close but I tried that and it didn't seam to help.

I will look more into this tomorrow, but suggestions welcome. Do I need to sync the directory also?

Copy link
Contributor

@hsanjuan hsanjuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think many checkpoints are discarded because the channel is full (// checkpoint request already pending). Thus you may be discarding the last one with the expected size in the test because previous ones are pending.

@kevina
Copy link
Contributor Author

kevina commented Mar 27, 2018

I think many checkpoints are discarded because the channel is full (// checkpoint request already pending). Thus you may be discarding the last one with the expected size in the test because previous ones are pending.

That is the intended behavior, it keeps unnecessary ones from piling up. The final checkpoint is not discarded because it only happens when the channel is closed. I verified this with debug print statements that I didn't push.

Also sync contents of diskUsage.cache to disk after the initial calculation
and during shutdown.
@kevina kevina force-pushed the kevina/diskusage-notes2 branch from 1ab6972 to 6846908 Compare March 27, 2018 23:00
@kevina
Copy link
Contributor Author

kevina commented Mar 27, 2018

I found the bug, I was doing something stupid. This should be ready for review now.

@kevina kevina force-pushed the kevina/diskusage-notes2 branch from 2e97c91 to 08c78dc Compare March 27, 2018 23:27
@hsanjuan
Copy link
Contributor

Are we sure this is not overkill in terms of code and complexity just to inform the user that the disk usage is approximate (which always is anyway) ?

I can imagine that either users care about accuracy, in which case, given we cannot tell them how far the approximation is from reality, they'll want to manually calculate and fix their diskUsage files. Or don't care so much, in which case it is the same how the approximation was performed. This might be solving a pure documentation issue with complex code and I'm not sure it's the best thing. Just my two cents...

@kevina
Copy link
Contributor Author

kevina commented Mar 28, 2018

@hsanjuan this was mostly @Stebalien idea, it also solves some other problems such as doing the checkpoint asynchronously and ensuring the disk usage is written to disk every 2 seconds.

@Stebalien
Copy link
Member

@hsanjuan basically, the only complex changes here are performance related. The "use a single json file" change should actually simplify things (although my motivation was to allow us to atomically update the accuracy and the size).

flatfs.go Outdated
diskUsage int64
diskUsageCheckpoint int64
// atmoic operations should always be used with diskUsage
diskUsage int64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variables operated by atomics need to be aligned. This isn't guaranteed by structs in go, the solution for this is either embedded struct or placing it as the first element in a struct.

This is not really related to this patchset but should be fixed either way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. That's something that should go a the top of the documentation and in big bold letters everywhere else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kubuxu

This isn't guaranteed by structs in go

To the best I can tell this is only a problem on 32-bit architectures that only guarantee 4 byte alignment on int64. Correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. That's something that should go a the top of the documentation and in big bold letters everywhere else.

Thats what I said. Its actually at the bottom, in normal font:
https://golang.org/pkg/sync/atomic/#pkg-note-BUG

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kevina no it depends on how Go lays out the structure and x86-32 does not have atomic ops on misaligned 64bit variables.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kubuxu I think we are saying the same thing. See http://www.tapirgames.com/blog/golang-memory-alignment

flatfs.go Outdated
return exactA
}
return ""
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should probably cover all cases (even if "cover" is return a).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There shouldn't be any other cases. The best I can do is if one is an empty string return the other. If a and b disagree otherwise the is no good value to return so I will just return "".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I'd return some form of "unknown".

Copy link
Member

@Stebalien Stebalien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two small changes, LGTM otherwise.

flatfs.go Outdated
"strings"
"sync"
"sync/atomic"
"time"

"github.com/ipfs/go-datastore"
"github.com/ipfs/go-datastore/query"
"gx/ipfs/QmeiCcJfDW1GJnWUArudsv5rQsihpi4oyddPhdqo3CfX6i/go-datastore"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't be rewritten.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! I have been using git gui and hand picking the chunks to include, looks like I wasn't careful enough.

// Otherwise insure the value will be written to disk after
// `diskUsageCheckpointTimeout`
if fs.dirty && !timerActive {
timer.Reset(diskUsageCheckpointTimeout)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. I see.

}

if err := os.Rename(tmp.Name(), filepath.Join(fs.path, DiskUsageFile)); err != nil {
log.Warningf("cound not write disk usage: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be deleting the temporary file on failure (or at least trying to do so).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@kevina kevina force-pushed the kevina/diskusage-notes2 branch from a97ac3b to d7d25ca Compare April 2, 2018 23:29
@kevina kevina force-pushed the kevina/diskusage-notes2 branch from 44bbdf5 to ab36d36 Compare April 3, 2018 00:09
@kevina kevina requested a review from Stebalien April 3, 2018 00:22
@Stebalien Stebalien merged commit 65bd5fb into master Apr 3, 2018
@ghost ghost removed the status/in-progress In progress label Apr 3, 2018
@Stebalien Stebalien deleted the kevina/diskusage-notes2 branch April 3, 2018 00:23
@kevina kevina removed the request for review from Stebalien April 3, 2018 00:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants