Fix some provider subsystem performance issues #319

Stebalien · 2019-04-13T01:30:22Z

Don't load into the read cache on write.
Delete on expiration.
Delete expired records on load.
~~Avoid populating the cache when garbage collecting (this just destroys the cache and hurts performance).~~ Just drop the cache.
Make the algorithm not n^2. I believe keys() returns the CID once per provider (and we then iterate over all providers once per provider).
Use a non thread-safe cache (don't need thread safety).

related: #316

Stebalien · 2019-04-13T06:31:04Z

Note: We should be doing GC in a separate thread but it's a little tricky to get this right without race conditions. We can save this for a follow-up PR.

vyzo

LGTM!

vyzo · 2019-04-13T08:09:24Z

providers/providers.go

@@ -30,7 +30,7 @@ var defaultCleanupInterval = time.Hour
 type ProviderManager struct {
 // all non channel fields are meant to be accessed only within
 // the run method
- providers *lru.Cache
+ providers *lru.LRU


what's the difference between these two?

I guess thread-safety.

Are we sure this is safe? It could come back to bite us.

Yes. The old provider set wasn't safe to use from multiple threads anyways as we'd modify the *providerSet structs.

vyzo · 2019-04-13T08:12:39Z

providers/providers.go

- provs.setVal(p, now)
+ if provs, ok := pm.providers.Get(k); ok {
+ provs.(*providerSet).setVal(p, now)
+ } // else not cached, just write through


vyzo · 2019-04-13T08:17:10Z

providers/providers.go

- data, ok := i.([]byte)
- if !ok {
- return time.Time{}, fmt.Errorf("data was not a []byte")
+func readTimeValue(data []byte) (time.Time, error) {


avoiding the interface{} is a nice little gain! No more allocation for passing byte slices.

(this was left over from the datastore refactor when we replaced all the interfaces with []byte)

vyzo · 2019-04-13T08:20:13Z

providers/providers.go

+ fallthrough
+ case now.Sub(t) > ProvideValidity:
+ // or just expired
+ err = pm.dstore.Delete(ds.RawKey(e.Key))


shouldn't we also clean the lru cache here? Alternatively we can simply flush it completely on gc.

Ah, there is a Purge call before entering gc, good.

vyzo · 2019-04-13T08:21:00Z

providers/providers.go

+ // drop them.
+ //
+ // Much faster than GCing.
+ pm.providers.Purge()


yay for that!

vyzo · 2019-04-13T10:21:05Z

Only concern here is removing the thread-safety from the LRU cache.

This is just extra work as we write through anyways.

We only access it from a single goroutine.

1. Don't be n^2. 2. Don't bother walking the cache, just drop it.

batches deletes

ghost assigned Stebalien Apr 13, 2019

ghost added the status/in-progress In progress label Apr 13, 2019

Stebalien force-pushed the fix/provider-mayhem branch 2 times, most recently from 84b9fac to 3ab4992 Compare April 13, 2019 05:12

Stebalien marked this pull request as ready for review April 13, 2019 05:12

Stebalien force-pushed the fix/provider-mayhem branch from 3ab4992 to c095545 Compare April 13, 2019 06:20

Stebalien force-pushed the fix/provider-mayhem branch from c095545 to 7ce5021 Compare April 13, 2019 06:40

Stebalien requested a review from vyzo April 13, 2019 06:44

vyzo approved these changes Apr 13, 2019

View reviewed changes

Stebalien added 10 commits April 13, 2019 09:18

providers: use raw cids as map keys

95a6c25

providers: don't load into the cache on write

0697369

This is just extra work as we write through anyways.

providers: improve time parsing error checking

2a3899b

providers: delete expired provider records on load

24c3841

providers: add some documentation

3c9d044

providers: make sure to close provider query, even on error

f4e6d42

providers: use the non-locking LRU

a3b9767

We only access it from a single goroutine.

providers: optimize GC

fbb29ea

1. Don't be n^2. 2. Don't bother walking the cache, just drop it.

providers: improve test coverage

b76de45

dep: update go-datastore

7bdc7a5

batches deletes

Stebalien force-pushed the fix/provider-mayhem branch from 6f26a7a to 7bdc7a5 Compare April 13, 2019 16:19

Stebalien merged commit 12c9510 into master Apr 13, 2019

Stebalien deleted the fix/provider-mayhem branch April 13, 2019 16:22

ghost removed the status/in-progress In progress label Apr 13, 2019

anacrolix mentioned this pull request Apr 15, 2019

#319 causes test failures in IPFS #321

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix some provider subsystem performance issues #319

Fix some provider subsystem performance issues #319

Stebalien commented Apr 13, 2019 •

edited

Loading

Stebalien commented Apr 13, 2019 •

edited

Loading

vyzo left a comment

vyzo Apr 13, 2019

vyzo Apr 13, 2019

vyzo Apr 13, 2019

Stebalien Apr 13, 2019

vyzo Apr 13, 2019

vyzo Apr 13, 2019

Stebalien Apr 13, 2019

vyzo Apr 13, 2019

vyzo Apr 13, 2019

vyzo Apr 13, 2019

vyzo commented Apr 13, 2019

Fix some provider subsystem performance issues #319

Fix some provider subsystem performance issues #319

Conversation

Stebalien commented Apr 13, 2019 • edited Loading

Stebalien commented Apr 13, 2019 • edited Loading

vyzo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vyzo commented Apr 13, 2019

Stebalien commented Apr 13, 2019 •

edited

Loading

Stebalien commented Apr 13, 2019 •

edited

Loading