Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sets with TTL #122

Merged
merged 15 commits into from
Jan 28, 2020
Merged

Sets with TTL #122

merged 15 commits into from
Jan 28, 2020

Conversation

martinmr
Copy link
Contributor

@martinmr martinmr commented Jan 16, 2020

This change is Reviewable

@martinmr martinmr marked this pull request as ready for review January 17, 2020 19:42
@martinmr martinmr mentioned this pull request Jan 21, 2020
Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r1.
Reviewable status: all files reviewed, 16 unresolved discussions (waiting on @jarifibrahim, @karlmcguire, and @martinmr)


cache.go, line 211 at r1 (raw file):

	now := time.Now()
	expiration := now.Add(ttl)
	if !expiration.After(now) {

if ttl.isNegative() then don't store.
if ttl.IsZero(), then no ttl (store).
if ttl.Positive,then has ttl (store)


cache.go, line 308 at r1 (raw file):

				victims, added := c.policy.Add(i.key, i.cost)
				if added {
					currentExpiration := c.store.Expiration(i.key)

prevExpiry


cache.go, line 310 at r1 (raw file):

					currentExpiration := c.store.Expiration(i.key)
					if !currentExpiration.IsZero() {
						c.ttlMap.Delete(i.key, currentExpiration)

Update in one call.


cache.go, line 314 at r1 (raw file):

					c.store.Set(i)
					c.ttlMap.Add(i.key, i.conflict, i.expiration)

do the update before store.Set. Also, can't store.Set do that internally? Do you need to call it from outside like this?


cache.go, line 319 at r1 (raw file):

				for _, victim := range victims {
					if victimExp := c.store.Expiration(victim.key); !victimExp.IsZero() {
						c.ttlMap.Delete(victim.key, victimExp)

store can just do all these internally.


cache.go, line 335 at r1 (raw file):

			}
		case <-c.cleanupTicker.C:
			c.ttlMap.CleanUp(c.store, c.policy, c.onEvict)

c.store.CleanUp


store.go, line 50 at r1 (raw file):

	// Update attempts to update the key with a new value and returns true if
	// successful.
	Update(*item, *expirationMap) bool

expirationMap should be passed to newStore elsewhere.


store.go, line 144 at r1 (raw file):

}

func (m *lockedMap) Set(i *item) {

I think this would be put on heap. Check if we can keep it on stack.


ttl.go, line 31 at r1 (raw file):

// timeToBucket converts a time into a bucket number that will be used
// to store items in the expiration map.
func timeToBucket(t time.Time) int {

ceil and have another one for floor.

Ceil for storage, floor for deletion.


ttl.go, line 32 at r1 (raw file):

// to store items in the expiration map.
func timeToBucket(t time.Time) int {
	return t.Second() / bucketSizeSecs

math.Ceil(t.Unix() / ...) Has to be round-up.


ttl.go, line 61 at r1 (raw file):

	defer m.Unlock()

	_, ok := m.buckets[bucketNum]

bucket, ok := ...

if !ok { bucket = new(bucket); m.buckets[bucket] = }

bucket[key] = conflict


ttl.go, line 71 at r1 (raw file):

// is needed to be able to find the bucket storing this pair in constant time.
func (m *expirationMap) Delete(key uint64, expiration time.Time) {
	bucketNum := timeToBucket(expiration)

Should be Update. So, it should contain the previous exp time, the new exp time.


ttl.go, line 74 at r1 (raw file):

	m.Lock()
	defer m.Unlock()
	_, ok := m.buckets[bucketNum]

if _, ok := ...


ttl.go, line 88 at r1 (raw file):

	// items in the current bucket that have not expired yet but all the items in
	// the previous bucket should have expired.
	bucketNum := timeToBucket(time.Now()) - 1

bucketNum := floor(time.Now().Unix())


ttl.go, line 96 at r1 (raw file):

	for key, conflict := range keys {
		_, value := store.Del(key, 0)

should you be passing the conflict here?


ttl.go, line 97 at r1 (raw file):

	for key, conflict := range keys {
		_, value := store.Del(key, 0)
		cost := policy.Cost(key)

Can we get the expiration time here?

Can you check if the key should in fact be expired? Only if that matches up, do the deletion?

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 16 unresolved discussions (waiting on @jarifibrahim, @karlmcguire, and @manishrjain)


cache.go, line 211 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

if ttl.isNegative() then don't store.
if ttl.IsZero(), then no ttl (store).
if ttl.Positive,then has ttl (store)

Done.j


cache.go, line 308 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

prevExpiry

Done.


cache.go, line 310 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Update in one call.

Done.


cache.go, line 314 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

do the update before store.Set. Also, can't store.Set do that internally? Do you need to call it from outside like this?

Done.


cache.go, line 319 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

store can just do all these internally.

Done.


cache.go, line 335 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

c.store.CleanUp

Done.


store.go, line 50 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

expirationMap should be passed to newStore elsewhere.

Done.


store.go, line 144 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

I think this would be put on heap. Check if we can keep it on stack.

The channel is receiving pointers to an item struct so it's already in the heap. I don't think we gain anything from changing this.


ttl.go, line 31 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

ceil and have another one for floor.

Ceil for storage, floor for deletion.

Done.


ttl.go, line 32 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

math.Ceil(t.Unix() / ...) Has to be round-up.

Done.


ttl.go, line 61 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

bucket, ok := ...

if !ok { bucket = new(bucket); m.buckets[bucket] = }

bucket[key] = conflict

Done.


ttl.go, line 71 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Should be Update. So, it should contain the previous exp time, the new exp time.

I added a new update function. I didn't removed this one because it's still being used in a different place.


ttl.go, line 74 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

if _, ok := ...

Done.


ttl.go, line 88 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

bucketNum := floor(time.Now().Unix())

Cannot use floor because there's an edge case (for an integer, the value of the two functions is the same. But I added a function to calculate the bucket to cleanup.


ttl.go, line 96 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

should you be passing the conflict here?

Done.


ttl.go, line 97 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Can we get the expiration time here?

Can you check if the key should in fact be expired? Only if that matches up, do the deletion?

Done.

cache_test.go Outdated
@@ -524,3 +589,9 @@ func TestCacheMetricsClear(t *testing.T) {
c.Metrics = nil
c.Metrics.Clear()
}

func TestMain(m *testing.M) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SA3000: TestMain should call os.Exit to set exit code (from staticcheck)

Copy link
Contributor

@manishrjain manishrjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Please look really carefully at the code and ensure that you have caught the edge cases.

Also, @karlmcguire could you review this code?

Reviewed 4 of 5 files at r2, 1 of 1 files at r3.
Reviewable status: all files reviewed, 15 unresolved discussions (waiting on @jarifibrahim, @karlmcguire, @manishrjain, and @martinmr)


cache.go, line 151 at r3 (raw file):

		stop:          make(chan struct{}),
		cost:          config.Cost,
		cleanupTicker: time.NewTicker(bucketSize),

bucketDuration / 2.0


store.go, line 106 at r3 (raw file):

func (sm *shardedMap) Cleanup(policy policy, onEvict onEvictFunc) {
	sm.em.cleanup(sm, policy, onEvict)

sm.expiryMap


store.go, line 118 at r3 (raw file):

	sync.RWMutex
	data map[uint64]storeItem
	em   *expirationMap

expiryMap


store.go, line 140 at r3 (raw file):

	// Handle expired items.
	if !item.expiration.IsZero() {

if this && that


ttl.go, line 31 at r3 (raw file):

func storageBucket(t time.Time) int64 {
	bucket := int64(math.Ceil(float64(t.Unix()) / bucketSize.Seconds()))

bucket := t.Unix()/bucketSize + 1

Avoid converting to float64. And also avoid calling a func on bucketSize.


ttl.go, line 38 at r3 (raw file):

	// The bucket to cleanup is always behind the storage bucket by one so that
	// no elements in that bucket (which might not have expired yet) are deleted.
	bucket := storageBucket(t) - 1

return ...


ttl.go, line 84 at r3 (raw file):

	}

	if oldExpiration.IsZero() {

It could be zero before, but then they put ttl on it?


ttl.go, line 92 at r3 (raw file):

	defer m.Unlock()

	oldBucketNum := storageBucket(oldExpiration)

oldExpTime


ttl.go, line 132 at r3 (raw file):

	now := time.Now()
	bucketNum := cleanupBucket(now)
	m.Lock()

The now should be calculated inside the lock.

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 15 unresolved discussions (waiting on @golangcibot, @jarifibrahim, @karlmcguire, and @manishrjain)


cache.go, line 151 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

bucketDuration / 2.0

Done.


cache_test.go, line 593 at r2 (raw file):

Previously, golangcibot (Bot from GolangCI) wrote…

SA3000: TestMain should call os.Exit to set exit code (from staticcheck)

Done.


store.go, line 106 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

sm.expiryMap

Done.


store.go, line 118 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

expiryMap

Done.


store.go, line 140 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

if this && that

Done.


ttl.go, line 31 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

bucket := t.Unix()/bucketSize + 1

Avoid converting to float64. And also avoid calling a func on bucketSize.

Done.


ttl.go, line 38 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

return ...

Done.


ttl.go, line 84 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

It could be zero before, but then they put ttl on it?

Still need to add a test. will do that in monday.


ttl.go, line 92 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

oldExpTime

Done.


ttl.go, line 132 at r3 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

The now should be calculated inside the lock.

Done.

cache_test.go Outdated
// Set bucketSizeSecs to 1 to avoid waiting too much during the tests.
bucketDurationSecs = 1
m.Run()
os.Exit(0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are supposed to exit with the return value of m.Run, so that the exit status of the test binary reflects whether tests have passed or failed.

But the whole TestMain function can probably be replaced with a func init() { bucketDurationSecs = 1 } in cache_test.go

Copy link
Contributor

@jarifibrahim jarifibrahim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 5 files reviewed, 19 unresolved discussions (waiting on @golangcibot, @karlmcguire, @manishrjain, and @martinmr)


cache.go, line 199 at r4 (raw file):

// SetWithTTL works like Set but adds a key-value pair to the cache that will expire
// after the specified TTL (time to live) has passed. A zero value means the value never
// exexpire, which is identical to calling Set. A negative value is a no-op and the value

nit:
exexpire => expire
A zero value means the value never expire => A zero value means the value will never expire.


cache_test.go, line 404 at r4 (raw file):

	val, ok = c.Get(2)
	require.True(t, ok)
	require.Equal(t, 2, val.(int))

I don't see a test for deletion of entries with ttl. You could add this

+	retrySet(3, 1, 1, 10*time.Second)
+	time.Sleep(1 * time.Second)
+	// Delete the item
+	c.Del(3)
+	// Ensure the key is deleted.
+	val, ok = c.Get(3)
+	require.False(t, ok)
+	require.Nil(t, val)

ttl.go, line 81 at r4 (raw file):

	}

	m.Lock()

We have a top-level lock for the expiration map. I'm wondering if it would be better to split this lock into bucket level. That way we can insert in bucket 1 while an update is being performed on bucket 2 and 3.

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 5 files reviewed, 19 unresolved discussions (waiting on @golangcibot, @jarifibrahim, @karlmcguire, @manishrjain, and @martinmr)


ttl.go, line 81 at r4 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

We have a top-level lock for the expiration map. I'm wondering if it would be better to split this lock into bucket level. That way we can insert in bucket 1 while an update is being performed on bucket 2 and 3.

I don't think multiple locks would work unless we have a map for each bucket. There's internal data inside the map that could be corrupted if there's multiple threads accessing the map.

Copy link
Contributor Author

@martinmr martinmr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 5 files reviewed, 18 unresolved discussions (waiting on @dominikh, @golangcibot, @jarifibrahim, @karlmcguire, and @manishrjain)


cache_test.go, line 404 at r4 (raw file):

Previously, jarifibrahim (Ibrahim Jarif) wrote…

I don't see a test for deletion of entries with ttl. You could add this

+	retrySet(3, 1, 1, 10*time.Second)
+	time.Sleep(1 * time.Second)
+	// Delete the item
+	c.Del(3)
+	// Ensure the key is deleted.
+	val, ok = c.Get(3)
+	require.False(t, ok)
+	require.Nil(t, val)

Done.


cache_test.go, line 598 at r4 (raw file):

Previously, dominikh (Dominik Honnef) wrote…

You are supposed to exit with the return value of m.Run, so that the exit status of the test binary reflects whether tests have passed or failed.

But the whole TestMain function can probably be replaced with a func init() { bucketDurationSecs = 1 } in cache_test.go

Done. Thanks for the feedback.


ttl.go, line 84 at r3 (raw file):

Previously, martinmr (Martin Martinez Rivera) wrote…

Still need to add a test. will do that in monday.

Done. Added a test.

@martinmr martinmr merged commit 51e97ad into master Jan 28, 2020
@martinmr martinmr deleted the martinmr/ttl branch January 28, 2020 21:54
jarifibrahim pushed a commit to dgraph-io/badger that referenced this pull request Jun 30, 2020
The update brings following changes from ristretto into badger

```
f66de99 Improve handling of updated items (dgraph-io/ristretto#168)
aec7994 Fix flaky TestCacheDel test (dgraph-io/ristretto#169)
a093fe6 Add benchmark plots to the project repo (dgraph-io/ristretto#166)
49dc42c Add Anurag as codeowner (dgraph-io/ristretto#158)
7a3f2d3 z: use MemHashString and xxhash.Sum64String (dgraph-io/ristretto#153)
9c31bb2 Check conflict key before updating expiration map. (dgraph-io/ristretto#154)
62cb731 Fix race condition in Cache.Clear (dgraph-io/ristretto#133)
9af1934 Docs and whitespace changes for readability. (dgraph-io/ristretto#143)
dbc185e Add changelog. (dgraph-io/ristretto#142)
ff325ad Remove key from policy after TTL eviction (dgraph-io/ristretto#130)
2dd5ff5 Use the require library in all the tests. (dgraph-io/ristretto#128)
7c48141 Use require in all tests in cache_test.go (dgraph-io/ristretto#127)
51e97ad Sets with TTL (dgraph-io/ristretto#122)
d3e7c37 Add benchmarks for math.rand and fastrand (dgraph-io/ristretto#118)
593823e Integrate fixes from PR dgraph-io/ristretto#91. (dgraph-io/ristretto#126)
29b4dd7 Fix comments. (dgraph-io/ristretto#123)
ddf345c Removed workflows directory. (dgraph-io/ristretto#124)
eb104d0 Add martinmr to CODEOWNERS file. (dgraph-io/ristretto#125)
```
jarifibrahim pushed a commit to dgraph-io/badger that referenced this pull request Jul 6, 2020
The update brings following changes from ristretto into badger

```
f66de99 Improve handling of updated items (dgraph-io/ristretto#168)
aec7994 Fix flaky TestCacheDel test (dgraph-io/ristretto#169)
a093fe6 Add benchmark plots to the project repo (dgraph-io/ristretto#166)
49dc42c Add Anurag as codeowner (dgraph-io/ristretto#158)
7a3f2d3 z: use MemHashString and xxhash.Sum64String (dgraph-io/ristretto#153)
9c31bb2 Check conflict key before updating expiration map. (dgraph-io/ristretto#154)
62cb731 Fix race condition in Cache.Clear (dgraph-io/ristretto#133)
9af1934 Docs and whitespace changes for readability. (dgraph-io/ristretto#143)
dbc185e Add changelog. (dgraph-io/ristretto#142)
ff325ad Remove key from policy after TTL eviction (dgraph-io/ristretto#130)
2dd5ff5 Use the require library in all the tests. (dgraph-io/ristretto#128)
7c48141 Use require in all tests in cache_test.go (dgraph-io/ristretto#127)
51e97ad Sets with TTL (dgraph-io/ristretto#122)
d3e7c37 Add benchmarks for math.rand and fastrand (dgraph-io/ristretto#118)
593823e Integrate fixes from PR dgraph-io/ristretto#91. (dgraph-io/ristretto#126)
29b4dd7 Fix comments. (dgraph-io/ristretto#123)
ddf345c Removed workflows directory. (dgraph-io/ristretto#124)
eb104d0 Add martinmr to CODEOWNERS file. (dgraph-io/ristretto#125)
```

(cherry picked from commit 09dfa66)
jarifibrahim pushed a commit to dgraph-io/badger that referenced this pull request Oct 2, 2020
The update brings following changes from ristretto into badger

```
f66de99 Improve handling of updated items (dgraph-io/ristretto#168)
aec7994 Fix flaky TestCacheDel test (dgraph-io/ristretto#169)
a093fe6 Add benchmark plots to the project repo (dgraph-io/ristretto#166)
49dc42c Add Anurag as codeowner (dgraph-io/ristretto#158)
7a3f2d3 z: use MemHashString and xxhash.Sum64String (dgraph-io/ristretto#153)
9c31bb2 Check conflict key before updating expiration map. (dgraph-io/ristretto#154)
62cb731 Fix race condition in Cache.Clear (dgraph-io/ristretto#133)
9af1934 Docs and whitespace changes for readability. (dgraph-io/ristretto#143)
dbc185e Add changelog. (dgraph-io/ristretto#142)
ff325ad Remove key from policy after TTL eviction (dgraph-io/ristretto#130)
2dd5ff5 Use the require library in all the tests. (dgraph-io/ristretto#128)
7c48141 Use require in all tests in cache_test.go (dgraph-io/ristretto#127)
51e97ad Sets with TTL (dgraph-io/ristretto#122)
d3e7c37 Add benchmarks for math.rand and fastrand (dgraph-io/ristretto#118)
593823e Integrate fixes from PR dgraph-io/ristretto#91. (dgraph-io/ristretto#126)
29b4dd7 Fix comments. (dgraph-io/ristretto#123)
ddf345c Removed workflows directory. (dgraph-io/ristretto#124)
eb104d0 Add martinmr to CODEOWNERS file. (dgraph-io/ristretto#125)
```
manishrjain pushed a commit to outcaste-io/outserv that referenced this pull request Jul 6, 2022
The update brings following changes from ristretto into badger

```
f66de99 Improve handling of updated items (dgraph-io/ristretto#168)
aec7994 Fix flaky TestCacheDel test (dgraph-io/ristretto#169)
a093fe6 Add benchmark plots to the project repo (dgraph-io/ristretto#166)
49dc42c Add Anurag as codeowner (dgraph-io/ristretto#158)
7a3f2d3 z: use MemHashString and xxhash.Sum64String (dgraph-io/ristretto#153)
9c31bb2 Check conflict key before updating expiration map. (dgraph-io/ristretto#154)
62cb731 Fix race condition in Cache.Clear (dgraph-io/ristretto#133)
9af1934 Docs and whitespace changes for readability. (dgraph-io/ristretto#143)
dbc185e Add changelog. (dgraph-io/ristretto#142)
ff325ad Remove key from policy after TTL eviction (dgraph-io/ristretto#130)
2dd5ff5 Use the require library in all the tests. (dgraph-io/ristretto#128)
7c48141 Use require in all tests in cache_test.go (dgraph-io/ristretto#127)
51e97ad Sets with TTL (dgraph-io/ristretto#122)
d3e7c37 Add benchmarks for math.rand and fastrand (dgraph-io/ristretto#118)
593823e Integrate fixes from PR dgraph-io/ristretto#91. (dgraph-io/ristretto#126)
29b4dd7 Fix comments. (dgraph-io/ristretto#123)
ddf345c Removed workflows directory. (dgraph-io/ristretto#124)
eb104d0 Add martinmr to CODEOWNERS file. (dgraph-io/ristretto#125)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants