Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expire: Add EXP_Reduce() for better softpurges #4085

Merged
merged 2 commits into from
Mar 25, 2024

Conversation

AlveElde
Copy link
Contributor

@AlveElde AlveElde commented Mar 19, 2024

This PR resolves an issue where softpurging an object would reset its expiry timer to the start of the objects grace period. Repeated softpurges of the same object would keep it away from expiry indefinetly, and softpurging an object in keep would bring it back into grace.

The effects of the current softpurge implementation are most severe when the cache contains a high number of objects that are regularly hit by key-based invalidation. This can result in a growing number of objects in cache, increasing load on the expiry thread mailbox, and contention on the exp mutex. Grace hits on stale objects long past their indended lifetime can also occur.

This patch series introduces EXP_Reduce() as an alternative to EXP_Rearm() for softpurging applications. purge.soft() has been changed to use the new function, and the VCC has been updated with the following:

The effect of a softpurge is dependent on the arguments given to the purge and
the current lifetime of the object. Let's consider an object in cache created
with *ttl*, *grace*, and *keep* of 60 seconds each:

``purge.soft(ttl = 0s, grace = -1s, keep = -1s)``

* If the object is **fresh**, the *ttl* is reduced to 0 seconds and the object
  expires after 120 seconds.
* If the object is **stale**, the expiry time is not changed.

``purge.soft(ttl = 0s, grace = 10s, keep = 10s)``

* If the object is **fresh**, the *ttl* is reduced to 0 seconds, *grace* and
  *keep* are reduced to 10 seconds. The object expires after 20 seconds.
* If the object has been **stale** for 5 seconds, *grace* is reduced to 5
  seconds and *keep* is reduced to 10 seconds. The object expires after 15
  seconds.
* If the object has been **stale** for 15 seconds, *grace* is reduced to 0
  seconds and *keep* is reduced to 5 seconds. The object expires after 5
  seconds.
* If the object has been **stale** for 20 seconds or more, the object
  immediately expires.

``purge.soft(ttl = 10s, grace = -1s, keep = -1s)``

* If the object has been **fresh** for 5 seconds, the *ttl* is reduced to 10
  seconds. The object expires after 130 seconds.
* If the object has been **fresh** for 55 seconds, the *ttl* is not changed. The
  object expires after 125 seconds.
* If the object is **stale**, the expiry time is not changed.

When the expiry time of an object is reduced due to a softpurge, an
``EXP_Reduce`` entry is logged under the ``ExpKill`` VSL tag. If a softpurge
does not reduce the expiry time of an object, an ``EXP_Unchanged`` entry is
logged instead.

@nigoroll
Copy link
Member

Can't this be done with just a few lines of changes to EXP_Rearm() to change the logging and a wrapper to avoid the ttl reset? I do see the point, but the patch includes refactoring which I do not see the motivation for and code duplication which I think we should avoid.

@dridi
Copy link
Member

dridi commented Mar 21, 2024

I think one goal is precisely to not prevent EXP_Rearm() from extending the life of an object while ensuring that a soft purge doesn't.

@AlveElde
Copy link
Contributor Author

AlveElde commented Mar 21, 2024

Changing the behavior of EXP_Rearm() is going to break several VMODs, so I figured that providing a new function was the way to go.

Regarding code duplication, I tried to strike a middleground between code reuse and readability. But maybe there is a better way to structure it that I'm not seeing?

@nigoroll
Copy link
Member

How would changing the logging break VMODs?
I did not mean to suggest to change EXP_Rearm() other than the logging. I suggested adding EXP_Reduce() as a trivial wrapper of EXP_Rearm().

@nigoroll
Copy link
Member

But maybe there is a better way to structure it that I'm not seeing?

I had something like this in mind:

diff --git a/bin/varnishd/cache/cache_expire.c b/bin/varnishd/cache/cache_expire.c
index 2b27371ba..2d8b34de4 100644
--- a/bin/varnishd/cache/cache_expire.c
+++ b/bin/varnishd/cache/cache_expire.c
@@ -262,6 +262,23 @@ EXP_Rearm(struct objcore *oc, vtim_real now,
            oc->timer_when, when, oc->flags);
 }
 
+
+void
+EXP_Reduce(struct objcore *oc, vtim_real now,
+    vtim_real ttl, vtim_real grace, vtim_real keep)
+{
+
+       CHECK_OBJ_NOTNULL(oc, OBJCORE_MAGIC);
+
+       if (!isnan(ttl)) {
+               ttl = now + ttl - oc->t_origin;
+               if (ttl >= oc->ttl)
+                       ttl = nan("");
+       }
+
+       EXP_Rearm(oc, now, ttl, grace, keep);
+}
+
 /*--------------------------------------------------------------------
  * Handle stuff in the inbox
  */

@AlveElde
Copy link
Contributor Author

Ah, now I get what you mean, that is a clever solution I had not considered. I will reattempt the patch series based on your example.

Copy link
Member

@dridi dridi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, except one detail.

Comment on lines 227 to 230
EXP_Reduce(struct objcore *oc, vtim_real now,
vtim_real ttl, vtim_real grace, vtim_real keep)
{
float next_ttl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vtim_dur for everything except now?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I will change ttl, grace and keep to vtim_dur. As for next_ttl, looks like this local variable can be eliminated now

This commit introduces EXP_Reduce(), a function to reduce object timers.
The goal is to provide a function better suited to soft-purging objects,
as EXP_Rearm() has some non-obvious disadvantages when used or this
purpose.

When EXP_Rearm() is used to soft-purge an object by setting its TTL to
0, the expiry is effectively reset to the start of the objects grace
period. This happens because the object TTL includes a time delta
between object insertion time (oc->t_origin) and now. The result is that
a soft-purge extends the lifetime of an already stale object, and
repeated soft-purges can keep the object away from expiry indefinitely.

The EXP_Reduce() function is better suited for soft-purging, as it only
updates an objects TTL if it would reduce the objects lifetime. The
function also has facilities for reducing grace and keep.
When a stale object is soft-purged, the time until the object expires
should not be reset, as repeated soft-purges could keep the object
around indefinitely.
Copy link
Member

@dridi dridi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @nigoroll too!

Copy link
Member

@hermunn hermunn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a good and useful change, and this is the right time to merge it. I have one question, but it can be left for "later".

@nigoroll nigoroll merged commit e676f77 into varnishcache:master Mar 25, 2024
14 checks passed
@nigoroll
Copy link
Member

also approved by @bsdphk during bugwash

@lucaspouzac
Copy link

lucaspouzac commented May 16, 2024

Hi, Is it possible to report this fix on 7.5 branch and release a minor version ? We have a similar behavior when expiring objects that could look like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants