Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deadlock between lookup thread / VRT_AddDirector() and VDI_Event(..., VCL_EVENT_COLD) #110

Closed
nigoroll opened this issue Jan 29, 2024 · 6 comments

Comments

@nigoroll
Copy link
Owner

@delthas reported another issue based on a test case which I could have looked at in more detail earlier (I did not because I did not want to use example.com, but I really should have):

(I have slightly modified the test case - @delthas, do you agree to add it?)

varnishtest "Reloading after creating backend in init"

feature cmd "getent hosts example.com"

# contributed by delthas, added with minor modifications

shell {
	cat >${tmpdir}/f1 <<-EOF
	vcl 4.1;
	import ${vmod_dynamic};
	import directors;

	backend none none;

	sub vcl_init {
		new dir = directors.fallback();
		new res = dynamic.resolver();
		new dyn = dynamic.director(domain_usage_timeout = 0.1s, resolver = res.use());
		dir.add_backend(dyn.backend(host={"example.com"}, port={"80"}) );
	}
	EOF
}

varnish v1 -cliok "vcl.load vcl1 ${tmpdir}/f1"
varnish v1 -cliok "vcl.use vcl1"
varnish v1 -cliok "start"

varnish v1 -cliok "vcl.load vcl2 ${tmpdir}/f1"
varnish v1 -cliok "vcl.use vcl2"
varnish v1 -cliok "vcl.discard vcl1"

the issue here is that VDI_Event(..., VCL_EVENT_COLD) waits for the lookup threads to finish while holding the vcl_mtx, which prevents the lookup threads to ... finish:

#5  0x00007fa9c72b0e76 in dom_event (dir=<optimized out>, ev=<optimized out>) at vmod_dynamic.c:1041
#6  0x000055d6f1c1e84d in VDI_Event (d=0x7fa9c6e1e7d0, ev=VCL_EVENT_COLD) at cache/cache_director.c:277
#7  0x000055d6f1c61bfc in vcl_BackendEvent (vcl=0x7fa9c6e451e0, e=VCL_EVENT_COLD) at cache/cache_vcl.c:449
#8  0x000055d6f1c61068 in vcl_set_state (vcl=0x7fa9c6e451e0, state=0x7fa9c6e28390 "0cold", msg=0x7ffc69699e08) at cache/cache_vcl.c:597
#9  0x000055d6f1c62905 in vcl_cli_state (cli=0x7fa9c6ed29d0, av=0x7fa9c6e2c580, priv=0x0) at cache/cache_vcl.c:861
#10 0x000055d6f1d11f22 in cls_dispatch (cli=0x7fa9c6ed29d0, cs=0x7fa9c6e22b30, av=0x7fa9c6e2c580, ac=3) at vcli_serve.c:273
#11 0x000055d6f1d11a0d in cls_exec (cfd=0x7fa9c6ed29a0, av=0x7fa9c6e2c580, ac=3) at vcli_serve.c:324
#12 0x000055d6f1d108a6 in cls_feed (cfd=0x7fa9c6ed29a0, p=0x7ffc69699f94 "\n2.1706557731.695251/vgc.so 1auto\no\nloneTable", 
    e=0x7ffc69699f95 "2.1706557731.695251/vgc.so 1auto\no\nloneTable") at vcli_serve.c:412
#13 0x000055d6f1d1036a in VCLS_Poll (cs=0x7fa9c6e22b30, cli=0x7fa9c6ed29d0, timeout=-1) at vcli_serve.c:617
#14 0x000055d6f1c18c12 in CLI_Run () at cache/cache_cli.c:110
#4  0x000055d6f1c40ca4 in Lck__Lock (lck=0x55d6f1db57f8 <vcl_mtx>, p=0x55d6f1d45aec "VRT_AddDirector", l=212) at cache/cache_lck.c:124
#5  0x000055d6f1c7bfd3 in VRT_AddDirector (ctx=0x7fa9bbd7df90, m=0x55d6f1da7260 <vbe_methods_noprobe>, priv=0x7fa9ba821780, 
    fmt=0x55d6f1d42f7f "%s") at cache/cache_vrt_vcl.c:212
#6  0x000055d6f1c09c4d in VRT_new_backend_clustered (ctx=0x7fa9bbd7df90, vc=0x0, vrt=0x7fa9bbd7deb0, via=0x0)
    at cache/cache_backend.c:737
#7  0x000055d6f1c0a632 in VRT_new_backend (ctx=0x7fa9bbd7df90, vrt=0x7fa9bbd7deb0, via=0x0) at cache/cache_backend.c:755
#8  0x00007fa9c72b24b4 in ref_add (r=0x7fa9ba827720, ctx=0x7fa9bbd7df90) at vmod_dynamic.c:617
#9  dom_update (now=1706557732.7551613, priv=<optimized out>, res=0x7fa9c72c2ba0 <res_getdns>, dom=0x7fa9c6e44100)
    at vmod_dynamic.c:728
#10 dom_lookup_thread (priv=0x7fa9c6e44100) at vmod_dynamic.c:819
#11 0x00007fa9c77c9044 in start_thread (arg=<optimized out>) at ./nptl/pthread_create.c:442
#12 0x00007fa9c784961c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
@nigoroll
Copy link
Owner Author

As is, this issue is hard. I could easily add a workaround, but that "workaround" basically also exists in Varnish-Cache, because a shortcut is taken in VRT_AddDirector() if the VCL is cooling, except in vtc mode:

https://github.com/varnishcache/varnish-cache/blob/2511f5364489c8345b3879a5eee21179bf71e9b2/bin/varnishd/cache/cache_vrt_vcl.c#L175-L176

So for all practical purposes, this issue exists primarily in VTC, but we can not guarantee that it does not happen in production deployments.

Holding the vcl_mtx goes back to varnishcache/varnish-cache#3094 / varnishcache/varnish-cache@465f2f8 and I do now wonder if this is actually a good idea for sending VCL_EVENT_COLD....

@nigoroll
Copy link
Owner Author

nigoroll commented Jan 30, 2024

More pondering: When sending the cold event, we need to ensure that the list of directors operated upon is complete, so holding the vcl_mtx makes sense, otherwise we would need a more elaborate data structure and synchronization to allow additions/deletions while iterating.
The temperature can only change in the CLI thread, so it can not change again while a cold event is posted. Also, once the temperature goes cold, it can not change back to something else for as long as the lookup thread is running. Maybe this can help us...

edit: the actual temperature change happens outside vcl_mtx

@delthas
Copy link
Contributor

delthas commented Jan 30, 2024

Hi, I agree to add the test case 😄

@nigoroll
Copy link
Owner Author

nigoroll commented Jan 30, 2024

some notes: we can not join the resolver thread after the COLD transition has completed, because by then (during discard) the resolver context may have become invalid (for the same reason we can not detach it). I am pretty much out of ideas even including changes to varnish-cache: for example, turning vcl_mtx into an rwlock would at first seem like an option (iterators only need to read), but in VRT_AddDirector(), we would need to check the temperature on a read lock and, if it is not cooling, upgrade to a write lock, but that can deadlock, and the moment we unlock, vcl_BackendEvent() could run again and we never get the lock.
I am really tempted to ignore the problem, because the race for the window in production code (without vtc_mode) is only https://github.com/varnishcache/varnish-cache/blob/2511f5364489c8345b3879a5eee21179bf71e9b2/bin/varnishd/cache/cache_vrt_vcl.c#L178-L211

nigoroll added a commit to varnishcache/varnish-cache that referenced this issue Jan 31, 2024
in VRT_AddDirector, we create the new vcldir with an initial
reference, which we need to drop if we can not add it.

Compare:

	VRT_AddDirector()
		...
		vdir->refcnt++;

	vcldir_free()
		...
		AZ(vdir->refcnt);

Noticed when testing other experimental changes while working on
nigoroll/libvmod-dynamic#110

 #5  0x000055820c8cb845 in VAS_Fail (func=0x55820c904559 "vcldir_free", file=0x55820c903a47 "cache/cache_vrt_vcl.c",
     line=150, cond=0x55820c90459a "(vdir->refcnt) == 0", kind=VAS_ASSERT) at vas.c:67
 #6  0x000055820c83a442 in vcldir_free (vdir=0x7f662aa53140) at cache/cache_vrt_vcl.c:150
 #7  0x000055820c839fe1 in VRT_AddDirector (ctx=0x7f662befe250, m=0x55820c965260 <vbe_methods_noprobe>,
     priv=0x7f662aa20780, fmt=0x55820c900f7f "%s") at cache/cache_vrt_vcl.c:219
 #8  0x000055820c7c7c4d in VRT_new_backend_clustered (ctx=0x7f662befe250, vc=0x0, vrt=0x7f662befdd10, via=0x0)
    at cache/cache_backend.c:737
 #9  0x000055820c7c8632 in VRT_new_backend (ctx=0x7f662befe250, vrt=0x7f662befdd10, via=0x0)
     at cache/cache_backend.c:755
nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Jan 31, 2024
If VRT_AddDirector() was called from handling a VCL_COLD event or,
indirectly, from another thread which the VCL_COLD event handler was
waiting for, varnishd would deadlock and prevent any CLI or director
changes, because VRT_AddDirector() requires the vcl_mtx, which is held
during vcl_BackendEvent() to ensure a consistent view of the director
list. Because of the early return from VRT_AddDirector() this likely
only happened in VTC mode, but the underlying race existed
nevertheless.

This patch _almost_ fixes the issue with the intend of making it
highly unlikely to occur without getting too involved with the vcl
temperature controls: We now check the same conditions under which
vcl_set_state() would transition the temperature to COOLING and, if
they apply, use Lck_Trylock() in a try/wait loop instead of
Lck_Lock(), avoiding the deadlock.

The patch presumably still does not fix the problem entirely, because
the reads of vcl->busy and vcl->temp before the Lck_Trylock() could
still be outdated. With the temperature controls otherwise unchanged,
the only alternative idea I could come up with was to always use a
try/wait loop, which I dismissed due to the performance impact
(overhead and added latency).

Ref nigoroll/libvmod-dynamic#110
@nigoroll
Copy link
Owner Author

At this point I think this can only be fixed in varnish-cache, see varnishcache/varnish-cache#4048

nigoroll added a commit to nigoroll/varnish-cache that referenced this issue Feb 5, 2024
If VRT_AddDirector() was called from handling a VCL_COLD event or,
indirectly, from another thread which the VCL_COLD event handler was
waiting for, varnishd would deadlock and prevent any CLI or director
changes, because VRT_AddDirector() requires the vcl_mtx, which is held
during vcl_BackendEvent() to ensure a consistent view of the director
list. Because of the early return from VRT_AddDirector() this likely
only happened in VTC mode, but the underlying race existed
nevertheless.

This patch _almost_ fixes the issue with the intend of making it
highly unlikely to occur without getting too involved with the vcl
temperature controls: We now check the same conditions under which
vcl_set_state() would transition the temperature to COOLING and, if
they apply, use Lck_Trylock() in a try/wait loop instead of
Lck_Lock(), avoiding the deadlock.

The patch presumably still does not fix the problem entirely, because
the reads of vcl->busy and vcl->temp before the Lck_Trylock() could
still be outdated. With the temperature controls otherwise unchanged,
the only alternative idea I could come up with was to always use a
try/wait loop, which I dismissed due to the performance impact
(overhead and added latency).

Ref nigoroll/libvmod-dynamic#110
@nigoroll
Copy link
Owner Author

nigoroll commented Feb 5, 2024

Fixed via varnishcache/varnish-cache#4048

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants