Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

go to state closed and wait for ack before removing vif #105

Merged
merged 2 commits into from
Nov 10, 2022

Conversation

palainp
Copy link
Member

@palainp palainp commented Oct 29, 2022

This PR is needed to change the state to closing and wait for the frontend to be ready before removing the vif. This should fix mirage/qubes-mirage-firewall#157.

Copy link
Member

@reynir reynir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I unfortunately don't know this part of xen :/

lib/xenstore.ml Outdated Show resolved Hide resolved
@palainp palainp changed the title go to state closing and wait for ack before removing vif go to state closed and wait for ack before removing vif Nov 7, 2022
@hannesm
Copy link
Member

hannesm commented Nov 9, 2022

What I can confirm just now is that in Qubes dom0, when I busy-loop reading the backend/vif/5/0/state of the personal VM while in a separate shell executing qvm-prefs personal netvm sys-net, the state changes from 4 (connected) to 6 (closed) to be not present.(NB: sys-firewall is domain ID 5, sys-net is 3)

I still want to see what the current mirage-firewall is doing in that scenario, and whether this fix fixes it.

Any pointers to the state machine transitions of xen devices would be appreciated -- I've read chapter 6 in Chisnall's "The Definitive Guide to the Xen Hypervisor" (thanks to the pointer from @reynir), but still fail to understand what the desired interaction of the state should be.

@talex5
Copy link
Contributor

talex5 commented Nov 9, 2022

Any pointers to the state machine transitions of xen devices would be appreciated

See: mirage/mirage-qubes#25 (comment) (and https://github.com/xen-project/xen/blob/e61a78981364925a43c9cc24dc77b62ff7b93c9f/xen/include/public/io/blkif.h#L482)

@hannesm
Copy link
Member

hannesm commented Nov 9, 2022

Thanks @talex5, unfortunately that refered diagram is only for startup, and stops at Connected.

@palainp
Copy link
Member Author

palainp commented Nov 9, 2022

I unfortunately have no documentation, it's just from the following commands:

$ xl -vvv network-detach perso 0
libxl: debug: libxl_nic.c:509:libxl_device_nic_remove: Domain 27:ao 0x765c30: create: how=(nil) callback=(nil) poller=0x765cd0
libxl: debug: libxl_event.c:813:libxl__ev_xswatch_register: watch w=0x7653c0 wpath=/local/domain/26/backend/vif/27/0/state token=3/0: register slotnum=3
libxl: debug: libxl_nic.c:509:libxl_device_nic_remove: Domain 27:ao 0x765c30: inprogress: poller=0x765cd0, flags=i
libxl: debug: libxl_event.c:750:watchfd_callback: watch w=0x7653c0 wpath=/local/domain/26/backend/vif/27/0/state token=3/0: event epath=/local/domain/26/backend/vif/27/0/state
libxl: debug: libxl_event.c:1055:devstate_callback: backend /local/domain/26/backend/vif/27/0/state wanted state 6 still waiting state 5
libxl: debug: libxl_event.c:750:watchfd_callback: watch w=0x7653c0 wpath=/local/domain/26/backend/vif/27/0/state token=3/0: event epath=/local/domain/26/backend/vif/27/0/state
libxl: debug: libxl_event.c:1044:devstate_callback: backend /local/domain/26/backend/vif/27/0/state wanted state 6 but it was removed
libxl: debug: libxl_event.c:850:libxl__ev_xswatch_deregister: watch w=0x7653c0 wpath=/local/domain/26/backend/vif/27/0/state token=3/0: deregister slotnum=3
libxl: debug: libxl_device.c:1133:device_backend_callback: Domain 27:calling device_backend_cleanup
libxl: debug: libxl_event.c:864:libxl__ev_xswatch_deregister: watch w=0x7653c0: deregister unregistered
libxl: error: libxl_device.c:1146:device_backend_callback: Domain 27:unable to remove device with path /local/domain/26/backend/vif/27/0
libxl: debug: libxl_event.c:864:libxl__ev_xswatch_deregister: watch w=0x7654c0: deregister unregistered
libxl: error: libxl_device.c:1445:device_addrm_aocomplete: Domain 27:Unable to remove vif with id 0
libxl: debug: libxl_event.c:2066:libxl__ao_complete: ao 0x765c30: complete, rc=-6
libxl: debug: libxl_event.c:2035:libxl__ao__destroy: ao 0x765c30: destroy
libxl_device_nic_del failed.
xencall:buffer: debug: total allocations:27 total releases:27
xencall:buffer: debug: current allocations:0 maximum allocations:2
xencall:buffer: debug: cache current size:2
xencall:buffer: debug: cache hits:13 misses:2 toobig:12
xencall:buffer: debug: total allocations:0 total releases:0
xencall:buffer: debug: current allocations:0 maximum allocations:0
xencall:buffer: debug: cache current size:0
xencall:buffer: debug: cache hits:0 misses:0 toobig:0

The main motivation for setting to state Closed is the line backend /local/domain/26/backend/vif/27/0/state wanted state 6 still waiting state 5 as it later complains that the directory is removed backend /local/domain/26/backend/vif/27/0/state wanted state 6 but it was removed.

But when setting to state closed before removing I have:

$ xl -vvv network-detach perso 0
libxl: debug: libxl_nic.c:509:libxl_device_nic_remove: Domain 29:ao 0xec4c30: create: how=(nil) callback=(nil) poller=0xec4cd0
libxl: debug: libxl_event.c:813:libxl__ev_xswatch_register: watch w=0xec43c0 wpath=/local/domain/28/backend/vif/29/0/state token=3/0: register slotnum=3
libxl: debug: libxl_nic.c:509:libxl_device_nic_remove: Domain 29:ao 0xec4c30: inprogress: poller=0xec4cd0, flags=i
libxl: debug: libxl_event.c:750:watchfd_callback: watch w=0xec43c0 wpath=/local/domain/28/backend/vif/29/0/state token=3/0: event epath=/local/domain/28/backend/vif/29/0/state
libxl: debug: libxl_event.c:1055:devstate_callback: backend /local/domain/28/backend/vif/29/0/state wanted state 6 still waiting state 5
libxl: debug: libxl_event.c:750:watchfd_callback: watch w=0xec43c0 wpath=/local/domain/28/backend/vif/29/0/state token=3/0: event epath=/local/domain/28/backend/vif/29/0/state
libxl: debug: libxl_event.c:1052:devstate_callback: backend /local/domain/28/backend/vif/29/0/state wanted state 6 ok
libxl: debug: libxl_event.c:850:libxl__ev_xswatch_deregister: watch w=0xec43c0 wpath=/local/domain/28/backend/vif/29/0/state token=3/0: deregister slotnum=3
libxl: debug: libxl_device.c:1133:device_backend_callback: Domain 29:calling device_backend_cleanup
libxl: debug: libxl_event.c:864:libxl__ev_xswatch_deregister: watch w=0xec43c0: deregister unregistered
libxl: debug: libxl_device.c:1187:device_hotplug: Domain 29:Backend domid 28, domid 0, assuming driver domains
libxl: debug: libxl_event.c:813:libxl__ev_xswatch_register: watch w=0xec44c0 wpath=/local/domain/28/backend/vif/29/0 token=3/1: register slotnum=3
libxl: debug: libxl_event.c:750:watchfd_callback: watch w=0xec44c0 wpath=/local/domain/28/backend/vif/29/0 token=3/1: event epath=/local/domain/28/backend/vif/29/0
libxl: debug: libxl_event.c:850:libxl__ev_xswatch_deregister: watch w=0xec44c0 wpath=/local/domain/28/backend/vif/29/0 token=3/1: deregister slotnum=3
libxl: debug: libxl_event.c:2066:libxl__ao_complete: ao 0xec4c30: complete, rc=0
libxl: debug: libxl_event.c:2035:libxl__ao__destroy: ao 0xec4c30: destroy
xencall:buffer: debug: total allocations:27 total releases:27
xencall:buffer: debug: current allocations:0 maximum allocations:2
xencall:buffer: debug: cache current size:2
xencall:buffer: debug: cache hits:13 misses:2 toobig:12
xencall:buffer: debug: total allocations:0 total releases:0
xencall:buffer: debug: current allocations:0 maximum allocations:0
xencall:buffer: debug: cache current size:0
xencall:buffer: debug: cache hits:0 misses:0 toobig:0

This now introduce the line backend /local/domain/28/backend/vif/29/0/state wanted state 6 ok.

The wait_for_frontend_closing call may not be needed however.

@hannesm
Copy link
Member

hannesm commented Nov 10, 2022

Thanks, I verified that this patch works with qubes-mirage-firewall. The xenstore-read is also as expected. Changing forth and back between sys-firewall and mirage-firewall works nicely with this patch applied.

@hannesm hannesm merged commit 52d0109 into mirage:master Nov 10, 2022
hannesm added a commit to hannesm/opam-repository that referenced this pull request Nov 10, 2022
CHANGES:

* netback: go to closed state before removing vif from xenstore (@palainp,
  mirage/mirage-net-xen#105, addresses mirage/qubes-mirage-firewall#157)
@palainp palainp deleted the wait_disco_before_rm branch November 10, 2022 15:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Detach netvm fails for mirage v0.8.2
4 participants