Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Interactive shell session hangs on exit if a background command is running #12210

Closed
6 tasks
dmitry-lyfar opened this issue Sep 4, 2023 · 5 comments · Fixed by #12381
Closed
6 tasks

Interactive shell session hangs on exit if a background command is running #12210

dmitry-lyfar opened this issue Sep 4, 2023 · 5 comments · Fixed by #12381
Assignees
Labels
Bug Confirmed to be a bug
Milestone

Comments

@dmitry-lyfar
Copy link

Required information

  • Distribution: ubuntu
  • Distribution version: 22.04
  • The output of "lxc info":
config:
 core.https_address: '[::]'
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- macaroon_authentication
- network_sriov
- console
- restrict_devlxd
- migration_pre_copy
- infiniband
- maas_network
- devlxd_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- devlxd_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- candid_authentication
- backup_compression
- candid_config
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- candid_config_key
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- snapshot_schedule_aliases
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- rbac
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
- resources_network_firmware
- backup_compression_algorithm
- ceph_data_pool_name
- container_syscall_intercept_mount
- compression_squashfs
- container_raw_mount
- container_nic_routed
- container_syscall_intercept_mount_fuse
- container_disk_ceph
- virtual-machines
- image_profiles
- clustering_architecture
- resources_disk_id
- storage_lvm_stripes
- vm_boot_priority
- unix_hotplug_devices
- api_filtering
- instance_nic_network
- clustering_sizing
- firewall_driver
- projects_limits
- container_syscall_intercept_hugetlbfs
- limits_hugepages
- container_nic_routed_gateway
- projects_restrictions
- custom_volume_snapshot_expiry
- volume_snapshot_scheduling
- trust_ca_certificates
- snapshot_disk_usage
- clustering_edit_roles
- container_nic_routed_host_address
- container_nic_ipvlan_gateway
- resources_usb_pci
- resources_cpu_threads_numa
- resources_cpu_core_die
- api_os
- container_nic_routed_host_table
- container_nic_ipvlan_host_table
- container_nic_ipvlan_mode
- resources_system
- images_push_relay
- network_dns_search
- container_nic_routed_limits
- instance_nic_bridged_vlan
- network_state_bond_bridge
- usedby_consistency
- custom_block_volumes
- clustering_failure_domains
- resources_gpu_mdev
- console_vga_type
- projects_limits_disk
- network_type_macvlan
- network_type_sriov
- container_syscall_intercept_bpf_devices
- network_type_ovn
- projects_networks
- projects_networks_restricted_uplinks
- custom_volume_backup
- backup_override_name
- storage_rsync_compression
- network_type_physical
- network_ovn_external_subnets
- network_ovn_nat
- network_ovn_external_routes_remove
- tpm_device_type
- storage_zfs_clone_copy_rebase
- gpu_mdev
- resources_pci_iommu
- resources_network_usb
- resources_disk_address
- network_physical_ovn_ingress_mode
- network_ovn_dhcp
- network_physical_routes_anycast
- projects_limits_instances
- network_state_vlan
- instance_nic_bridged_port_isolation
- instance_bulk_state_change
- network_gvrp
- instance_pool_move
- gpu_sriov
- pci_device_type
- storage_volume_state
- network_acl
- migration_stateful
- disk_state_quota
- storage_ceph_features
- projects_compression
- projects_images_remote_cache_expiry
- certificate_project
- network_ovn_acl
- projects_images_auto_update
- projects_restricted_cluster_target
- images_default_architecture
- network_ovn_acl_defaults
- gpu_mig
- project_usage
- network_bridge_acl
- warnings
- projects_restricted_backups_and_snapshots
- clustering_join_token
- clustering_description
- server_trusted_proxy
- clustering_update_cert
- storage_api_project
- server_instance_driver_operational
- server_supported_storage_drivers
- event_lifecycle_requestor_address
- resources_gpu_usb
- clustering_evacuation
- network_ovn_nat_address
- network_bgp
- network_forward
- custom_volume_refresh
- network_counters_errors_dropped
- metrics
- image_source_project
- clustering_config
- network_peer
- linux_sysctl
- network_dns
- ovn_nic_acceleration
- certificate_self_renewal
- instance_project_move
- storage_volume_project_move
- cloud_init
- network_dns_nat
- database_leader
- instance_all_projects
- clustering_groups
- ceph_rbd_du
- instance_get_full
- qemu_metrics
- gpu_mig_uuid
- event_project
- clustering_evacuation_live
- instance_allow_inconsistent_copy
- network_state_ovn
- storage_volume_api_filtering
- image_restrictions
- storage_zfs_export
- network_dns_records
- storage_zfs_reserve_space
- network_acl_log
- storage_zfs_blocksize
- metrics_cpu_seconds
- instance_snapshot_never
- certificate_token
- instance_nic_routed_neighbor_probe
- event_hub
- agent_nic_config
- projects_restricted_intercept
- metrics_authentication
- images_target_project
- cluster_migration_inconsistent_copy
- cluster_ovn_chassis
- container_syscall_intercept_sched_setscheduler
- storage_lvm_thinpool_metadata_size
- storage_volume_state_total
- instance_file_head
- instances_nic_host_name
- image_copy_profile
- container_syscall_intercept_sysinfo
- clustering_evacuation_mode
- resources_pci_vpd
- qemu_raw_conf
- storage_cephfs_fscache
- network_load_balancer
- vsock_api
- instance_ready_state
- network_bgp_holdtime
- storage_volumes_all_projects
- metrics_memory_oom_total
- storage_buckets
- storage_buckets_create_credentials
- metrics_cpu_effective_total
- projects_networks_restricted_access
- storage_buckets_local
- loki
- acme
- internal_metrics
- cluster_join_token_expiry
- remote_token_expiry
- init_preseed
- storage_volumes_created_at
- cpu_hotplug
- projects_networks_zones
- network_txqueuelen
- cluster_member_state
- instances_placement_scriptlet
- storage_pool_source_wipe
- zfs_block_mode
- instance_generation_id
- disk_io_cache
- amd_sev
- storage_pool_loop_resize
- migration_vm_live
- ovn_nic_nesting
- oidc
- network_ovn_l3only
- ovn_nic_acceleration_vdpa
- cluster_healing
- instances_state_total
- auth_user
- security_csm
- instances_rebuild
- numa_cpu_placement
- custom_volume_iso
- network_allocations
- storage_api_remote_volume_snapshot_copy
- zfs_delegate
- operations_get_query_all_projects
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
auth_user_name: dmitry
auth_user_method: unix
environment:
 addresses:
 - 192.168.1.137:8443
 - '[fde1:b138:eb6:4a6b:8e63:750f:55b2:dc10]:8443'
 - '[fde1:b138:eb6:4a6b:5a0:a38:a74a:2f19]:8443'
 - 10.153.192.1:8443
 - 10.0.3.1:8443
 - 172.17.0.1:8443
 - 10.80.59.1:8443
 - '[fd42:d4fc:b3b6:b133::1]:8443'
 - 10.53.33.1:8443
 - '[fd42:34ad:d512:7a35::1]:8443'
 - 10.240.117.1:8443
 - 172.20.10.3:8443
 architectures:
 - x86_64
 - i686
 certificate: |
   -----BEGIN CERTIFICATE-----
   MIICFjCCAZygAwIBAgIRANNdYgBthcUKHAZXJuK0zAUwCgYIKoZIzj0EAwMwOjEc
   MBoGA1UEChMTbGludXhjb250YWluZXJzLm9yZzEaMBgGA1UEAwwRcm9vdEBkbWl0
   cnktbHlmYXIwHhcNMjIwNzExMDg0MjQxWhcNMzIwNzA4MDg0MjQxWjA6MRwwGgYD
   VQQKExNsaW51eGNvbnRhaW5lcnMub3JnMRowGAYDVQQDDBFyb290QGRtaXRyeS1s
   eWZhcjB2MBAGByqGSM49AgEGBSuBBAAiA2IABF8WGnKU42/rTmrjmCzGQkgXdPPE
   9ZJpousdeRzcTD8Hj3xbJRZASZWRCYLMJEcHOOE2jQhh1FlikF7BhzzmiTI4trZ/
   3vu41nGs4rIQ9D3Mcyx9JRiXd4UFlF8yD0+VCaNmMGQwDgYDVR0PAQH/BAQDAgWg
   MBMGA1UdJQQMMAoGCCsGAQUFBwMBMAwGA1UdEwEB/wQCMAAwLwYDVR0RBCgwJoIM
   ZG1pdHJ5LWx5ZmFyhwR/AAABhxAAAAAAAAAAAAAAAAAAAAABMAoGCCqGSM49BAMD
   A2gAMGUCMF+X2Fh7GZvIjKbqlvUf/ZZ/Bq6OxSgiwWWptcznwcX60wLN38DSjywZ
   uKDfWr4BAgIxAL9Sv+2vY4ol37rnAQw0DxkEeV2B1gCJCmz1BR1mCmg2P9xCjrgq
   V31VvdRAgLUdrQ==
   -----END CERTIFICATE-----
 certificate_fingerprint: 7c716873391f2d1cd2354ab45182632c2d38ecb0463e6925bab5634319166f3f
 driver: lxc | qemu
 driver_version: 5.0.0 | 8.0.4
 firewall: nftables
 kernel: Linux
 kernel_architecture: x86_64
 kernel_features:
   idmapped_mounts: "true"
   netnsid_getifaddrs: "true"
   seccomp_listener: "true"
   seccomp_listener_continue: "true"
   shiftfs: "false"
   uevent_injection: "true"
   unpriv_fscaps: "true"
 kernel_version: 5.19.0-50-generic
 lxc_features:
   cgroup2: "true"
   core_scheduling: "true"
   devpts_fd: "true"
   idmapped_mounts_v2: "true"
   mount_injection_file: "true"
   network_gateway_device_route: "true"
   network_ipvlan: "true"
   network_l2proxy: "true"
   network_phys_macvlan_mtu: "true"
   network_veth_router: "true"
   pidfd: "true"
   seccomp_allow_deny_syntax: "true"
   seccomp_notify: "true"
   seccomp_proxy_send_notify_fd: "true"
 os_name: Ubuntu
 os_version: "22.04"
 project: workspace.dmitry
 server: lxd
 server_clustered: false
 server_event_mode: full-mesh
 server_name: dmitry-lyfar
 server_pid: 5536
 server_version: "5.17"
 storage: dir | zfs | btrfs
 storage_version: 1 | 2.1.5-1ubuntu6 | 5.16.2
 storage_supported_drivers:
 - name: dir
   version: "1"
   remote: false
 - name: lvm
   version: 2.03.11(2) (2021-01-08) / 1.02.175 (2021-01-08) / 4.47.0
   remote: false
 - name: zfs
   version: 2.1.5-1ubuntu6
   remote: false
 - name: btrfs
   version: 5.16.2
   remote: false
 - name: ceph
   version: 17.2.6
   remote: true
 - name: cephfs
   version: 17.2.6
   remote: true
 - name: cephobject
   version: 17.2.6
   remote: true

Issue description

A shell session does not handle Ctrl+D properly and hangs if there is a background command running.

Steps to reproduce

  1. lxc launch ubuntu:20.04 u1
  2. lxc shell u1
  3. ping google.com 2>&1 > /dev/null &
  4. exit OR Ctrl-D will hang and not return to the host's shell session. If a separate shell session is launched and the process running in p.3 is killed, the original session terminates immediately

Information to attach

  • Any relevant kernel output (dmesg)
  • Container log (lxc info NAME --show-log)
  • Container configuration (lxc config show NAME --expanded)
  • Main daemon log (at /var/log/lxd/lxd.log or /var/snap/lxd/common/lxd/logs/lxd.log)
  • Output of the client with --debug
  • Output of the daemon with --debug (alternatively output of lxc monitor while reproducing the issue)
@dmitry-lyfar dmitry-lyfar changed the title Interactive shell session hangs if a background command is running Interactive shell session hangs on exit if a background command is running Sep 4, 2023
@tomponline tomponline added the Bug Confirmed to be a bug label Sep 6, 2023
@tomponline
Copy link
Member

tomponline commented Sep 6, 2023

Reproduced and LXD debug log shows this after exit command sent:

DEBUG  [2023-09-06T09:14:28+01:00] Instance process stopped                      PID=54554 err="<nil>" exitStatus=0 instance=c1 interactive=true project=default
DEBUG  [2023-09-06T09:14:28+01:00] Exec control handler finished                 PID=54554 instance=c1 interactive=true project=default

And then when the ping command is manually killed we see:

DEBUG  [2023-09-06T09:15:37+01:00] Websocket: Sending barrier message            address=@
DEBUG  [2023-09-06T09:15:37+01:00] Websocket: Stopped read mirror                address=@
DEBUG  [2023-09-06T09:15:37+01:00] Websocket: Stopped write mirror               address=@
DEBUG  [2023-09-06T09:15:37+01:00] Exec mirror websocket finished                PID=54554 instance=c1 interactive=true number=0 project=default
DEBUG  [2023-09-06T09:15:37+01:00] Updated metadata for operation                class=websocket description="Executing command" operation=1c628329-00c1-4f1a-b89f-a8198c6dd41c project=default
DEBUG  [2023-09-06T09:15:37+01:00] Success for operation                         class=websocket description="Executing command" operation=1c628329-00c1-4f1a-b89f-a8198c6dd41c project=default
DEBUG  [2023-09-06T09:15:37+01:00] Event listener server handler stopped         listener=cdc5433d-aaaa-4c6d-9890-1cacc3f17787 local=/var/lib/lxd/unix.socket remote=@

@tomponline
Copy link
Member

@MusicDin @mihalicyn I've been doing a fair bit of research and experimentation around this.

The key problem is that when the shell exits with a background process that is still holding the pty open, but not producing output the any reads or polls on that pty will block.

Currently we are using io.Copy to read from the pty and write to the websocket, and that is what is blocking.

I have tried reintroducing the approach Christian used with poll that was removed in 54e3da8 and it still blocked, but this time on poll via GetPollRevents.

I wanted to see if the approach Christian originally used with the poll syscall had actually ever worked and whether it was just me who had inadvertently broken his implementation later on.

I found two things out:

  1. If you use 4.0/stable channel then the interactive exec session correctly exits even if there are background tasks still running and not generating output. So it seems at least it did used to work (but see later on for troubling news on that).
  2. If you use 4.0/edge (still very old) it doesn't work anymore and blocks as it does today.

I'm pretty sure that the breakage between 1. and 2. is caused by 06ada13 because rather than closing ch channel in first go routine when GetPollRevents returns the relevant events, it instead changes it to cancelling the context. This leaves the only place where the ch channel is closed being in the 2nd go routine, but this is the one that blocks on GetPollRevents.

Whoops, my bad.

However I then looked at the earlier implementation in more detail to figure out how it worked and it dawned on me that really the "working" version in 4.0/stable wasn't really working properly either. It wasn't unblocking the blocked GetPollRevents as even after I exited the interactive session I can still see a Go routine being leaked blocked on poll:

E.g. this is from 4.0/stable after I've exited an interactive session successfully with background tasks that don't produce output:

goroutine 3392 [syscall]:
syscall.Syscall6(0x10f, 0xc000aa86e0, 0x1, 0x0, 0x0, 0x0, 0x0)
	/snap/go/9991/src/syscall/asm_linux_amd64.s:43 +0x5
golang.org/x/sys/unix.ppoll(0xc000938000?, 0xc000aa86c0?, 0x41ace5?, 0xc000aa86a8?)
	/build/lxd/parts/lxd/src/.go/pkg/mod/golang.org/x/sys@v0.0.0-20220114195835-da31bd327af9/unix/zsyscall_linux.go:124 +0x58
golang.org/x/sys/unix.Ppoll({0xc000aa86e0?, 0x1010100004f3aa0?, 0x7f48846a77b0?}, 0x0?, 0x7f48ce701a68?)
	/build/lxd/parts/lxd/src/.go/pkg/mod/golang.org/x/sys@v0.0.0-20220114195835-da31bd327af9/unix/syscall_linux.go:147 +0x45
golang.org/x/sys/unix.Poll({0xc000aa86e0?, 0xc000aa86e8?, 0x45ad92?}, 0x20000?)
	/build/lxd/parts/lxd/src/.go/pkg/mod/golang.org/x/sys@v0.0.0-20220114195835-da31bd327af9/unix/syscall_linux.go:156 +0x99
github.com/lxc/lxd/shared.GetPollRevents(0xc0029a7ae0?, 0xc000918000?, 0x203b?)
	/build/lxd/parts/lxd/src/shared/util_linux.go:638 +0x52
github.com/lxc/lxd/shared.ExecReaderToChannel.func3()
	/build/lxd/parts/lxd/src/shared/util_linux.go:517 +0x12c
created by github.com/lxc/lxd/shared.ExecReaderToChannel
	/build/lxd/parts/lxd/src/shared/util_linux.go:506 +0x21a

So even when it did work (as in return from lxc exec command) it was leaking go routines.

What I have found that works is to specify a timeout on the poll call (I've used 1s) and so that it always returns eventually.
This then seems to simplify the function a lot as we just need to call poll with a 1s timeout looking for (unix.POLLIN | unix.POLLPRI) (indicating a read is ready) and then perform the read.

Then we can check if the parent process (the shell) has exited the next time a read is requested and if it has, then don't even call poll.

My POC is here:

#12381

This doesn't leak go routines and returns within 1s of the parent process exiting if there are silent background tasks running.

@mihalicyn
Copy link
Member

Great investigation, Tom!

@tomponline
Copy link
Member

@mihalicyn do you see any problem with keeping poll()ing every 1s?

@mihalicyn
Copy link
Member

@mihalicyn do you see any problem with keeping poll()ing every 1s?

No, I don't. I'll take a look closer on your PR.

tomponline added a commit to tomponline/lxd that referenced this issue Oct 16, 2023
Fixes canonical#12210

Signed-off-by: Thomas Parrott <thomas.parrott@canonical.com>
tomponline added a commit to tomponline/lxd that referenced this issue Oct 16, 2023
…ntainer exec

Fixes canonical#12210

Signed-off-by: Thomas Parrott <thomas.parrott@canonical.com>
tomponline added a commit to tomponline/lxd that referenced this issue Nov 30, 2023
Fixes canonical#12210

Signed-off-by: Thomas Parrott <thomas.parrott@canonical.com>
tomponline added a commit to tomponline/lxd that referenced this issue Nov 30, 2023
…ntainer exec

Fixes canonical#12210

Signed-off-by: Thomas Parrott <thomas.parrott@canonical.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Confirmed to be a bug
Projects
None yet
4 participants