Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect arguments passed to instance placement scriptlet on instance move #1299

Closed
victoitor opened this issue Oct 9, 2024 · 3 comments
Closed
Assignees
Labels
Bug Confirmed to be a bug
Milestone

Comments

@victoitor
Copy link

When incus move is used to move an instance between projects in a cluster, the arguments used to call the placement scriptlet are incorrect.

I have 3 projects with the following cluster group restrictions.

victoitor@bastion:~$ incus project get intel-12700 restricted.cluster.groups
intel-12700
victoitor@bastion:~$ incus project get amd-5700g restricted.cluster.groups
amd-5700g
victoitor@bastion:~$ incus project get auxiliar restricted.cluster.groups
amd-5700g

And the following cluster groups.

victoitor@bastion:~$ incus cluster group show amd-5700g
description: ""
members:
- amd01
- amd02
- amd03
- amd04
config: {}
name: amd-5700g
victoitor@bastion:~$ incus cluster group show intel-12700
description: ""
members:
- intel01
- intel02
- intel03
config: {}
name: intel-12700

I have a scriptlet with the following part for logging the input.

def instance_placement(request, candidate_members):
    project = get_project( request.project )
    log_error("SCRIPTLET DEBUG Request: ", request, "\nSCRIPTLET DEBUG Candidade members: ", candidate_members, "\nSCRIPTLET DEBUG Project: ", project)

So I create and instance on project auxiliar and use incus move to move it between all possible pairs of projects, I get the following sequence of command and log. So the set of candidate members always includes just one node instead of the full cluster group. Furthermore, sometimes the target project is incorrect, like when moving from amd-5700g to auxiliar, which is quite awkward.

victoitor@bastion:~$ incus move incus-test --project auxiliar --target-project amd-5700g
ERROR  [2024-10-09T14:34:41-03:00] Instance placement scriptlet: SCRIPTLET DEBUG Request: {"architecture": "x86_64", "config": {"cloud-init.vendor-data": "#cloud-config\npackage_update: true\npackage_upgrade: true\npackage_reboot_if_required: true\ntimezone: America/Fortaleza\nusers:\n- gecos: Default pargo user\n  groups: sudo, video, render\n  name: pargo\n  lock_passwd: true\n  sudo: ALL=(ALL) NOPASSWD:ALL\n  shell: /bin/bash\n", "image.architecture": "amd64", "image.description": "Debian bookworm amd64 (20241009_05:24)", "image.os": "Debian", "image.release": "bookworm", "image.serial": "20241009_05:24", "image.type": "squashfs", "image.variant": "default", "limits.cpu": "0-5,8-13", "limits.memory": "24GB", "security.nesting": "true", "user.responsavel": "Incus Test", "volatile.apply_template": "create", "volatile.base_image": "bea0f1696dc17d7a8002d8d0dd408ad51fa89212e24db1593831b0bed583a5a3", "volatile.eth0.hwaddr": "00:16:3e:c0:aa:18"}, "devices": {"eth0": {"name": "eth0", "nictype": "bridged", "parent": "br0", "type": "nic"}, "root": {"path": "/", "pool": "local", "type": "disk"}}, "ephemeral": False, "profiles": ["default"], "restore": "", "stateful": False, "description": "", "name": "incus-test", "source": {"type": "copy", "certificate": "", "alias": "", "fingerprint": "", "properties": {}, "server": "", "secret": "", "protocol": "", "base-image": "", "mode": "", "operation": "", "secrets": {}, "source": "incus-test", "live": False, "instance_only": False, "refresh": False, "project": "auxiliar", "allow_inconsistent": False}, "instance_type": "", "type": "container", "start": False, "reason": "new", "project": "amd-5700g"}
SCRIPTLET DEBUG Candidade members: [{"roles": [], "failure_domain": "default", "description": "", "config": {"user.experimentos.limits.cpu": "0-5,8-13", "user.experimentos.limits.memory": "24GB"}, "groups": ["default", "amd-5700g"], "server_name": "amd02", "url": "https://10.11.16.12:8443", "database": False, "status": "Online", "message": "Fully operational", "architecture": "x86_64"}]
SCRIPTLET DEBUG Project: {"config": {"features.images": "false", "features.profiles": "true", "features.storage.buckets": "true", "features.storage.volumes": "true", "restricted": "true", "restricted.cluster.groups": "amd-5700g", "restricted.cluster.target": "allow", "restricted.containers.nesting": "allow", "restricted.devices.nic": "allow", "restricted.snapshots": "allow", "user.node.limits.cpu": "0-5,8-13", "user.node.limits.cpu.unique": "true", "user.node.limits.memory": "24GB", "user.node.represented": "true", "user.node.represented.unique": "true"}, "description": "Experimentos - máquinas amd-5700g", "name": "amd-5700g", "used_by": []} 
victoitor@bastion:~$ incus move incus-test --project amd-5700g --target-project auxiliar
ERROR  [2024-10-09T14:35:53-03:00] Instance placement scriptlet: SCRIPTLET DEBUG Request: {"architecture": "", "config": {"cloud-init.vendor-data": "#cloud-config\npackage_update: true\npackage_upgrade: true\npackage_reboot_if_required: true\ntimezone: America/Fortaleza\nusers:\n- gecos: Default pargo user\n  groups: sudo, video, render\n  name: pargo\n  lock_passwd: true\n  sudo: ALL=(ALL) NOPASSWD:ALL\n  shell: /bin/bash\n", "image.architecture": "amd64", "image.description": "Debian bookworm amd64 (20241009_05:24)", "image.os": "Debian", "image.release": "bookworm", "image.serial": "20241009_05:24", "image.type": "squashfs", "image.variant": "default", "limits.cpu": "0-5,8-13", "limits.memory": "24GB", "security.nesting": "true", "user.responsavel": "Incus Test", "volatile.apply_template": "create", "volatile.base_image": "bea0f1696dc17d7a8002d8d0dd408ad51fa89212e24db1593831b0bed583a5a3", "volatile.cloud-init.instance-id": "58659f9e-ee14-44dc-9669-4f6857d581d3", "volatile.eth0.hwaddr": "00:16:3e:c0:aa:18", "volatile.idmap.base": "0", "volatile.idmap.next": "[{\"Isuid\":true,\"Isgid\":false,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000},{\"Isuid\":false,\"Isgid\":true,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000}]", "volatile.last_state.idmap": "[]", "volatile.uuid": "3777865f-e0ec-4e0f-a4db-88fd54d93623", "volatile.uuid.generation": "3777865f-e0ec-4e0f-a4db-88fd54d93623"}, "devices": {"eth0": {"name": "eth0", "nictype": "bridged", "parent": "br0", "type": "nic"}, "root": {"path": "/", "pool": "local", "type": "disk"}}, "ephemeral": False, "profiles": [], "restore": "", "stateful": False, "description": "", "name": "incus-test", "source": {"type": "", "certificate": "", "alias": "", "fingerprint": "", "properties": {}, "server": "", "secret": "", "protocol": "", "base-image": "", "mode": "", "operation": "", "secrets": {}, "source": "", "live": False, "instance_only": False, "refresh": False, "project": "", "allow_inconsistent": False}, "instance_type": "", "type": "", "start": False, "reason": "relocation", "project": "amd-5700g"}
SCRIPTLET DEBUG Candidade members: [{"roles": [], "failure_domain": "default", "description": "", "config": {"user.experimentos.limits.cpu": "0-5,8-13", "user.experimentos.limits.memory": "24GB"}, "groups": ["default", "amd-5700g"], "server_name": "amd02", "url": "https://10.11.16.12:8443", "database": False, "status": "Online", "message": "Fully operational", "architecture": "x86_64"}]
SCRIPTLET DEBUG Project: {"config": {"features.images": "false", "features.profiles": "true", "features.storage.buckets": "true", "features.storage.volumes": "true", "restricted": "true", "restricted.cluster.groups": "amd-5700g", "restricted.cluster.target": "allow", "restricted.containers.nesting": "allow", "restricted.devices.nic": "allow", "restricted.snapshots": "allow", "user.node.limits.cpu": "0-5,8-13", "user.node.limits.cpu.unique": "true", "user.node.limits.memory": "24GB", "user.node.represented": "true", "user.node.represented.unique": "true"}, "description": "Experimentos - máquinas amd-5700g", "name": "amd-5700g", "used_by": []} 
victoitor@bastion:~$ incus move incus-test --project auxiliar --target-project intel-12700
ERROR  [2024-10-09T14:37:47-03:00] Instance placement scriptlet: SCRIPTLET DEBUG Request: {"architecture": "x86_64", "config": {"cloud-init.vendor-data": "#cloud-config\npackage_update: true\npackage_upgrade: true\npackage_reboot_if_required: true\ntimezone: America/Fortaleza\nusers:\n- gecos: Default pargo user\n  groups: sudo, video, render\n  name: pargo\n  lock_passwd: true\n  sudo: ALL=(ALL) NOPASSWD:ALL\n  shell: /bin/bash\n", "image.architecture": "amd64", "image.description": "Debian bookworm amd64 (20241009_05:24)", "image.os": "Debian", "image.release": "bookworm", "image.serial": "20241009_05:24", "image.type": "squashfs", "image.variant": "default", "limits.cpu": "0-15", "limits.memory": "24GB", "security.nesting": "true", "user.responsavel": "Incus Test", "volatile.apply_template": "copy", "volatile.base_image": "bea0f1696dc17d7a8002d8d0dd408ad51fa89212e24db1593831b0bed583a5a3", "volatile.eth0.hwaddr": "00:16:3e:dc:e3:a7"}, "devices": {"eth0": {"name": "eth0", "nictype": "bridged", "parent": "br0", "type": "nic"}, "root": {"path": "/", "pool": "local", "type": "disk"}}, "ephemeral": False, "profiles": ["default"], "restore": "", "stateful": False, "description": "", "name": "incus-test", "source": {"type": "copy", "certificate": "", "alias": "", "fingerprint": "", "properties": {}, "server": "", "secret": "", "protocol": "", "base-image": "", "mode": "", "operation": "", "secrets": {}, "source": "incus-test", "live": False, "instance_only": False, "refresh": False, "project": "auxiliar", "allow_inconsistent": False}, "instance_type": "", "type": "container", "start": False, "reason": "new", "project": "intel-12700"}
SCRIPTLET DEBUG Candidade members: [{"roles": ["database"], "failure_domain": "default", "description": "", "config": {"user.experimentos.limits.cpu": "0-15", "user.experimentos.limits.memory": "24GB"}, "groups": ["intel-12700"], "server_name": "intel01", "url": "https://10.11.16.31:8443", "database": True, "status": "Online", "message": "Fully operational", "architecture": "x86_64"}]
SCRIPTLET DEBUG Project: {"config": {"features.images": "false", "features.profiles": "true", "features.storage.buckets": "true", "features.storage.volumes": "true", "restricted": "true", "restricted.cluster.groups": "intel-12700", "restricted.cluster.target": "allow", "restricted.containers.nesting": "allow", "restricted.devices.nic": "allow", "restricted.snapshots": "allow", "user.node.limits.cpu": "0-15", "user.node.limits.cpu.unique": "true", "user.node.limits.memory": "24GB", "user.node.represented": "true", "user.node.represented.unique": "true"}, "description": "Experimentos - máquinas intel-12700", "name": "intel-12700", "used_by": []} 
victoitor@bastion:~$ incus move incus-test --project intel-12700 --target-project amd-5700g
ERROR  [2024-10-09T14:39:27-03:00] Instance placement scriptlet: SCRIPTLET DEBUG Request: {"architecture": "x86_64", "config": {"cloud-init.vendor-data": "#cloud-config\npackage_update: true\npackage_upgrade: true\npackage_reboot_if_required: true\ntimezone: America/Fortaleza\nusers:\n- gecos: Default pargo user\n  groups: sudo, video, render\n  name: pargo\n  lock_passwd: true\n  sudo: ALL=(ALL) NOPASSWD:ALL\n  shell: /bin/bash\n", "image.architecture": "amd64", "image.description": "Debian bookworm amd64 (20241009_05:24)", "image.os": "Debian", "image.release": "bookworm", "image.serial": "20241009_05:24", "image.type": "squashfs", "image.variant": "default", "limits.cpu": "0-5,8-13", "limits.memory": "24GB", "security.nesting": "true", "user.responsavel": "Incus Test", "volatile.apply_template": "copy", "volatile.base_image": "bea0f1696dc17d7a8002d8d0dd408ad51fa89212e24db1593831b0bed583a5a3", "volatile.eth0.hwaddr": "00:16:3e:dc:e3:a7"}, "devices": {"eth0": {"name": "eth0", "nictype": "bridged", "parent": "br0", "type": "nic"}, "root": {"path": "/", "pool": "local", "type": "disk"}}, "ephemeral": False, "profiles": ["default"], "restore": "", "stateful": False, "description": "", "name": "incus-test", "source": {"type": "copy", "certificate": "", "alias": "", "fingerprint": "", "properties": {}, "server": "", "secret": "", "protocol": "", "base-image": "", "mode": "", "operation": "", "secrets": {}, "source": "incus-test", "live": False, "instance_only": False, "refresh": False, "project": "intel-12700", "allow_inconsistent": False}, "instance_type": "", "type": "container", "start": False, "reason": "new", "project": "amd-5700g"}
SCRIPTLET DEBUG Candidade members: [{"roles": ["database-leader", "database"], "failure_domain": "default", "description": "", "config": {"user.experimentos.limits.cpu": "0-5,8-13", "user.experimentos.limits.memory": "24GB"}, "groups": ["default", "amd-5700g"], "server_name": "amd01", "url": "https://10.11.16.11:8443", "database": True, "status": "Online", "message": "Fully operational", "architecture": "x86_64"}]
SCRIPTLET DEBUG Project: {"config": {"features.images": "false", "features.profiles": "true", "features.storage.buckets": "true", "features.storage.volumes": "true", "restricted": "true", "restricted.cluster.groups": "amd-5700g", "restricted.cluster.target": "allow", "restricted.containers.nesting": "allow", "restricted.devices.nic": "allow", "restricted.snapshots": "allow", "user.node.limits.cpu": "0-5,8-13", "user.node.limits.cpu.unique": "true", "user.node.limits.memory": "24GB", "user.node.represented": "true", "user.node.represented.unique": "true"}, "description": "Experimentos - máquinas amd-5700g", "name": "amd-5700g", "used_by": []} 
victoitor@bastion:~$ incus move incus-test --project amd-5700g --target-project intel-12700
ERROR  [2024-10-09T14:40:29-03:00] Instance placement scriptlet: SCRIPTLET DEBUG Request: {"architecture": "x86_64", "config": {"cloud-init.vendor-data": "#cloud-config\npackage_update: true\npackage_upgrade: true\npackage_reboot_if_required: true\ntimezone: America/Fortaleza\nusers:\n- gecos: Default pargo user\n  groups: sudo, video, render\n  name: pargo\n  lock_passwd: true\n  sudo: ALL=(ALL) NOPASSWD:ALL\n  shell: /bin/bash\n", "image.architecture": "amd64", "image.description": "Debian bookworm amd64 (20241009_05:24)", "image.os": "Debian", "image.release": "bookworm", "image.serial": "20241009_05:24", "image.type": "squashfs", "image.variant": "default", "limits.cpu": "0-15", "limits.memory": "24GB", "security.nesting": "true", "user.responsavel": "Incus Test", "volatile.apply_template": "copy", "volatile.base_image": "bea0f1696dc17d7a8002d8d0dd408ad51fa89212e24db1593831b0bed583a5a3", "volatile.eth0.hwaddr": "00:16:3e:dc:e3:a7"}, "devices": {"eth0": {"name": "eth0", "nictype": "bridged", "parent": "br0", "type": "nic"}, "root": {"path": "/", "pool": "local", "type": "disk"}}, "ephemeral": False, "profiles": ["default"], "restore": "", "stateful": False, "description": "", "name": "incus-test", "source": {"type": "copy", "certificate": "", "alias": "", "fingerprint": "", "properties": {}, "server": "", "secret": "", "protocol": "", "base-image": "", "mode": "", "operation": "", "secrets": {}, "source": "incus-test", "live": False, "instance_only": False, "refresh": False, "project": "amd-5700g", "allow_inconsistent": False}, "instance_type": "", "type": "container", "start": False, "reason": "new", "project": "intel-12700"}
SCRIPTLET DEBUG Candidade members: [{"roles": ["database"], "failure_domain": "default", "description": "", "config": {"user.experimentos.limits.cpu": "0-15", "user.experimentos.limits.memory": "24GB"}, "groups": ["intel-12700"], "server_name": "intel02", "url": "https://10.11.16.32:8443", "database": True, "status": "Online", "message": "Fully operational", "architecture": "x86_64"}]
SCRIPTLET DEBUG Project: {"config": {"features.images": "false", "features.profiles": "true", "features.storage.buckets": "true", "features.storage.volumes": "true", "restricted": "true", "restricted.cluster.groups": "intel-12700", "restricted.cluster.target": "allow", "restricted.containers.nesting": "allow", "restricted.devices.nic": "allow", "restricted.snapshots": "allow", "user.node.limits.cpu": "0-15", "user.node.limits.cpu.unique": "true", "user.node.limits.memory": "24GB", "user.node.represented": "true", "user.node.represented.unique": "true"}, "description": "Experimentos - máquinas intel-12700", "name": "intel-12700", "used_by": []} 
victoitor@bastion:~$ incus move incus-test --project intel-12700 --target-project auxiliar
ERROR  [2024-10-09T14:42:03-03:00] Instance placement scriptlet: SCRIPTLET DEBUG Request: {"architecture": "x86_64", "config": {"cloud-init.vendor-data": "#cloud-config\npackage_update: true\npackage_upgrade: true\npackage_reboot_if_required: true\ntimezone: America/Fortaleza\nusers:\n- gecos: Default pargo user\n  groups: sudo, video, render\n  name: pargo\n  lock_passwd: true\n  sudo: ALL=(ALL) NOPASSWD:ALL\n  shell: /bin/bash\n", "image.architecture": "amd64", "image.description": "Debian bookworm amd64 (20241009_05:24)", "image.os": "Debian", "image.release": "bookworm", "image.serial": "20241009_05:24", "image.type": "squashfs", "image.variant": "default", "limits.cpu": "6-7,14-15", "limits.cpu.allowance": "100%", "limits.memory": "1GiB", "user.responsavel": "Incus Test", "volatile.apply_template": "copy", "volatile.base_image": "bea0f1696dc17d7a8002d8d0dd408ad51fa89212e24db1593831b0bed583a5a3", "volatile.cloud-init.instance-id": "d6bc666a-e91e-48cd-a9aa-bc226762e2be", "volatile.eth0.hwaddr": "00:16:3e:df:b8:14", "volatile.idmap.base": "0", "volatile.idmap.next": "[{\"Isuid\":true,\"Isgid\":false,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000},{\"Isuid\":false,\"Isgid\":true,\"Hostid\":1000000,\"Nsid\":0,\"Maprange\":1000000000}]", "volatile.last_state.idmap": "[]", "volatile.uuid": "539d2bf4-1264-4e8d-a7d6-9ff58d098646", "volatile.uuid.generation": "539d2bf4-1264-4e8d-a7d6-9ff58d098646"}, "devices": {"eth0": {"name": "eth0", "nictype": "bridged", "parent": "br0", "type": "nic"}, "root": {"path": "/", "pool": "local", "type": "disk"}}, "ephemeral": False, "profiles": ["default"], "restore": "", "stateful": False, "description": "", "name": "incus-test", "source": {"type": "copy", "certificate": "", "alias": "", "fingerprint": "", "properties": {}, "server": "", "secret": "", "protocol": "", "base-image": "", "mode": "", "operation": "", "secrets": {}, "source": "incus-test", "live": False, "instance_only": False, "refresh": False, "project": "intel-12700", "allow_inconsistent": False}, "instance_type": "", "type": "container", "start": False, "reason": "new", "project": "auxiliar"}
SCRIPTLET DEBUG Candidade members: [{"roles": ["database-leader", "database"], "failure_domain": "default", "description": "", "config": {"user.experimentos.limits.cpu": "0-5,8-13", "user.experimentos.limits.memory": "24GB"}, "groups": ["default", "amd-5700g"], "server_name": "amd01", "url": "https://10.11.16.11:8443", "database": True, "status": "Online", "message": "Fully operational", "architecture": "x86_64"}]
SCRIPTLET DEBUG Project: {"config": {"features.images": "false", "features.profiles": "true", "features.storage.buckets": "true", "features.storage.volumes": "true", "restricted": "true", "restricted.backups": "allow", "restricted.cluster.groups": "amd-5700g", "restricted.cluster.target": "allow", "restricted.containers.lowlevel": "allow", "restricted.containers.nesting": "allow", "restricted.devices.disk": "allow", "restricted.devices.nic": "allow", "restricted.snapshots": "allow", "user.node.limits.cpu": "6-7,14-15", "user.node.represented": "true"}, "description": "Montagem e estacionamento", "name": "auxiliar", "used_by": []} 
@stgraber stgraber added the Bug Confirmed to be a bug label Oct 18, 2024
@stgraber stgraber added this to the incus-6.7 milestone Oct 18, 2024
@stgraber stgraber self-assigned this Oct 18, 2024
@stgraber
Copy link
Member

Things actually seem consistent here, just not particularly ideal:

ERROR  [2024-11-15T02:27:20Z] [server04] Instance placement scriptlet: [stgraber][relocation] project=restrict-s03, instance=test, candidates=["server01"] 
ERROR  [2024-11-15T02:27:20Z] [server04] Instance placement scriptlet: [stgraber][new] project=restrict-s01, instance=test, candidates=["server01"] 

and then:

ERROR  [2024-11-15T02:28:04Z] [server01] Instance placement scriptlet: [stgraber][relocation] project=restrict-s01, instance=test, candidates=["server04", "server03"] 
ERROR  [2024-11-15T02:28:04Z] [server01] Instance placement scriptlet: [stgraber][new] project=restrict-s03, instance=test, candidates=["server04"] 

I don't know why in your case you're only seeing the new reason and not the relocation one.

Basically during the move, Incus uses the relocation call to determine where the instance should be going. At that point it still exists in the source project which is why we're getting the source project at that point. The set of candidates being passed during relocation are the allowed candidates for the target project and it's when the scriptlet can actually make a decision.

Then after that decision is made, Incus internally handles the cross-project move which effectively is a copy+delete, that's why we get the new call into the scriptlet again, this time with the new project as target and this time with no flexibility on the target as it has already been decided.

@stgraber
Copy link
Member

Now ideally we'd be able to:

  • Eliminate the following new call entirely in this scenario, finding a way to detect that this is an internal move and not a new instance being created
  • Alter the call for relocation to indicate the target project rather than source

I'll take a look into this now. The project name part should be pretty trivial, eliminating the new event will likely be a bit trickier.

stgraber added a commit to stgraber/incus that referenced this issue Nov 15, 2024
Closes lxc#1299

Signed-off-by: Stéphane Graber <stgraber@stgraber.org>
stgraber added a commit to stgraber/incus that referenced this issue Nov 15, 2024
Closes lxc#1299

Signed-off-by: Stéphane Graber <stgraber@stgraber.org>
stgraber added a commit to stgraber/incus that referenced this issue Nov 15, 2024
Closes lxc#1299

Signed-off-by: Stéphane Graber <stgraber@stgraber.org>
stgraber added a commit to stgraber/incus that referenced this issue Nov 15, 2024
Closes lxc#1299

Signed-off-by: Stéphane Graber <stgraber@stgraber.org>
@hallyn hallyn closed this as completed in 51a8dc4 Nov 15, 2024
@victoitor
Copy link
Author

Thank you very much, when I have the time, I'll update and test everything out and report of I find anything odd.

stgraber added a commit that referenced this issue Dec 4, 2024
Closes #1299

Signed-off-by: Stéphane Graber <stgraber@stgraber.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Confirmed to be a bug
Development

No branches or pull requests

2 participants