-
Notifications
You must be signed in to change notification settings - Fork 457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: revisit how disks are added to domains #46
Comments
I am not sure what you mean here. In our project we taint the disk whenever we change the domain or the disk, and that actually triggers the rebuild of both the disk and the domain, which makes total sense given Terraform concepts. Eg.:
We use Salt for provisioning so it is safe to execute provisioners multiple times. So if we change the domain (eg. change the amount of RAM) the provisioners will run but nothing bad happens; if we change the disk the whole domain will be rebuilt from scratch, which is the expected behaviour.
That's already possible, see below.
In our project, as you know, we have the problem that the IOW users can taint the domain or the
Sharing of disks is actually a useful feature in some (albeit uncommon) clustered filesystem setups. I blogged about the manual steps to implement a test Oracle RAC cluster, and that' something that will at some point come to our project. From my blog:
About this last point:
We solve this with Terraform modules. Bottom line: I am personally happy with the current implementation, it does all I need. I am still not sure if and why you have specific corner cases which are not presently covered. I am not against this proposal, although other items would be more important FMPOV, eg. bridged networking! |
This is a command you issue by hand. I was talking about extending the When using terraform against a cloud you don't have to do this extra step because the volume is a "built-in" attribute of the instance (like CPU/RAM). Hence a rebuild of the instance will automatically invalidate the disk.
This is not the expected behaviour for someone managing cloud instances with terraform (see the explanation above). As for rerunning the provisioners, I agree that Salt is the way to go (I use that too as you know), but that must not be a requisite for everyone. That also leads to the big issue I mentioned in the issue description: if you change a linked volume the domain is not rebuilt and hence the provisioners are not run. Finally, there are some domain attributes (like RAM) that, once changed, require the node to be powered off and restarted in order to take place. This is how libvirt works, there's no way to circumvent it.
Sorry, I missed the
The question was about the new semantics we would have to introduce to have ephemeral and persistent disks. Right now my POC implements ephemeral disks by duplicating part of the volume attributes inside of the disk resource. On the other hand the persistent disks can be implemented by creating a disk that points to a volume (which is the current implementation). However I'm not particularly fond of it. Hence the request for feedback and ideas ;)
I'm really happy about this project too! I would have liked to work on bridged networking but @inercia was already doing it :) The point is: terraform has been created thinking about cloud instances, which are slightly different from the VMs. We have to try to reach the good compromise between the two wolds and I think @dmacvicar did a really good job at recreating the same "feeling"/user experience. I'm just exposing some corner cases that might affect someone used to manage cloud instance with terraform. |
I do agree this is a tricky issue and at some point I thought about making the full disk definition part of the domain. I am not completely against it, but I don't see yourself convinced about this being the right solution either. If there is a way that we can reuse part of the code from the volume handling and that is more or less elegant I would be ok on allowing the volume definition to be "embedded" in the disk definition so that it is completely part of the domain. Now, where I disagree with you is in the proposal:
What is the point of
In both cases the user may want to say in which pool the volume is, but I am not even sure about this. The provider could create a private volume only for these disks if we want to make things simpler. So only the |
closing this issues since we didn't come out with a real pragmatic implementation. Feel free to reopen in case we have a plan in short-term |
The current state
Right now to add a disk to a domain we must:
The problems
These are the minor issues/annoyances:
count
on a domain introduces quite some complexity: you have to usecount
also inside of the volume definition and you have to be careful when writing the link (the terraformelement
syntax is not exactly user friendly)The biggest issue from my POV is that the volume life cycle is kinda disconnected from the one of the domain.
Doesn't play well when rebuilding a domain
When you change a domain in a way that it forces its rebuild the following actions will happen:
Bad things can happen during the last step because most of the times the provisioners assume they are running on a "vanilla" system while they already ran against that volume.
Doesn't play well with changes
Imagine a domain has already been provisioned and the user changes its disks. The next apply will unlink the old volumes from the domain and link the freshly created ones.
This has two problems:
.tf
file) or when the whole infra is brought to ashesAt the same time changing a linked volume won't cause the domain provisioners to be executed again. This is IMHO a pretty big issue.
How to fix that
Terraform original design
Terraform was designed to handle cloud workloads. In this context there are two types of storage devices: ephemeral and persistent ones.
Each instance has always an ephemeral disk. The life cycle of this disk is coupled with the instance. A change that forces the rebuilding of the instance will cause the disk to be deleted and the new instance will start with a clean slate.
The persistent storage is usually called volume and can be attached to an instance. This is not really tied to the life cycle of an instance, hence it can be used to store really important data that has to be preserved between instance rebuilds.
The proposal
There's no way to "taint" a linked resource. We cannot mark all the volumes linked to a domain as to be deleted whenever the domain is rebuilt. I think the feature is missing because it would not make sense inside of the original context for which Terraform has been designed.
We basically have to implement ephemeral and persistent storage with libvirt.
The persistent one is the simplest, we just have to use the
volume
resource to implement that. However we should introduce new directives to allow the user to create a volume from nothing (like by specifying just the size). This can be useful also outside of the context of this issue.The ephemeral one is a bit tricky, especially from the user point of view. We could extend the disk type to have the following attributes:
pool
and used directly (no linked disk underneath)Defining a domain would be something like that:
This would cause the
opensuse-leap.qcow2
image to be uploaded to thedefault
pool as a volume named:<domain name>.img
, hencechameleon.img
.Everything is still handled as a volume from the libvirt POV, however the domain resource has full responsibility at managing this volume. It's just a matter of extending the create, delete and update functions of the domain resource.
The usability question
Ephemeral and persistent volumes have still to be added as disks to a domain. How can we differentiate between what is going to survive domain rebuilds and what is going to be nuked every time?
We could define persistent volumes in this way:
However I think it could lead to confusion among users...
Feedback
What do you think about the whole thing?
If you are interested I can share a POF I made (it supports create and delete of ephemeral disks created with
source
).The text was updated successfully, but these errors were encountered: