-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: State Encryption #9556
Comments
what is the status here ? From my point of view, there should be no link between config parameters (passwords) and state files. I also don't understant people "sharing" the state file... If you have a need to share something, maybe that's something to be added to Terraform. The state file is an "internal" view of the currently running architecture, it's not a config file. I totally agree on having providers ressources to pull/push sensitive data from (ex : passwords). Using Vault as an endpoint for it sounds great to me, as it would allow Terraform to use an ENV token to gain access to these data, then use them to deploy, replacing the pointers from the remote state file by the values from the Vault... |
The forthcoming version 0.9 contains some reworking of Terraform's handling of states that will, amongst other things, make this easier to implement in a future release. I can't say exactly when that will be (I don't have visibility into the official roadmap) but the technical blockers on this will be much diminished once 0.9 is released. I suppose it's worth noting that the usage examples in my original proposal here are no longer valid with the changes in 0.9. Instead of configuring encryption on the terraform {
backend "consul" {
# ...
# HYPOTHETICAL ENHANCEMENT -- NOT YET SUPPORTED
encryption "vault" {
address = "https://vault.example.com/"
mount_point = "foobaz" # "transit" by default
key = "terraform"
}
}
} |
Can Vault generic secret store become a separate Terraform backend? We could remove a dependency to Consul then. |
@mkuzmin in principle that is possible, but I've seen the Vault team recommend against storing non-trivially-sized things in Vault's generic backend, and instead to use the transit backend to encrypt for storage elsewhere. That recommendation is what this design was based on. |
@apparentlymart #9556 (comment) is this supported in 0.9 release |
as discussed at Hashidays NY with @phinze https://github.com/agilebits/sm Manual workflow could be:
on second machine:
|
i was considering writing a consul http proxy that you could use as a consul backend for tf. encrypted/decrypted all the values through vault transit. the consul sharing works (should work - good idea!) for my team. but consul could be (or seem like) a barrier to entry in a cross-team situation. if an upstream team already has consul deployed, and remains aware that consul has a rest api, a but being able replace the this new way to think about remote state is almost burying the lead when it comes to enterprise though. |
I'd like to see this feature implemented in core, along with other encryption efforts. In the mean time, I came across this tool: terrahelp |
@apparentlymart What's the status on this? Would this get accepted into terraform if I would implement it? Or are there any technical blockers on this? |
Hi @simonre! The architecture of "backends" in Terraform changed significantly since I originally proposed this (which was before I was a HashiCorp employee), so I expect we'll need to do another round of design work before deciding what is the right thing to do here. There has also been some disagreement in subsequent discussions about whether whole-state encryption is actually what's needed here, or whether encryption of specific sensitive values is actually the requirement: whole-state encryption is a pretty blunt instrument, requiring that any particular individual either have access to the entire state (which is required to run Terraform at all) or none of it. With more precision, it may be possible to have different sensitivity levels for different information, and to permit certain operations to complete without access to the sensitive information at all. In practice, many users have implemented a system functionally equivalent to what I proposed here by selecting a backend that has its own built-in support for encryption at rest, such as S3. If using S3 with its built-in encryption is not sufficient, I doubt that what I proposed here would be sufficient either, since it has much the same security characteristics. However, if there are some ways that the S3 backend (or any other backends that similarly talk to a store with native support for encryption at rest) could better support that use-case, a good nearer-term change would be to add additional capabilities directly to those backends to better exploit those built-in features. If anyone has any ideas about that I'd encourage opening a separate issue to discuss them before implementation, since for security-related changes in particular it's good to talk through the design to inform the implementation. |
@Roxyrob I think you make a good point. But I don't think it matters much because if I were to venture a guess I would say that in Hashicorp's philosophy they "trust" cloud providers. To a degree I can understand that point of view; If you are already running everything in their cloud (VM's, etc.) encrypting your statefile before uploading it isn't going to give you that much extra security. So from their point of view it might help against such one of a kind Amazon incident but other then that not much. Now obviously this is a lot of conjecture from my part but it is the only way I can explain seeing state file encryption at rest as a low priority. Meaning there is a lack of incentive for them to invest time in this PR or the issue at all. Which is fine, it's opensource and they don't owe anyone anything. |
@siepkes I undertand what you suppose but I think data security is primary concern for every tools and Terraform is a so great one that cannot neglet also if there isn't a simple implementation solution. Probably I'm wrong but IMHO any piece of data especially if sensible by nature as many data in tfstate are or potentially can be, should be managed following Without this basic assumption Terraform will be a great tools to provision infrastructure as soon as can work without secrets. I cannot see any Cloud solution (HCP included) that can provide applicable safeness that allows data to be saved I'm anyway interested to know if a sops like logic (with tfstate JSON values selectively encrypted) can be viable or not. |
You can use other backends other than s3. |
The issue is that ALL of these are making a "copy" of credentials. That is just wrong right out of the gate. Especially when it's using something like vault - it should be storing a reference to the credentials to be loaded when applied. If Terraform were operating as a client-server type implementation, where the state had to be independently accessed by a service on some other box in order to apply the state - then yeah, I could understand the current behavior -- but it's being 100% driven by the terraform executable and modules on the client system. The suggestion to use an encrypted backing store doesn't change the fundamental issue that terraform is duplicating the actual credentials and saving them elsewhere. |
@FernandoMiguel consul or every other solution does not change the context. Tfstate is always in cleartext somewhere, and someone can access the file and so secrets inside (at least if you do not take all on a server in a private room detached from networks and always watched). Sops like logic instead allow you to save JSON file (and so potentially tfstate json too) only with values (all or some) encrypted using e.g. AWS KMS CMK. Such an approach increse security (and probably sufficient risk mitigation) allowing JSON values only encryption/decryption as a service with master encryption keys never known by someone and accessible by means of IAM and KMS Key policies configurations. Nothing is perfect in the All is possible, but this is much less risky than having cleartext secrets in a file on a cloud storage. |
@nneul a tfstate cleartext problem mitigation can be reached if we do not undermining probably basic principles of Terraform behavior and also: having encrypted value in tfatste file (or in other better solution for that purpose - like consul, and so on) can be potencially useful (to share between different terraform configs or different DevOps tools in the pipeline). I think that in cloud era we cannot avoid that secrets can be "a little out of control" (e.g. we will never have total control on all cloud storage solutions involved in a complex infra/process). What we can do instead is to make our best on data security. |
Frankly, I'm quite shocked and surprised that Hashicorp haven't placed a higher priority on providing a mechanism for encryption of the tfstate at rest. In many organisations, these state files contain the 'keys to the kingdom' and a comprehensive map of their infrastructure and should be considered highly sensitive. Knowing that most large companies use Terraform, and that there's a good chance that many/most will use the Amazon S3 backend, if I were intent on compromising one of them I would expect a pretty good chance of success if I could simply gain read-access to their state bucket. If I were a malicious actor, I would probably start by focussing my efforts there. And it probably wouldn't be hard to find a few companies that have incorrectly configured their bucket ACL, or that have staff with read access inadvertently configured for their user/group or whatever. However, if I suspected that all I'd find were encrypted state files, and that I'd need to obtain additional keys to get to reveal the full set of 'keys to the kingdom', then that would certainly act as a deterrent at least and I'd probably look for an alternative attack vector. Another way to look at it is - how would you feel if your plaintext tfstate file got leaked on the DarkWeb? 😱 |
It's certainly a lot more complex than what this PR does, at least if you want to do it in a secure fashion. TF providers keep adding fields in the state, and if you miss one that's sensitive, you are again leaking credentials. This would have a high probability of giving a false sense of security IMHO, which is often actually very bad for security. That being said, there's nothing to stop someone from implementing a different encryption backend once my PR makes it in. I just personally don't feel like writing it. |
Depending on the providers you use, this is true, but it's also very much out of scope for this github issue. |
Hi all! It's been a while. Firstly, I just want to note that this is one of those weirder issues where I started it while I was a Terraform user in my previous job, and then subsequently joined HashiCorp to work on Terraform, so my role in this issue effectively changed partway through. This issue has been at a bit of an impasse for a while because, although it has broad support in the form of upvotes, whenever we had a more specific and detailed discussion with someone about it the conclusion was often that encrypting the entire state at rest wasn't really the desired result. Some wanted to encrypt specific parts of the state, while others asked if Terraform could just avoid storing copies of their secrets at all, since those secrets are already readily available in other locations such as HashiCorp Vault, AWS Secrets Manager, etc. Encrypting the that information doesn't really help because anyone using Terraform would need to have access to decrypt it anyway, and so all it would take is one compromise of the Terraform execution environment to indirectly access the secrets, thus defeating the benefit of storing secrets in a specialized store. (This is the phenomenon sometimes called Secret Sprawl, and some earlier comments in this issue already drew attention to this.) Therefore, over the years while thinking about this I began looking at it in a different way: I think there's a significant gap in the Terraform language in representing values that exist only in memory during a single phase (e.g. plan or apply), that never get persisted in plan or state files, and that might even represent objects that won't exist at all once Terraform's work is complete. For now I'm calling this concept "ephemeral values", where "ephemeral" is intended to mean "lives only for the duration of one phase". This is a new cross-cutting concern in the Terraform language that starts with the observation that provider configurations and provisioner/connection configurations are already "ephemeral" in this sense, but we didn't previously have a general sense of that idea that other language features could rely on. We can then build some specific features on top of that concept. The ones I'm currently thinking about are:
I'm currently prototyping for this over in #35078 but this is still very early and subject to significant changes. I'm sharing this today only because we've not shared anything in this issue for a while and I wanted to let you all know that we've not forgotten about it, just that what I originally proposed here was invalidated by further research and so we're taking a different approach. If successful, these new features would allow avoiding having any secrets in the Terraform plan files and state snapshots at all. Of course that doesn't mean that there would be no value in encrypting the state -- the information there is still valuable in the sense of being a pretty detailed map of your infrastructure that might be useful to an attacker -- but the ephemeral values features would significantly lower the stakes, potentially making it sufficient to rely on your chosen state storage service's features for encryption at rest, which means far less operational complexity for everyday Terraform use. (As others noted above, it turns it into largely an RBAC problem rather than an encryption problem, which I understand that some consider a significant disadvantage but the idea here is to lower the stakes by removing the secrets so that the state is effectively just a description of the same remote objects that are already protected exclusively by RBAC in the target system anyway, rather than exposing additional secret information that would not normally be exposed by the target system's APIs.) |
@apparentlymart thank you for sharing the current proposal, IMHO this is a great step forward! That said, it doesn't cover all cases. For example there are many cases we generate passwords in terraform, so the password is kept in the state. I think a good complementary solution would be to be able to opt in to encrypt sensitive values with a pgp key. This way there is another second factor required to access the secret when it's needed in the state. |
Hi @simonweil, To generate a new password directly inside Terraform using this proposal you'd need:
Here's a hypothetical configuration using some imaginary resource types: variable "regenerate_password" {
type = bool
default = false
nullable = false
}
# Imagine this as the same as resource "random_password" "example"
# but treated ephemerally instead of persistently.
ephemeral "random_password" "example" {
count = var.regenerate_password ? 1 : 0
# ...
}
resource "thing_with_password" "example" {
# ...
# This is intended as a "write-only attribute" and so we want
# to arrange for it to be null unless var.regenerate_password
# is set, which means that by default the existing password
# would be left unchanged.
new_password = one(random_password.example[*].generated)
} In normal use you'd just run If you want to change the password then you'd run In a real example I expect you'd actually need two managed resources with write-only attributes: one would be something like Importantly, no-one ever needs to know what the old password was in order to generate a new one: they just need to have sufficient access to write the newly-generated password into everywhere it needs to go. They explicitly set the input variable when they want to change the password, so there's no need for Terraform to record the old password to compare with. It is admittedly a different workflow than is often used for password generation today, but fetching and providing encryption keys to Terraform would also be a different workflow. |
@apparentlymart instead of toggles, hashes could be used. |
It would be possible for a particular resource type to store a hash of the value sent to a write-only attribute, but that would be appropriate only if the underlying API can return a hash of the current secret during read. That would then allow the provider to notice that the new password is the same as the old password and decline to change it. However, we want to be able to support running For a value that can only be written and not read in any form -- not even a checksum -- the null vs. not null rule serves as a trigger that doesn't require knowing anything about the old value, which I was assuming as the case for the example I shared in my most recent comment. Storing even a checksum of a password can potentially be problematic, particularly if it isn't salted in some way, so a design for this must not rely on checksums, but a provider for a system that chooses to allow retrieving secret checksums (or other similar derived vales, like a key fingerprint) can make use of that information to avoid proposing spurious updates. |
It occurs to me that I wasn't clear about one point in my earlier comments, and I only gestured at it in the most recent comment. An important requirement in solving this is that we don't want to force an operator to re-supply the same password (or other secret) on every run just to affirm that it doesn't need to change, because that means that everyone who needs to make any change to a particular configuration will need to know the password. The design goal is that if you aren't doing anything related to the password then you don't need to provide any information related to it and neither does the provider. Only when you're actually trying to change the password do you need to provide the new password value, and even then you might do it only indirectly via an ephemeral resource as I showed in my earlier example. (A simpler variant of that example would be to specify the password directly as an ephemeral+sensitive input variable, but I was trying to specifically address generating passwords with Terraform, so I wrote the more complex variant where the input variable only represents the intent to change the password and then the actual generation happens internally so that the operator doesn't need to ever know the new password.) |
👋 hey all, to add to Martin's comments above, we've been iterating on this problem for over a year and we believe this will address a lot of problems with how 'secrets' are handled in Terraform. We'd love to setup time to show folks how this would work and answer any questions/concerns live. Or please respond here if you feel more comfortable. If you are interested, please email me at oismail@hashicorp.com and we can set up some time! Talk soon :D |
I think this really goes a long way, however I would be curious on your take how to (taking your example) toggle the password creation on once when creating a resource for the first time. In other words, how would a One way I could imagine would be to allow for ephemeral resources to be partly ephemeral, or rather, allow a resource's attributes to be ephemeral. This would allow the Another idea would be to somehow expose the plan (for a given resource) as an ephemeral value itself – that way, the toggle could literally be "are we creating this resource", e.g. ephemeral "random_password" "example" {
count = creating(thing_with_password.example) ? 1 : 0
# ...
} or something like it. |
Hi @matthiasr, While exposing metadata like the planned action for a specific resource instance is an attractive idea (I also considered it in earlier work) we need to keep in mind that such a thing would imply a dependency on the object whose planned action we're interested in -- otherwise it won't have a planned action yet -- and in most realistic scenarios that implies a dependency cycle because the thing using the password already naturally depends on the thing generating the password. Part of the idea of using an input variable to explicitly say that you want to change the password is to improve the ergonomics of the existing pattern of using the |
Hi all, As I mentioned above, I wrote this proposal before I joined HashiCorp, and even ignoring the change in employment relationship I've also personally soured on the idea a bunch since I originally proposed it and have been leaving it open only because it had upvotes associated with it. However, someone reminded me today that we also have the older issue #516 that represents the general problem of how to deal with secrets in the state, without implying any specific solution, and it has considerably more votes. Therefore I'm going to close this issue to represent that I'm retracting specifically what I proposed, but I intend that we'll keep using #516 as a single location to represent that the problem still exists. I've also written Ephemeral Values in Terraform as my own personal perspective on this (not an official HashiCorp statement), which in part describes more about why I think what I proposed in 2016 was pulling in the wrong direction and why I think the idea of ephemeral values (which I described in my other comments above) is the superior solution. If you previously voted 👍 on this issue and haven't already voted on #516 then I'd encourage you to transfer your votes to that other issue. Thanks! |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
Currently we have several resources that retrieve or generate secrets, and for any where these secrets are used to populate other resources or configure other providers these secrets must necessarily be stored in the state.
Such resources include:
aws_db_cluster
(password attribute)azurerm_virtual_machine
(machine login passwords)tls_private_key
vault_generic_secret
(both managed resource and data source) (Vault Provider #9158)This causes some conflict, because Terraform's design originally assumed that the state was just a local cache of some remote data, and was fine to e.g. check into a git repo alongside the configuration, or to publish somewhere for consumption in other downstream Terraform configurations. It can be surprising and troublesome for secret values to show up in Terraform states that are being used in these ways.
Proposal: Encrypt the state at rest
To address this issue in a way that does not significantly increase Terraform's core complexity, I propose that we address this by allowing Terraform state to optionally be encrypted, as a whole, at rest. That is to say that the state file stored on local disk and on the remote storage target would be some sort of ciphertext of the state, and for each operation Terraform would retrieve this and decrypt it in memory only for use during that operation, re-encrypting it before writing any changes.
Encrypting the entire state is a rather blunt instrument, but it has the advantage of allowing the encryption to be orthogonal to other concerns in Terraform, and thus makes it easy to reason about its behavior and understand what is and is not encrypted: all, or nothing.
State Encryption Backends
Terraform already has the concept of a remote state storage backend. This proposal introduces a similar but orthogonal concept alongside that: a state encryption backend.
An encryption backend is responsible for translating from a cleartext state file to an encrypted one and vice-versa. The backend defines exactly what format the encrypted state file is stored in, and its contents are opaque to the rest of Terraform.
This would be enabled along with remote state storage in
terraform remote config
:Since encryption backend is orthogonal to storage backend, it's possible to mix and match these as desired. In the above example the data is encrypted using Vault's "transit" secret backend and stored in Consul. Another useful encryption backend would be for Amazon KMS, which serves a similar purpose to Vault's transit backend, and could make a good companion to the "s3" storage backend in an AWS-centric environment.
Introducing such a concept would be a pretty isolated change that would affect only the state management portions of Terraform:
terraform remote config
needs to learn two new options:-encrypt
and-encrypt-config
application/json
as the Content-Type for cleartext, but might be better to useapplication/octet-stream
for encrypted data.Storing State in Git
Historically the Terraform docs suggested that storing state files in git was a reasonable way to share them within a team. The "remote state" mechanism has subsequently superseded that, and this proposal considers remote state as the primary mechanism for collaboration and supports encryption of state only in conjunction with remote state.
Moving down this path would be a good opportunity to officially deprecate the suggestion of storing state files in git repositories, and strongly encourage the use of remote state with encryption.
Effect on Remote State Workflow
Terraform maintains a local copy of the remote state as a "cache". With state encryption enabled, this local copy would also be encrypted at rest, and so Terraform would need to retrieve and recrypt both the local cache and the remote persistent storage in order to do comparison/sync operations, including the
terraform remote push
andterraform remote pull
commands along with the various similar implicit actions taken during other commands that read and write Terraform state.Effect on Remote State as a Collaboration Tool
The
terraform_remote_state
data source has encouraged the use of state files as a means of passing data from one Terraform configuration to another, effectively creating a DAG of separately-maintained Terraform configs.This can be a powerful tool for managing complex environments, where one large configuration and associated state would be unwieldy. However, it has the rather odd consequence that the entire state is shared merely to allow another config to retrieve the outputs; downstream consumers of the state necessarily have access to all of the gory details of how these resources are created, even though Terraform entirely ignores them.
A while back I'd proposed #3164 to address a related concern around sharing states for collaboration: that the data I wanted to share was at a different level of abstraction than the configurations that produced it. As @phinze rightfully pointed out in that discussion, that proposal (and indeed the
terraform_remote_state
data source itself) are really just trying to pass an arbitrary bag of key/value pairs by smuggling it inside a larger data structure.In the mean time we have implemented the concept of data sources, which make retrieving data a first-class idea in Terraform. My suggestion is that we move away from
terraform_remote_state
as the primary suggested collaboration tool, and instead use more general intermediaries for passing such data.Per the discussion in #3164, I've subsequently transitioned all of the "tree of configs" stuff in my employer's world over to using Consul resources, and no longer use
terraform_remote_state
at all. Instead, the "parent" configurations use theconsul_key_prefix
resource to write sets of data conveniently to Consul as discrete keys, and then the "child" configurations use theconsul_keys
data source to retrieve those keys. This change also had the positive consequence of making the same data visible to other consumers beyond Terraform, such as in our use ofconsul-template
to configure applications.Consul is currently the most compelling way to do this sharing due to the good usability of Terraform's Consul provider. The
aws_s3_bucket_object
resource and data source could be used similarly, though could perhaps benefit from an analog ofconsul_key_prefix
to enable managing multiple related keys in a convenient and robust way. Similar such backends could include etcd, Google Cloud Storage, and (for things that are secret in nature despite being shared between configs) Vault.We might choose to allow
encrypt
andencrypt_config
as attributes of theterraform_remote_state
data source so that encrypted state can still be read by those who have access to the relevant credentials. I expect the use-cases for such a thing would be pretty narrow and fraught with gotchas, so personally I would always prefer to expose more carefully only the specific attributes that need to be exposed, in a manner that is most appropriate for each attribute.Effect on "state surgery" to work around Terraform issues
I'm sure most teams managing non-trivial configurations with Terraform have at least once resorted to manually tweaking the contents of a Terraform state to work around some sort of tangle that has either been created by outside config drift or by Terraform itself. On my team we call this "state surgery" and have indeed needed to do it several times over the years for one reason or another.
This sort of process will be made much more difficult with encrypted state files, since it'd no longer possible (or at least, straightforward) to edit in-place the local state cache to "trick" Terraform.
Fortunately, in 0.7 the new
terraform state
family of commands has significantly reduced the need for manual state surgery. Continued investment in this area to cover other "state surgery" use-cases should remove the need for such manual tweaking, allowing changes to be made somewhat more safely.References
The text was updated successfully, but these errors were encountered: