Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support directories in file* crypto functions or introduce directory equivalents #26091

Closed
EamonHetherton opened this issue Sep 2, 2020 · 8 comments
Labels
enhancement new new issue not yet triaged

Comments

@EamonHetherton
Copy link
Contributor

Current Terraform Version

Terraform v0.13.0

Use-cases

Triggering resources if the contents of a folder have changed that is deterministic across platforms.

Attempted Solutions

Current common solution is to declare a data.archive_file resource with the source_dir set and use the output_sha of this to trigger dependencies. This has problems especially cross platform that are unrelated to the intent which is to trigger if the contents of any file change. (see references)

I shouldn't need a zip file to be created that I am not going to use to detect changes to a folder.

Proposal

Either allow the file* functions to accept directory parameters or introduce directory analogues for the filemd5, filesha1, filesha256 and filesha512 that consider only the content and ignore timestamps and file permissions. Options could be added if needed to allow a user to consider timestamp and file permissions in the hash, but should be off by default so that only the file contents are used.

References

hashicorp/terraform-provider-archive#53
hashicorp/terraform-provider-archive#34
hashicorp/terraform-provider-archive#47
hashicorp/terraform-provider-archive#41

@EamonHetherton EamonHetherton added enhancement new new issue not yet triaged labels Sep 2, 2020
@apparentlymart
Copy link
Contributor

Hi @EamonHetherton! Thanks for sharing this feature request.

If I'm understanding your goals correctly, I think it's possible to achieve a result like what you are looking for using some building blocks already present in the Terraform language. For example:

locals {
  source_directory = "${path.module}/example"

  source_directory_files = fileset(local.source_directory, "**")
  source_file_hashes = {
    for path in local.source_directory_files :
    path => filebase64sha512("${local.source_directory}/${path}")
  }
  overall_hash = base64sha512(jsonencode(local.source_file_hashes))
}

output "source_file_hashes" {
    value = local.source_file_hashes
}

output "overall_hash" {
    value = local.overall_hash
}

This produces the following result for me:

Outputs:

overall_hash = +LA4s2SjZElqrTCtsVrJXA0rkyVLOBeDwUeOytk6fHTneP0MmMTCdq0wdc6fvq4eJpYIQImjAyn4XYSc3ICUqA==
source_file_hashes = {
  "thingy/example" = "4cES/5CP68O5ixaTps01ZOr45ebKYp0ITZ8OupkkfKzdcuNp/4lBOXwoB0Cf9mvmS+kI2hete4pJoqJsDoCGqg=="
  "thingy/foo" = "z4PhNX7vuL3xVChQ1m2AB9Yg5AULVxXcg/SpIdNs6c5H0NE8XYXysP+DGNKHfuwvY7kxvUdBeoGlODJ6+SfaPg=="
}

For this example I was assuming that it doesn't really matter how the the overall hash input string is formatted as long as it is something that will change if any of the file paths or file contents in the directory change, and so I just made a hash of the JSON serialization of local.source_file_hashes which includes both the file paths and the hashes of the contents of the individual files. This same principle would work for any other string serialization of path+hash pairs, though, as long as you can generate it using a Terraform function or string template.

Does that get closer to the result you were looking for?

@EamonHetherton
Copy link
Contributor Author

EamonHetherton commented Sep 2, 2020

Thanks, I'll give this a go.

Two potential issues pop to mind that might make it unreliable especially cross-platform;

  1. is the order of the files in source_directory_files going to be deterministic across platforms (win, mac and linux)?
    2) will the path that is included in the hash use a platform dependant seperator? '' for windows and '/' for mac and linux? (edit: I see that fileset already handles the platform independance for paths)

but I'll try it out and see.

@EamonHetherton
Copy link
Contributor Author

so far so good. It appears to work across windows and mac where the archive_file method would fail :)

@apparentlymart
Copy link
Contributor

Hi @EamonHetherton,

I'm sorry I didn't reply sooner. I've had some disruptions to my routine due to climate change. 😖

It seems like you already answered at least one of your questions, but I just wanted to explicitly confirm that the order of files is guaranteed to be lexical by unicode codepoint in all cases, and that all of Terraform's path manipulation functions -- fileset included -- use / as the path separator on all platforms for consistency, relying on the fact that Windows supports both \ and / equally as long as we are consistent within a particular path.

Now that you've tried it out, would you say that the solution I proposed as sufficiently addressed your use-case? If so, I'd move that we close this and say that the combination of several smaller functions is how to achieve this result in Terraform.

@EamonHetherton
Copy link
Contributor Author

Yes, this combination of functions has given me exactly what I was after if just a little more verbose than a single function.

@apparentlymart
Copy link
Contributor

Thanks for confirming, @EamonHetherton!

Although we do offer some specialized functions for common cases, what you're doing here feels a little too you-specific to have a specialized function for it -- or indeed, multiple functions to cover all of the different serialization formats and hash algorithms Terraform supports. For that reason, I'm going to close this in the sense that there seems to be sufficient existing Terraform functionality to meet the need, while giving flexibility about the details of what exactly is included in the hash and how it is calculated.

Thanks again for sharing the feature request, and for working with me to investigate how we might achieve the result with existing Terraform functionality.

@ntrp
Copy link

ntrp commented Oct 16, 2020

I don't really think this is specific to his use case, there are thousand of users that are having the same non deterministic zip issue. I believe terraform should give a proper solution to the issue by either implementing an option to create a deterministic zip with the data.archive_file resource or by providing a directory hash function. If you think about it what is the purpose of outputting a sha from the zip generation if it's not deterministic?
The solution presented definitely works but this should be something handled being a common use case rather than a "hack"..

@ghost
Copy link

ghost commented Oct 28, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked as resolved and limited conversation to collaborators Oct 28, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement new new issue not yet triaged
Projects
None yet
Development

No branches or pull requests

3 participants