Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make content hash consistent across machines #1085

Conversation

elyalvarado
Copy link
Contributor

The metadata used for calculating the content/chunk hash by Webpack was including the absolute paths for all the definition files and additional dependencies, which caused the calculated hash to be different depending on the absolute path of the root context. This caused the hash to change between different machines even if there were no changes in the code, which in turn causes issues in certain deployment environments where the build is executed in different server, as is the case in the default Rails deployments in AWS OpsWorks.

This is similar to the issue described in #1048

This PR changes the paths added to the build metadata to be relative to the root context, which makes the content/chunk hash calculation consistent across builds in different machines.

The metadata used for calculating the content/chunk hash by webpack was including the absolute paths for all the definition files and additional dependencies, which caused the calculated hash to be different depending on the absolute path of the root context. This caused the hash to change between different machines even if there were no changes in the code, which in turn causes issues in certain deployment environments where the build is executed in different server, as is the case in the default Rails deployments in AWS OpsWorks.
This commit changes the paths added to the build metadata to be relative to the root context, which makes the content/chunk hash calculation consistent across builds in different machines.
@johnnyreilly
Copy link
Member

Cool - thanks for contributing! I'd like to land this, probably once the project references PR has been merged. Please bear with us!

@johnnyreilly
Copy link
Member

v7.0.0 is now merged - I think you may have some merge conflicts I'm afraid!

@elyalvarado
Copy link
Contributor Author

@johnnyreilly rebased

@johnnyreilly
Copy link
Member

Okay looks good! Could you update the package.json version to 7.0.2 and add an entry to the CHANGELOG.md too please?

This caused the hash to change between different machines even if there were no changes in the code, which in turn causes issues in certain deployment environments where the build is executed in different server, as is the case in the default Rails deployments in AWS OpsWorks.

Could you expand on what problems the issue causes please? I'd really like to understand well the problem we're seeking to fix ☺️

@weiwei-lin
Copy link

weiwei-lin commented Apr 29, 2020

@elyalvarado Thanks for rebasing the PR.

Could you expand on what problems the issue causes please? I'd really like to understand well the problem we're seeking to fix ☺️

I'm running in the the same issue as well.
In my case, the content hash depends on the absolute path of the project.
Here's the steps to reproduce the issue:

  1. let say the project is under '~/project1'
  2. run webpack
  3. the produced bundle comes with a content hash, 'abc'.
  4. run cp -r ~/project1 ~/project2
  5. run webpack
  6. Now the produced bundle has a different content hash, 'cde'.

I was hoping webpack-plugin-hash-output can fix the issue. But nope. Moving the project cause it to emits slightly different code as well (in my case, some variable names in the minified code can be different).

After manually applying the fix in this PR. Both issues are resolved.

@elyalvarado elyalvarado force-pushed the make-content-hash-consistent-across-machines branch from 4150017 to e4372fb Compare April 29, 2020 02:19
@elyalvarado
Copy link
Contributor Author

elyalvarado commented Apr 29, 2020

@johnnyreilly, the issue this fixes is exactly the one @weiwei-lin describes. When Webpack calculates a content/chunk hash during a build it also takes into account the metadata passed to it. Before this fix the metadata added by ts-loader included the absolute paths for all the definition files and additional dependencies. By making it so that the metadata includes only the relative paths (instead of the absolute paths) the build hash (as calculated by Webpack) can be made consistent across different environments where the project root path is different (as it should be if there are not any other changes in the project).

You can replicate the failing behavior on your own by copying any project using ts-loader to another folder and running Webpack again, in that case you'll see two different hashes. As @weiwei-lin explained in his comment.

This is critical in some environments, by example when using AWS OpsWorks to deploy a Ruby on Rails application that uses the Webpacker gem. In deploy the webpack build is executed in each one of the application servers where the code is deployed, and because this happens in a timestamped folder if you have multiple servers you are very likely to have at least one server with a millisecond offset calculating a different hash, and therefore having the javascript file unreachable if the HTML is loaded from a different server.

Such an environment will not be the only one affected. If a developer using a different project root path rebuilds a project with ts-loader, then the hashes will change, and if deploying this new code, even without changes in the project, any previous cache layer will be unnecessarily invalidated.

@johnnyreilly
Copy link
Member

Thanks for the explanations - I feel like I half understand it and half don't. I completely get that by performing builds in different locations you'll get different content hashes and I'm convinced solving that is a good idea because consistent behaviour is helpful.

What I'm less clear on is what the bad side effects you're both experiencing are. Maybe it helps if I talk about how I use ts-loader. I run a webpack --config webpack.prod.js and pump out html / CSS and JS.

The CSS / JS have hashes in their file names and all of the above are copied into a content directory that's served up on a web server. That's it. The fact that hashes would be different there if I did the webpack building there doesn't present as the moment I've done my build it's done for all time.

It feels like your workflow is different from this and I don't quite follow what you're bumping on. Forgive me for not quite getting this so far. Being able to understand this is very important to me as I want to be aware of the different ways in which people use ts-loader in order that decisions I make around it serve the whole community.

Care to have another go at educating me? ☺️

@weiwei-lin
Copy link

In our case, the project lives in a mono repo. And it's automatically built and deployed by bots.
The project will only deploy if the produced bundle is different from the last one. And the repo can be cloned into different paths when run by the bot.
Whenever there's a new commit in the mono repo, all projects get built, and the bundles/binaries that are different than the last one gets deployed.
Because ts-loader produce different hash and content when the project path is changed, any commit in the mono repo will then trigger a deployment of the project.

@elyalvarado
Copy link
Contributor Author

In the AWS OpsWorks/Rails case:

  • Upon commit a deployment is triggered in all the application servers
  • Each application server creates a timestamped folder where it clones the repo
  • Each application server executes the build process, and because the timestamp can be different by a few seconds the resulting hash might differ between servers.
  • After the build is done, the timestamped folder is symlinked into the current folder which is the one served.
  • All app servers are behind a load balancer
  • If a request comes in, Rails generates the index.html with script tag pointing to the hashed javascript that the server has in its public folder, but if the javascript request goes to a different server (where the built was done with a different timestamp), then the page breaks because the browser will get a 404 for the javascript resource.

@johnnyreilly johnnyreilly merged commit bbc6d81 into TypeStrong:master Apr 30, 2020
@johnnyreilly
Copy link
Member

johnnyreilly commented Apr 30, 2020

@weiwei-lin that completely makes sense - thanks!

@elyalvarado

Thanks for the explanation. Though it sounds like this fix isn't going to resolve the issue you face?

From what you've said it comes down to timestamps in folder name which (I'm assuming each application server is still going to be running the build) could still be a problem for you?

Shipping https://github.com/TypeStrong/ts-loader/releases/tag/v7.0.2 now!

berickson1 pushed a commit to berickson1/ts-loader that referenced this pull request May 22, 2020
* Make content hash consistent across machines

The metadata used for calculating the content/chunk hash by webpack was including the absolute paths for all the definition files and additional dependencies, which caused the calculated hash to be different depending on the absolute path of the root context. This caused the hash to change between different machines even if there were no changes in the code, which in turn causes issues in certain deployment environments where the build is executed in different server, as is the case in the default Rails deployments in AWS OpsWorks.
This commit changes the paths added to the build metadata to be relative to the root context, which makes the content/chunk hash calculation consistent across builds in different machines.

* Update Changelog and package.json to v7.0.2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants