Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adds logic to remove http query params from destination file name #273

Merged

Conversation

semmet95
Copy link
Contributor

@semmet95 semmet95 commented Jul 28, 2023

Summary

This PR adds changes to ensure that downloaded http type dependency artifacts are given a filename that does not include the query parameters that might be present in the corresponding URI.

  • parse the uri using net/url's Parse function
  • get the clean uri without the query parameters using the Path property
  • use this clean uri to get the name for the downloaded artifact

Use Cases

fixes #274

Checklist

  • I have viewed, signed, and submitted the Contributor License Agreement.
  • I have linked issue(s) that this PR should close using keywords or the Github UI (See docs)
  • I have added an integration test, if necessary.
  • I have reviewed the styleguide for guidance on my code quality.
  • I'm happy with the commit history on this PR (I have rebased/squashed as needed).

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Jul 28, 2023

CLA Signed

The committers listed above are authorized under a signed CLA.

@dmikusa dmikusa added type:bug A general bug semver:patch A change requiring a patch version bump labels Jul 28, 2023
@semmet95 semmet95 marked this pull request as ready for review July 29, 2023 14:20
@semmet95 semmet95 requested a review from a team as a code owner July 29, 2023 14:20
@dmikusa
Copy link
Contributor

dmikusa commented Aug 4, 2023

Hi, I took a look. A couple of things on this:

  1. It needs test coverage. There might be existing tests covering HTTP and file downloads, but we need to make sure there's one also with query params and confirm that the expected file name is produced.

  2. Thinking through test scenarios, I also worry about the case where you have a URL like https://www.example.com/download.php?give-me=something. If we throw out the query params, you could have collisions with downloads and that wouldn't be good. It might make more sense to just sha256 hash the full URL and use that for the file name. I don't like that it creates meaningless file names, but it will ensure a unique file name for the download URL. Is this a change you'd be willing to make?

@semmet95
Copy link
Contributor Author

semmet95 commented Aug 4, 2023

Thanks for the suggestions @dmikusa
I'll look into them and update the PR accordingly

@semmet95
Copy link
Contributor Author

semmet95 commented Aug 4, 2023

I checked just to be sure that the apache-tomcat buildpack doesn't need the file name to have the proper extension and even with the sha256 as its name the buildpack managed to download and extract the tomcat configuration tar file.
We are working on the suggested changes now.

@semmet95
Copy link
Contributor Author

semmet95 commented Aug 4, 2023

@dmikusa pushed the changes. Please let me know if any improvements can be made.

@semmet95 semmet95 marked this pull request as draft August 18, 2023 08:42
@semmet95 semmet95 marked this pull request as ready for review August 18, 2023 08:43
@pranavek
Copy link

@dmikusa The changes are implemented as per your suggestions. Could you take a look at it?

NB: Some Github workflows are in Waiting for status state.

@anthonydahanne anthonydahanne force-pushed the fix/filename-remove-query-params branch from db6cda3 to 6a1c758 Compare September 4, 2023 16:08
@anthonydahanne
Copy link
Member

I've allowed the action to run and I rebased on main.

Looks good to me, @dmikusa WDYT?

dependency_cache.go Outdated Show resolved Hide resolved
@semmet95 semmet95 requested a review from dmikusa September 7, 2023 16:26
Signed-off-by: Jayashree O <jaishu138@gmail.com>
…ponding test

Signed-off-by: Amit Singh <singhamitch@outlook.com>
…or http(s) uri

Signed-off-by: Amit Singh <singhamitch@outlook.com>
Signed-off-by: Amit Singh <singhamitch@outlook.com>
@anthonydahanne anthonydahanne force-pushed the fix/filename-remove-query-params branch from f5b426f to aaf573f Compare September 14, 2023 12:44
@dmikusa dmikusa merged commit fda98b4 into paketo-buildpacks:main Sep 15, 2023
@dmikusa
Copy link
Contributor

dmikusa commented Sep 15, 2023

Thanks for this PR & sticking with it. If you're interested, we'll need to port this to the release-2.x branch as well.

@c0d1ngm0nk3y
Copy link
Contributor

Some feedback: This change was breaking. I wonder why this was brought into v1. Just admitting it in the release notes is not the best user experience. At least I do not read the release notes carefully for each minor bump.

@dmikusa
Copy link
Contributor

dmikusa commented Sep 29, 2023

@c0d1ngm0nk3y Apologizes. We try to screen out any breaking changes, but this one slipped through. We wanted to get this into 1.x because it impacted users. In hindsight, we'd have held it for 2.x.

@c0d1ngm0nk3y
Copy link
Contributor

@dmikusa But doesn't make it clear that some integration test was missing. I mean it was not the case that one very specific edge case was broken, but offline buildpacks, didn't work in general, right? It feels to me that this should be tested somehow.

What wonders me more is not the fact that this was overseen - mistakes happen, that normal imho. But rather the reaction. Knowing that this is breaking but still keep the code and not reverting the change was strange to me.

My POV: I discovered that the integration test for one of our use cases broke. After some debugging, I found out that it is nothing we changed, but that offline buildpacks do not work in general. When checking the release notes of create-package, I saw that this problem is known and there there is even a notice buried in the release notes of a minor patch. The experience was not the best and it felt a bit unnecessary to me.

@dmikusa
Copy link
Contributor

dmikusa commented Oct 2, 2023

But doesn't make it clear that some integration test was missing.

100%. We have stories to add integration testing to the composite buildpacks, but the work is incomplete. We'd be happy to work with you on that, if it's something you'd like to contribute.

What wonders me more is not the fact that this was overseen - mistakes happen, that normal imho. But rather the reaction. Knowing that this is breaking but still keep the code and not reverting the change was strange to me.

What you're seeing is absolute transparency and us working through this problem in real-time. We could have taken the bug report, sat on this for a week or so while we investigated the full extent of the problem, made a decision, and then finally communicated that to the community. My approach for the community is to communicate early and often. There are some drawbacks to that because we're working through the issue in real-time, but I think the benefit of being transparent with everything outweighs the approach of keeping things private until they've been fully sorted out.

That said. here's the latest update. We're reevaluating this. Another bug case has popped up. Previously, the decision was made to try and keep moving forward because we wanted to be able to retain the original fix that #273 provided. Additional failure scenarios have popped up. We're presently discussing those, the potential impact on the user base, and the work required to resolve them. We'll keep folks updated when a course of action has been decided upon.

The recommendation is still to continue using the older version of create-package and libpak. This should avoid issues because both are using the pre-#273 code.

@pivotal-david-osullivan
Copy link
Contributor

Having investigated the bug raised in #288 as well as further failures seen, we've decided to revert the change in this PR to minimise impact and avoid introducing other issues by attempting to patch the new logic in the short term.

Hopefully the problem scenario faced in #274 can be resolved in v2 of Libpak (#287)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver:patch A change requiring a patch version bump type:bug A general bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid including http query parameters in downloaded (http) artifact's filename
6 participants