Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Poetry 1.5.1 poetry.lock with absence of category key #7418

Closed

Conversation

phillipuniverse
Copy link
Contributor

@phillipuniverse phillipuniverse commented Jun 9, 2023

Follow-up to the original Poetry 1.5 merge at #7350 and the smoke test updates at dependabot/smoke-tests#75. Removal of the category key happened in the 1.5.0 release at https://github.com/python-poetry/poetry/releases/tag/1.5.0.

Fixes #7389
Fixes #7641

@phillipuniverse phillipuniverse requested a review from a team as a code owner June 9, 2023 14:02
@phillipuniverse
Copy link
Contributor Author

phillipuniverse commented Jun 9, 2023

Another thought I had looking at this. In requirements.txt we specify that we want to install a flexible version of Poetry:

But Dependabot isn't actually operating against any version other than 1.5.1, so this requirements.txt is a little misleading. I think it would make more sense to have this line be:

poetry==1.5.1

To make it clear which version Dependabot is operating with. Then update PRs like #7350 don't need to widen, they just need to update this explicit version.

Thoughts?

@phillipuniverse
Copy link
Contributor Author

phillipuniverse commented Jun 9, 2023

The test failures are what I was afraid of. Looks like Dependabot has special logic on the category key existing in the poetry.lock file and Poetry has removed that functionality:

def subdep_type
category =
TomlRB.parse(lockfile.content).fetch("package", []).
find { |dets| normalise(dets.fetch("name")) == dependency.name }.
fetch("category")
category == "dev" ? "dev-dependencies" : "dependencies"
end

So I think that the current Dependabot code will not work with a Poetry 1.5 lockfile and more needs to be done to make Dependabot compatible.

EDIT - looks like more details on the category removal were reported at #7389

@phillipuniverse
Copy link
Contributor Author

I’m not exactly sure how this needs to change. It seems like this was designed to work in the absence of a pyproject.toml file, but now the dependency group information is only managed there. Should we move the check to just look at pyproject.toml instead?

@shooit shooit mentioned this pull request Jun 14, 2023
1 task
@phillipuniverse phillipuniverse changed the title Update spec lock files to match Poetry 1.5.1 format Support Poetry 1.5.1 poetry.lock with absence of category key Jun 21, 2023
@phillipuniverse
Copy link
Contributor Author

I looked a bit more into this and we are butting up against multiple tech debt issues here with Poetry changing things. Since the category is removed from poetry.lock, the only way we can obtain this information is via pyproject.toml. The current code paths I believe allow only a poetry.lock, so making changes to that core assumption could break other workflows.

Determining the category from pyproject.toml is also a little tricky considering older Poetry versions. Older versions of Poetry (< 1.2.0) had you specify dependencies as dev-dependencies, whereas Poetry > 1.2.0 uses an arbitrary group.

Here is a good example that illustrates the different ways to specify dependencies (dev-dependencies is deprecated but not removed in Poetry yet):

[tool.poetry]
name = "PoetryGroups"
version = "2.0.0"
homepage = "https://github.com/roghu/py3_projects"
license = "MIT"
readme = "README.md"
authors = ["Dependabot <support@dependabot.com>"]
description = "Various small python projects."
[tool.poetry.dependencies]
python = ">=3.10"
[tool.poetry.dev-dependencies]
black = "^22.12.0"
[tool.poetry.group.dev.dependencies]
pytest = "*"
[tool.poetry.group.docs.dependencies]
sphinx = "6.1.3"

Support for group dependencies was added in this PR but works around the issue by continuing to rely on the category in poetry.lock. That will also need to be reworked a bit to rely on pyproject.toml.

Finally, there is some core GitHub integration at play to be able to parse out dependencies in the 'Insights' tab. This was called out at #6659 (comment) and something might have to change there depending on the reliance on poetry.lock to get through those insights.


This PR needs additional guidance from the Dependabot maintainers for how to proceed.

@pavera
Copy link
Contributor

pavera commented Jul 6, 2023

As far as the requirements.txt specification the range is there to allow support under python 3.6. If dependabot detects that a given repository is using python 3.6 it will install a python 3.6 virtualenv and then pip install the requirements.txt. The result is that on python 3.6 poetry 1.1.15 will be installed. We are hoping to remove this soonish as we look to remove python 3.6 support.

As to the issue with the category key being removed, the dilemma is that we can only really support 1 version of the lockfile currently. If we update our code to support the 1.5 lockfile format users still on < 1.5 will see updates that remove the category key and locally their tooling will add the key back in. That being said, we would like to move in the direction of supporting the latest package manager version more quickly, even if that means breaking backwards compatibility. With that in mind, if you could take a stab at moving the category detection to using pyproject.toml on this PR I think that would be the preferred direction from our side.

@phillipuniverse
Copy link
Contributor Author

@pavera thanks for the additional information! Specifically the Python 3.6 nuance, I had not considered how that worked but it makes perfect sense now how someone would get Poetry 1.1.15.

With that in mind, if you could take a stab at moving the category detection to using pyproject.toml on this PR I think that would be the preferred direction from our side.

Definitely, I think that is the overall more stable approach. I can take this on.

One clarifying question - how does Dependabot generally support package managers that have many categories? Since we were relying on the 'category' behavior before this treated all group dependencies as "dev" dependencies. Do we want to keep that behavior or is there a better way to mark different group dependencies as different categories in the Dependabot language?

@phillipuniverse
Copy link
Contributor Author

phillipuniverse commented Jul 20, 2023

Started in on a candidate solution and ran into a problem - transitive dependencies. Transitive dependencies are in the poetry.lock file but are not in the pyproject.toml file (by definition). I think in this case we'll need to traverse up the dependency tree until we find a dependency within a group in pyproject.toml.

@jonathangreen
Copy link

jonathangreen commented Jul 20, 2023

@phillipuniverse I think the right solution is exactly what you are proposing, to actually work out the category from the transitive dependencies. However, maybe a first pass solution to this is to put all transitive dependencies into the main category, with a further todo for more sophisticated transitive dependency resolution?

According to the comment here on the poetry PR that removed the category key, defaulting to main in the absence of a category key is what poetry itself is doing.

@allanlewis
Copy link

If you want to know which category a dependency is in, could poetry export help? I'm thinking you could export the prod and dev dependencies and see which are only dev.

@phillipuniverse
Copy link
Contributor Author

phillipuniverse commented Jul 22, 2023

@jonathangreen thanks for additional info. I pushed up a relatively naive solution as you suggested, simply checking the existence of the dependency in pyproject.toml. If we didn't find it (meaning, transitive) I defaulted treating them as main dependencies, as indicated by the PR you linked.

As an aside, I think once the Python 3.6 (and potentially Python 3.7?) support is dropped in #7610, it's a good opportunity to look at the dependency category handling overall for Poetry support. There is still a lot of reliance on dev-dependencies (even in my latest changes) and the "group" support was bolted on. Since it's such an easy migration to go from dev-dependencies -> group.dev.dependencies, I think it would simplify some code paths to only rely on groups.

Remaining TODOs on this PR

  • Fix the few failing tests
  • Clean up the changes to all the .lock files that I rewrote for Poetry 1.5, I'll put this back where they were
  • Make sure we have tests around a 1.5 lock file. I'm thinking I will modify the pyproject.toml added to test group support at
    context "with group dependencies" do
    let(:pyproject_fixture_name) { "poetry_group_dependencies.toml" }
    subject(:dependency_names) { dependencies.map(&:name) }
    it "includes dev-dependencies and group.dev.dependencies" do
    expect(dependency_names).to include("black")
    expect(dependency_names).to include("pytest")
    end
    it "includes other group dependencies" do
    expect(dependency_names).to include("sphinx")
    end
    end
    end
    , generate a 1.5 lockfile from here and do verifications in the updator

The code as written is ready to review, I'm not planning on making many modifications. I'm pretty new to Ruby so I would appreciate readability suggestions!

Comment on lines 139 to 141
# This is how Poetry treats transitive dev dependencies, see discussion at https://github.com/python-poetry/poetry/pull/7637#issuecomment-1494272266
# and https://github.com/dependabot/dependabot-core/pull/7418#issuecomment-1644012926
is_expected.to eq([{ production: true }])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonathangreen FYI this test was failing when there was no category key with the changes I made to put everything in the "main" dependencies. It looks like that even though from your comment in Poetry they mix it all together, we had a test that ensured this behavior.

Options:

  1. Finish out the search process where we walk the dependency tree trying to figure out if a transitive dependency belongs in dev or not, preserving this test case
  2. Have a divergence between when there is or is not a category key. What this looks like to me is that Poetry used to put all the transitive dependencies in the correct category, which is why the version with the category key marked it as production: false
  3. Ignore the category key altogether and move all dependencies into production: true. In the current code changes, if there is a category key in the metadata I use it. If not, only then do I go and check pyproject.toml

I think (3) is the most consistent with how Poetry treats the situation and is more forward-looking since Poetry has decided to ignore the category altogether.

Thoughts?

Copy link
Contributor Author

@phillipuniverse phillipuniverse Aug 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the latest review from @bdragon, option (2) is the expected behavior and this test should not have changed. Reverting.

Small nuance. For transitive dependencies locked with Poetry 1.5+, those should be treated as "production" dependencies, but if there is a category key then treat as that category.

Copy link
Member

@bdragon bdragon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@phillipuniverse thanks for raising this!

Here is my understanding:

  • if the dependency has a category key in the lockfile, it was generated with Poetry <v1.5 and we use its value to determine the dependency group
  • if the dependency is in the main dependency group (i.e., tool.poetry.dependencies) in pyproject.toml, it stays that way
  • if the dependency is in the [legacy] dev dependency group (i.e., tool.poetry.dev-dependencies) in pyproject.toml, it stays that way
  • if the dependency is included in a dependency group other than the main one (e.g., tool.poetry.group.test.dependencies) in pyproject.toml, it goes in the dev dependencies group
  • if the dependency is transitive, it goes in the main dependencies group

tool.poetry.dev-dependencies has been deprecated since Poetry v1.2 and tool.poetry.group.dev.dependencies is preferred, but this is a separate concern.

Based on the conversation here, the changes in python-poetry/poetry#7637, and the Poetry docs—and given that we need to straddle Poetry versions 1.1 and 1.5—I think the solution proposed here looks pretty good.

I think we'd prefer to avoid traversing the dependency tree for every transitive dependency if possible, so defaulting to dependencies seems reasonable.

I went ahead and updated this branch with the latest main. It looks like you have one failing spec—if you don't mind getting that to pass, it'd be great to see this merged. Thanks!

@jeffwidman
Copy link
Member

jeffwidman commented Aug 4, 2023

and given that we need to straddle Poetry versions 1.1 and 1.5

We may not need to do that once we drop support for Python 3.6/3.7...

I am planning to look at this once August 17th rolls around and I can merge my long string of Python PR's... Can we hold off on merging this until then?

There's a broader discussion we're starting to have internally about how old of versions we'll promise to support for package managers. So we'll also be trying to get some clarity there on how much we choose to support multiple versions vs telling users "we're not going to support older than N versions of poetry because it's not hard to update to the newer version of poetry, please do that instead"... again, a broader conversation that we're working through internally.

So let's just put this on pause til August 17th and then re-open the discussion.

@G-Rath
Copy link

G-Rath commented Aug 4, 2023

So let's just put this on pause til August 17th and then re-open the discussion.

keeping in mind that without this dependabot just flat out falls over on new poetry.lock files - we've already had a few patch updates on our projects that've needed to be done manually due to this; is it possible get something out to fix the current hard error in parallel to your longer term discussions?

@phillipuniverse
Copy link
Contributor Author

phillipuniverse commented Aug 4, 2023

@G-Rath yup - @jeffwidman today, all automated security updates in GitHub fail for projects that use Poetry >=1.5 without this PR. This has been exacerbated by Dependabot being updated to use Poetry 1.5 when emitting normal, non-security-related updates.

Now that I have a bit more clarity on the final state, I can address the feedback that @bdragon suggested, we can fix the breakage in the GitHub/Dependabot ecosystem that exists today, then go for broader sweeping changes in tandem with the Python 3.8+ change.

@jeffwidman does that change your mind at all or still think we need to wait?

@jeffwidman
Copy link
Member

Thanks for the feedback. I didn't realize this was breaking everything for you folks, that sucks.

Yeah, given that I'd also lean towards shipping this sooner, and then later cleaning up any cruft it introduces for supporting older versions of poetry.

Let me take a deeper look at this on Monday before committing for sure though.

@jeffwidman
Copy link
Member

FYI: I haven't forgotten, just got buried yesterday. My head is still deep in a different section of code but as soon as I wrap that up I'll be switching to coming up to speed on poetry and looking at this.

Copy link
Contributor Author

@phillipuniverse phillipuniverse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Latest changes has tests around the 1.5 lockfile and should be compatible with changes requested from @bdragon

@@ -298,12 +298,32 @@ def update_dependency_requirement(toml_node, requirement)
end

def subdep_type
Copy link
Contributor Author

@phillipuniverse phillipuniverse Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just happened to find this function unexpectedly searching around for subdep_type. Looks like this logic was always shared with the file updater, but it was small enough such that it made more sense to copy it vs extract into a utility function.

Since the logic for determining a "subdep_type" is getting larger, maybe it makes sense to refactor to a separate function? Another set of options:

  1. Refactor into a utility function and call from both places. Opinion on where this should go?
  2. Remove this modification since it's not specifically tackling the core issue and we are trying to be pretty targeted per other conversations
  3. Keep as-is

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You forgot to pass poetry_object around, and a real update seems to be still failing because of that. So this needs to be fixed before merging, and also signals some lack of coverage here.

I'll look into a good place for factoring out this method.

Copy link
Contributor

@deivid-rodriguez deivid-rodriguez Aug 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I proposed #7826 to refactor this. We can rebase and adapt this PR to it, and should be all good.

@deivid-rodriguez
Copy link
Contributor

We just drop old Python support, and will now unconditionally run Poetry 1.5.

However, I guess we'll need to keep the compatibility code in case we find an old lockfile anyways? We may not generate an ideal diff but that's an existing problem and I'm not sure completely dropping support for lockfiles generated with old Poetry versions would be an improvement. So I'd lean towards keeping it?

@deivid-rodriguez
Copy link
Contributor

Ok, so in the end I decided to go with a different approach at #7834, which I find cleaner and I believe should work better.

@phillipuniverse I'm trying to add attribution to the commit but it just won't pick it up. Not sure what I'm doing wrong.

@phillipuniverse
Copy link
Contributor Author

We may not generate an ideal diff but that's an existing problem and I'm not sure completely dropping support for lockfiles generated with old Poetry versions would be an improvement. So I'd lean towards keeping it?

Hm maybe... in the current situation, Dependabot will read the old version and then rewrite with 1.5+ anyway. Seems about the same as just erroring out with a message about only supporting Poetry 1.5+ 🤷

Ok, so in the end I decided to go with a different approach at #7834, which I find cleaner and I believe should work better.

Definitely agree. I looked it over and this looks to be a more sustainable solution than what I had here. I really like the poetry show --only main part of it, very simple solution for discovering dev vs non-dev.

I'm trying to add attribution to the commit but it just won't pick it up. Not sure what I'm doing wrong.

@deivid-rodriguez I appreciate the attempt at attribution but honestly, nbd on my side. I'm just excited for my automated security updates to start working again!!

Thanks for the care and attention! I'm going to go ahead and close this in favor of #7834.

@phillipuniverse phillipuniverse deleted the poetry-1.5-lockfile branch August 18, 2023 03:44
@deivid-rodriguez
Copy link
Contributor

Hm maybe... in the current situation, Dependabot will read the old version and then rewrite with 1.5+ anyway. Seems about the same as just erroring out with a message about only supporting Poetry 1.5+ 🤷

That's a good point, not necessarily better to keep opening PRs changing the lockfile format to 1.5+. However, that's an existing issue so I prefer to keep any fix focused on adding support 1.5+ lockfile format but keep the current behavior other than that. I also have the feeling that many users are aware of this issue but may be using Dependabot PRs as a way to upgrade their Poetry version to stop the format from changing back and forth.

Definitely agree. I looked it over and this looks to be a more sustainable solution than what I had here. I really like the poetry show --only main part of it, very simple solution for discovering dev vs non-dev.

Thanks for the kind words!

@deivid-rodriguez I appreciate the attempt at attribution but honestly, nbd on my side. I'm just excited for my automated security updates to start working again!!

That's great, I'm still confused since it's the first time adding Co-authored-by: information to a commit message is not picked up by GitHub 🤷.

@jeffwidman
Copy link
Member

👋 I just want to say thanks again to everyone who chimed in on this issue providing additional context. I spent my afternoon reading poetry docs and learning it, and also understanding the issue here, and it was super helpful to have all the pointers everyone provided to historical context.

I'm deploying #7834 shortly, and as long as that looks green I'll merge it to main. So your poetry :dependabot: PR's are hopefully fixed by the time you read this.

@phillipuniverse
Copy link
Contributor Author

@deivid-rodriguez @jeffwidman - just wanted to report that thanks to you guys, all of my automated security updates have started working again!! Thank you so much!!

@jeffwidman
Copy link
Member

Awesome thanks for circling back to confirm!

@esev
Copy link

esev commented Aug 19, 2023

Same here. Thank you!

Though I noticed one odd indirect dependency update where a package was downgraded. I hadn't seen that before, so just checking, is that expected?
pywemo/pywemo@d2d03e8 (rstcheck: 6.1.2 -> 3.5.0)

Edit: A few more cases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants