Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: implement settings driven download and high level resolver customization #207

Merged
merged 4 commits into from
Jul 26, 2024

Conversation

shubhbapna
Copy link
Collaborator

@shubhbapna shubhbapna commented Jul 19, 2024

fixes #191
fixes #185
fixes #184

@shubhbapna shubhbapna force-pushed the settings-downloader branch from 6ba4bae to b1fdf38 Compare July 19, 2024 19:53
@shubhbapna shubhbapna marked this pull request as ready for review July 19, 2024 20:05
@shubhbapna shubhbapna requested a review from dhellmann July 19, 2024 20:05
@shubhbapna shubhbapna marked this pull request as draft July 19, 2024 20:12
@shubhbapna
Copy link
Collaborator Author

shubhbapna commented Jul 19, 2024

Have to update the overrides list

Edit: done

@shubhbapna shubhbapna force-pushed the settings-downloader branch from 8e8111d to c98eaf0 Compare July 19, 2024 20:29
@shubhbapna shubhbapna marked this pull request as ready for review July 19, 2024 20:30
@shubhbapna shubhbapna changed the title implement settings driven download and high level resolver customization feat: implement settings driven download and high level resolver customization Jul 19, 2024
src/fromager/settings.py Outdated Show resolved Hide resolved
src/fromager/settings.py Outdated Show resolved Hide resolved
@shubhbapna shubhbapna force-pushed the settings-downloader branch from 2364012 to c83e0d1 Compare July 22, 2024 19:51
@shubhbapna shubhbapna requested a review from dhellmann July 22, 2024 20:26
@shubhbapna shubhbapna force-pushed the settings-downloader branch 2 times, most recently from 1dfc9ce to 79bda31 Compare July 24, 2024 16:11
@tiran
Copy link
Collaborator

tiran commented Jul 25, 2024

Config parsing and handling is getting a bit convoluted. We could look into Pydantic or attrs to do some validation, typing, and nice wrapping for us. I tried dataclasses in PR #241, but that's too limited. InstructLab uses Pydantic, but that's pretty heavy weight. attrs is a bit lighter.

PS: In a future PR, let's not delay this PR longer than necessary.

@shubhbapna shubhbapna force-pushed the settings-downloader branch from 79bda31 to 0aea9f7 Compare July 25, 2024 15:47
Copy link
Member

@dhellmann dhellmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One real question and several nits for log messages.

src/fromager/overrides.py Show resolved Hide resolved
@@ -141,23 +141,35 @@ def default_download_source(
sdist_server_url: str,
) -> tuple[pathlib.Path, str, str]:
"Download the requirement and return the name of the output path."
url_template = ctx.settings.sdist_download_url(req.name)
rename_to_template = ctx.settings.sdist_local_filename(req.name)
logger.debug(f"{req.name}: loading resolver parameters from settings")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this log message will help someone debug.

Suggested change
logger.debug(f"{req.name}: loading resolver parameters from settings")

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well the hope is that it comes right before: https://github.com/python-wheel-build/fromager/pull/207/files#diff-e99a2837a7d36f9f14de5c4b54a1db688c08f148f62b8e3d5cb82937419dbca9R153

I was not able to cleanly combine these two statements hence the separate logs. Another option could be to log settings._get_package_settings and making that public

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be a philosophical difference.

I generally want to log before something that may take some time, so if it looks like the program is "stuck" a user can reason about whether maybe it's just taking a while to compute or download something.

Then I also want to log when non-default behaviors are triggered, like an override is being used (or not).

This message is coming just before a lookup that should be very fast, and the results are logged later. So I think we can drop this message entirely and then fix up the phrasing of the other one and the user will have all the info they need.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep we can drop the first log statement. If we want to fix the phrasing of the second log statement to indicate whether a value came from settings or not, then I will have to add additional checks just for that. It won't be as clean. If we want to indicate if it came from settings then I would prefer to do it they way we are discussing below

src/fromager/sources.py Outdated Show resolved Hide resolved
src/fromager/sources.py Outdated Show resolved Hide resolved
src/fromager/sources.py Outdated Show resolved Hide resolved
src/fromager/sources.py Outdated Show resolved Hide resolved
@shubhbapna
Copy link
Collaborator Author

@dhellmann regarding the logging, I think the way to go might be to log whatever settings._get_package_settings returns (so making that method public and calling it in download_source only for the purposes of logging)

@dhellmann
Copy link
Member

@dhellmann regarding the logging, I think the way to go might be to log whatever settings._get_package_settings returns (so making that method public and calling it in download_source only for the purposes of logging)

How many different places will that function be called? Are we going to log the same info a lot?

Does logging all of the package settings at once make it easier to debug than just logging the values that are being used as they are used?

@shubhbapna
Copy link
Collaborator Author

How many different places will that function be called? Are we going to log the same info a lot?

just in the default_download_source function

Does logging all of the package settings at once make it easier to debug than just logging the values that are being used as they are used?

It will just log the package setting for the particular requirement we are working with currently. It might be useful so as to not surprise the user when the see a value that they were not expecting

@dhellmann
Copy link
Member

Does logging all of the package settings at once make it easier to debug than just logging the values that are being used as they are used?

It will just log the package setting for the particular requirement we are working with currently. It might be useful so as to not surprise the user when the see a value that they were not expecting

But it would be all of the settings for that package, right?

@shubhbapna
Copy link
Collaborator Author

But it would be all of the settings for that package, right?

yes, correct

@dhellmann
Copy link
Member

But it would be all of the settings for that package, right?

yes, correct

OK, let's do that. Ask for the settings, if we get any back log something like "found package build configuration settings" and dump all of them. Then later, just report things like "doing X using value Y" and don't report each time whether it's a setting, a CLI value, or the default.

@shubhbapna shubhbapna force-pushed the settings-downloader branch from 8b65067 to 8cf56e8 Compare July 25, 2024 19:03
@shubhbapna
Copy link
Collaborator Author

Done I removed the other log statements which logged the values because the resolver function and the download_url function already log them.

I just noticed 2 things as well:

@dhellmann
Copy link
Member

Done I removed the other log statements which logged the values because the resolver function and the download_url function already log them.

I just noticed 2 things as well:

No, I don't think so. In that case we're looking at the wheel server we've been given to see if we've already built the wheel we're being asked to download. We only want to find a wheel, and we only want to look at that wheel server.

  • Do the package names being passed to the methods in settings have to be canonicalized?

Yes. The values read from the file and the values from the Requirement both need to be canonicalized before you compare them or use them as keys in a dict.

@shubhbapna shubhbapna requested a review from dhellmann July 25, 2024 19:20
@mergify mergify bot merged commit ab7749a into python-wheel-build:main Jul 26, 2024
54 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants