Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Bootstrap.get_bootstrap_from_recipes() so it's smarter and deterministic, fixes #1875 #1887

Merged
merged 1 commit into from Jul 9, 2019

Conversation

ghost
Copy link

@ghost ghost commented Jun 26, 2019

This should fix Bootstrap.get_bootstrap_from_recipes() not being deterministic, which fixes #1875

@ghost ghost added the bug label Jun 26, 2019
@ghost ghost requested a review from AndreMiras June 26, 2019 07:34
@AndreMiras
Copy link
Member

Thanks @Jonast for the pull request 🙏
I think @opacam was suggesting to maybe go a bit deeper than just sorting it, see his comment there #1872 (comment)
Also we may want to unit test this change

@ghost
Copy link
Author

ghost commented Jun 26, 2019

@AndreMiras added it to the test! If I understand @opacam 's suggestion basically is "don't return the first alphabetical one but something better", but this is strictly speaking not related to the determinism issue and also something that doesn't concern Bootstrap.list_bootstraps but rather should be additional logic in Bootstrap.get_bootstrap_from_recipes. So IMHO it makes sense to add this change anyway

["empty", "sdl2", "service_only", "webview"]
expected_bootstraps = sorted(
["empty", "service_only", "webview", "sdl2"]
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor suggestion to improve the readability of this block:

listdir_ret = ["empty", "service_only", "webview", "sdl2"]
mock_listdir.return_value = listdir_ret
expected_bootstraps = sorted(listdir_ret)

That way we make it clear that we're talking about the same list once sorted once not. Because here you somehow shuffled your lists and at first sight we don't see they're the same or why they're shuffled.


# Make sure order remains stable even if listdir() order varies:
mock_listdir.return_value = \
["empty", "webview", "sdl2", "service_only"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: then you could use listdir_ret.reverse() base don the suggestion above

@AndreMiras
Copy link
Member

Yes @Jonast I think you have a fair point, thanks

AndreMiras
AndreMiras previously approved these changes Jun 26, 2019
@opacam
Copy link
Member

opacam commented Jun 26, 2019

IMHO, It does not make sense to me to return the results sorted in this case. I think that we will only add useless code...so I would rather do it once and well done (with some logic I mean), based on recipes, something like:

  • if flask in recipes I would expect an webview bootstrap
  • if kivy in recipes I would expect an sdl2 bootstrap
  • all the other combinations I would make it go to service_only bootstrap...unless someone thinks about that we should do some other special match

If we apply the sorted way proposed in here, it would only lead to inconsistencies (like we have now), but if all of you want to go this sorted way, then let's do it...we can apply some kind of logic later.

@ghost
Copy link
Author

ghost commented Jun 26, 2019

Yeah @opacam I can see your point, maybe list_bootstraps should actually return a set... I'll try something else hang on

@ghost ghost changed the title Fix Bootstrap.list_bootstraps() order not being deterministic, fixes #1875 [WIP] [Don't merge] Fix Bootstrap.list_bootstraps() order not being deterministic, fixes #1875 Jun 26, 2019
@ghost
Copy link
Author

ghost commented Jun 26, 2019

@opacam ok I made something new which fixes Bootstrap.get_bootstrap_from_recipes, is that more like what you had in mind?

@ghost ghost changed the title [WIP] [Don't merge] Fix Bootstrap.list_bootstraps() order not being deterministic, fixes #1875 [WIP] Fix Bootstrap.get_bootstrap_from_recipes() so it's smarter and deterministic, fixes #1875 Jun 26, 2019
@AndreMiras
Copy link
Member

Yes agree guys, I also thought this functionally should be a set. But then I started imagining all the refactor that would require and I thought "OK maybe not for now" 😛
Anyway if you feel like giving it a try let's do it 😄
I'll take a quick first look now and continue reviewing later

'''Find all the available bootstraps and return them.'''
forbidden_dirs = ('__pycache__', 'common')
bootstraps_dir = join(dirname(__file__), 'bootstraps')
result = set()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes also thought "this should be a set". But the fact that it was an iterator before made me more cautious. But seeing the code I'm not seeing why not being an iterator would be an issue. The result set would be anything enormous, right, so no risk to load it on the stack, correct?

# Test all alternatives are listed (they won't have dependencies
# expanded since expand_dependencies() is too simplistic):
expanded_result_2 = expand_dependencies([("pysdl2", "kivy")], self.ctx)
assert([["pysdl2"], ["kivy"]] == expanded_result_2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: assert is still a keyword, not a function 😄
If you like the function stile you can also use the self.assertEqual() or self.assertTrue(). That would also make it more consistent with the style used in this file

tests/test_bootstrap.py Outdated Show resolved Hide resolved
tests/test_bootstrap.py Outdated Show resolved Hide resolved

# Special rule: return "webview" if we depend on common web recipe:
for possible_web_dep in known_web_packages:
if have_dependency_in_recipes(possible_web_dep):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some thoughts without deep diving into it. So I'm not sure if we could get any better, but the time complexity of this one is concerning me.
Let me know if I'm seeing this correctly or not.
have_dependency_in_recipes() itself seems O(n²) because we have a in in a for loop.
Then we call have_dependency_in_recipes() itself in another for loop so that gives us a O(n³).
If by any chance dep_list is a set then the if dep in dep_list: is not O(n) anymore, but O(1). So then the overall complexity would be O(n²).

for name in listdir(bootstraps_dir):
if name in forbidden_dirs:
continue
filen = join(bootstraps_dir, name)
if isdir(filen):
yield name
result.add(name)
return result

@classmethod
def get_bootstrap_from_recipes(cls, recipes, ctx):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feel like this method is big now and could be split furthermore by concerns, what do you think?

Copy link
Author

@ghost ghost Jun 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I don't see it 😂 (this seems to be my default response to this 😂 😂 ) the previous one you suggested I finally split was due to nesting not to length, I think "it's big" is still a somewhat poor criteria (on its own) for a split. UNLESS you want to be able to test the sub parts separately of course, although I'm not sure that is necessary...? I would personally leave it as is 🤷‍♀️

Copy link
Member

@AndreMiras AndreMiras Jun 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes definitely big is a poor and arbitrary criteria, just like the 80 char lines of the pep. It probably started from some feelings and adjusted over time/researches. That "it's big" feeling would vary from one to another of course.
I wanted to share the feeling before looking for solution to see maybe luckily something super clever would have come to your or other people mind 😄

Edit:
So yes took a look at the metrics quickly. Before that method was 40 lines, and at the time of my comment it was 100. Hence my surprise. More specifically if I have to scroll to see the entirety of a method, I start to feel like something is off 😛
So definitely taking the inner function out would help I think. But probably there's more that could be done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then see taking a quick look I already see some possible split since the comments and the logs say it all, for instance:
# Find out which bootstraps are acceptable:
Then a big block and then:
info('Found {} acceptable bootstraps: {}
For me that definitely feels like a split. Not only it will make things more readable:
acceptable_bootstraps = find_acceptable_bootstraps(...)
But also make it more unit testable 😉

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah and it also obscures the control flow and introduces interdependencies and more interfaces making changes later harder. I definitely see your point but I don't think chasing that far after tests is necessarily good, so if you ask me I think this is better off as it is UNTIL we actually want to use it separately, or it reaches a length where this is more warranted. It is also not such a deeply complex and core functionality that I'd think this hurts us a lot not being tested more granularly.

But as usual, I'm happy to leave the final decision to you guys

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's seemed like an easy refactor to me, but OK 😢
As usual I'm nitpicking, but won't block a PR for something like this. So if you don't feel like it let's leave it for now. Maybe at some point it bugs so much that I'll do it myself.
Thanks for discussing it 👍

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh no I haven't convinced you even in the slightest have I 😱 oh dear. maybe I'm wrong then, I'll split it up 👍 I mean I can see your side a little, and in the long run I feel like you had so far a slightly better grasp at writing nice code than I have

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's so nice of you 😊
Thank you @Jonast !


if prioritized_acceptable_bootstraps:
info('Using the highest ranked/first of these: {}'
.format(prioritized_acceptable_bootstraps[0].name))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: The message say of these so I was expecting to see the list. But we actually show the one being picked up, which is fine, but then the log wording should change a bit

prioritized_acceptable_bootstraps = sorted(
list(acceptable_bootstraps),
key=functools.cmp_to_key(bootstrap_priority)
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very interesting helper function. I understand you didn't want to expose it because it's a util function only being used in get_bootstrap_from_recipes(), but then:

  1. it makes the overall get_bootstrap_from_recipes() method larger
  2. the bootstrap_priority() helper is not unit testable

Maybe we should define it as a private top function inside the file bootstrap.py file? I have the feeling it would make the overall thing more readable and more testable

for entry in recipes:
if not isinstance(entry, (tuple, list)) or len(entry) == 1:
if isinstance(entry, (tuple, list)):
entry = entry[0]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's something fishy here 🐟 😄
First it doesn't seem covered, second it means expand_dependencies() would be a bit polymorphic in a way which can lead to unexpected behavior or code not so easy to follow.

Copy link
Author

@ghost ghost Jun 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean with polymorphic? This is the usual behavior of how all graph lists work, in fact I did not introduce this at all but just stuck to it as it already behaved (see the lower loop.) So no, pretty sure this is as it should be. (Dependencies can be a string or a tuple of alternatives, that is how things are used everywhere)

The coverage is a good point though, let me look into that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the reply. Great that you will look at the coverage, who knows maybe we will come across another old bug or dead code 😄

@AndreMiras
Copy link
Member

I made a first review round, your approach looks promising, thank you for giving it a try 👏
I hope I can dig deeper and also test it later this week

@opacam
Copy link
Member

opacam commented Jun 28, 2019

@opacam ok I made something new which fixes Bootstrap.get_bootstrap_from_recipes, is that more like what you had in mind?

@Jonast, oh yeah, this looks far better, thanks for doing this 😄

Note: I have took a superficial look and it looks good, I only miss one thing, I think that maybe we should add some documentation about this new method to find the right bootstrap via user recipes...maybe a note in doc/source/bootstraps.rst explaining this new behaviour?



def _cmp_bootstraps_by_priority(a, b):
global default_recipe_priorities
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing me, why do you need the global keyword if you're not updating the reference of the object?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhh then I need to remove it!! I somehow thought it would be less confusing to make sure people realize it's not a local var, but I guess this is more confusing isn't it? I'll remove it 😀

self.assertTrue(_cmp_bootstraps_by_priority(
Bootstrap.get_bootstrap("service_only", self.ctx),
Bootstrap.get_bootstrap("sdl2", self.ctx)
) < 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome thanks!
Then we can write a simple test for covering the alphabetic sorting return (a.name - b.name) # alphabetic sort for determinism
So if basically if I understood correctly, we just need to comp two bootstraps that're not in default_recipe_priorities list. And if we don't have any, we can just patch the list to make it empty.
Would that make sense?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, neat idea! I'll add that, thanks 😍

@ghost
Copy link
Author

ghost commented Jul 9, 2019

Okay so I think I worked pretty much everything in (if not I probably forgot, poke me). In my tests it worked, so maybe it would be a good time to check out if this is worth integrating, after all it is quite self-contained and does some nice little local code improvements

@ghost ghost changed the title [WIP] Fix Bootstrap.get_bootstrap_from_recipes() so it's smarter and deterministic, fixes #1875 Fix Bootstrap.get_bootstrap_from_recipes() so it's smarter and deterministic, fixes #1875 Jul 9, 2019
Copy link
Member

@AndreMiras AndreMiras left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, thanks for your efforts and patience ❤️

@AndreMiras AndreMiras merged commit 108d49c into kivy:develop Jul 9, 2019
@ghost ghost deleted the fix_unpredictable_listbootstraps branch July 9, 2019 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

get_bootstrap_from_recipes() result consistency
2 participants