Fix Bootstrap.get_bootstrap_from_recipes() so it's smarter and deterministic, fixes #1875 #1887

ghost · 2019-06-26T07:33:59Z

This should fix Bootstrap.get_bootstrap_from_recipes() not being deterministic, which fixes #1875

AndreMiras · 2019-06-26T08:25:43Z

Thanks @Jonast for the pull request 🙏
I think @opacam was suggesting to maybe go a bit deeper than just sorting it, see his comment there #1872 (comment)
Also we may want to unit test this change

ghost · 2019-06-26T08:48:13Z

@AndreMiras added it to the test! If I understand @opacam 's suggestion basically is "don't return the first alphabetical one but something better", but this is strictly speaking not related to the determinism issue and also something that doesn't concern Bootstrap.list_bootstraps but rather should be additional logic in Bootstrap.get_bootstrap_from_recipes. So IMHO it makes sense to add this change anyway

AndreMiras · 2019-06-26T10:28:15Z

tests/test_bootstrap.py

+            ["empty", "sdl2", "service_only", "webview"]
+        expected_bootstraps = sorted(
+            ["empty", "service_only", "webview", "sdl2"]
+        )


minor suggestion to improve the readability of this block:

listdir_ret = ["empty", "service_only", "webview", "sdl2"] mock_listdir.return_value = listdir_ret expected_bootstraps = sorted(listdir_ret)

That way we make it clear that we're talking about the same list once sorted once not. Because here you somehow shuffled your lists and at first sight we don't see they're the same or why they're shuffled.

AndreMiras · 2019-06-26T10:29:22Z

tests/test_bootstrap.py

+
+        # Make sure order remains stable even if listdir() order varies:
+        mock_listdir.return_value = \
+            ["empty", "webview", "sdl2", "service_only"]


Minor: then you could use listdir_ret.reverse() base don the suggestion above

AndreMiras · 2019-06-26T10:30:49Z

Yes @Jonast I think you have a fair point, thanks

opacam · 2019-06-26T11:00:25Z

IMHO, It does not make sense to me to return the results sorted in this case. I think that we will only add useless code...so I would rather do it once and well done (with some logic I mean), based on recipes, something like:

if flask in recipes I would expect an webview bootstrap
if kivy in recipes I would expect an sdl2 bootstrap
all the other combinations I would make it go to service_only bootstrap...unless someone thinks about that we should do some other special match

If we apply the sorted way proposed in here, it would only lead to inconsistencies (like we have now), but if all of you want to go this sorted way, then let's do it...we can apply some kind of logic later.

ghost · 2019-06-26T16:56:16Z

Yeah @opacam I can see your point, maybe list_bootstraps should actually return a set... I'll try something else hang on

ghost · 2019-06-26T19:13:49Z

@opacam ok I made something new which fixes Bootstrap.get_bootstrap_from_recipes, is that more like what you had in mind?

AndreMiras · 2019-06-26T19:43:33Z

Yes agree guys, I also thought this functionally should be a set. But then I started imagining all the refactor that would require and I thought "OK maybe not for now" 😛
Anyway if you feel like giving it a try let's do it 😄
I'll take a quick first look now and continue reviewing later

AndreMiras · 2019-06-26T19:49:27Z

pythonforandroid/bootstrap.py

        '''Find all the available bootstraps and return them.'''
        forbidden_dirs = ('__pycache__', 'common')
        bootstraps_dir = join(dirname(__file__), 'bootstraps')
+        result = set()


Yes also thought "this should be a set". But the fact that it was an iterator before made me more cautious. But seeing the code I'm not seeing why not being an iterator would be an issue. The result set would be anything enormous, right, so no risk to load it on the stack, correct?

AndreMiras · 2019-06-26T19:53:36Z

tests/test_bootstrap.py

+        # Test all alternatives are listed (they won't have dependencies
+        # expanded since expand_dependencies() is too simplistic):
+        expanded_result_2 = expand_dependencies([("pysdl2", "kivy")], self.ctx)
+        assert([["pysdl2"], ["kivy"]] == expanded_result_2)


Minor: assert is still a keyword, not a function 😄
If you like the function stile you can also use the self.assertEqual() or self.assertTrue(). That would also make it more consistent with the style used in this file

tests/test_bootstrap.py

AndreMiras · 2019-06-26T20:31:41Z

pythonforandroid/bootstrap.py

+
+        # Special rule: return "webview" if we depend on common web recipe:
+        for possible_web_dep in known_web_packages:
+            if have_dependency_in_recipes(possible_web_dep):


Just some thoughts without deep diving into it. So I'm not sure if we could get any better, but the time complexity of this one is concerning me.
Let me know if I'm seeing this correctly or not.
have_dependency_in_recipes() itself seems O(n²) because we have a in in a for loop.
Then we call have_dependency_in_recipes() itself in another for loop so that gives us a O(n³).
If by any chance dep_list is a set then the if dep in dep_list: is not O(n) anymore, but O(1). So then the overall complexity would be O(n²).

pythonforandroid/bootstrap.py

AndreMiras · 2019-06-26T20:37:54Z

pythonforandroid/bootstrap.py

        for name in listdir(bootstraps_dir):
            if name in forbidden_dirs:
                continue
            filen = join(bootstraps_dir, name)
            if isdir(filen):
-                yield name
+                result.add(name)
+        return result

    @classmethod
    def get_bootstrap_from_recipes(cls, recipes, ctx):


feel like this method is big now and could be split furthermore by concerns, what do you think?

sorry I don't see it 😂 (this seems to be my default response to this 😂 😂 ) the previous one you suggested I finally split was due to nesting not to length, I think "it's big" is still a somewhat poor criteria (on its own) for a split. UNLESS you want to be able to test the sub parts separately of course, although I'm not sure that is necessary...? I would personally leave it as is 🤷‍♀️

Yes definitely big is a poor and arbitrary criteria, just like the 80 char lines of the pep. It probably started from some feelings and adjusted over time/researches. That "it's big" feeling would vary from one to another of course.
I wanted to share the feeling before looking for solution to see maybe luckily something super clever would have come to your or other people mind 😄

Edit:
So yes took a look at the metrics quickly. Before that method was 40 lines, and at the time of my comment it was 100. Hence my surprise. More specifically if I have to scroll to see the entirety of a method, I start to feel like something is off 😛
So definitely taking the inner function out would help I think. But probably there's more that could be done

Then see taking a quick look I already see some possible split since the comments and the logs say it all, for instance:
# Find out which bootstraps are acceptable:
Then a big block and then:
info('Found {} acceptable bootstraps: {}
For me that definitely feels like a split. Not only it will make things more readable:
acceptable_bootstraps = find_acceptable_bootstraps(...)
But also make it more unit testable 😉

Yeah and it also obscures the control flow and introduces interdependencies and more interfaces making changes later harder. I definitely see your point but I don't think chasing that far after tests is necessarily good, so if you ask me I think this is better off as it is UNTIL we actually want to use it separately, or it reaches a length where this is more warranted. It is also not such a deeply complex and core functionality that I'd think this hurts us a lot not being tested more granularly.

But as usual, I'm happy to leave the final decision to you guys

That's seemed like an easy refactor to me, but OK 😢
As usual I'm nitpicking, but won't block a PR for something like this. So if you don't feel like it let's leave it for now. Maybe at some point it bugs so much that I'll do it myself.
Thanks for discussing it 👍

oh no I haven't convinced you even in the slightest have I 😱 oh dear. maybe I'm wrong then, I'll split it up 👍 I mean I can see your side a little, and in the long run I feel like you had so far a slightly better grasp at writing nice code than I have

That's so nice of you 😊
Thank you @Jonast !

AndreMiras · 2019-06-26T20:45:25Z

pythonforandroid/bootstrap.py

+
+        if prioritized_acceptable_bootstraps:
+            info('Using the highest ranked/first of these: {}'
+                 .format(prioritized_acceptable_bootstraps[0].name))


Minor: The message say of these so I was expecting to see the list. But we actually show the one being picked up, which is fine, but then the log wording should change a bit

AndreMiras · 2019-06-26T20:49:13Z

pythonforandroid/bootstrap.py

+        prioritized_acceptable_bootstraps = sorted(
+            list(acceptable_bootstraps),
+            key=functools.cmp_to_key(bootstrap_priority)
+        )


very interesting helper function. I understand you didn't want to expose it because it's a util function only being used in get_bootstrap_from_recipes(), but then:

it makes the overall get_bootstrap_from_recipes() method larger

the bootstrap_priority() helper is not unit testable

Maybe we should define it as a private top function inside the file bootstrap.py file? I have the feeling it would make the overall thing more readable and more testable

AndreMiras · 2019-06-26T21:00:05Z

pythonforandroid/bootstrap.py

+    for entry in recipes:
+        if not isinstance(entry, (tuple, list)) or len(entry) == 1:
+            if isinstance(entry, (tuple, list)):
+                entry = entry[0]


There's something fishy here 🐟 😄
First it doesn't seem covered, second it means expand_dependencies() would be a bit polymorphic in a way which can lead to unexpected behavior or code not so easy to follow.

What do you mean with polymorphic? This is the usual behavior of how all graph lists work, in fact I did not introduce this at all but just stuck to it as it already behaved (see the lower loop.) So no, pretty sure this is as it should be. (Dependencies can be a string or a tuple of alternatives, that is how things are used everywhere)

The coverage is a good point though, let me look into that

Thanks for the reply. Great that you will look at the coverage, who knows maybe we will come across another old bug or dead code 😄

AndreMiras · 2019-06-26T21:03:51Z

I made a first review round, your approach looks promising, thank you for giving it a try 👏
I hope I can dig deeper and also test it later this week

opacam · 2019-06-28T08:48:58Z

@opacam ok I made something new which fixes Bootstrap.get_bootstrap_from_recipes, is that more like what you had in mind?

@Jonast, oh yeah, this looks far better, thanks for doing this 😄

Note: I have took a superficial look and it looks good, I only miss one thing, I think that maybe we should add some documentation about this new method to find the right bootstrap via user recipes...maybe a note in doc/source/bootstraps.rst explaining this new behaviour?

AndreMiras · 2019-06-29T17:11:35Z

pythonforandroid/bootstrap.py

+
+
+def _cmp_bootstraps_by_priority(a, b):
+    global default_recipe_priorities


This is confusing me, why do you need the global keyword if you're not updating the reference of the object?

Ohhh then I need to remove it!! I somehow thought it would be less confusing to make sure people realize it's not a local var, but I guess this is more confusing isn't it? I'll remove it 😀

AndreMiras · 2019-06-29T17:18:26Z

tests/test_bootstrap.py

+        self.assertTrue(_cmp_bootstraps_by_priority(
+            Bootstrap.get_bootstrap("service_only", self.ctx),
+            Bootstrap.get_bootstrap("sdl2", self.ctx)
+        ) < 0)


Awesome thanks!
Then we can write a simple test for covering the alphabetic sorting return (a.name - b.name) # alphabetic sort for determinism
So if basically if I understood correctly, we just need to comp two bootstraps that're not in default_recipe_priorities list. And if we don't have any, we can just patch the list to make it empty.
Would that make sense?

Ah, neat idea! I'll add that, thanks 😍

ghost · 2019-07-09T02:23:40Z

Okay so I think I worked pretty much everything in (if not I probably forgot, poke me). In my tests it worked, so maybe it would be a good time to check out if this is worth integrating, after all it is quite self-contained and does some nice little local code improvements

AndreMiras

Nice work, thanks for your efforts and patience ❤️

ghost added the bug label Jun 26, 2019

ghost requested a review from AndreMiras June 26, 2019 07:34