Use alpine equivalent of Docker Compose services to reduce size and startup time #4329

mschwager · 2018-07-20T15:55:17Z

Hi all,

This is my first PR against warehouse! Let me know if anything is out of order.

To reduce the size of dependencies pulled in for Docker Compose services I've changed the image's to be their Alpine equivalent. This should save something like ~400MB when downloading the images:

$ docker images | grep -E "postgres|redis|rabbitmq" | grep -Ev "<none>|latest" | sort
postgres                                        10.1-alpine          e6c5e6a76255        6 months ago        38.2MB
postgres                                        10.1                 ec61d13c8566        7 months ago        287MB
rabbitmq                                        3.7-alpine           23917a8e0e82        9 days ago          48MB
rabbitmq                                        3.7                  dc1744b237f9        3 days ago          125MB
redis                                           4.0-alpine           80581db8c700        9 days ago          28.6MB
redis                                           4.0                  f06a5773f01e        3 days ago          83.4MB

One thing I noticed, though, is a couple of tests fail after the move from postgres:10.1 -> postgres:10.1-alpine. Perhaps this is just an oddity with my dev environment, so I'll be curious to see if the TravisCI tests fail. Here are the failures in question:

============================================== FAILURES ===============================================
____________________________________ TestProjectList.test_no_query ____________________________________

self = <tests.unit.admin.views.test_projects.TestProjectList object at 0x7f4697e716a0>
db_request = <pyramid.testing.DummyRequest object at 0x7f4694c8f1d0>

    def test_no_query(self, db_request):
        projects = sorted(
            [ProjectFactory.create() for _ in range(30)],
            key=lambda p: p.normalized_name,
        )
        result = views.project_list(db_request)
    
>       assert result == {"projects": projects[:25], "query": None}
E       AssertionError: assert {'projects': ...'query': None} == {'projects': [...'query': None}
E         Common items:
E         {'query': None}
E         Differing items:
E         {'projects': <paginate.Page: Page 1/2>} != {'projects': [Project(name='buGkrZztOVdJ'), Project(name='CUSRSHwyrWYS'), Project(name='DQYmKgwGbDWN'), Project(name='EZSwnoLaZDBP'), Project(name='fGYqTjDlvGzs'), Project(name='fhRNAXlQsSPb'), ...]}
E         Full diff:
E         - {'projects': <paginate.Page: Page 1/2>, 'query': None}
E         + {'projects': [Project(name='buGkrZztOVdJ'),
E         +               Project(name='CUSRSHwyrWYS'),
E         +               Project(name='DQYmKgwGbDWN'),
E         +               Project(name='EZSwnoLaZDBP'),
E         +               Project(name='fGYqTjDlvGzs'),
E         +               Project(name='fhRNAXlQsSPb'),
E         +               Project(name='GLinWsOkojkz'),
E         +               Project(name='gOrRfJDhcYfq'),
E         +               Project(name='HEFdmcDNAyrG'),
E         +               Project(name='hGINavLJVZma'),
E         +               Project(name='hIenNBvaBTCF'),
E         +               Project(name='lioPBgmNzTGj'),
E         +               Project(name='mglUjgSIumKu'),
E         +               Project(name='MNwUhtYjNBTX'),
E         +               Project(name='MPPpqZvLqrwG'),
E         +               Project(name='mPVlJqlpJXWO'),
E         +               Project(name='mzpiabwWpmkw'),
E         +               Project(name='oMDIktqhGEqJ'),
E         +               Project(name='OTEYViBAKZfk'),
E         +               Project(name='PzpfZMoMsAgf'),
E         +               Project(name='QWKRstFHveKi'),
E         +               Project(name='SelaTfofhpsB'),
E         +               Project(name='tKRYkXvtVlDW'),
E         +               Project(name='tXgrnOXiYWKc'),
E         +               Project(name='tYlRgnNjegiv')],
E         +  'query': None}

tests/unit/admin/views/test_projects.py:38
___________________________________ TestProjectList.test_with_page ____________________________________

self = <tests.unit.admin.views.test_projects.TestProjectList object at 0x7f468f899ac8>
db_request = <pyramid.testing.DummyRequest object at 0x7f468f899d68>

    def test_with_page(self, db_request):
        projects = sorted(
            [ProjectFactory.create() for _ in range(30)],
            key=lambda p: p.normalized_name,
        )
        db_request.GET["page"] = "2"
        result = views.project_list(db_request)
    
>       assert result == {"projects": projects[25:], "query": None}
E       AssertionError: assert {'projects': ...'query': None} == {'projects': [...'query': None}
E         Common items:
E         {'query': None}
E         Differing items:
E         {'projects': <paginate.Page: Page 2/2>} != {'projects': [Project(name='UYEuGqrPAsdI'), Project(name='VGjnNowaWLFS'), Project(name='VJSpPMJWvdha'), Project(name='wBPbUpitYyCJ'), Project(name='WSuUehNTmuFp')]}
E         Full diff:
E         - {'projects': <paginate.Page: Page 2/2>, 'query': None}
E         + {'projects': [Project(name='UYEuGqrPAsdI'),
E         +               Project(name='VGjnNowaWLFS'),
E         +               Project(name='VJSpPMJWvdha'),
E         +               Project(name='wBPbUpitYyCJ'),
E         +               Project(name='WSuUehNTmuFp')],
E         +  'query': None}

tests/unit/admin/views/test_projects.py:48: AssertionError
================================= 2 failed, 27 passed in 7.38 seconds =================================

I'm especially confused because tests/unit/admin/views/test_users.py has very similar tests, but those pass.

Thoughts?

…tartup time

ewdurbin · 2018-07-20T16:09:12Z

I get the same test failures locally when applying this diff, super bizarre.

We don't run test suites in Docker for CI, though I believe we could on travis. It may be worth investigating this.

ewdurbin · 2018-07-20T16:13:26Z

Seems there are a handful of issues with this interchange noted, see docker-library/postgres#276 and related issues for some sparse details about an inconsistency with musl (alpine's libc implementation).

mschwager · 2018-07-20T16:25:44Z

Huh, interesting. It seems reasonable that there can be slight differences between postgres:10.1 and postgres:10.1-alpine, however, the weird part to me is that tests/unit/admin/views/test_users.py:TestUsersList.test_no_query continues to pass while tests/unit/admin/views/test_projects.py:TestProjectList.test_no_query fails.

mschwager · 2018-07-20T17:48:11Z

I found it! If I change key=lambda p: p.normalized_name to key=lambda p: p.name in test_no_query and test_with_page it fixes the tests. I noticed a discrepancy where user_list orders on username and sorts on username, but project_list orders on name and sorts on normalized_name.

I still don't fully understand why this fixes it since test_basic_query and test_wildcard_query aren't failing, but seem to have the same issue.

dstufft · 2018-07-20T19:10:47Z

I'm going to guess that this comes down to differences in locale.

mschwager · 2018-07-20T19:40:42Z

I can change the lambdas to use p.name instead of p.normalized_name to fix the tests, but I'm not necessarily sure that's what we want? At least, not without understanding the problem better. Do you have any thoughts or ideas on where to look next?

ewdurbin · 2018-07-20T23:41:33Z

musl is sufficiently different from libc that for databases in particular it seems unwise to diverge too far from what we will see in CI (travis/ubuntu) and production (AWS RDS, AWS Elasticache, CloudAMQP). All of these targets are (currently) libc based.

As detailed in https://wiki.musl-libc.org/functional-differences-from-glibc.html, locales aren’t supported which is the likely reason for the weird difference.

I appreciate and understand the advantages of smaller images and faster startup times, but am concerned that these images will cause more headaches down the line than they will alleviate.

Given that, I’m -1 on this PR.

mschwager · 2018-07-23T15:55:27Z

Thanks for the additional resources, this is all starting to make a lot more sense to me. I'm willing to concede this PR since I agree that keeping the CI and production environments similar is a worthwhile goal, but I'd like to consider one more take on these changes.

To dig a bit deeper into what's actually breaking, let's take a look at what postgres is returning before and after.

Before:

ipdb> pprint.pprint(list(result["projects"]))
[Project(name='BIMBlgoOkHka'),
 Project(name='bMmmSpgHmHIh'),
 Project(name='byAXNNAdlSqO'),
 Project(name='cBqRuPRfikfn'),
 Project(name='CvRGYyPGJKAl'),
 ...]

After:

ipdb> pprint.pprint(list(result["projects"]))
[Project(name='CJeowjOyVFBg'),
 Project(name='EKxpPcfLjHOf'),
 ...
 Project(name='aFCsScRCyzKS'),
 Project(name='dFMoMkuHlAtk'),
 Project(name='dUvxzRgRVYFQ')]

As you've both correctly deduced, the locale is affecting the sort order returned by postgres. Before, the sort order was solely based on the letter, regardless of case. After, the sort order first sorted upper case then sorted lower case letters. I'm not sure which is preferred from a UX perspective, I'd guess that, for example, users would expect the former in the case of a search box. However, if we ignore UX for a minute, I'd argue that the tests may be testing the wrong thing.

As mentioned earlier, we could change tests/unit/admin/views/test_users.py:TestUsersList.test_no_query to key sorted off of p.name instead of p.normalized_name. I'd argue that this would be more correct since the project_list results are ordered by name. That would fix the tests failing from this PR, but perhaps that still doesn't matter due to the CI vs. production environments issue. Maybe updating the tests is worthwhile in and of itself? Although, the most correct solution is probably something involving matching postgres' locale using locale when sorting, but that's probably more trouble than it's worth.

Thoughts?

dstufft · 2018-07-23T17:17:59Z

The name field is a case insensitive field, so I think that's why it generally sorts case insensitively.

dstufft · 2018-07-28T02:08:39Z

Thanks for this PR, however after thinking it over, I'm going to agree with @ewdurbin here and say we don't want to diverge this much from our production deployment. While we can't get 1:1 parity with our production deployment, the more differences we get, the more likely we are to miss errors until they get into production.

Thanks again, and sorry that we couldn't accept it!

Use alpine equivalent of Docker Compose services to reduce size and s…

a4d3f43

…tartup time

dstufft closed this Jul 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use alpine equivalent of Docker Compose services to reduce size and startup time #4329

Use alpine equivalent of Docker Compose services to reduce size and startup time #4329

Uh oh!

mschwager commented Jul 20, 2018

Uh oh!

ewdurbin commented Jul 20, 2018

Uh oh!

ewdurbin commented Jul 20, 2018 •

edited

Loading

Uh oh!

mschwager commented Jul 20, 2018

Uh oh!

mschwager commented Jul 20, 2018

Uh oh!

dstufft commented Jul 20, 2018

Uh oh!

mschwager commented Jul 20, 2018

Uh oh!

ewdurbin commented Jul 20, 2018 •

edited

Loading

Uh oh!

mschwager commented Jul 23, 2018 •

edited

Loading

Uh oh!

dstufft commented Jul 23, 2018

Uh oh!

dstufft commented Jul 28, 2018

Uh oh!

Uh oh!

Use alpine equivalent of Docker Compose services to reduce size and startup time #4329

Use alpine equivalent of Docker Compose services to reduce size and startup time #4329

Uh oh!

Conversation

mschwager commented Jul 20, 2018

Uh oh!

ewdurbin commented Jul 20, 2018

Uh oh!

ewdurbin commented Jul 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mschwager commented Jul 20, 2018

Uh oh!

mschwager commented Jul 20, 2018

Uh oh!

dstufft commented Jul 20, 2018

Uh oh!

mschwager commented Jul 20, 2018

Uh oh!

ewdurbin commented Jul 20, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mschwager commented Jul 23, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dstufft commented Jul 23, 2018

Uh oh!

dstufft commented Jul 28, 2018

Uh oh!

Uh oh!

ewdurbin commented Jul 20, 2018 •

edited

Loading

ewdurbin commented Jul 20, 2018 •

edited

Loading

mschwager commented Jul 23, 2018 •

edited

Loading