Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent SQS from defaulting to localhost hostname #733

Merged
merged 2 commits into from
May 5, 2017
Merged

Prevent SQS from defaulting to localhost hostname #733

merged 2 commits into from
May 5, 2017

Conversation

alukach
Copy link
Contributor

@alukach alukach commented May 4, 2017

Commit cdbfe9a64e1884dc99fc9806a834f23cf8398a3b added tooling to retrieve the hostname from the SQS Transport. If no hostname is provided (for example, if you only configure an app with broker_transport = 'sqs'), the hostname is set to a default value from the Transport baseclass' default_connection_params property. This is problematic, as the default is 'localhost', which is likely not correct for an SQS queue and results in failed message retrieval. Interestingly, a Celery worker is still able to successfully start, retrieve the list of queues, and retrieve messages being that Celery provides the Transport a default url value of None.

This PR omits hostname from the SQS Transport's default_connection_params property, thereby letting Boto3 use its default logic if no hostname is manually provided to the SQS Transport.

@alukach
Copy link
Contributor Author

alukach commented May 4, 2017

Pinging the author of the first commit, @georgepsarakis.

@georgepsarakis
Copy link
Contributor

@alukach thanks, perhaps I missed this case. Would you mind expanding the unit tests, probably here ?

@georgepsarakis
Copy link
Contributor

Also, do you think this is actually breaking backwards compatibility?

@alukach
Copy link
Contributor Author

alukach commented May 4, 2017

@georgepsarakis Looking into this bug, I've realized that the true issue is that when the ConnectionPool generates new connections (by cloning the connection instance originally sent to the pool from celery.app.connection_for_read()), the NoneType value is being overwritten by the hostname value from SQS.Transport.default_connection_params. I think the test best exemplifies the core of the issue. This tests fails on master.

While this could technically break backwards functionality, I don't think this should be an issue as:

  • I can't think of any reason why a user would want a boto3 connection to localhost
  • Cloned connections maintaining the same hostname as its parent seems to be completely expected behaviour, hard to imagine people assuming otherwise

@georgepsarakis
Copy link
Contributor

I was mostly asking if you think that this will cause an issue for anyone that uses SQS and upgrading to the upcoming release of 4.0.3. If yes, then perhaps we should include this fix as well.

@alukach
Copy link
Contributor Author

alukach commented May 5, 2017

@georgepsarakis Ah, I do think that this is an issue introduced by the commit referenced in the PR's description which would break anyone's system that doesn't explicitly define the SQS endpoint (which, I'd be surprised if anyone actually does specify the endpoint rather than simply specifying the region).

Admittedly, I'm really only looking at this from the perspective of code from #693, I would recommend against deploying any release that includes #693 and #715 without this PR.

@georgepsarakis
Copy link
Contributor

@alukach thanks for your efforts! @thedrow @auvipy do you think we can merge just this fix for the 4.0.3 release?

@georgepsarakis
Copy link
Contributor

By the way, @alukach if you happen to have any comments for this Pull Request, they will be more than welcome 😃 !

Copy link
Member

@auvipy auvipy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a release note is also needed

@alukach
Copy link
Contributor Author

alukach commented May 5, 2017

@auvipy is that something I could help out with or are those done in-house? Would it be a new entry in the CHANGELOG?

@auvipy
Copy link
Member

auvipy commented May 5, 2017

yes

@auvipy
Copy link
Member

auvipy commented May 5, 2017

from now on we shouldn't merged any pr without release note [if there is any]

@alukach
Copy link
Contributor Author

alukach commented May 5, 2017

@auvipy The last entry in the Changelog is:

.. _version-4.0.2:

4.0.2

:release-date: 2016-12-15 03:31 P.M PST
:release-by: Ask Solem

  • Now depends on :mod:amqp 2.1.4

    This new version takes advantage of TCP Keepalive settings on Linux,
    making it better at detecting closed connections, also in failover
    conditions.

  • Redis: Priority was reversed so, e.g. priority 0 became priority 9.

.. _version-4.0.1:

I'm assuming you're suggesting adding an entry for 4.0.3. This PR is a bug fix for and issue raised on master. It feels strange to report both the creation and resolution of a bug within the same release (we may as well just pretend it never happened). There may be other features that will go into 4.0.3, however I think organizing and recording those changes is a bit out of scope for this PR.

@auvipy auvipy merged commit 25a9e76 into celery:master May 5, 2017
@georgepsarakis
Copy link
Contributor

@auvipy @alukach thanks 😄 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants