Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Postgrex.Notifications does not receive notifications from hosted PostgreSQL DB on Azure #375

Closed
petersenlance opened this issue Mar 26, 2018 · 18 comments

Comments

@petersenlance
Copy link

We are using Microsoft Azure with our app running on VMs and our PostgreSQL DB on a separate VM. Using Postgrex.Notifications we are able to emit events from our DB and respond accordingly in our app. We have been exploring moving to a hosted DB in Azure, but in our testing we have been unable to receive any events through Postgrex.Notifications.

Using the psql client, I have verified that our database is emitting the pg_notify events and I have confirmed that Postgrex.Notifications.start_link and Postgrex.Notifications.listen return the {:ok, _} tuples that are expected.

@josevalim
Copy link
Member

Thanks @petersenlance for the heads up!

We don't have resources to look into this, so I would recommend you to dig deeper and let us know of anything you find or if we can help with something.

@petersenlance
Copy link
Author

Thanks for the response, @josevalim. We'll keep working on this and let you know if we make progress.

If you do end up having some resources and it would help, we could set up a dummy hosted DB through our Azure account and DM you the credentials for you to connect to it for testing.

@fishcakez
Copy link
Member

Could this be duplicate of #359 ?

@petersenlance
Copy link
Author

It is definitely connected, since #359 was opened by one of my co-workers. However, that issue was different. In that case, we had a VM with our own PostgreSQL DB set up. When our app first connected to our DB we would receive notifications through Postgrex.Notifications.

We are looking into moving to a hosted solution with Azure, but when we connect to the hosted DB we never receive notifications.

@josevalim
Copy link
Member

Hi @petersenlance, any news here?

Which PGSQL version are you running in production? If you can create a minimal app that reproduces the issue by connecting to Azure, I can take a look at it. Ideally the app would:

  1. have a process sending notifications through azure
  2. have a process receiving those notifications through azure

Since the app may have private credentials, it is ok for it to be private. You can give it access to my GH account.

@slashdotdash
Copy link

@josevalim I've encountered the exact same issue with a Postgres (v9.6) database hosted in Azure.

Running the postgrex notification tests, via a fork of this repo and by adding this commit with a hard-coded connection to an Azure hosted Postgres instance, results in the following failures:

$ mix test test/notification_test.exs 

  1) test listening, notify, then receive (using registered names) (NotificationTest)
     test/notification_test.exs:51
     No message matching {:notification, ^receiver_pid, ^ref, "channel", ""} after 1000ms.
     The following variables were pinned:
       receiver_pid = #PID<0.189.0>
       ref = #Reference<0.1943451657.2336751618.90446>
     The process mailbox is empty.
     code: assert_receive {:notification, ^receiver_pid, ^ref, "channel", ""}
     stacktrace:
       test/notification_test.exs:58: (test)

  2) test listening x2, unlistening, notify, receive (NotificationTest)
     test/notification_test.exs:70
     No message matching {:notification, ^pid, ^ref1, "channel", ""} after 1000ms.
     The following variables were pinned:
       pid = #PID<0.192.0>
       ref1 = #Reference<0.1943451657.2336751618.90560>
     The process mailbox is empty.
     code: assert_receive {:notification, ^pid, ^ref1, "channel", ""}, 1_000
     stacktrace:
       test/notification_test.exs:80: (test)
.
  3) test listening, notify, then receive (without payload) (NotificationTest)
     test/notification_test.exs:43
     No message matching {:notification, ^receiver_pid, ^ref, "channel", ""} after 1000ms.
     The following variables were pinned:
       receiver_pid = #PID<0.199.0>
       ref = #Reference<0.1943451657.2336751618.90753>
     The process mailbox is empty.
     code: assert_receive {:notification, ^receiver_pid, ^ref, "channel", ""}
     stacktrace:
       test/notification_test.exs:48: (test)
.
  4) test listening, notify, then receive (with payload) (NotificationTest)
     test/notification_test.exs:35
     No message matching {:notification, ^receiver_pid, ^ref, "channel", "hello"} after 1000ms.
     The following variables were pinned:
       receiver_pid = #PID<0.205.0>
       ref = #Reference<0.1943451657.2336751618.90946>
     The process mailbox is empty.
     code: assert_receive {:notification, ^receiver_pid, ^ref, "channel", "hello"}
     stacktrace:
       test/notification_test.exs:40: (test)
...

Finished in 62.6 seconds
9 tests, 4 failures

I've looked through the Postgres server parameters that are available to configure in the Azure portal but cannot see anything that is noticeably different than on my local Postgres instance which is working fine.

@josevalim
Copy link
Member

@slashdotdash can you ping me on IRC or somewhere where we can chat? :)

@josevalim
Copy link
Member

Basically, I cannot try your commit out because of "(invalid_authorization_specification) no pg_hba.conf entry for host "188.146.238.184", user "postgrex", database "postgrex_test", SSL on". If you can share that privately it would be welcome. :)

@slashdotdash
Copy link

@josevalim I've just sent you an email.

@josevalim
Copy link
Member

Thanks @slashdotdash!

I was able to reproduce the issue.

I am thinking this is an issue with Azure. If the connection that is listening is idle, then the notification is never received. However, if you issue commands in the connection, then the notifications do arrive.

Take the following test:

  test "listening, notify, then receive (without payload)", context do
    assert {:ok, ref} = PN.listen(context.pid_ps, "channel")

    assert {:ok, %Postgrex.Result{command: :notify}} = P.query(context.pid, "NOTIFY channel", [])
    receiver_pid = context.pid_ps
    Process.sleep(1000)
    # _ = PN.listen(context.pid_ps, "channel2")
    assert_receive {:notification, ^receiver_pid, ^ref, "channel", ""}
  end

If you uncomment the line above, then the test pass, meaning that you receive the notification as long as you are writing to the connection, which defeats the whole purpose of LISTEN/NOTIFY.

@timbuchwaldt
Copy link

Hey!
We had the same issue and triggered @slashdotdash as it occurred in Commanded.

We raised the issue with Azure support, current status being:

I replicated the behavior outside the Elixir environment.
I engaged the Azure Database for PostgreSQL engineering team to analyze the problem.

and

The Azure Database for PostgreSQL engineering team is investigating the issue.
I will provide an update as soon as possible.

@petersenlance
Copy link
Author

@slashdotdash I'm glad you were you able to set up something that reproduces this.

We have been working with Azure support. They pointed me at tcp_keepalives (https://www.postgresql.org/docs/9.5/static/runtime-config-connection.html and https://stackoverflow.com/questions/20856599/mysql-ping-hangs-with-amazon-rds) but I have not had a chance to test this out yet.

@josevalim
Copy link
Member

@petersenlance not sure if keep alive will make a difference because those things are happening in a matter of miliseconds and we still do not receive any data.

@michalmuskala
Copy link
Member

Do we know anything more here? From reading the discussion it looks like the problem is at the azure side not ours? @timbuchwaldt @petersenlance could you confirm this?

@SkipMike
Copy link

SkipMike commented Jun 9, 2018

So, is there a build of postgrex out there that will we can try to use to solve our problems?

@josevalim
Copy link
Member

@SkipMike if you use the master branch, we have added periodic pings. you can change the ping frequency and that will specify how frequent you receive messages. That may workaround the issue but I still think the behaviour is broken on the Azure side of things.

@SkipMike
Copy link

SkipMike commented Jun 9, 2018

@josevalim Thanks. we'll try it out.

@josevalim
Copy link
Member

Closing this as we believe this issue is fixed in master. We hope a new release will be out in the weeks. Thanks!

mbuhot added a commit to mbuhot/ecto_job that referenced this issue Oct 4, 2018
This is primarily to support EctoJob in Azure where pg_notify is known to not work.
elixir-ecto/postgrex#375

Additionally, this change ensures that event if there is no demand,
poll messages still activate scheduled / expired jobs.
mbuhot added a commit to mbuhot/ecto_job that referenced this issue Oct 4, 2018
This is primarily to support EctoJob in Azure where pg_notify is known to not work.
elixir-ecto/postgrex#375

Additionally, this change ensures that even if there is no demand,
poll messages still activate scheduled / expired jobs.
mbuhot added a commit to mbuhot/ecto_job that referenced this issue Oct 4, 2018
This is primarily to support EctoJob in Azure where pg_notify is known to not work.
elixir-ecto/postgrex#375

Additionally, this change ensures that even if there is no demand,
poll messages still activate scheduled / expired jobs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants