Stuck in Moped::Errors::ConnectionFailure: Could not connect to a primary node for replica set #348

dblock · 2015-02-08T12:35:52Z

Moped::Errors::ConnectionFailure: Could not connect to a primary node for replica set #<Moped::Cluster:128953180 @seeds=[<Moped::Node resolved_address="10.95.128.244:27017">, <Moped::Node resolved_address="10.184.156.102:27017">]>

…avity/ruby/2.0.0/gems/moped-2.0.3/lib/moped/cluster.rb: 254:in `with_primary'
…ty/ruby/2.0.0/gems/moped-2.0.3/lib/moped/collection.rb: 124:in `insert'
…by/2.0.0/gems/mongoid-4.0.0/lib/mongoid/query_cache.rb: 117:in `insert_with_clear_cache'
…ems/mongoid-4.0.0/lib/mongoid/persistable/creatable.rb:  79:in `insert_as_root'

Occasionally we see a machine or two stuck in this. I am not sure when this happens, but about 10% of nodes end up in this state every 24 hours. The MongoDB cluster is doing fine.

This issue could probably use more detail, please tell me what to look for next time I have a machine in this state.

The text was updated successfully, but these errors were encountered:

wandenberg · 2015-02-13T23:22:52Z

Hi @dblock could you check if the code on #352 solve this problem?

niedfelj · 2015-02-20T15:56:52Z

@dblock Please see my PR #338
We were having these errors too, and I'm guessing that you are actually having a pool saturation problem and not primary node connection issues. In general, the logging in mongoid is pretty terrible. Are you running Puma? And have you tuned pool_size and pool_timeout?

steve-rodriguez · 2015-02-20T16:12:21Z

How do you go about tuning those? How do you know what to set them to? Are there guidelines?

niedfelj · 2015-02-20T16:30:27Z

In general, you should have a pool_size that is equal to or greater than the number threads you are running. You shouldn't need to tune pool_timeout. Here is an update submitted to mongoid for generating the mongoid.yml giving more details on those configs

https://github.com/mongoid/mongoid/pull/3883/files

niedfelj · 2015-02-20T16:31:17Z

These PRs might also be useful to you, in adding more/better logging in error situations and giving metrics on per request in rails:

https://github.com/mongoid/mongoid/pull/3885
https://github.com/mongoid/mongoid/pull/3884

dblock · 2015-02-20T17:30:14Z

#352 has so far been good to us in production (72 hours). So it has improved things I want to say.

fedenusy · 2015-02-23T16:57:32Z

I'm seeing this error as well.

ajsharp · 2015-03-08T03:23:45Z

+1. We see this a couple of times per day, seemingly on a random basis.

wnkz · 2015-03-10T09:16:54Z

+1 also seeing this.

InvisibleMan · 2015-03-13T21:38:06Z

I think, MOPED also use wrong thread-safe code.

#353 (comment)

ajsharp · 2015-03-13T21:59:21Z

Interesting. Does anyone see this behavior with unicorn? I've seen it with puma (threads), but don't have anything in production with unicorn.

Wondering if switching the app server to unicorn might be an easy "fix", because it seems like the real fix could take a bit of time.

ajsharp · 2015-03-13T22:00:10Z

@arthurnn any thoughts on this issue?

InvisibleMan · 2015-03-16T09:33:37Z

I'm using sidekiq gem and I have not choice.

glebtv · 2015-03-21T14:13:05Z

I just spent 20 minutes debugging an issue with this error message, and I found that when calling .find(nil) in moped it results in this (incorrect) error message.

> session[:test].find(nil).first
Moped::Errors::ConnectionFailure: Could not connect to a primary node for replica set #<Moped::Cluster:69729780 @seeds=[<Moped::Node resolved_address="127.0.0.1:27017">]>

Whereas without arguments it's ok:

> session[:test].find().first
=> nil

Expected error message would be something along InvalidFind

sahin · 2015-06-06T14:36:17Z

+1

wandenberg · 2015-06-06T19:03:32Z

who still having this problem and can help me with the setup environment and a description on how to reproduce it?

sahin · 2015-06-07T13:51:14Z

+1 @wandenberg , we still have this problem in production. It is simple to reproduce it, shutdown one of the server in the replication or close the port.

nofxx · 2015-06-07T23:10:34Z

+1. Monkey increasing POOL_SIZE seems to give more time between errors.
Also, looks like sidekiq is playing a major role.
I got 90 sidekiq workers in 3 servers, plus 10 or so unicorns. Still don't get the pool size 5...

davidleroy · 2015-07-01T16:09:32Z

+1

brand-it · 2015-07-01T16:52:05Z

👍

mhuggins · 2015-10-09T17:41:27Z

We're seeing this error crop up in some sidekiq jobs.

chenqiangzhishen · 2016-07-26T10:56:35Z

+1, still see the issue

Moped::Errors::ConnectionFailure

Could not connect to a primary node for replica set #<Moped::Cluster:50526920 @seeds=[<Moped::Node resolved_address="10.23.84.206:27018">, <Moped::Node resolved_address="10.23.84.207:27018">]>

traceback

vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/cluster.rb:254:in `with_primary'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/read_preference/primary.rb:55:in `block in with_node'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/retryable.rb:30:in `call'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/retryable.rb:30:in `with_retry'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/retryable.rb:39:in `rescue in with_retry'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/retryable.rb:29:in `with_retry'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/retryable.rb:39:in `rescue in with_retry'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/retryable.rb:29:in `with_retry'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/read_preference/primary.rb:54:in `with_node'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/cursor.rb:139:in `load_docs'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/query_cache.rb:234:in `block in load_docs'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/query_cache.rb:135:in `with_cache'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/query_cache.rb:234:in `load_docs'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/cursor.rb:28:in `each'
vendor/bundle/ruby/gems/moped-2.0.7/lib/moped/query.rb:78:in `each'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/contextual/mongo.rb:122:in `each'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/contextual.rb:20:in `each'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/criteria/findable.rb:107:in `entries'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/criteria/findable.rb:107:in `from_database'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/criteria/findable.rb:75:in `multiple_from_db'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/criteria/findable.rb:19:in `execute_or_raise'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/criteria/findable.rb:40:in `find'
vendor/bundle/ruby/gems/mongoid-4.0.2/lib/mongoid/findable.rb:90:in `find'
....
....

dennislysenko · 2016-11-30T16:03:29Z

How do you properly set up a moped pool if not using mongoid? Here is how I'm doing it, and still occasionally getting these errors:

$mongo_pool = ConnectionPool.new(size: 30, timeout: 3000) do
  mongo_client = Moped::Session.new(Moped::Uri.new(uri_string).hosts)
  mongo_client.use(dbname)
end

# have one main one open
mongo_client = Moped::Session.new(Moped::Uri.new(uri_string).hosts)
$mongo = mongo_client.use(dbname)

where uri_string is in the format: mongodb://1.2.3.4:27017/desired_db_name

Might end up just dropping moped as I'm not even using mongoid and that seems to be the biggest use/support case :/

elenatanasoiu · 2017-03-29T13:47:43Z

It could be that mongo is not running. Have you tried:

sudo rm /var/lib/mongodb/mongod.lock
sudo service mongodb start

deepthawtz · 2017-04-18T19:49:21Z

@elenatanasoiu the problem is that mongod is running and replica set is healthy but these error messages crop up nevertheless

bastoune · 2017-08-01T09:52:03Z

Hey, did you find any solution ?

shivamv · 2017-09-11T18:15:44Z

@bastoune we used sidekiq for background jobs and puma. Both being multithreaded, supporting 25 and 16 threads by default.
Now, mongoid by default has pool size as 5, evidently, there were situations wherein the poolsize got exhausted in this case resulting into
Moped::Errors::ConnectionFailure: Could not connect to a primary node for replica set #<Moped::Cluster:1223353180 @seeds=[<Moped::Node resolved_address="xx.xxx.xxx.xxx:27017">, <Moped::Node resolved_address="xx.xxx.xxx.xxx:27017">]>

Fixed it by tuning poolsize, sidekiq + puma threads.
Here is an article for sql database though i suppose it clarifies the fundamentals

bastoune · 2017-09-28T14:42:31Z

@shivamv Thanks for the reply, going to spend more time to understands this ;)

yanghoxom · 2018-08-27T03:26:21Z

@shivamv thanks, it help full, maybe somebody miss turn on docker have mongoid inside?

dblock mentioned this issue Feb 8, 2015

Moped is utilizing 'bad' Connections from the connection Pool #346

Open

fedenusy mentioned this issue Mar 3, 2015

Writes fail with ConnectionPool::PoolShuttingDownError #345

Closed

nofxx mentioned this issue Jun 7, 2015

ConnectionPool::PoolShuttingDownError #378

Open

teambundledore mentioned this issue Oct 10, 2017

Stuck in Moped::Errors::ConnectionFailure: Could not connect to a primary node for replica set AFDC/Platinum#144

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stuck in Moped::Errors::ConnectionFailure: Could not connect to a primary node for replica set #348

Stuck in Moped::Errors::ConnectionFailure: Could not connect to a primary node for replica set #348

dblock commented Feb 8, 2015

wandenberg commented Feb 13, 2015

niedfelj commented Feb 20, 2015

steve-rodriguez commented Feb 20, 2015

niedfelj commented Feb 20, 2015

niedfelj commented Feb 20, 2015

dblock commented Feb 20, 2015

fedenusy commented Feb 23, 2015

ajsharp commented Mar 8, 2015

wnkz commented Mar 10, 2015

InvisibleMan commented Mar 13, 2015

ajsharp commented Mar 13, 2015

ajsharp commented Mar 13, 2015

InvisibleMan commented Mar 16, 2015

glebtv commented Mar 21, 2015

sahin commented Jun 6, 2015

wandenberg commented Jun 6, 2015

sahin commented Jun 7, 2015

nofxx commented Jun 7, 2015

davidleroy commented Jul 1, 2015

brand-it commented Jul 1, 2015

mhuggins commented Oct 9, 2015

chenqiangzhishen commented Jul 26, 2016 •

edited

Loading

dennislysenko commented Nov 30, 2016

elenatanasoiu commented Mar 29, 2017

deepthawtz commented Apr 18, 2017

bastoune commented Aug 1, 2017

shivamv commented Sep 11, 2017 •

edited

Loading

bastoune commented Sep 28, 2017

yanghoxom commented Aug 27, 2018 •

edited

Loading

Stuck in Moped::Errors::ConnectionFailure: Could not connect to a primary node for replica set #348

Stuck in Moped::Errors::ConnectionFailure: Could not connect to a primary node for replica set #348

Comments

dblock commented Feb 8, 2015

wandenberg commented Feb 13, 2015

niedfelj commented Feb 20, 2015

steve-rodriguez commented Feb 20, 2015

niedfelj commented Feb 20, 2015

niedfelj commented Feb 20, 2015

dblock commented Feb 20, 2015

fedenusy commented Feb 23, 2015

ajsharp commented Mar 8, 2015

wnkz commented Mar 10, 2015

InvisibleMan commented Mar 13, 2015

ajsharp commented Mar 13, 2015

ajsharp commented Mar 13, 2015

InvisibleMan commented Mar 16, 2015

glebtv commented Mar 21, 2015

sahin commented Jun 6, 2015

wandenberg commented Jun 6, 2015

sahin commented Jun 7, 2015

nofxx commented Jun 7, 2015

davidleroy commented Jul 1, 2015

brand-it commented Jul 1, 2015

mhuggins commented Oct 9, 2015

chenqiangzhishen commented Jul 26, 2016 • edited Loading

dennislysenko commented Nov 30, 2016

elenatanasoiu commented Mar 29, 2017

deepthawtz commented Apr 18, 2017

bastoune commented Aug 1, 2017

shivamv commented Sep 11, 2017 • edited Loading

bastoune commented Sep 28, 2017

yanghoxom commented Aug 27, 2018 • edited Loading

chenqiangzhishen commented Jul 26, 2016 •

edited

Loading

shivamv commented Sep 11, 2017 •

edited

Loading

yanghoxom commented Aug 27, 2018 •

edited

Loading