Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

esclient error: No Living connections #17

Closed
vesameskanen opened this issue May 2, 2016 · 11 comments
Closed

esclient error: No Living connections #17

vesameskanen opened this issue May 2, 2016 · 11 comments

Comments

@vesameskanen
Copy link

It seems that in some environments, Pelias dbclient fails to add part of imported data to ES index. The error shows up after importing about 1 million documents. The exact error message is:

2016-05-02T12:03:14.304Z - error: [dbclient] esclient error Error: No Living connections
at sendReqWithConnection (/root/.pelias/openaddresses/node_modules/pelias-dbclient/node_modules/elasticsearch/src/lib/transport.js:211:15)
at next (/root/.pelias/openaddresses/node_modules/pelias-dbclient/node_modules/elasticsearch/src/lib/connection_pool.js:213:7)
at process._tickCallback (node.js:355:11)
2016-05-02T12:03:14.304Z - error: [dbclient] invalid resp from es bulk index operation
2016-05-02T12:03:14.304Z - info: [dbclient] retrying batch [500]
2016-05-02T12:03:14.304Z - error: [dbclient] transaction error reached max retries

@missinglink
Copy link
Member

missinglink commented May 2, 2016

hi @vesameskanen, I've not seen that message myself during an import.

The No Living connections error message means that the API codebase could not connect to the elasticsearch cluster, it indicates that the elasticsearch cluster may have been shut down?

I get the same message some times (on my local machine) when I shut down elasticsearch and then npm start the API code, it outputs loads of No Living connections messages.

Before we investigate further can you please confirm that; at the time you receive these messages; that the elasticsearch cluster you're using is still responding to curl requests such as curl localhost:9200/?

Is it possible that you have a script which is shutting down the server before the import has been completed?

@orangejulius
Copy link
Member

I think I've seen this before. Our dbclient has a bug where no rate limiting is applied to batches that are being retried, which has the unhelpful effect of hammering elasticsearch when it starts to come under heavy load from importing. It also means you'll have to scroll up really far in the console output, to the first batch failure, to see if it was a timeout, or some other kind of error.

@vesameskanen
Copy link
Author

Import succeeds here in my native debian box, but fails every time when run in order to build a docker image of ES instance with Pelias data. It is possible that the running ES service somehow fails and shuts down inside the docker environment, but that is quite unlike to happen. I suspect that there is a problem with KeepAlive setting or something else related to socket reuse, leading to exhaustion of available sockets. I will study the problem in more detail and report my findings,

@vesameskanen
Copy link
Author

Some further observations: the errors show up in the last step of our data import, OpenAddresses. Part of it runs OK, but then indexing starts failing consistently. The first error is:

ERROR: 2016-05-03T05:40:51Z
Error: Request error, retrying
POST http://localhost:9200/_bulk => read ECONNRESET
at Log.error (/root/.pelias/openaddresses/node_modules/pelias-dbclient/node_modules/elasticsearch/src/lib/log.js:226:60)
at checkRespForFailure (/root/.pelias/openaddresses/node_modules/pelias-dbclient/node_modules/elasticsearch/src/lib/transport.js:244:18)

After that, I get another 'read ECONNRESET' error, then an error message 'socket hang up', and after that lots of 'Unable to revive connection' and 'No living connections' errors. It seems that all remaining indexing actions fail. At the end of import, ES has indexed 1 022 226 documents. When any of the data sources is dropped, so that the number of documents is clearly smaller, errors do not show up.

By the way, failures do not stop the import process; maybe they should, because errors could get buried into the massive log file.

@vesameskanen
Copy link
Author

I just verified that at the end of import, after those 'no living connections' errors, ES is not reachable via curl either. This can be purely an environment specific problem - Google found lots of similar issues with other ES client apps, too. I guess it might be best to close this issue as it seems to be a general problem, not specific to Pelias.

@easherma
Copy link

easherma commented Jun 14, 2016

Having the same problem, especially as @orangejulius describes it.

@orangejulius
Copy link
Member

orangejulius commented Jun 14, 2016

This is interesting, and might be worth keeping open if Pelias can reliably trigger issues with Elasticsearch. Is there any info in the Elasticsearch log that looks helpful?

@easherma
Copy link

I got those living connection errors.

I checked and it seemed like the ES service ran out of memory and stopped running. Wasn't sure what else I could do besides restart the service. After that it seemed to work for awhile, but ran into some more errors:

[2016-06-14 00:09:36,603][WARN ][indices.breaker ] [Headlok] [FIELDDATA] New used memory 663753694 [633mb] from field [phrase.he] would be larger than configured breaker: 633785548 [604.4mb], breaking
[2016-06-14 00:09:36,603][WARN ][index.warmer ] [Headlok] [pelias][0] failed to warm-up global ordinals for [he]
org.elasticsearch.ElasticsearchException: org.elasticsearch.ElasticsearchException: org.elasticsearch.common.breaker.CircuitBreakingException: [FIELDDATA] Data too large, data for [phrase.he] would be larger than limit of [633785548/604.4mb]
at org.elasticsearch.index.fielddata.plain.AbstractIndexOrdinalsFieldData.loadGlobal(AbstractIndexOrdinalsFieldData.java:73)
at org.elasticsearch.index.fielddata.plain.AbstractIndexOrdinalsFieldData.loadGlobal(AbstractIndexOrdinalsFieldData.java:41)
at org.elasticsearch.search.SearchService$FieldDataWarmer$3.run(SearchService.java:966)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

Which eventually turned into:

2016-06-14T06:59:58.886Z - error: [dbclient] [503] UnavailableShardsException[[pelias][0] Primary shard is not active or isn't assigned to a known node. Timeout: [1m], request: org.elasticsearch.action.bulk.BulkShardRequest@553a2c]

Nothing seemed to make it into ES, if I spin up the API to try and run a search I can't find anything.

Some other thoughts I'll stick here until I find a better place for them

  • Assuming the root cause of these issues has more do with ES configuration for large imports, it might be worth it to have some logic that looks at the size of your pbf import and gives you a flag with some options/suggestions before confirming import (I think there was part of the vagrant environment that did something similar?)
  • Logging/reporting on progress could be more clear. It would be good feedback for the user to see where in the process we ran into problems. Realizing it may not actually be that simple to have a %progress bar.
  • Mentioned elsewhere, but ability to update/merge would be great. And/or split large imports into bundled transactions that can be committed independently of each other.

@orangejulius
Copy link
Member

Yup, these are the circuit breaker issues I remember from doing a North America import a while back. Your suggestions for how to make things clearer are all excellent.

The fix is to adjust the ES_HEAP_SIZE variable in the script that starts up Elasticsearch. On Linux (at least for me) it's at /etc/init.d/elasticsearch. The guideline for the heap size is the minimum of 50% of your RAM and 31GB, so on my 16GB laptop I have it set to 8GB. Let me know if adjusting that is enough for things to be fixed for you.

@ktjaco
Copy link

ktjaco commented Jul 14, 2016

Just ran into some of the errors listed above. I've been testing out different Elasticsearch configuration settings.

http://kufli.blogspot.ca/2014/11/elasticsearch-advanced-settings-and.html

@thucnc
Copy link

thucnc commented Apr 4, 2018

I got the same issue, but in different use-case. I create an api to import single address into pelias as below

streams.importPipe = function (stream) {
	streams.elasticsearch = require('pelias-dbclient');
	return stream
		 .pipe( streams.addressNormalizerAfterDedup ())
     .pipe( streams.adminLookup() )
     .pipe( streams.dbMapper() )
     .pipe( streams.elasticsearch({ batchSize: 1}) );
}
module.exports = streams;

and

app.post("/v1/pelias/import", (req, res) => {
	if (!req.body.address || !req.body.lat || !req.body.lng) return res.status(401).send({title: 'Missing required parameters: address/lat/lng'})
	var address = req.body,
			stream  =  streams.importPipe(streamify([address]));
	
	stream.on('finish', (data) => {
		console.log(address || 'done')
		res.send({})
	})
})

The first call to api works well, but the second try got the mentioned error. So, I think it doesn't relate to memory as the load is very lightweight.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants