-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elasticsearch self repair and handling of "valid" error responses #1471
Comments
@svenoe
So the first line of investigation would be adding a |
The problem is, that if I understand correctly the real error message does not reach this point because of otobo/Kernel/GenericInterface/Transport/HTTP/REST.pm Lines 880 to 883 in b6845c9
$ResponseError contains some not so useful information. But probably(?) you are right and we should use this method, but give it useful data.
As to provoking errors, yes, shutting down the ES container is one option. In the end you probably should delete a ticket from the index, too, and update its queue, or so, anyways. I propose copying a normal request out of the Webservice debugger, altering it and sending it via curl. Maybe the copying part is not even needed - it looks relatively easy: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete.html (Or the more complicated version: https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html). |
Investigated which clients can be used for manipulating Elasticsearch. Curator, https://www.elastic.co/guide/en/elasticsearch/client/curator/current/about.html, does not do the job, as it is intended for the maintenance of indices. Fell back to using a small script based on Search::Elasticsearch, https://metacpan.org/pod/Search::Elasticsearch. Checked what happens when updating the queue of a ticket that has been deleted in Elasticsearch.
The error message is useful for readding the relevant ticket to Elasticsearch. |
While looking into this, I noticed that there is a mixup of message for the generic interface debugger. When a mapping module can't be initialized then the error message of a previous function result is written into the debug log. Creating an extra issue for this, as this is a bug. |
Digging deep into the Generic Interface code I find that there are at lease three opportunities for error handling.
For the case 1 there is currently only
The case 2 is used for dynamic fields. In this case HandleError() is mapped to HandleResponse(). My impression is that this intended as a more straight forward callback mechanism. The case 3 is a bit confusing. It seems to be redundant with regard to the case 2. The goal of this issue seems to be more like case 1. It is mostly a retry with a twist. But there can be some nasty cases. When Elasticsearch is down and then restarted, then a lot of tasks can be queued up. When there are many independent changes to a ticket, then each change would cause a reindexing of the ticket in Elasticsearch. An alternative approach would be:
TODO;
|
As a quick comment - I do not see it as a real "redo with a twist", yes you are right for the case of rebuilding the complete ticket it definitely is (and yes, care has to be taken to not run into loops), but it has other purposes, too, e.g. if the ingest plugin which extracts data from an attachment returns that it cannot handle encrypted attachments, I don't want an error message in the logs, maybe an info or debug statement, but the plugin complaining is totally fine. So in the end I see it more as a way to silently exit and just potentially start some additional action, as the rebuild. |
It looks the search in Elasticsearch and the search in the database are pretty much independent of each other. The database search, especially the article word index, is active per default. The Elasticsearch indexing must be set up as a web service. There is the table attribute |
Discussed this issue with @svenoe and we concluded:
|
Back to the drawing board as this is fairly confusing.
in
It is expected that error handling modules that suppress the message also set StopAfterMatch. |
The tentative names for the new error handling modules are:
|
Started working on the FailureAccept error handling module. Found that adding a new module means a lot of duplication, especially in the admin interface. suppose it makes sense to make the error handling modules more generic.
Edit: the TODO items won't be done |
Please have a quick look at my message... ;) |
for custom callback that assess the validty of a request response. For now this is only available for HTTP::REST transport objects. There should be no functional changes yet.
Discussed this with @svenoe . We found that using Error modules is not worth it, especially as it would not affect the first message in |
Index attachment by attachment, so that error handling is easier
…handler Issue #1471 webservice error handler
The current state of the implementation appears to be stable and working. But it is expected that further instances of unhelpful errors will crop up. These should be handled in new issue. Closing this issue. |
If for some reason the creation of a ticket in Elasticsearch fails, all subsequent article creations and other ticket updates will fail. We receive a clear error response though, and should use it to try to create the ticket a second time. While at it, we could filter out errors from not being able to handle specific files (like encrypted files, etc.) from the ingest plugin, and maybe put an info text to the log, but not more.
For both to work, first the invokers must be extended to handle errors (task five of #772). The important places for this would be:
$FunctionResult = $TransportObject->RequesterPerformRequest(
$RestClient->$RestCommand(@RequestParam);
I'm not sure how to do this in a nice way, without looking further into this. In principle we don't want the error handling right after the second, but the communication with the invoker usually is done in the first (e.g. line 206:
my $FunctionResult = $InvokerObject->PrepareRequest(
and especially interesting since we would probably need something similar line 226:elsif ( $FunctionResult->{StopCommunication}
).For the self repair of the ticket index, line 539-552 of Kernel/System/Console/Command/Maint/Elasticsearch/Migration.pm
# create the ticket
could be taken, which is quite simple in itself, some measurement has to be taken, though, to not cause infinite loops! :)The text was updated successfully, but these errors were encountered: