When T-Engine throws an Exception, the document gets indexed with no content #395

hi-ko · 2022-04-04T08:31:25Z

this is a reference ticket for SEARCH-2974 due to not beeing able to comment on alfresco.atlassian.net

details provided by @binduwavell

What we discovered is if the T-Engine throws, the node gets indexed with no content and the only way to update the content cache is to update the content on the node or to > PURGE/INDEX the node so the transform runs again.

We extended the NodeContentGet webscript and if the transformer fails we throw in this webscript, then Solr marks the node as having a content transform error. Once we've > grabbed our OCR text it is written as a rendition, which moves the node into a new transaction and Solr re-indexes the node automatically and re-attempts content transform....

I think that if a T-Engine fails, Solr should probably not cache empty content... I don't think Solr necessarily needs to retry getting the content, but the next time the node > is indexed, Solr should not trust the empty content cache and should re-attempt NodeContentGet.

This leads to an unreliable index because technical errors inevitably lead to incomplete index. It is not possible to fix these errors afterwards.
Writing zero content to the index if the node content is not empty should be always seen as wrong index content and therefore as a bug.

hi-ko mentioned this issue Apr 4, 2022

mechanism to temporarily prevent text retrieval #396

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

When T-Engine throws an Exception, the document gets indexed with no content #395

When T-Engine throws an Exception, the document gets indexed with no content #395

hi-ko commented Apr 4, 2022 •

edited

Loading

When T-Engine throws an Exception, the document gets indexed with no content #395

When T-Engine throws an Exception, the document gets indexed with no content #395

Comments

hi-ko commented Apr 4, 2022 • edited Loading

hi-ko commented Apr 4, 2022 •

edited

Loading