Skip to content

reuse IndexReader objects for multi-project searches #1186

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Sep 30, 2016

Conversation

vladak
Copy link
Member

@vladak vladak commented Sep 16, 2016

This fixes #1132 and speeds up multi-project search by doing the right thing w.r.t. IndexSearcher object reuse as stressed in just every book about Lucene.

There are 2 problems with this change:

  • the destroy() method should be called after each SearchHelper or SearchEngine is used for searching, ideally in try-finally block. However, this is not easy to do, namely in search.jsp where the code is interspersed with include blocks etc. It is important to call destroy to make sure all IndexSearcher objects are properly returned to their corresponding SearcherManager objects. I briefly considered to use Finalizer however all references dissuade everyone from using it so I chickened out.
  • the relationship between IndexSearcher/SeracherManager is a bit clumsy. The SearcherManager's are kept in a project map in RuntimeEnvironment and the consumers (SearchHelper or SearchEngine) have to keep their own IndexSearcher map for multi-project searches so that they can return the IndexSearcher objects in destroy(). Also, invalidating SearcherManager objects when corresponding project disappears after configuration change is not done to keep things simple. As a result, the map grows unbounded.

@vladak
Copy link
Member Author

vladak commented Sep 16, 2016

Another problem is partial reindex - usually it is done from the OpenGrok shell script with environment variable OPENGROK_WEBAPP_CFGADDR=none so the webapp has no way of knowing that refresh is needed. This will be taken care of after some time (more precisely no later than after getIndexRefreshPeriod() timeout) thanks to the indexReopenThread background thread periodically calling maybeRefresh() on all SearcherManager objects. I filed #1187 to track this.


/**
* Unit tests for the {@code SearchHelper} class.
*/
public class SearchHelperTest {
TestRepository repository;
private final String ctagsProperty = "org.opensolaris.opengrok.analysis.Ctags";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dup?
can we take it from a final variable from code instead of redeclaring?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filed #1198 to track this


@Before
public void setUp() throws IOException {
repository = new TestRepository();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also duplicate this on quite a few places, dedup might be nice (but for this dup not in this changeset, just for future)

@tarzanek
Copy link
Contributor

ship it!
let's get our hands on it and let's get dirty! :)

@tarzanek tarzanek added this to the 0.13 milestone Sep 16, 2016
@vladak
Copy link
Member Author

vladak commented Sep 30, 2016

I tried to address the destroy() issue as suggested by @tulinkry by adding a hook into requestDestroyed() and by wrapping the search engine in JSONSearchServlet into try/finally block.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

consider reusing IndexSearcher objects
2 participants