Skip to content

indexer in 1.7.x still consumes more CPU than in 1.5 #4003

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
vladak opened this issue Jul 22, 2022 · 5 comments
Open

indexer in 1.7.x still consumes more CPU than in 1.5 #4003

vladak opened this issue Jul 22, 2022 · 5 comments
Labels

Comments

@vladak
Copy link
Member

vladak commented Jul 22, 2022

I was hoping that this issue was resolved in the past year, unfortunately it is still the case:
screenshot-20220720-135334
I guess we'll stay forever on the last 1.5 release as it works just fine. Not sure if we need to configure or tune anything, I was hoping for an easy out-of-the-box solution. I even tried to disable the suggester by disabling it in the read-only config, but either it didn't work or it didn't help.

Originally posted by @der-eismann in #3585 (comment)

@vladak
Copy link
Member Author

vladak commented Jul 22, 2022

@der-eismann Looks like it is necessary to properly root cause the change. Could you share the details about the indexer run and the overall environment ? Also, what exact 1.7.x version are you using at the moment ?

@vladak vladak added the indexer label Jul 22, 2022
@chenchuanliang
Copy link

I also encountered the same problem. 1.7 indexing is too slow.

@GCSimba
Copy link

GCSimba commented Mar 29, 2023

Hi, do you know how to increase the number of threads in opengrok? I think it takes a lot of time to build an index

@vladak
Copy link
Member Author

vladak commented Mar 29, 2023

Hi, do you know how to increase the number of threads in opengrok? I think it takes a lot of time to build an index

Please create separate discussion for that question.

@der-eismann
Copy link
Contributor

@vladak sorry for the late reply. We continued running on 1.5 since it was just working™. I tried to investigate a bit again with 1.12 as the usage is still high, however it seems to be related to the whole per-project management that was introduced in 1.6.

Our setup consists of 240 Git repositories that are externally updated (automatic removal of archived repos, adding new ones...). So back in 1.5 we only ran it with NOMIRROR=true and it was fine. We had a project overview on the default page with the dates of the last commits, everything was good.
And it's still fine if I run 1.12 with AVOID_PROJECTS=true, CPU usage is similar to how it was in 1.5 and indexing is done in ~3 mins. However it is not possible to select specific projects anymore on the main page, since projects are avoided, and there's no Git History or Annotation, which sucks a bit.

Now when I remove AVOID_PROJECTS and replace it with WORKERS=1, I can see in the startup logs that it takes ~4 minutes to add all projects and then it's just Sync starting without any further output. Even after 20 minutes the CPU usage is consistently high (about 2-3 CPU cores), and I still don't have a project overview on the main page. Is there any misconfiguration?

All I want is to index our 240 Git repos, re-index them every 10 minutes without updating or syncing them and ideally have an overview on the main page. So basically the 1.5 behavior. Any way to achieve that?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants