Skip to content

A way to reduce index db size #3416

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dpilin opened this issue Feb 5, 2021 · 3 comments
Closed

A way to reduce index db size #3416

dpilin opened this issue Feb 5, 2021 · 3 comments

Comments

@dpilin
Copy link

dpilin commented Feb 5, 2021

Hello!

Could you please advise is there any way to reduce index db files' size? I was indexing a very large SVN repo this week (each project ~ 200 Gb of size) and stumbled upon high disc space usage. After investigation I found out that for each project indexer had created index db files with almost same size as the project itself.

For example, there is a project directory which is 191 Gb of size, the indexer created a group of files which are 149 Gb of size. Is it an expected behaviour? If so, is there a way to reduce these files' size?

Thanks in advance

@vladak
Copy link
Member

vladak commented Feb 8, 2021

It really depends on the input data. In general the index size / input data size ratio is way less than 1. Examining the contents of the index might be the first step, e.g. using Luke.

@vladak
Copy link
Member

vladak commented Feb 8, 2021

Also, in case history is enabled the index will contain entries from the history of the repository. That could contribute to the index size significantly if the history is long/rich.

@vladak vladak added the indexer label Feb 8, 2021
@dpilin
Copy link
Author

dpilin commented Feb 8, 2021

It seems to be the case, we have a large amount of historical data in our repository. Thank you, we'll try tuning the indexer's configuration to avoid not-necessary files indexing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants