-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No document can be fed because ostensibly there isn't enough disk space #499
Comments
I have updated https://pyvespa.readthedocs.io/en/latest/troubleshooting.html#full-disk, linking to a new example in https://pyvespa.readthedocs.io/en/latest/application-packages.html , where one can set a higher limit. I know this is a bit cumbersome, please give this a try and let me know. https://docs.vespa.ai/en/proton.html#proton-maintenance-jobs means that Vespa needs disk space for compaction jobs. How much is schema-dependent, 75% is a conservative number - this helps operators avoiding index corruption due to full disk. |
Thanks a lot for your effort! If I were making design choices regarding the architecture I would fix the issue at its root. Which I believe is the fact that the disk space required by Vespa for "compaction jobs" is proportional to the space already occupied by the storage. It has absolutely nothing to do with the total disk space. An empty storage doesn't need free extra 100Gb to add ten 5Kb documents to an empty storage. I wonder what other convenient choices were made by Vespa authors, but I'd rather not spend my time satisfying my curiosity so long as there are properly designed alternatives that just work. |
Some time ago, in a distant past, the conclusion was vespa was made for large systems spanning multiple distributed machines focusing on scalability and ease of operation. From this the conclusion followed that apart for monitoring and other minor services vespa would be the only service running. This still holds true. If you want to present a smaller portion of your machine to vespa docker/podman is the better solution for that. |
@baldersheim , thank you for the explanation! If you don't mind:
I tried using docker. However, specifying |
|
@ch3rn0v 2 - It is slightly more complicated as it is a variable number which initially will be very small and not enough to cater for the fixed overhead. Doing that would also allow other services using disk/memory to run on the same node, which would be a feature for development nodes, but make running a production system much more complicated. And since there are other ways to solve this issue we have not prioritized making our own solution. |
Now that I see the reasoning behind this choice, I understand it much better. Thanks a lot for clarifying this! Unfortunately I didn't find any easy and clean way to run and test Vespa locally w/o Docker GUI, so perhaps I'll try it out some other time out of curiosity. For now running Weaviate seemed simple enough, but Vespa and Faiss is something I'd like to do in order to compare things. |
I follow this guide https://pyvespa.readthedocs.io/en/latest/getting-started-pyvespa.html step by step.
When I try to add even a single document, it results in an error:
ReturnCode(NO_SPACE, External feed is blocked due to resource exhaustion: disk on node 0 [textsearch] (0.9 > 0.750))
Suppose I have a 100 TB disk with 90 TB being occupied. Adding a 3kb document fails because 10 TB isn't enough for vespa. Does a ratio (instead of an absolute value) sound reasonable as a default threshold to anyone?
How do I change this 0.75 to another value? I found this approach on SO, but I don't see how it can be used with pyvespa. Should I use
vespa.package.ApplicationConfiguration
? If yes, what is the key? Should value be a string or a float? I don't see it in the docs. These options didn't work:content.tuning.resource-limits.disk
tuning.resource-limits.disk
resource-limits.disk
disk
The text was updated successfully, but these errors were encountered: