-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refresh BaseX DB at given intervals, but not during startups #45
Comments
Split off from #14 as agreed in the 4th SG meeting on 2018-09-04. |
After conducting some research, we think that this change would need major refactoring of the code. Many controllers on the webapp code (TestDriverController, TestResultController...) rely on an initialized DataStorageService, that contains a BaseX instance using the class BsxDataStorage. The application can't deploy this controllers, that are necessary to run the services, without starting this data storage. Also, we think that this operation makes more sense during startup, better than launching this task at the same time that the users are using the ETF. Looking at the class BsxDataStorage, on the etf-bsxds module, we can't observe any possible refactoring to improve the startup time, all the tasks executed on initialization seems necessary for us. In our experience, the most time-consuming task during deployment is the download of ETS files from GitHub. We may run some more tests to assess startup times thoroughly. |
Due to some gross startup tests, BaseX DB startup time has not revealed to be an excessive time consumption task in the startup considering it in absolute terms. A mean startup for the ETF validator is 60 seconds which we can estimate that 30/60 is consumed by the BaseX DB initialisation. Event if the BaseX initialization is roughly a 50% of the total time, it is still a very low amount of time. Thus, considering that:
|
Closed as agreed in the SG meeting on 2020-01-21 |
Background and Motivation:
Performance optimisation is required to reduce startup time and validation time; reduction of startup time will simplify cloud deployment horizontal scaling, while reduction in validation time will be helpful while integrating ETF with INSPIRE Geoportal, or any other Metadata related workflow/pipeline.
One identified performance issue is that BaseX contains a cache of all validation results, and of all tests; the DB is re-initialised each time Tomact is started/restarted; related data is persisted on file system under /home/tomcat/.etf/ .
Proposed change
Refresh BaseX DB at given intervals, but not during startups.
Alternatives
A parameter could be introduced to deactivate some time-consuming consistency checks during the startup.
Funding
JRC will be ready to fund within its current development contract.
Additional information
n/a
The text was updated successfully, but these errors were encountered: