Solr Container Scaling #4762

danmcp · 2018-06-20T12:50:38Z

There has been some initial work to allow scaling for postgres and glassfish.

A similar project needs to be undertaken for Solr. I would expect the implementation to use a StatefulSet and perhaps an Operator.

More details on scaling solr:

https://lucene.apache.org/solr/guide/6_6/introduction-to-scaling-and-distribution.html

pdurbin · 2018-07-17T14:32:45Z

From discussion at http://irclog.iq.harvard.edu/dataverse/2018-07-17 we might want to watch https://www.youtube.com/watch?v=PC8mYweMgV4&t=434s

thaorell · 2018-07-19T17:12:36Z

After some trials and errors, right now I am working on making solr a headless service with 2 nodes (a master and a slave).

pdurbin · 2018-07-23T16:30:33Z

@thaorell and I discussed this issue a bit at http://irclog.iq.harvard.edu/dataverse/2018-07-20 and my quick take is that Solr docs recommend SolrCloud but there's concerns that adding Zookeeper to the mix will complicate things.

thaorell · 2018-08-03T18:08:38Z

Currently, the following work has been done:

Making Solr a StatefulSet
Configuring master-slave replication for better scalability, for more information, read https://lucene.apache.org/solr/guide/6_6/index-replication.html#index-replication
Setting up how backup and restoration for the master pod in OpenShift

pdurbin · 2018-08-07T14:27:09Z

@thaorell great! Are you close to making a pull request? Are you blocked in any way? Please let us know how we can help.

thaorell · 2018-08-07T14:29:51Z

@pdurbin I think I am ready for a pull request. One question though, I have two new files solrconfig_master.xml and solrconfig_slave.xml, should these be in conf/solr or conf/docker/solr?

thaorell · 2018-08-07T14:30:03Z

as I have mentioned with @pdurbin, I also wrote some docs about how to configure persistent volumes on Kubernetes so Solr can backup and restore its index (glassfish and postgres will also follow suit if needed later)

pdurbin · 2018-08-07T14:34:11Z

I have two new files solrconfig_master.xml and solrconfig_slave.xml, should these be in conf/solr or conf/docker/solr?

Let's have @matthew-a-dunlap comment on this because he's actively working on Solr config files for #4836.

Awesome to hear about the backup and restore! As a developer, I run Solr in a very non-fancy way. I'm quick to reinstall it entirely. I have so little data on my laptop that for me it's quick to delete all the data out of Solr and reindex my installation of Dataverse. Real backup and restore sounds like a great feature for production installations of Dataverse.

matthew-a-dunlap · 2018-08-07T15:14:22Z

I've started making some changes to our solr setup in #4836. solrconfig.xml has changed somewhat (I went back to a clean slate) and is definitely going to change more.

More importantly, I changed our solr installation steps based upon recommendations from folks in the solr IRC. Our code in develop is pointing to the installation folder for its templates which could lead to unforseen consequences.

I'm not sure what the best next step is. I can probably update the configs you've created @thaorell as I go, but you may need to test them in the end. Hopefully the solrconfig.xml won't change again much after this story (and if maintaining it becomes a pain we can start using the programmatic API configurations to consolidate what we do).

thaorell · 2018-08-07T15:21:20Z

thanks @matthew-a-dunlap, I will create a PR soon so you could see the files. Ideally these files (either for standalone or distributed deployment cases) should be very similar

matthew-a-dunlap · 2018-08-07T15:52:45Z

Sounds great! @thaorell I realized I didn't answer your initial question about the config placement, maybe put them in the dockers folder for now and if we bring scaling into our normal deployments we can then move those

thaorell · 2018-08-08T14:29:42Z

@matthew-a-dunlap when your finish with #4836, I would appreciate it if you could inform me so I would fix my solrconfig_master.xml and solrconfig_slave.xml.

matthew-a-dunlap · 2018-08-08T21:44:44Z

@thaorell We decided for #4836 to keep it simple and only fix highlighting in schema.xml. The boosting fix and the solr best-practice changes are being put off for #4938 . Should mean that we don't have any conflicts as you don't seem to be touching schema.xml.

djbrooke · 2018-08-09T21:28:10Z

@thaorell - sending this back your way after talking with @matthew-a-dunlap. Let us know when @danmcp's feedback is implemented and we'll take a look in code review. Thanks!

thaorell · 2018-08-13T15:36:04Z

@djbrooke I have implemented accordingly to the feedback.

pdurbin · 2018-08-14T14:15:29Z

@thaorell I just added a review to pull request #4924 and requested some minor changes, removing comments provided that I understand what you've implemented. Overall, this looks great! Thanks!

pdurbin · 2018-08-14T15:12:06Z

Looking good as of d538fac. Moving to QA. Thanks!

Himanshusoni9 · 2023-11-27T06:33:09Z

One More issue : On Solr Collection Data Backup Based on Condition/Data Filter.(There is no provision for that . )

Because SOLR BACKUP API with Query is not working
//http://localhost:8983/solr/admin/collections?action=RESTORE&name=myBackupName&location=C:\Users\DELL\Downloads\SOLR_BACKUP&collection=myCondCollection&query=text:cellphone

I am trying out to perform backups of our Solr data with a particular condition in mind.

To provide some context, let's say Solr collection consists of 100 records, among which 70 records contain the text "mobile," and the remaining 30 records contain the text "cellphone." my objective is to take a Solr collection/data backup that contains only the records the text "cellphone" – essentially, we want to create a backup file that reflects these 30 specific records only.

I would greatly appreciate it if you could share insights on the best practices or methods to achieve this selective backup based on a condition. If there are specific parameters or commands we should be utilizing, kindly provide the necessary guidance. Additionally, any documentation or references you could point us to would be immensely helpful.

Thank you in advance for your time and assistance. We value your expertise and look forward to implementing an efficient solution based on your recommendations.

poikilotherm · 2023-11-27T07:09:40Z

This is not possible via the Backup/Restore API.

Please find a list of supported command options for RESTORE here:
https://solr.apache.org/guide/solr/latest/deployment-guide/collection-management.html#restore

Himanshusoni9 · 2023-11-27T11:04:50Z

This is not possible via the Backup/Restore API.

Please find a list of supported command options for RESTORE here: https://solr.apache.org/guide/solr/latest/deployment-guide/collection-management.html#restore
@poikilotherm
One more question on Backup Restore: Can we restore Solr 8.11 data to Solr 9.4, considering the change in Lucene version from 8.9.0 to 9.8.0? Do we need to reindex it?"

pdurbin · 2023-11-28T20:22:48Z

@Himanshusoni9 hi! This question is probably better asked at https://groups.google.com/g/dataverse-community or https://chat.dataverse.org instead of an old, closed issue. ❤️

pdurbin mentioned this issue Jun 28, 2018

Support multi-core Solr instances #3212

Closed

djbrooke assigned thaorell Jul 17, 2018

djbrooke added the communitydev label Jul 17, 2018

pdurbin self-assigned this Jul 20, 2018

pdurbin removed their assignment Jul 23, 2018

thaorell mentioned this issue Aug 1, 2018

Remote Solr service not working #4910

Closed

thaorell mentioned this issue Aug 7, 2018

4762 solr scaling #4924

Merged

pdurbin added Status: Code Review and removed communitydev labels Aug 7, 2018

pdurbin unassigned thaorell Aug 7, 2018

djbrooke assigned matthew-a-dunlap Aug 8, 2018

djbrooke added communitydev and removed Status: Code Review labels Aug 9, 2018

djbrooke assigned thaorell and unassigned matthew-a-dunlap Aug 9, 2018

matthew-a-dunlap added Status: Code Review and removed communitydev labels Aug 13, 2018

matthew-a-dunlap assigned matthew-a-dunlap and unassigned thaorell and matthew-a-dunlap Aug 13, 2018

pdurbin self-assigned this Aug 14, 2018

pdurbin added communitydev and removed Status: Code Review labels Aug 14, 2018

pdurbin assigned thaorell and unassigned pdurbin Aug 14, 2018

pdurbin added Status: QA and removed communitydev labels Aug 14, 2018

pdurbin unassigned thaorell Aug 14, 2018

kcondon self-assigned this Aug 14, 2018

kcondon closed this as completed Aug 16, 2018

kcondon removed the Status: QA label Aug 16, 2018

djbrooke added this to the 4.10 - Additional Data Transfer Options milestone Aug 20, 2018

thaorell mentioned this issue Aug 20, 2018

Sphinx errors in containers.rst #4983

Closed

djbrooke modified the milestones: 4.10 - Additional Data Transfer Options, 4.9.3 - Optional File PIDs, Initial Internationalization Work Sep 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Solr Container Scaling #4762

Solr Container Scaling #4762

danmcp commented Jun 20, 2018

pdurbin commented Jul 17, 2018

thaorell commented Jul 19, 2018

pdurbin commented Jul 23, 2018

thaorell commented Aug 3, 2018

pdurbin commented Aug 7, 2018

thaorell commented Aug 7, 2018

thaorell commented Aug 7, 2018

pdurbin commented Aug 7, 2018

matthew-a-dunlap commented Aug 7, 2018 •

edited

Loading

thaorell commented Aug 7, 2018 •

edited

Loading

matthew-a-dunlap commented Aug 7, 2018

thaorell commented Aug 8, 2018

matthew-a-dunlap commented Aug 8, 2018 •

edited

Loading

djbrooke commented Aug 9, 2018

thaorell commented Aug 13, 2018

pdurbin commented Aug 14, 2018

pdurbin commented Aug 14, 2018

Himanshusoni9 commented Nov 27, 2023

poikilotherm commented Nov 27, 2023 •

edited

Loading

Himanshusoni9 commented Nov 27, 2023

pdurbin commented Nov 28, 2023

Solr Container Scaling #4762

Solr Container Scaling #4762

Comments

danmcp commented Jun 20, 2018

pdurbin commented Jul 17, 2018

thaorell commented Jul 19, 2018

pdurbin commented Jul 23, 2018

thaorell commented Aug 3, 2018

pdurbin commented Aug 7, 2018

thaorell commented Aug 7, 2018

thaorell commented Aug 7, 2018

pdurbin commented Aug 7, 2018

matthew-a-dunlap commented Aug 7, 2018 • edited Loading

thaorell commented Aug 7, 2018 • edited Loading

matthew-a-dunlap commented Aug 7, 2018

thaorell commented Aug 8, 2018

matthew-a-dunlap commented Aug 8, 2018 • edited Loading

djbrooke commented Aug 9, 2018

thaorell commented Aug 13, 2018

pdurbin commented Aug 14, 2018

pdurbin commented Aug 14, 2018

Himanshusoni9 commented Nov 27, 2023

poikilotherm commented Nov 27, 2023 • edited Loading

Himanshusoni9 commented Nov 27, 2023

pdurbin commented Nov 28, 2023

matthew-a-dunlap commented Aug 7, 2018 •

edited

Loading

thaorell commented Aug 7, 2018 •

edited

Loading

matthew-a-dunlap commented Aug 8, 2018 •

edited

Loading

poikilotherm commented Nov 27, 2023 •

edited

Loading