Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

6633 update solr 772 #6631

Merged
merged 9 commits into from
Feb 13, 2020
Merged

Conversation

poikilotherm
Copy link
Contributor

@poikilotherm poikilotherm commented Feb 10, 2020

What this PR does / why we need it:
We should update to the latest supported version of Solr in the 7.x release train to be on an upstream supported release.

According to this list, there have been no minor or major security problems.

Which issue(s) this PR closes:

Closes #6633

Special notes for your reviewer:

2020-02-10: this is a WIP
Basically, I just did a grep and sed to replace any "7.3.1" with "7.7.2":

rg "7\.3\.1" --files-with-matches | xargs sed -i '' -e 's/7\.3\.1/7.7.2/g'
mv conf/solr/7.3.1 conf/solr/7.7.2

Do we need to add more notes, add docs, ...?

We should discuss about the deprecation warnings (see #6599) and maybe get rid of 'em to prepare for the future. If we change that, we will need to check about reindexing advice for RLN.

Suggestions on how to test this:
We still need to figure out how to test all of this.

Components affected:

Does this PR introduce a user interface change?:
Nope. Nada.

Is there a release notes update needed for this change?:
Definitely. Still needs to be take care of. A first draft is included in the commits.

Additional documentation:
None so far.

@coveralls
Copy link

coveralls commented Feb 10, 2020

Coverage Status

Coverage decreased (-0.01%) to 19.45% when pulling b1fb21d on poikilotherm:6599-update-solr-772 into 9ce4aa0 on IQSS:develop.

Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell all instances of 7.3.1 have been replaced with 7.7.2. But like @poikilotherm said, a release note needs to be added.

@pdurbin
Copy link
Member

pdurbin commented Feb 10, 2020

We should discuss about the deprecation warnings (see #6599)

How frequent are these warnings? Only on startup? Or are they emitted constantly and there's a risk of filling up a disk?

@poikilotherm
Copy link
Contributor Author

I only see them at startup, when config is parsed.

Copy link
Member

@pdurbin pdurbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! (I haven't tested anything.) Thanks, @poikilotherm !

@kcondon
Copy link
Contributor

kcondon commented Feb 10, 2020

Quick q: has anyone in Dev reviewed the Solr release notes to see whether there are any obvious points of concern prior to testing?

@pdurbin
Copy link
Member

pdurbin commented Feb 10, 2020

@kcondon I haven't. @poikilotherm , have you ?

@kcondon
Copy link
Contributor

kcondon commented Feb 10, 2020

@pdurbin I guess it was really a request if it hasn't been done. Thanks!

@poikilotherm
Copy link
Contributor Author

I have. I couldn't find anything relevant for us, but this should be verified in-depth by someone else. 4 eyes see more than 2.

@pdurbin
Copy link
Member

pdurbin commented Feb 10, 2020

I just looked at the "upgrade notes" at https://lucene.apache.org/solr/7_7_2/changes/Changes.html for all the versions (screenshot below) and I don't see anything I'm concerned about. We don't use fancy features in Solr. I don't think any of these upgrade notes pertain to our use of Solr.

lucene apache org_solr_7_7_2_changes_Changes html

@kcondon
Copy link
Contributor

kcondon commented Feb 10, 2020

@poikilotherm @pdurbin Thanks for checking!

@kcondon kcondon self-assigned this Feb 10, 2020
@donsizemore
Copy link
Contributor

I recompiled v4.19 with @poikilotherm's change to pom.xml, deployed it to https://payara5.odum.unc.edu and upgraded Solr to 7.7.2 on that host, FWIW.

Only entry in server.log so far:

The web application [unknown] registered the JDBC driver [org.apache.solr.client.solrj.io.sql.DriverImpl] but failed to unregister it when the web application was stopped. To prevent a memory leak, the JDBC Driver has been forcibly unregistered.

@poikilotherm
Copy link
Contributor Author

poikilotherm commented Feb 10, 2020

I see the same error message with current SolrJ 7.3.1, so no blocker 😌

@poikilotherm poikilotherm changed the title 6599 update solr 772 6633 update solr 772 Feb 12, 2020
@kcondon
Copy link
Contributor

kcondon commented Feb 13, 2020

@poikilotherm Quick q: would it not be simpler to just install the new version rather than upgrade?

@poikilotherm
Copy link
Contributor Author

poikilotherm commented Feb 13, 2020

That might depend on how much time it takes to reindex and how much downtime is ok for your installation. I have no experience with big installations with 100.000+ datasets. Maybe @4thikonov can share experience about his 50.000 dataset import to give us an idea?

On second thought: as long as we don't touch the schema, a reindex shouldn't be necessary. Also, reading the upgrade docs again, the process is all about installing the new version, move your collection in place and start Solr.

@kcondon
Copy link
Contributor

kcondon commented Feb 13, 2020

@poikilotherm Thanks. It takes ~18hrs to fully reindex Harvard. What I was trying to figure out is whether:

  1. This pr requires installations to upgrade due to a changed and incompatible? solrj client
  2. Whether the index is also incompatible and so needs migrating or clearing and rebuilding
  3. If your reindex instructions could be done incrementally or whether you were calling for a clean and reindex.
  4. Our solr installation instructions are relatively simple but I noticed in the upgrade link you'd posted, it mentions in Taking Solr to Production, that there is a solr installation script that they seem to recommend. This is likely beyond the scope of this ticket but wondered whether our instructions could use a review/update.

@poikilotherm
Copy link
Contributor Author

Ad 1) I have no idea if SolrJ 7.7 is compatible with Solr 7.3. I even have no idea where to look for such a statement. IMHO it might be better to upgrade, as our old SolrJ version has a CVE. We could also test this, but upgrading is always better IMHO...

Ad 2) We are only doing minor upgrades. Index is compatible from what I see in the release docs and upgrade notes.

So no need to reindex as long as we don't change the schema config or a reindex becomes necessary due to changes in a metadata block like citation.tsv.

If you're asking the paranoid me, I'd do an update test with sample data...

Ad 3) I'm pretty sure that neither Dataverse nor Solr supports partial reindexes. An inplace reindex should be fine, no need to drop and rebuild. That should be much faster too, shouldn't it?

@pdurbin
Copy link
Member

pdurbin commented Feb 13, 2020

@poikilotherm heads up that I tweaked your release note at b1fb21d. @kcondon and I worked on it together.

@pdurbin
Copy link
Member

pdurbin commented Feb 13, 2020

I'm pretty sure that neither Dataverse nor Solr supports partial reindexes.

Well, you can reindex individual datasets, for example: http://guides.dataverse.org/en/4.19/admin/solr-search-index.html#reindexing-datasets

@kcondon kcondon merged commit 317e4ff into IQSS:develop Feb 13, 2020
@poikilotherm
Copy link
Contributor Author

Thank you everyone involved for a quick solution and all your efforts you spent so eager and willingly! Very much appreciated!

@kcondon
Copy link
Contributor

kcondon commented Feb 19, 2020

@poikilotherm @pdurbin After some additional testing, post merge, I'm finding some strange behavior:
performance is really poor on our production db copy and the pattern of indexing and logging has changed. After two hours it had indexed 5 dataverses, 27 datasets, and 399 files. With the current system it indexes all the dataverses first, then the datasets and indexes all 6000+ dataverses in 15 minutes. After switching back to prod release and solr the same db, same gfish, solr 7.3.0 on same box as 7.7.2 returned to good performance. I'm not seeing anything in server logs to indicate the issue and gfish/java is showing 100% cpu when running with new solr/war. I'm wondering if I have something misconfigured? I will try the new war with the current solr config to eliminate that from the equation. Any help or suggestions are appreciated.

@poikilotherm
Copy link
Contributor Author

You are testing with full text indexing turned on, right? If so, could you try again with fulltext indexing turned off and see if performance is back again?

Any chance we can create a representative test data set so this could be part of a load test in CI? If you have some time in spare: any chance to test this with current dataverse-sample-data to check if this happens with it, too? Reproducible problems are so much easier to debug... 😉

@kcondon
Copy link
Contributor

kcondon commented Feb 19, 2020

@poikilotherm I am not testing with full text indexing and I am using the same db for both current and new solr test scenarios.

Ok, using the process of elimination, I deployed your war file using the existing production solr version that had been performing well and now see the same performance issues. So it appears to be something in the pr? Apologies for not catching it sooner -it passed basic functional testing and I had some config issues with my prod/volume test but saw this as lower risk, based on the small code change.

Last update: I am able to use the production 4.19 war against solr 7.7.2 and it works as expected in terms of performance and logging.

I will try a build from develop just in case I have a bad build for some reason.

Still working on the develop approach -hit a snag with another pr. Will ask @djbrooke for input. We do not have any logging to speak of for this problem so maybe some logging might help.

@kcondon
Copy link
Contributor

kcondon commented Feb 20, 2020

@poikilotherm @pdurbin So it looks like it is not the solrj client version, nor solr server but something in the develop branch related to indexing a particular dataverse we have in production, Murray Research Archive. I'm not sure yet what might be the problem but 1. the dv fails to index with no error, 2. batch index is impacted in a weird way, still partly functions, 3. logging is impacted and does not help with understanding the problem. I think 2 and 3 are preexisting and 1 might be related to dv metadata changes in this branch?

Effectively I can reproduce part of the behavior by trying to index the last attempted dv in the logs:
curl http://localhost:8080/api/admin/index/dataverses/10

I've created a separate issue for this: #6665 I do not think it has to do with this pr at this point until we learn more.

@pdurbin
Copy link
Member

pdurbin commented Feb 25, 2020

I do not think it has to do with this pr at this point until we learn more.

I agree. Judging from investigation by @sekmiller the problems mentioned above seem to have been introduced in pull request #6564. #6665 is the issue to watch for updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Upgrade from Solr 7.3.1 to 7.7.2
6 participants