Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update solr to prepare for 9.7 merge #2

Open
wants to merge 180 commits into
base: feature/SOLR-17334
Choose a base branch
from

Conversation

ellaeln
Copy link

@ellaeln ellaeln commented Aug 2, 2024

bringing this branch forward to enable merging it, hopefully into 9.7

patsonluk and others added 30 commits June 17, 2024 22:43
…to ZK (apache#2438)

Removed logic that registers the synthetic collection/core on a Coordinator node to Zookeeper.
This simplifies the code flow and avoids confusion to external tools that mistakenly recognize the synthetic collection as an actual collection.

This is achieved by the introduction of SyntheticSolrCore which is created by CoordinatorHttpSolrCall.
SyntheticSolrCore provides a shortcut creation route that bypasses ZK registration.
the changes made by SOLR-16962 to ZkController and MoveReplicaCmd
resolve the conflation of ulog and tlog dir, and remove some
related hacky compensations for this conflation. The nightly test
MoveReplicaHdfsFailoverTest was still compensating for this issue
when it no longer needs to, and was thus failing.
The IndexUpgraderTool doesn't have a clear use case for Solr users, and
after some discussion on the mailing list we decided documenting it
isn't worth the confusion it causes for users.

This commit removes the related page from our ref-guide.
…pache#2501)

* Switch to URI 
* Added URL to the forbidden API 

-------------

Co-authored-by:  Uwe Schindler <uschindler@apache.org>
Co-authored-by: David Smiley <dsmiley@apache.org>
Co-authored-by: Eric Pugh <epugh@opensourceconnections.com>
---------

Co-authored-by: Eric Pugh <epugh@opensourceconnections.com>
…che#1557)

Co-authored-by: Christine Poerschke <cpoerschke@apache.org>
Co-authored-by: Michael Gibney <michael@michaelgibney.net>
…on commands (apache#1768)

This goes through all the various CLI tools and moves them all to using the same long form across all our tools, replacing a mish mash of older formats.   -zkHost is now --zk-host for example.  This pattern applies to both Windows and Linux bin/solr commands.  The old forms are now deprecated, though they continue to work.  The -u short command for --credentials is now used everywhere, this did mean the Assert tools "--same-user" command no longer has a "-u" option.
---------

Co-authored-by: Jan Høydahl <jh@cominvent.com>
Co-authored-by: Solr Bot <125606113+solrbot@users.noreply.github.com>
Co-authored-by: Jan Høydahl <janhoy@apache.org>
Co-authored-by: Jason Gerlowski <gerlowskija@apache.org>
This migrates the remainder of Solr's "filestore" APIs to the JAX-RS framework.
No cosmetic changes were made in the process, with the small exception of
folding the internal "local delete" functionality into the existing delete API using
a new `localDelete` boolean query param.
QueryResult refactoring so that it's only returned from SolrIndexSearcher instead of
being provided to it.
Deprecated APIs in 9x; should be removed later.
Deprecated old methods in SolrIndexSearcher that are less clear than using QueryCommand.
Deprecated related old methods in QueryPluginUtils; some should move to DebugComponent.
Removed "env" methods because Solr code should almost always access global configuration via system properties.  Properties are augmented with the "env" nonetheless.
Co-authored-by: Christine Poerschke <cpoerschke@apache.org>
1. fix live-update of configuration
2. fix slot borrowing bug
3. cleaner state tracking via try-with-resources
4. fix refguide documentation to mention limitations and request header requirement
An alternative to the "Prometheus Exporter" in which each Solr node can return the Prometheus format natively from the MetricsHandler using `wt=prometheus` param.  It's much faster and architecturally simpler, albeit less flexible.

Co-authored-by: mbiscocho <mbiscocho@bloomberg.net>
Co-authored-by: Christine Poerschke <cpoerschke@apache.org>
Co-authored-by: David Smiley <dsmiley@salesforce.com>
…eld loading overhead if documentCache==null (apache#2551)

* SOLR-17349: (adjusted) SolrDocumentFetcher should always skip lazy field loading overhead if documentCache==null

this also reverts the code change from 390c30f,
which is obviated by directly making `enableLazyFieldLoading` conditional
on presence of `documentCache`.
Prefix queries have a memory cost that increases in proportion with the
number of indexed terms that start with that prefix.  This can cause
stability problems, especially for short prefixes which can match a
large number of indexed-terms.

This commit introduces a new solrconfig.xml setting,
`minPrefixQueryTermLength`, similar to the existing
`maxBooleanClauses`.  This setting causes an error to be thrown any
time a prefix query is created with a prefix shorter than the configured
minumum.

The limit is set at '-1' in the default configset, essentially disabling it.  But
users may change the value in their own configs or use the
'solr.query.minPrefixLength' property as an override.  The limit may also be
overridden on a per-query basis by providing a 'minPrefixQueryTermLength'
local param.
iamsanjay and others added 20 commits August 30, 2024 12:10
… command. (apache#2675)

the repetition of the word 'solr' looks awkward, but that is how it is!
Instead of custom code that accomplishes the same, sometimes more slowly.
Delete unused BytesOutputStream.
Avoids a minor security matter of exposing a host & port in certain cases.

jetty/jetty.project#11014

Not needed in Jetty 12 (Solr 10).
"You can repeat this process until you’ve fetched as many docs as you want, or until the nextCursorMark returned matches the cursorMark you’ve already specified — indicating that there are no more results."

The pseudo code misses the first part. The extra if also prevents an unnecessary extra request.
…nd returned partial results in some situations
See JIRA for motivation and discussion of particular descriptors.
The default ShardHandler implementation, HttpShardHandler, sends all
shard-requests serially, only parallelizing the waiting and parsing of
responses.  This works great for collections with few shards, but as the
number of shards increases the serialized sending of shard-requests adds
a larger and larger overhead to the overall request (especially when
auth and PKI are done at request-sending time).

This commit fixes this by introducing an alternate ShardHandler
implementation, geared towards collections with many shards.  This
ShardHandler uses an executor to parallelize both request sending and
response waiting/parsing.  This consumes more CPU, but reduces greatly
reduces the latency/QTime observed by users querying many-shard
collections.
Solr uses different API roots for its v1 and v2 APIs: "/solr" for v1,
and "/api" for v2.  Adding a method for each SolrRequest to indicate
which API version it belongs to will allow SolrJ to build the full
request path more reliably without relying on fragile 'instanceof'
checks.
@tboeghk
Copy link
Collaborator

tboeghk commented Sep 10, 2024

Sorry, didn't see this PR in time. As work is continuing in the new PR I assume it's safe to close this PR?

epugh and others added 9 commits September 10, 2024 12:35
…ions (apache#2684)

* move -a and -addlopts to deprecated and use --jvm-opts for long format.
* this option is only called by scripts, its not called by end users, so we don't need deprecation in RunExampleTool
…as Solr. (apache#2698)

* Introduce new header for help text to clarify when you must run on the same server as solr.

* small formatting tweak to improve whitespace in cli output

* use similar text for similar meaning
Co-authored-by: Christine Poerschke <cpoerschke@apache.org>
Co-authored-by: Christine Poerschke <cpoerschke@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment