We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
In a wide crawl, we appear to be hitting URLs that end with *, which leads to queries to OutbackCDX that look like:
*
/dc?limit=1&sort=reverse&url=https%3A%2F%2Fhips.hearstapps.com%2Ftoc.h-cdn.co%2Fassets%2F16%2F46%2F3200x1600%2Flandscape-1479498518-cindy-crawford-rande-gerber-house.jpg%3Fresize%3D1200%3A*
The * on the end forces the matchType to be PREFIX and this is true even if you specify a matchType parameter, and even if the * is encoded as %2A.
matchType
PREFIX
%2A
For now, I'll work around it but I'd like to know how best to handle this situation in the future.
Thanks!
The text was updated successfully, but these errors were encountered:
a2c4158
👍
Sorry, something went wrong.
Oops. Looks like that's a bit of a gotcha in the design of the CDX server API.
I've implemented the solution you alluded to. Specifying matchType=exact will now stop wildcards from being expanded.
No branches or pull requests
In a wide crawl, we appear to be hitting URLs that end with
*
, which leads to queries to OutbackCDX that look like:The
*
on the end forces thematchType
to bePREFIX
and this is true even if you specify amatchType
parameter, and even if the*
is encoded as%2A
.For now, I'll work around it but I'd like to know how best to handle this situation in the future.
Thanks!
The text was updated successfully, but these errors were encountered: