Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

archived https URLs aren't replaying in Ubuntu with some Tomcat versions #398

Closed
ldko opened this issue Apr 24, 2019 · 0 comments
Closed

Comments

@ldko
Copy link
Member

ldko commented Apr 24, 2019

Discussion about this happened on IIPC Slack. For reference, I am putting some of the details in this issue to go along with the PR #397 opened by @peveikko.

On Ubuntu (not an issue with CentOS and RHEL) in at least some Tomcat versions , OpenWayback is returning Resource Not in Archive for https scheme archived URIs and suggests to search under http://https/www. Same pages do work with http scheme.

@peveikko noted: For https URLs Everything works fine at centos/rhel, but got this behaviour with 3 different ubuntu machines. Also tried with different tomcat/java versions.

@anjackson supplied following:
Okay, so I think this is to do with a CVE https://nvd.nist.gov/vuln/detail/CVE-2015-5174 -- I think Tomcat have added some URL clean-up/normalisation, meaning that later versions of Tomcat 6/7/8 may all have the same problem. This doesn't affect http URLs, perhaps because this code reinserts any stripped slash?

// This looks a little confusing: We're trying to fixup an incoming
// request URL that starts with:
// "http:/www.archive.org"
// so it becomes:
// "http://www.archive.org"
// (note the missing second "/" in the first)
//
// if that is not the case, then see if the incoming scheme
// is known, adding an implied "http://" scheme if there doesn't appear
// to be a scheme..
// TODO: make the default "http://" configurable.
if (!urlStr.startsWith(UrlOperations.HTTP_SCHEME)) {
if(urlStr.startsWith("http:/")) {
urlStr = UrlOperations.HTTP_SCHEME + urlStr.substring(6);
} else {

...Easiest thing might be to modify the WaybackRequest to explicitly support /https:/host/... (assuming I've got this right of course)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant