Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProxyUrl: Determine whether to proxy via web service #3199

Merged
merged 18 commits into from
Nov 13, 2023
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions config/vufind/config.ini
Original file line number Diff line number Diff line change
Expand Up @@ -1449,6 +1449,18 @@ replace_other_urls = true
; to false.
;prefixLinks = true

; Use a web service to determine whether to prefix each link, ignoring prefixLinks
; above. Query the configured URL via HTTP GET and a single query parameter, 'url',
; set to the link domain. The web service should return a body with a single number,
; '1' to prefix and '0' if not. The web service should be hosted closely such that
; query time is minimal.
; See https://github.com/lehigh-university-libraries/ezproxy-url-checker for an OSS
; implementation of this protocol that extracts the answers from EZproxy config.
;prefixLinksWebServiceUrl = http://localhost:8888/proxyUrl

; Duration (seconds) to cache web service response data. Default is 600.
;prefixLinksWebServiceCacheLifetime = 600

; Uncomment the following line and change the password to something secret to enable
; EZproxy ticket authentication.
;secret = "verysecretpassword"
Expand Down
78 changes: 74 additions & 4 deletions module/VuFind/src/VuFind/View/Helper/Root/ProxyUrl.php
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,11 @@

namespace VuFind\View\Helper\Root;

use Exception;
use Laminas\Cache\Storage\Adapter\AbstractAdapter as CacheAdapter;

use function intval;

/**
* Proxy URL view helper
*
Expand All @@ -38,8 +43,14 @@
* @license http://opensource.org/licenses/gpl-2.0.php GNU General Public License
* @link https://vufind.org/wiki/development Wiki
*/
class ProxyUrl extends \Laminas\View\Helper\AbstractHelper
class ProxyUrl extends \Laminas\View\Helper\AbstractHelper implements
\Laminas\Log\LoggerAwareInterface,
\VuFindHttp\HttpServiceAwareInterface
{
use \VuFind\Cache\CacheTrait;
use \VuFind\Log\LoggerAwareTrait;
use \VuFindHttp\HttpServiceAwareTrait;

/**
* VuFind configuration
*
Expand All @@ -51,10 +62,13 @@ class ProxyUrl extends \Laminas\View\Helper\AbstractHelper
* Constructor
*
* @param \Laminas\Config\Config $config VuFind configuration
* @param CacheAdapter $cache Cache for web service responses
*/
public function __construct($config = null)
public function __construct($config = null, CacheAdapter $cache = null)
{
$this->config = $config;
$this->setCacheStorage($cache);
$this->cacheLifetime = intval($config->EZproxy->prefixLinksWebServiceCacheLifetime ?? 600);
}

/**
Expand All @@ -66,10 +80,66 @@ public function __construct($config = null)
*/
public function __invoke($url)
{
$usePrefix = !isset($this->config->EZproxy->prefixLinks)
|| $this->config->EZproxy->prefixLinks;
$useWebService = $this->config->EZproxy->prefixLinksWebServiceUrl ?? false;
if ($useWebService) {
$usePrefix = $this->checkUrl($url) ?? $this->checkConfig();
maccabeelevine marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On closer examination, I'm not really sure if it makes sense to use checkConfig() as the fallback here. It looks to me like the main purpose of the prefixLinks setting is to allow prefixing to be globally disabled. I think if somebody configures both prefixLinks and prefixLinksWebServiceUrl at the same time, that's essentially a configuration error. I suppose it doesn't necessarily hurt anything to leave this as it is, but it just feels to me like it might not be the best option. Maybe we should just fall back to true instead of falling back to checkConfig, for example...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree on this, although not strongly. That fallback to checkConfig() is going to happen on any exception checking the web service, such as a connection error, which itself is likely enough from time to time (even if the services are co-hosted; things break). So people should think about what the fallback default should be, i.e. it's better as a configured param than hard-coded, and so we may as well reuse prefixLinks. Admittedly the config.ini isn't clear that it's a fallback, so I just fixed that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense -- I was thinking about it differently, but tweaking the comments is a good solution. :-)

} else {
$usePrefix = $this->checkConfig();
}

return ($usePrefix && isset($this->config->EZproxy->host))
? $this->config->EZproxy->host . '/login?qurl=' . urlencode($url)
: $url;
}

/**
* Return the configured prefixLinks setting.
*
* @return bool The configured setting, or the default
*/
protected function checkConfig()
{
return $this->config->EZproxy->prefixLinks ?? true;
}

/**
* Check whether the given URL requires the proxy prefix. Cache the response.
*
* @param string $url The raw URL to check
*
* @return mixed Whether the URL should be prefixed, or null if it can't be determined
*/
protected function checkUrl($url)
{
$domain = parse_url($url, PHP_URL_SCHEME)
. '://'
. parse_url($url, PHP_URL_HOST);
$cacheKey = "proxyUrl-domainToUsePrefix-$domain";
$usePrefix = $this->getCachedData($cacheKey);
if (null === $usePrefix) {
$usePrefix = $this->queryWebService($domain);
$this->putCachedData($cacheKey, $usePrefix);
}
return $usePrefix;
}

/**
* Query the web service on whether to prefix URLs to a given domain.
*
* @param $domain The domain
*
* @return mixed Whether the URL should be prefixed, or null if it can't be determined
*/
protected function queryWebService($domain)
{
$prefixLinksWebServiceUrl = $this->config->EZproxy->prefixLinksWebServiceUrl;
try {
$response = $this->httpService->get($prefixLinksWebServiceUrl, ['url' => $domain]);
$responseData = $response->getContent();
} catch (Exception $ex) {
demiankatz marked this conversation as resolved.
Show resolved Hide resolved
$this->logError('Exception during EZproxy web service request: ' . $ex->getMessage());
return null;
maccabeelevine marked this conversation as resolved.
Show resolved Hide resolved
}
return '1' === $responseData;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it might be a good idea to trim $responseData? I tested this by putting a text file in VuFind's public directory and pointing the configuration at that -- I could then edit the file to change it to 0 or 1 to test both cases. However, my editor initially added line breaks, and 1\n is interpreted as "not 1." This caused me a little bit of confusion, and I imagine it's possible if somebody builds their own custom web service that the responses might end up having stray whitespace in them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, done.

}
}
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,8 @@ public function __invoke(
}
$config = $container->get(\VuFind\Config\PluginManager::class)
->get('config');
return new $requestedName($config);
$cache = $container->get(\VuFind\Cache\Manager::class)
->getCache('object');
return new $requestedName($config, $cache);
}
}