Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixes undefined offset by adding an isset test #1103

Merged
merged 2 commits into from
Feb 17, 2021

Conversation

kouralex
Copy link
Contributor

@kouralex kouralex commented Dec 7, 2020

Whilst checking php error log for a Skosmos instance I noticed that the line 467 below

/**
* Invokes the alphabetical listing for a specific vocabulary.
*/
public function invokeAlphabeticalIndex($request)
{
$lang = $request->getLang();
$this->setLanguageProperties($lang);
$template = $this->twig->loadTemplate('alphabetical-index.twig');
$vocab = $request->getVocab();
$offset = ($request->getQueryParam('offset') && is_numeric($request->getQueryParam('offset')) && $request->getQueryParam('offset') >= 0) ? $request->getQueryParam('offset') : 0;
if ($request->getQueryParam('limit')) {
$count = $request->getQueryParam('limit');
} else {
$count = ($offset > 0) ? null : 250;
}
$contentLang = $request->getContentLang();
$allAtOnce = $vocab->getConfig()->getAlphabeticalFull();
if (!$allAtOnce) {
$letters = $vocab->getAlphabet($contentLang);
$letter = $request->getLetter();
if ($letter === '') {
$letter = $letters[0];
}
$searchResults = $vocab->searchConceptsAlphabetical($letter, $count, $offset, $contentLang);
} else {
$letters = null;
$searchResults = $vocab->searchConceptsAlphabetical('*', null, null, $contentLang);
}
can cause Undefined offset: 0 PHP Notice.

By following the code it calls it seems like the NULL value is then passed here:

/**
* Generates sparql query clauses used for creating the alphabetical index.
* @param string $letter the letter (or special class) to search for
* @return array of sparql query clause strings
*/
private function formatFilterConditions($letter, $lang) {
$useRegex = false;
if ($letter == '*') {
$letter = '.*';
$useRegex = true;
} elseif ($letter == '0-9') {
$letter = '[0-9].*';
$useRegex = true;
} elseif ($letter == '!*') {
$letter = '[^\\\\p{L}\\\\p{N}].*';
$useRegex = true;
}
# make text query clause
$lcletter = mb_strtolower($letter, 'UTF-8'); // convert to lower case, UTF-8 safe
if ($useRegex) {
$filtercondLabel = $lang ? "regex(str(?label), '^$letter$', 'i') && langMatches(lang(?label), '$lang')" : "regex(str(?label), '^$letter$', 'i')";
$filtercondALabel = $lang ? "regex(str(?alabel), '^$letter$', 'i') && langMatches(lang(?alabel), '$lang')" : "regex(str(?alabel), '^$letter$', 'i')";
} else {
$filtercondLabel = $lang ? "strstarts(lcase(str(?label)), '$lcletter') && langMatches(lang(?label), '$lang')" : "strstarts(lcase(str(?label)), '$lcletter')";
$filtercondALabel = $lang ? "strstarts(lcase(str(?alabel)), '$lcletter') && langMatches(lang(?alabel), '$lang')" : "strstarts(lcase(str(?alabel)), '$lcletter')";
}
return array('filterpref' => $filtercondLabel, 'filteralt' => $filtercondALabel);
}

where at line 1207 it is again (luckily) changed back to an empty string. This will, however, cause a filter clause that has a STRSTARTS() with an empty argument, which is unnecessary and will always evaluate to True.

Whilst this PR fixes the PHP notice, it won't change the behavior (i.e., the query as it is, now). I wonder if it would be more appropriate to, in this case, change the letter value to "*", for which there is a special treatment (unlike for empty string). However, now that I look into that, it seems like the STRSTARTS part is unnecessary for the aforementioned "*" value.

Additionally, I just noticed that this also affects the JenaTextSparql:

/**
* Generates the jena-text-specific sparql query used for rendering the alphabetical index.
* @param string $letter the letter (or special class) to search for
* @param string $lang language of labels
* @param integer $limit limits the amount of results
* @param integer $offset offsets the result set
* @param array|null $classes
* @param boolean $showDeprecated whether to include deprecated concepts in the result (default: false)
* @param \EasyRdf\Resource|null $qualifier alphabetical list qualifier resource or null (default: null)
* @return string sparql query
*/
public function generateAlphabeticalListQuery($letter, $lang, $limit = null, $offset = null, $classes = null, $showDeprecated = false, $qualifier = null)
{
if ($letter == '*' || $letter == '0-9' || $letter == '!*') {
// text index cannot support special character queries, use the generic implementation for these
return parent::generateAlphabeticalListQuery($letter, $lang, $limit, $offset, $classes, $showDeprecated, $qualifier);
}
$gc = $this->graphClause;
$classes = ($classes) ? $classes : array('http://www.w3.org/2004/02/skos/core#Concept');
$values = $this->formatValues('?type', $classes, 'uri');
$limitandoffset = $this->formatLimitAndOffset($limit, $offset);
# make text query clause
$lcletter = mb_strtolower($letter, 'UTF-8'); // convert to lower case, UTF-8 safe
$langClause = $this->generateLangClause($lang);
$textcondPref = $this->createTextQueryCondition($letter . '*', 'skos:prefLabel', $langClause);
$textcondAlt = $this->createTextQueryCondition($letter . '*', 'skos:altLabel', $langClause);
$orderbyclause = $this->formatOrderBy("LCASE(?match)", $lang) . " STR(?s) LCASE(STR(?qualifier))";
$qualifierClause = $qualifier ? "OPTIONAL { ?s <" . $qualifier->getURI() . "> ?qualifier }" : "";
$filterDeprecated="";
if(!$showDeprecated){
$filterDeprecated="FILTER NOT EXISTS { ?s owl:deprecated true }";
}
$query = <<<EOQ
SELECT DISTINCT ?s ?label ?alabel ?qualifier
WHERE {
$gc {
{
$textcondPref
FILTER(STRSTARTS(LCASE(STR(?match)), '$lcletter'))
FILTER EXISTS { ?s skos:prefLabel ?match }
BIND(?match as ?label)
}
UNION
{
$textcondAlt
FILTER(STRSTARTS(LCASE(STR(?match)), '$lcletter'))
FILTER EXISTS { ?s skos:altLabel ?match }
BIND(?match as ?alabel)
{
?s skos:prefLabel ?label .
FILTER (langMatches(LANG(?label), '$lang'))
}
}
?s a ?type .
$qualifierClause
$filterDeprecated
} $values
}
ORDER BY $orderbyclause $limitandoffset
EOQ;
return $query;
}
}

Before merging it should be discussed whether something else should be done to both of these SPARQL parts, namely, if there should be a more special treatment for "*" and whether or not the letter value should be changed to that very same value.

@sonarqubecloud
Copy link

sonarqubecloud bot commented Dec 7, 2020

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@codecov
Copy link

codecov bot commented Dec 7, 2020

Codecov Report

Merging #1103 (254c2c6) into master (254c2c6) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff            @@
##             master    #1103   +/-   ##
=========================================
  Coverage     60.30%   60.30%           
  Complexity     1592     1592           
=========================================
  Files            32       32           
  Lines          4431     4431           
=========================================
  Hits           2672     2672           
  Misses         1759     1759           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 254c2c6...5293aca. Read the comment docs.

@kouralex kouralex marked this pull request as draft December 7, 2020 21:34
@osma
Copy link
Member

osma commented Feb 17, 2021

The situation where this can happen is when a vocabulary has been configured to have a certain language, but there are actually no prefLabels with that language tag. The code in WebController tries to find the first alphabet letter for which prefLabels exist, but when there are no such prefLabels, the list is empty. This is basically a configuration error but these things happen, even in production as you've noticed by checking the logs.

I think we need a unit test to check for this situation. The suggested fix may be OK, I need to double-check. The sane result in this not-so-sane situation would be to show an empty list of alphabet letters as well as an empty alphabetical listing of concepts.

@osma osma marked this pull request as ready for review February 17, 2021 12:23
@sonarqubecloud
Copy link

SonarCloud Quality Gate failed.

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
23.1% 23.1% Duplication

@osma
Copy link
Member

osma commented Feb 17, 2021

I will merge this assuming the Travis CI tests pass. The SonarCloud complaint is irrelevant, it complains about identical blocks in unit tests (true, but harmless).

@osma osma self-assigned this Feb 17, 2021
@osma osma modified the milestones: Next Tasks, 2.9 Feb 17, 2021
@osma
Copy link
Member

osma commented Feb 17, 2021

Can't wait for Travis, but the tests pass locally, so merging this.

@osma osma merged commit d02bcfe into master Feb 17, 2021
@osma osma deleted the fix-undefined-offset-in-webcontroller branch February 17, 2021 14:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants