Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with SimpleXML processing COUNTER report namespaces. #485

Closed
vinny75 opened this issue Sep 13, 2018 · 17 comments
Closed

Problem with SimpleXML processing COUNTER report namespaces. #485

vinny75 opened this issue Sep 13, 2018 · 17 comments

Comments

@vinny75
Copy link
Contributor

vinny75 commented Sep 13, 2018

PHP Warning: simplexml_load_string(): namespace error : Namespace prefix rr on Report is not defined in /var/www/html/coral3/usage/admin/classes/domain/SushiService.php on line 611, referer: /coral3/usage/sushi.php

PHP Fatal error: Call to a member function attributes() on null in /var/www/html/coral3/usage/admin/classes/domain/SushiService.php on line 630, referer: /coral3/usage/sushi.php

Traced these errors back to:
610 $clean_xml = str_ireplace(['s:','SOAP-ENV:','SOAP:'],'',$string);
611 $xml = simplexml_load_string($clean_xml);
and
629 $report = $xml->Body->ReportResponse->Report->Report;
630 $reportTypeName = $report->attributes()->Name;

These changes seem to fix it for me
610 (line deleted)
611 $xml = simplexml_load_string($string);
612 $xml->registerXPathNamespace("s", "http://schemas.xmlsoap.org/soap/envelope/");
613 $xml->registerXPathNamespace("r", "http://www.niso.org/schemas/counter");
and
629 $report = $xml->xPath("s:Body//r:Report")[0];

@vinny75
Copy link
Contributor Author

vinny75 commented Dec 10, 2018

This fails if a namespace is used within the Report element. The only reliable solution I found was to get all the namespace used in the document, then remove them.

609  $string = file_get_contents($fName);
//        $clean_xml = str_ireplace(['s:','SOAP-ENV:','SOAP:'],'',$string);
        $xml = simplexml_load_string($string);
        $namespaces = $xml->getDocNamespaces(TRUE);
        $clean_xml = $string;
        foreach ($namespaces as $namespace => $uri) {
          if (!empty($namespace)) $clean_xml = str_ireplace($namespace . ":", '', $string);
        }
        $xml = simplexml_load_string($clean_xml);

This seems like a bit of a hack but I couldn't get anything else to work.

@haozeng0
Copy link

Same error, 3.0.1

PHP Fatal error: Call to a member function attributes() on null in C:\Apache24\htdocs\coral\usage\admin\classes\domain\SushiService.php on line 630

@vinny75
Copy link
Contributor Author

vinny75 commented Mar 18, 2019

Same error, 3.0.1

PHP Fatal error: Call to a member function attributes() on null in C:\Apache24\htdocs\coral\usage\admin\classes\domain\SushiService.php on line 630

Did you try my second fix? If it works for you, I'll submit a pull request to get it included in the code

@haozeng0
Copy link

Hi, Your second fix gives me the error:

[Mon Mar 18 10:33:17.887759 2019] [:error] [pid 5268:tid 960] [client 10.32.48.31:60819] PHP Notice: Trying to get property of non-object in C:\Apache24\htdocs\coral\usage\admin\classes\domain\SushiService.php on line 639, referer: https://archives.yu.edu/coral/usage/sushi.php
[Mon Mar 18 10:33:17.887759 2019] [:error] [pid 5268:tid 960] [client 10.32.48.31:60819] PHP Notice: Trying to get property of non-object in C:\Apache24\htdocs\coral\usage\admin\classes\domain\SushiService.php on line 639, referer: https://archives.yu.edu/coral/usage/sushi.php
[Mon Mar 18 10:33:17.887759 2019] [:error] [pid 5268:tid 960] [client 10.32.48.31:60819] PHP Fatal error: Call to a member function attributes() on null in C:\Apache24\htdocs\coral\usage\admin\classes\domain\SushiService.php on line 640, referer: https://archives.yu.edu/coral/usage/sushi.php

And my line 639: $report = $xml->Body->ReportResponse->Report->Report;

@haozeng0
Copy link

The first fix works for me.
Line 611 also needs to be deleted.
Line 611: $xml = simplexml_load_string($clean_xml);

@vinny75
Copy link
Contributor Author

vinny75 commented Mar 18, 2019

Hi, Your second fix gives me the error:

[Mon Mar 18 10:33:17.887759 2019] [:error] [pid 5268:tid 960] [client 10.32.48.31:60819] PHP Notice: Trying to get property of non-object in C:\Apache24\htdocs\coral\usage\admin\classes\domain\SushiService.php on line 639, referer: https://archives.yu.edu/coral/usage/sushi.php
[Mon Mar 18 10:33:17.887759 2019] [:error] [pid 5268:tid 960] [client 10.32.48.31:60819] PHP Notice: Trying to get property of non-object in C:\Apache24\htdocs\coral\usage\admin\classes\domain\SushiService.php on line 639, referer: https://archives.yu.edu/coral/usage/sushi.php
[Mon Mar 18 10:33:17.887759 2019] [:error] [pid 5268:tid 960] [client 10.32.48.31:60819] PHP Fatal error: Call to a member function attributes() on null in C:\Apache24\htdocs\coral\usage\admin\classes\domain\SushiService.php on line 640, referer: https://archives.yu.edu/coral/usage/sushi.php

And my line 639: $report = $xml->Body->ReportResponse->Report->Report;

Which vendor were you using when you got this error?

@haozeng0
Copy link

EBSCO.

@vinny75
Copy link
Contributor Author

vinny75 commented Mar 18, 2019

EBSCO.

I just tried EBSCO BR2 and JR1 and it worked fine with the second changes. Can't remember which vendor was failing on the first set.

@haozeng0
Copy link

haozeng0 commented Mar 18, 2019 via email

@nkuitse
Copy link
Contributor

nkuitse commented Apr 10, 2019

The idea of stripping strings like "s:" from a random XML document strikes me as utter insanity. For example, if the data includes a title like "Enemies: a love story" then the title will be changed to "Enemie a love story". Not to mention that a namespace prefix such as "reports:" will be changed to "report" without the colon.

The right solution is beyond me -- I don't know PHP well enough, least of all the quirks of SimpleXML -- but couldn't we at least make this bug slightly less nasty by registering some namespaces and using XPath? Something along the following lines:

$ns = array(
'xsi' => 'http://www.w3.org/2001/XMLSchema-instance',
'soapenv' => 'http://schemas.xmlsoap.org/soap/envelope/',
'counter' => 'http://www.niso.org/schemas/counter',
'sushi' => 'http://www.niso.org/schemas/sushi',
'sushicounter' => 'http://www.niso.org/schemas/sushi/counter',
);
$xml = simplexml_load_string($string);
foreach ($ns as $pfx => $uri) {
$xml->registerXPathNamespace($pfx, $uri);
}
$reports = $xml->xpath('//counter:Report');
if ($reports && $reports[0]) {
$reportTypeName = $reports[0]->attributes()->Name;
...
}

And so on?

@vinny75
Copy link
Contributor Author

vinny75 commented Apr 10, 2019

The namespaces are not standard across vendors. I tried reading all the namespaces and registering them but that didn't work.

@nkuitse
Copy link
Contributor

nkuitse commented Apr 10, 2019

I agree, but I'm not trying to fix the bug, I'm just trying to mitigate the harm that it causes. The current code is just so entirely wrong -- the hack on line 610 would, for example, completely mangle a SUSHI response that happens to use a namespace prefix ending in "s" (other than the prefix "s:" itself):

$clean_xml = str_ireplace(['s:','SOAP-ENV:','SOAP:'],'',$string);

And registering the namespaces from an associative array as my code does should make it easier to add namespaces in the future. In fact, if you have a corpus of sushistore/*.xml that you're able to share, I would be happy to identify all namespaces used in them and adjust the array accordingly. You can just grep the files for : and send me that output -- I don't need anything else.

@vinny75
Copy link
Contributor Author

vinny75 commented Apr 11, 2019

I tried adding the namespaces, it didn't work for me.

@nkuitse
Copy link
Contributor

nkuitse commented Apr 12, 2019

Perhaps we should just strip all namespaces from the XML, then proceed from there. You can strip namespaces like this:

function stripNamespaces($xml) {
$xsl = new DOMDocument();
$xsl->loadXML('
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="no"/>
<xsl:template match="/|comment()|processing-instruction()">
xsl:copy

xsl:apply-templates/
</xsl:copy>
</xsl:template>
<xsl:template match="">
<xsl:element name="{local-name()}">

<xsl:apply-templates select="@
|node()"/>
</xsl:element>
</xsl:template>
<xsl:template match="@*">
<xsl:attribute name="{local-name()}">
<xsl:value-of select="."/>
</xsl:attribute>
</xsl:template>
</xsl:stylesheet>
');
$proc = new XSLTProcessor();
$xsl = $proc->importStylesheet($xsl);
$proc->setParameter(null, "", "");
$raw = new DomDocument();
$raw->load($fName);
$doc = $proc->transformToDoc($raw);
return $doc->saveXML();
}

Then you just make a small change in lines 609-611:

$string = file_get_contents($fName);
$clean_xml = stripNamespaces($string);
$xml = simplexml_load_string($clean_xml);

Hopefully that does the trick -- I haven't tested it yet.

@nkuitse
Copy link
Contributor

nkuitse commented Apr 12, 2019

I have a sandbox instance where I can test the code in the pull request, but I have to set up a SUSHI connection in it first (and have never done that before) -- so if you have an instance you can test it in now, that would be great.

@nkuitse
Copy link
Contributor

nkuitse commented May 23, 2019

@queryluke has a much better fix in #579

@andyp-uk
Copy link
Contributor

I have merged @queryluke 's fix for this issue, PR #611

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants