Skip to content

Commit

Permalink
Inherited Scrutinizer Recommendations - 3 of 3 (#1808)
Browse files Browse the repository at this point in the history
* Make DefinedNames Samples Consistent With Other Samples (#1707)

All other Samples write to temporary directory. DefinedNames samples
write to main directory, which (a) means they aren't stored with others,
and (b) they aren't ignored by git so look like changed files.
The tests are also simplified by requiring Header rather than Bootstrap,
making use of Helper.

* Resolve XSS Vulnerability in the HTML Writer (#1719)

Resolve XSS Vulnerability in the HTML Writer

* Drop Travis

* Automatic GitHub releases from git tags

* Improve Coverage in src/PhpSpreadsheet

There are no changes to code. Additional tests are added,
so that the following 6 items now have 100% test coverage:
- Comment
- DefinedName
- DocumentGenerator
- IOFactory
- NamedFormula
- NamedRange

* Changes for Scrutinizer

Two changes to fix minor problems reported by Scrutinizer.

* Spelling: Tou -> You

* Fix for 1735 (Incorrect activeSheetIndex after RemoveSheetByIndex) (#1743)

This is a fix for issue #1735.
It adds tests for this situation, and similar situations involving
adding new sheets and accessing existing ones.
Coverage for Spreadsheet.php increases from 69% to 75% as a result.

* Update change log

* Fix for 3 Issues Involving ReadXlsx and NamedRange (#1742)

* Fix for 3 Issues Involving ReadXlsx and NamedRange

Issues #1686 and #1723, which provide sample spreadsheets, are probably
solved by this ticket. Issue #1730 is also probably solved, but I have
no way to verify.

There are two problems with how PhpSpreadsheet is handling things now.
Although the first problem is much less severe, and isn't really a factor
in the issues named above, it is helpful to get it out of the way first.
If you define a named range in Excel, and then delete the sheet where
the range exists, Excel saves the range as #REF!. If there is a cell which
references the range, it will similarly have the value #REF! when you open
the Excel file.
Currently, PhpSpreadsheet discards the #REF! definition, so a cell which
references the range will appear as #NAME? rather than #REF!.
This PR changes the behavior so that PhpSpreadsheet retains the #REF!
definition, and cells which reference it will appear as #REF!.

The second problem is the more severe, and is, I believe, responsible
for the 3 issues identified above.
If you define a named range and the sheet on which the range is defined
does not exist at the time, Excel will save the range as something like:

'[1]Unknown Sheet'!$A$1

If a cell references such a range, Excel will again display #REF!.
PhpSpreadsheet currently throws an Exception when it encounters
such a definition while reading the file. This PR changes
the behavior so that PhpSpreadsheet saves the definition as #REF!,
and cells which reference it will behave similarly.

For the record, I will note that Excel does not magically recalculate when a
missing sheet is subsequently added, despite the fact that the reference
might now become resolvable. PhpSpreadsheet behaves likewise.

* Remove Dead Code in Test

Identified it after push but before merge.

* Update change log

* Apply Column and Row Styles to Existing Cells (#1721)

* Apply Column and Row Styles to Existing Cells

This is a fix for issue #1712.
When a style is applied to an entire row or column, it is currently
only effective for cells which don't already contain a value.
The code needs to iterate through existing cells in the row/column
in order to apply the style to them.
This could be considered a breaking change, however, I believe that
the change makes things operate as users would expect, and that the
existing implementation is incomplete.

The change also removes protected element conditionalStyles from
the Style class. That element is an unused remnant, and can no longer be
set or retrieved - methods getConditionalStyles and setConditionalStyles
actually act on an element in the Worksheet class.

Finally, additional tests are added so that Style, and in fact the
entire Style directory, now has 100% test coverage.

* Scrutinizer Changes

Scrutinizer flagged 6 statements. 5 can be easily corrected.
One is absolutely wrong (it thinks iterating through cells in column
can return null). Let's see if we can satisfy it.

* Remove Exception For CellIterator on Empty Row/Column

For my first attempt at this change, which corrects a bug by updating styles
for non-empty cells when a style is set on a row or column, I wished to make things
more efficient by using setIterateOnlyExistingCells, something which the
existing documentation recommends. This caused an exception to be generated
when the row or column is empty. So I removed that part of the change while I
researched what was going on.

I have completed that research. The existing code does throw an exception
when the row/column is empty and iterateOnlyExistingCells is true. However,
that does not seem like a reasonable action. This situation is analagous to
iterating over an empty array, and that action is legal and does not throw.
The same should apply here. There were no tests for this situation,
and now there are.

I have added additional tests, and coverage for all of RowCellIterator,
ColumnCellIterator, and CellIterator are all now 100%. Some of my new tests
were added in new members, because the existing tests all relied on mocking,
which was not the best choice for the new tests. One of the existing tests
for RowCellIteratorTest (testSeekOutOfRange) was wrong; it issued the expected
exception, but for the wrong reason. I have added an additional test to
ensure that it fails "correctly".

The existing documentation says that the default value for
IterateOnlyExistingCells is true. In fact, the default value is false.
I have corrected the documentation.

* More Scrutinizer

I believe its analysis is incorrect, but this should silence it.

* DocBlock Correction

ColumnCellIterator DocBlock for current indicated it could return null
or Cell, but it can really return only Cell. This had caused Scrutinizer
to complain earlier.

* PHP8 Environment Appears to be Fixed

Cosmetic change to Doc member. I suspect there is a way to rerun all
the tests without another push, but I have been unable to figure out how.

* Update change log

* TextData Coverage and Minor Bug Fixes (#1744)

This had been intended to get 100% coverage for TextData functions, and it does that.
However, some minor bugs requiring source changes arose during testing.
- the Excel CHAR function restricts its argument to 1-255. PhpSpreadsheet CHARACTER
  had been allowing 0+. Also, there is no need to test if iconv exists,
  since it is part of Composer requirements.
- The DOLLAR function had been returning NUM for invalid arguments. Excel returns VALUE.
  Also, negative amounts were not being handled correctly.
- The FIXEDFORMAT function had been returning NUM for invalid arguments. Excel FIXED returns VALUE.

* Replace anti-xss with html purifier (#1751)

* Replace voku/anti-xss with ezyang/htmlpurifier. Despite anti-xss being a smaller footprint dependency, an a better license fit with our MIT license, there are issues with it's automatic it sanitisation of global variables causing side effects
* Additional unit tests for xss in html writer cell comments

* Fix bug #1626 where values of 0 were "rounded" up/down as if they were not 0 (#1627)

* Fix bug where values of 0 were "rounded" up/down as if they were not 0

* Update change log

* Fix for #1612 - SLK Long File Name (#1706)

Issue has been marked stale, but ...
Sylk read sets worksheet title to filename (minus .slk).
If that is >31 characters, PhpSpreadsheet throws Exception.
This change truncates sheet title, as Excel does, to 31 characters.

* Update change log

* worksheet: fix if cellValue does not exist (#1727)

The condition is FALSE if the cell does not exist in the flipped table,
but anyway, it is sent in to a method requiring 'string' type, causing
it to fail.

* fixes #1655 issue (#1656)

Resolve problem with incorrectly defined hyperlinks

* Add 'ps' suffix to printer settings resources IDs (#1690)

* Add 'ps' suffix to printer settings resources IDs

* Update change log

* Fix pixelsToPoints conversion (for HTML col width) (#1733)

* DocBlock Change in Styles/Conditional (#1697)

Scrutinizer reported a minor error in a test involving a module
which I was not changing.
Styles/Conditional function setConditions can take a scalar or an
array as a parameter, but DocBlock says it only expects array.
I did not wish to add the extra module to my PR, but made a note to
self to fix that after PR was installed.
That has now happened, and it makes for a good case for me
to see all the PHP8/Composer2/etc. changes that have happened recently.

* Merge pull request #1698

* Merge pull request #4 from PHPOffice/master

* Restore Omitted Read XML Test

* Fix for bug #1592 (UPDATED) (#1623)

* Fix for Xls when BIFF8 SST (FCh) has bad Shared string length

* Update change log

* Add nightly PHP 8.1 dev to github actions (#1763)

* Fix compatibility with ext-gd on php 8 (#1762)

* CSV - Guess Encoding, Handle Null-string Escape (#1717)

* CSV - Guess Encoding, Handle Null-string Escape

This is in response to issue #1647 (detect CSV character encoding).
First, my tests with mb_detect_encoding indicate that it doesn't work
well enough; regardless, users can always do that on their own
if they deem it useful.
Rolling my own is also troublesome, but I can at least:
a. Check for BOM (UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE).
b. Do some heuristic tests for each of the above encodings.
c. Fallback to a user-specified encoding (default CP1252)
  if a and b don't yield result.
I think this is probably useful enough to include, and relatively
easy to expand if other potential encodings should be considered.

Starting with PHP7.4, fgetcsv allows specification of null string as
escape character in fgetcsv. This is a much better choice than the PHP
(and PhpSpreadsheet) default of backslash in that it handles the file
in the same manner as Excel does. There is one statement in Reader/CSV
which would be adversely affected if the caller so specified (building
a regular expression under the assumption that escape character is
a single character). Fix that statement appropriately and add tests.

* Update changelog

* Update Units of Measure supported by the CONVERT() function (#1768)

Now supports all current UoM in all categories, with both 1- and 2-character multiplier prefixes, and binary multiplier prefixes, including the new Temperature scales

* Changelog for 1.16.0 release

* Fix date tests withut specified year for current year 2021 (#1774)

* Mrand of zero to any multiple should return 0 (#1773)

* Inherited Scrutinizer Recommendations - 3 of 3

I tried to sync my fork with the main project, as I have done several times before.
However, the GitHub interface to do this had changed, and it appears that I
did not make the optimal selection when I had a choice. Consequently, all
the merges that happened to base between the last time I synchronized and this time
appear to be part of any PR that I push. "Files changed" remains correct for
my new PRs, but there appear to be many more commits involved than is actually the
case. I will, at some point, delete and re-create my fork, and pay much closer
attention in future when I want to sync my fork with the main project.

Because of this set-up, Scrutinizer reports flaws in code that I haven't
actually changed in my PRs #1799 and #1800.
It still passes them, but, as long as I'm aware of the problems,
I may as well attempt to correct them. The following are not part of those PRs:
- 5 problems spread over 4 different members
- 12 problems in Calculation/Engineering
- 15 problems in Reader/XML

I shall attempt to resolve these via 3 separate PRs,
of which this is the third.

* The Usual

Fixed some Scrutinizer problems, some new ones popped up.

* More Scrutinizer

I think it's wrong in a lot of these cases. Although I am working
around all of them, I intend to file a bug report with them.

Co-authored-by: Mark Baker <mark@lange.demon.co.uk>
Co-authored-by: Adrien Crivelli <adrien.crivelli@gmail.com>
Co-authored-by: Ryan McAllen <rhyno@rhynodesigns.com>
Co-authored-by: Flinsch <220455+Flinsch@users.noreply.github.com>
Co-authored-by: Jan Sverre Riksfjord <jasverix@gmail.com>
Co-authored-by: Max Kalyabin <maksim@kalyabin.ru>
Co-authored-by: Sébastien Despont <sebastien.despont@gmail.com>
Co-authored-by: Guilliam Xavier <guilliamxavier@users.noreply.github.com>
Co-authored-by: Gianluca Giovinazzo <g.giovinazzo@gmail.com>
Co-authored-by: Alexander M. Turek <me@derrabus.de>
Co-authored-by: Martins Sipenko <martins.sipenko@gmail.com>
  • Loading branch information
12 people authored Jan 31, 2021
1 parent 70c3803 commit 308bc64
Showing 1 changed file with 30 additions and 21 deletions.
51 changes: 30 additions & 21 deletions src/PhpSpreadsheet/Reader/Xml.php
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ public function listWorksheetNames($pFilename)

$xml_ss = $xml->children($namespaces['ss']);
foreach ($xml_ss->Worksheet as $worksheet) {
$worksheet_ss = $worksheet->attributes($namespaces['ss']);
$worksheet_ss = self::getAttributes($worksheet, $namespaces['ss']);
$worksheetNames[] = (string) $worksheet_ss['Name'];
}

Expand Down Expand Up @@ -221,7 +221,7 @@ public function listWorksheetInfo($pFilename)
$worksheetID = 1;
$xml_ss = $xml->children($namespaces['ss']);
foreach ($xml_ss->Worksheet as $worksheet) {
$worksheet_ss = $worksheet->attributes($namespaces['ss']);
$worksheet_ss = self::getAttributes($worksheet, $namespaces['ss']);

$tmpInfo = [];
$tmpInfo['worksheetName'] = '';
Expand Down Expand Up @@ -381,13 +381,13 @@ public function loadIntoExisting($pFilename, Spreadsheet $spreadsheet)
}
if (isset($xml->CustomDocumentProperties)) {
foreach ($xml->CustomDocumentProperties[0] as $propertyName => $propertyValue) {
$propertyAttributes = $propertyValue->attributes($namespaces['dt']);
$propertyAttributes = self::getAttributes($propertyValue, $namespaces['dt']);
$propertyName = preg_replace_callback('/_x([0-9a-f]{4})_/i', ['self', 'hex2str'], $propertyName);
$propertyType = Properties::PROPERTY_TYPE_UNKNOWN;
switch ((string) $propertyAttributes) {
case 'string':
$propertyType = Properties::PROPERTY_TYPE_STRING;
$propertyValue = trim($propertyValue);
$propertyValue = trim((string) $propertyValue);

break;
case 'boolean':
Expand All @@ -407,7 +407,7 @@ public function loadIntoExisting($pFilename, Spreadsheet $spreadsheet)
break;
case 'dateTime.tz':
$propertyType = Properties::PROPERTY_TYPE_DATE;
$propertyValue = strtotime(trim($propertyValue));
$propertyValue = strtotime(trim((string) $propertyValue));

break;
}
Expand All @@ -420,8 +420,9 @@ public function loadIntoExisting($pFilename, Spreadsheet $spreadsheet)
$worksheetID = 0;
$xml_ss = $xml->children($namespaces['ss']);

foreach ($xml_ss->Worksheet as $worksheet) {
$worksheet_ss = $worksheet->attributes($namespaces['ss']);
foreach ($xml_ss->Worksheet as $worksheetx) {
$worksheet = ($worksheetx === null) ? new SimpleXMLElement('<xml></xml>') : $worksheetx;
$worksheet_ss = self::getAttributes($worksheet, $namespaces['ss']);

if (
(isset($this->loadSheetsOnly)) && (isset($worksheet_ss['Name'])) &&
Expand All @@ -444,7 +445,7 @@ public function loadIntoExisting($pFilename, Spreadsheet $spreadsheet)
// locally scoped defined names
if (isset($worksheet->Names[0])) {
foreach ($worksheet->Names[0] as $definedName) {
$definedName_ss = $definedName->attributes($namespaces['ss']);
$definedName_ss = self::getAttributes($definedName, $namespaces['ss']);
$name = (string) $definedName_ss['Name'];
$definedValue = (string) $definedName_ss['RefersTo'];
$convertedValue = AddressHelper::convertFormulaToA1($definedValue);
Expand All @@ -458,7 +459,7 @@ public function loadIntoExisting($pFilename, Spreadsheet $spreadsheet)
$columnID = 'A';
if (isset($worksheet->Table->Column)) {
foreach ($worksheet->Table->Column as $columnData) {
$columnData_ss = $columnData->attributes($namespaces['ss']);
$columnData_ss = self::getAttributes($columnData, $namespaces['ss']);
if (isset($columnData_ss['Index'])) {
$columnID = Coordinate::stringFromColumnIndex((int) $columnData_ss['Index']);
}
Expand All @@ -475,14 +476,14 @@ public function loadIntoExisting($pFilename, Spreadsheet $spreadsheet)
$additionalMergedCells = 0;
foreach ($worksheet->Table->Row as $rowData) {
$rowHasData = false;
$row_ss = $rowData->attributes($namespaces['ss']);
$row_ss = self::getAttributes($rowData, $namespaces['ss']);
if (isset($row_ss['Index'])) {
$rowID = (int) $row_ss['Index'];
}

$columnID = 'A';
foreach ($rowData->Cell as $cell) {
$cell_ss = $cell->attributes($namespaces['ss']);
$cell_ss = self::getAttributes($cell, $namespaces['ss']);
if (isset($cell_ss['Index'])) {
$columnID = Coordinate::stringFromColumnIndex((int) $cell_ss['Index']);
}
Expand Down Expand Up @@ -524,7 +525,7 @@ public function loadIntoExisting($pFilename, Spreadsheet $spreadsheet)
$cellData = $cell->Data;
$cellValue = (string) $cellData;
$type = DataType::TYPE_NULL;
$cellData_ss = $cellData->attributes($namespaces['ss']);
$cellData_ss = self::getAttributes($cellData, $namespaces['ss']);
if (isset($cellData_ss['Type'])) {
$cellDataType = $cellData_ss['Type'];
switch ($cellDataType) {
Expand Down Expand Up @@ -587,7 +588,7 @@ public function loadIntoExisting($pFilename, Spreadsheet $spreadsheet)
$author = (string) $commentAttributes->Author;
}
$node = $cell->Comment->Data->asXML();
$annotation = strip_tags($node);
$annotation = strip_tags((string) $node);
$spreadsheet->getActiveSheet()->getComment($columnID . $rowID)->setAuthor($author)->setText($this->parseRichText($annotation));
}

Expand All @@ -610,7 +611,7 @@ public function loadIntoExisting($pFilename, Spreadsheet $spreadsheet)
if ($rowHasData) {
if (isset($row_ss['Height'])) {
$rowHeight = $row_ss['Height'];
$spreadsheet->getActiveSheet()->getRowDimension($rowID)->setRowHeight($rowHeight);
$spreadsheet->getActiveSheet()->getRowDimension($rowID)->setRowHeight((float) $rowHeight);
}
}

Expand All @@ -631,7 +632,7 @@ public function loadIntoExisting($pFilename, Spreadsheet $spreadsheet)
$activeWorksheet = $spreadsheet->setActiveSheetIndex(0);
if (isset($xml->Names[0])) {
foreach ($xml->Names[0] as $definedName) {
$definedName_ss = $definedName->attributes($namespaces['ss']);
$definedName_ss = self::getAttributes($definedName, $namespaces['ss']);
$name = (string) $definedName_ss['Name'];
$definedValue = (string) $definedName_ss['RefersTo'];
$convertedValue = AddressHelper::convertFormulaToA1($definedValue);
Expand Down Expand Up @@ -662,10 +663,11 @@ private function parseStyles(SimpleXMLElement $xml, array $namespaces): void
}

foreach ($xml->Styles[0] as $style) {
$style_ss = $style->attributes($namespaces['ss']);
$style_ss = self::getAttributes($style, $namespaces['ss']);
$styleID = (string) $style_ss['ID'];
$this->styles[$styleID] = (isset($this->styles['Default'])) ? $this->styles['Default'] : [];
foreach ($style as $styleType => $styleData) {
foreach ($style as $styleType => $styleDatax) {
$styleData = $styleDatax ?? new SimpleXMLElement('<xml></xml>');
$styleAttributes = $styleData->attributes($namespaces['ss']);
switch ($styleType) {
case 'Alignment':
Expand Down Expand Up @@ -750,15 +752,16 @@ private function parseStyleBorders($styleID, SimpleXMLElement $styleData, array
$diagonalDirection = '';
$borderPosition = '';
foreach ($styleData->Border as $borderStyle) {
$borderAttributes = $borderStyle->attributes($namespaces['ss']);
$borderAttributes = self::getAttributes($borderStyle, $namespaces['ss']);
$thisBorder = [];
$style = (string) $borderAttributes->Weight;
$style .= strtolower((string) $borderAttributes->LineStyle);
$thisBorder['borderStyle'] = self::$mappings['borderStyle'][$style] ?? Border::BORDER_NONE;
foreach ($borderAttributes as $borderStyleKey => $borderStyleValue) {
foreach ($borderAttributes as $borderStyleKey => $borderStyleValuex) {
$borderStyleValue = (string) $borderStyleValuex;
switch ($borderStyleKey) {
case 'Position':
$borderStyleValue = strtolower((string) $borderStyleValue);
$borderStyleValue = strtolower($borderStyleValue);
if (in_array($borderStyleValue, self::$borderPositions)) {
$borderPosition = $borderStyleValue;
} elseif ($borderStyleValue == 'diagonalleft') {
Expand All @@ -784,6 +787,11 @@ private function parseStyleBorders($styleID, SimpleXMLElement $styleData, array
}
}

private static function getAttributes(?SimpleXMLElement $simple, string $node): SimpleXMLElement
{
return ($simple === null) ? new SimpleXMLElement('<xml></xml>') : ($simple->attributes($node) ?? new SimpleXMLElement('<xml></xml>'));
}

private static $underlineStyles = [
Font::UNDERLINE_NONE,
Font::UNDERLINE_DOUBLE,
Expand Down Expand Up @@ -854,7 +862,8 @@ private function parseStyleFont(string $styleID, SimpleXMLElement $styleAttribut
*/
private function parseStyleInterior($styleID, SimpleXMLElement $styleAttributes): void
{
foreach ($styleAttributes as $styleAttributeKey => $styleAttributeValue) {
foreach ($styleAttributes as $styleAttributeKey => $styleAttributeValuex) {
$styleAttributeValue = (string) $styleAttributeValuex;
switch ($styleAttributeKey) {
case 'Color':
$this->styles[$styleID]['fill']['endColor']['rgb'] = substr($styleAttributeValue, 1);
Expand Down

0 comments on commit 308bc64

Please sign in to comment.