Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug when missing workbook relationship #277

Merged
merged 1 commit into from
Mar 18, 2024

Conversation

tmiller
Copy link
Contributor

@tmiller tmiller commented Mar 17, 2024

Fix bug when missing workbook relationship

Add a check to see if the list is empty before trying to access it's
contents. If an excel file has an overridden relationship with no word
"book" in the name it will attempt to grab the first item of an empty
list when looking up workbook relationships.

IndexError: list index out of range

There could be a better fix to this issue I'm not well enough versed in
the xslx specification. The following xlsx file caused the issue.

$ unzip -l some_file.xlsx
Archive:  some_file.xlsx
  Length      Date    Time    Name
---------  ---------- -----   ----
      142  02-06-2024 13:28   xl/worksheets/_rels/sheet1.xml.rels
 65968555  02-06-2024 13:28   xl/worksheets/sheet1.xml
  2078037  02-06-2024 13:28   xl/sharedStrings.xml
     9867  02-06-2024 13:28   xl/styles.xml
      566  02-06-2024 13:28   xl/_rels/workbook.xml.rels
      388  02-06-2024 13:28   xl/workbook.xml
      297  02-06-2024 13:28   _rels/.rels
     1122  02-06-2024 13:28   [Content_Types].xml
---------                     -------
 68058974                     8 files

In [Content_types].xml it is overriding the relationships to point at
_rels/.rels rather than xl/_rels/workbook.xml.rels. This causes the
workbook_relationships list to be empty causes the error mentioned
above. One can see that it does indeed have a workbook relationship,
however it is being overridden.

[Contenet_types].xml:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types">
  <Default Extension="png" ContentType="image/png"/>
  <Default Extension="jpeg" ContentType="image/jpeg"/>
  <Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/>
  <Default Extension="xml" ContentType="application/xml"/>
  <Default Extension="vml" ContentType="application/vnd.openxmlformats-officedocument.vmlDrawing"/>
  <Override PartName="/xl/worksheets/sheet1.xml" ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml"/>
  <Override PartName="/xl/sharedStrings.xml" ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.sharedStrings+xml"/>
  <Override PartName="/xl/styles.xml" ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.styles+xml"/>
  <Override PartName="/xl/workbook.xml" ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml"/>
  <Override PartName="/_rels/.rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/>
</Types>

xl/_rels/workbook.xml.rels:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
  <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet" Target="worksheets/sheet1.xml"/>
  <Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/sharedStrings" Target="sharedStrings.xml"/>
  <Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Target="styles.xml"/>
</Relationships>

Add a check to see if the list is empty before trying to access it's
contents. If an excel file has an overridden relationship with no word
"book" in the name it will attempt to grab the first item of an empty
list when looking up workbook relationships.

    IndexError: list index out of range

There could be a better fix to this issue I'm not well enough versed in
the xslx specification. The following xlsx file caused the issue.

    $ unzip -l some_file.xlsx
    Archive:  some_file.xlsx
      Length      Date    Time    Name
    ---------  ---------- -----   ----
          142  02-06-2024 13:28   xl/worksheets/_rels/sheet1.xml.rels
     65968555  02-06-2024 13:28   xl/worksheets/sheet1.xml
      2078037  02-06-2024 13:28   xl/sharedStrings.xml
         9867  02-06-2024 13:28   xl/styles.xml
          566  02-06-2024 13:28   xl/_rels/workbook.xml.rels
          388  02-06-2024 13:28   xl/workbook.xml
          297  02-06-2024 13:28   _rels/.rels
         1122  02-06-2024 13:28   [Content_Types].xml
    ---------                     -------
     68058974                     8 files

In `[Content_types].xml` it is overriding the relationships to point at
`_rels/.rels` rather than `xl/_rels/workbook.xml.rels`. This causes the
`workbook_relationships` list to be empty causes the error mentioned
above. One can see that it does indeed have a workbook relationship,
however it is being overridden.

`[Contenet_types].xml`:

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types">
      <Default Extension="png" ContentType="image/png"/>
      <Default Extension="jpeg" ContentType="image/jpeg"/>
      <Default Extension="rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/>
      <Default Extension="xml" ContentType="application/xml"/>
      <Default Extension="vml" ContentType="application/vnd.openxmlformats-officedocument.vmlDrawing"/>
      <Override PartName="/xl/worksheets/sheet1.xml" ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.worksheet+xml"/>
      <Override PartName="/xl/sharedStrings.xml" ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.sharedStrings+xml"/>
      <Override PartName="/xl/styles.xml" ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.styles+xml"/>
      <Override PartName="/xl/workbook.xml" ContentType="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet.main+xml"/>
      <Override PartName="/_rels/.rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/>
    </Types>

`xl/_rels/workbook.xml.rels`:

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
    <Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
      <Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/worksheet" Target="worksheets/sheet1.xml"/>
      <Relationship Id="rId2" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/sharedStrings" Target="sharedStrings.xml"/>
      <Relationship Id="rId3" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/styles" Target="styles.xml"/>
    </Relationships>
@tmiller tmiller force-pushed the fix-no-workbook-relationship branch from 68c2942 to ef3ff50 Compare March 17, 2024 17:41
@dilshod dilshod merged commit b45eb09 into dilshod:master Mar 18, 2024
@tmiller tmiller deleted the fix-no-workbook-relationship branch March 19, 2024 13:28
@tanji
Copy link

tanji commented Mar 22, 2024

@dilshod could you please tag this fix? Thank you kindly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants