Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed issue #539 Fixed error opening excel file created in encoding d… #540

Merged
merged 6 commits into from
Dec 19, 2019
Merged

Fixed issue #539 Fixed error opening excel file created in encoding d… #540

merged 6 commits into from
Dec 19, 2019

Conversation

monoflash
Copy link
Contributor

Fixed issue #539 Fixed error opening excel file created in encoding different from UTF-8, added logging of possible errors when decoding xml, if the function does not provide exit with error

…ifferent from UTF-8, added logging of possible errors when decoding xml, if the function does not provide exit with error
@codecov-io
Copy link

codecov-io commented Dec 18, 2019

Codecov Report

Merging #540 into v2 will decrease coverage by 0.91%.
The diff coverage is 76.92%.

Impacted file tree graph

@@            Coverage Diff            @@
##               v2    #540      +/-   ##
=========================================
- Coverage   96.11%   95.2%   -0.92%     
=========================================
  Files          26      26              
  Lines        5452    5543      +91     
=========================================
+ Hits         5240    5277      +37     
- Misses        119     149      +30     
- Partials       93     117      +24
Impacted Files Coverage Δ
file.go 88.05% <100%> (-0.52%) ⬇️
rows.go 83.25% <50%> (-0.69%) ⬇️
calcchain.go 87.09% <60%> (-6.01%) ⬇️
comment.go 95.81% <60%> (-1.64%) ⬇️
styles.go 98.58% <66.66%> (-0.7%) ⬇️
sheet.go 94.69% <72.22%> (-1.3%) ⬇️
chart.go 99.08% <75%> (-0.2%) ⬇️
sparkline.go 97.02% <76%> (-2.98%) ⬇️
picture.go 96.4% <82.35%> (-2.09%) ⬇️
docProps.go 91.54% <82.35%> (-8.46%) ⬇️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a00ba75...207c926. Read the comment docs.

Copy link
Member

@xuri xuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @monoflash, thanks for your PR. I've left some comments.

go.mod Outdated Show resolved Hide resolved
picture.go Outdated Show resolved Hide resolved
sheet_test.go Outdated Show resolved Hide resolved
sheetpr_test.go Outdated Show resolved Hide resolved
sheetview_test.go Outdated Show resolved Hide resolved
calcchain.go Show resolved Hide resolved
charset_reader.go Outdated Show resolved Hide resolved
charset_reader.go Outdated Show resolved Hide resolved
@xuri xuri added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 18, 2019
@monoflash
Copy link
Contributor Author

monoflash commented Dec 18, 2019

Oops.. I didn’t think that the commits following the pool-request would be added to the pool-request.
Returned package name changes.
Without this patch, excel files do not open and I need to take the library from somewhere. I wanted to connect from a fork, but go mod resists ...

@monoflash monoflash requested a review from xuri December 18, 2019 18:02
@monoflash
Copy link
Contributor Author

I fixed all remarks you wrote about.

Copy link
Member

@xuri xuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @monoflash, I've left some comments. If we use charset.NewReaderLabel instead of the CharsetReader function, that will simplify the code.

charset_reader.go Outdated Show resolved Hide resolved
.gitignore Show resolved Hide resolved
picture.go Outdated Show resolved Hide resolved
go.sum Outdated Show resolved Hide resolved
charset_reader_test.go Outdated Show resolved Hide resolved
Copy link
Contributor Author

@monoflash monoflash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Part of the proposed changes has been implemented.

@monoflash monoflash requested a review from xuri December 19, 2019 11:00
Copy link
Member

@xuri xuri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specified charset reader for the Rows function like is:

func (f *File) Rows(sheet string) (*Rows, error) {
    name, ok := f.sheetMap[trimSheetName(sheet)]
    if !ok {
        return nil, ErrSheetNotExist{sheet}
    }
    if f.Sheet[name] != nil {
        // flush data
        output, _ := xml.Marshal(f.Sheet[name])
        f.saveFileList(name, replaceWorkSheetsRelationshipsNameSpaceBytes(output))
    }
    var (
        err       error
        inElement string
        row       int
        rows      Rows
    )
    decoder := xml.NewDecoder(bytes.NewReader(f.readXML(name)))
    decoder.CharsetReader = charset.NewReaderLabel
    for {
        token, _ := decoder.Token()
        if token == nil {
            break
        }
        switch startElement := token.(type) {
        case xml.StartElement:
            inElement = startElement.Name.Local
            if inElement == "row" {
                for _, attr := range startElement.Attr {
                    if attr.Name.Local == "r" {
                        row, err = strconv.Atoi(attr.Value)
                        if err != nil {
                            return &rows, err
                        }
                    }
                }
                rows.totalRow = row
            }
        default:
        }
    }
    rows.f = f
    rows.sheet = name
    rows.decoder = xml.NewDecoder(bytes.NewReader(f.readXML(name)))
    rows.decoder.CharsetReader = charset.NewReaderLabel
    return &rows, nil
}

I have create a spreadsheet file, there are different encoding on over 20 worksheets:

Encoding.xlsx

Worksheet Encoding
Sheet1 utf8
Sheet2 windows-1250
Sheet3 windows-1251
Sheet4 windows-1252
Sheet5 windows-1253
Sheet6 windows-1254
Sheet7 windows-1255
Sheet8 windows-1256
Sheet9 windows-1257
Sheet10 windows-1258
Sheet11 big5
Sheet12 gbk
Sheet13 ISO-8859-3
Sheet14 ISO-8859-4
Sheet15 ISO-8859-5
Sheet16 ISO-8859-7
Sheet17 ISO-8859-9
Sheet18 ISO-8859-13
Sheet19 ISO-8859-15
Sheet20 euc-kr

Test the newly modified Row function:

package main

import (
    "fmt"

    "github.com/360EntSecGroup-Skylar/excelize"
)

func main() {
    f, err := excelize.OpenFile("./Encoding.xlsx")
    if err != nil {
        fmt.Println("Open file error:", err)
        return
    }

    for _, sheetName := range f.GetSheetMap() {
        rows, err := f.GetRows(sheetName)
        if err != nil {
            fmt.Println("GetRows error:", err)
            return
        }
        for _, row := range rows { 
            for _, colCell := range row {
                fmt.Print(colCell, "\t")
            }
            fmt.Println()
        }
    }
}

Output:

ελληνικά
Kağan
latviešu
Hello 常用國字標準字體表
nutraĵo
Résumé
Résumé
עִבְרִית
русский
ελληνικά
€1 is cheap
다음과 같은 조건을 따라야 합니다: 저작자표시
Gdańsk
Việt
Kalâdlit
Kağan
latviešu
русский
العربية
Hello 常用國字標準字體表

@monoflash
Copy link
Contributor Author

I added the function that you requested, and deleted my custom function, also fixed some skipped returned error and fix tests.

@monoflash monoflash requested a review from xuri December 19, 2019 16:10
@xuri xuri merged commit b1b3c0d into qax-os:v2 Dec 19, 2019
@xuri
Copy link
Member

xuri commented Dec 19, 2019

Thanks @monoflash. I will add error return values to functions later, and unit tests for this PR.

nullfy pushed a commit to nullfy/excelize that referenced this pull request Oct 23, 2020
…ax-os#540)

* Fixed issue qax-os#539 Fixed error opening excel file created in encoding different from UTF-8, added logging of possible errors when decoding XML if the function does not provide exit with an error

* Added test for CharsetReader

* Fixed #discussion_r359397878

Discussion: qax-os#540 (comment)

* Fixed go fmt

* go mod tidy and removed unused imports

* The code has been refactored
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants