-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed issue #539 Fixed error opening excel file created in encoding d… #540
Conversation
…ifferent from UTF-8, added logging of possible errors when decoding xml, if the function does not provide exit with error
Codecov Report
@@ Coverage Diff @@
## v2 #540 +/- ##
=========================================
- Coverage 96.11% 95.2% -0.92%
=========================================
Files 26 26
Lines 5452 5543 +91
=========================================
+ Hits 5240 5277 +37
- Misses 119 149 +30
- Partials 93 117 +24
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @monoflash, thanks for your PR. I've left some comments.
Oops.. I didn’t think that the commits following the pool-request would be added to the pool-request. |
I fixed all remarks you wrote about. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @monoflash, I've left some comments. If we use charset.NewReaderLabel instead of the CharsetReader
function, that will simplify the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of the proposed changes has been implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Specified charset reader for the Rows function like is:
func (f *File) Rows(sheet string) (*Rows, error) {
name, ok := f.sheetMap[trimSheetName(sheet)]
if !ok {
return nil, ErrSheetNotExist{sheet}
}
if f.Sheet[name] != nil {
// flush data
output, _ := xml.Marshal(f.Sheet[name])
f.saveFileList(name, replaceWorkSheetsRelationshipsNameSpaceBytes(output))
}
var (
err error
inElement string
row int
rows Rows
)
decoder := xml.NewDecoder(bytes.NewReader(f.readXML(name)))
decoder.CharsetReader = charset.NewReaderLabel
for {
token, _ := decoder.Token()
if token == nil {
break
}
switch startElement := token.(type) {
case xml.StartElement:
inElement = startElement.Name.Local
if inElement == "row" {
for _, attr := range startElement.Attr {
if attr.Name.Local == "r" {
row, err = strconv.Atoi(attr.Value)
if err != nil {
return &rows, err
}
}
}
rows.totalRow = row
}
default:
}
}
rows.f = f
rows.sheet = name
rows.decoder = xml.NewDecoder(bytes.NewReader(f.readXML(name)))
rows.decoder.CharsetReader = charset.NewReaderLabel
return &rows, nil
}
I have create a spreadsheet file, there are different encoding on over 20 worksheets:
Worksheet | Encoding |
---|---|
Sheet1 | utf8 |
Sheet2 | windows-1250 |
Sheet3 | windows-1251 |
Sheet4 | windows-1252 |
Sheet5 | windows-1253 |
Sheet6 | windows-1254 |
Sheet7 | windows-1255 |
Sheet8 | windows-1256 |
Sheet9 | windows-1257 |
Sheet10 | windows-1258 |
Sheet11 | big5 |
Sheet12 | gbk |
Sheet13 | ISO-8859-3 |
Sheet14 | ISO-8859-4 |
Sheet15 | ISO-8859-5 |
Sheet16 | ISO-8859-7 |
Sheet17 | ISO-8859-9 |
Sheet18 | ISO-8859-13 |
Sheet19 | ISO-8859-15 |
Sheet20 | euc-kr |
Test the newly modified Row
function:
package main
import (
"fmt"
"github.com/360EntSecGroup-Skylar/excelize"
)
func main() {
f, err := excelize.OpenFile("./Encoding.xlsx")
if err != nil {
fmt.Println("Open file error:", err)
return
}
for _, sheetName := range f.GetSheetMap() {
rows, err := f.GetRows(sheetName)
if err != nil {
fmt.Println("GetRows error:", err)
return
}
for _, row := range rows {
for _, colCell := range row {
fmt.Print(colCell, "\t")
}
fmt.Println()
}
}
}
Output:
ελληνικά
Kağan
latviešu
Hello 常用國字標準字體表
nutraĵo
Résumé
Résumé
עִבְרִית
русский
ελληνικά
€1 is cheap
다음과 같은 조건을 따라야 합니다: 저작자표시
Gdańsk
Việt
Kalâdlit
Kağan
latviešu
русский
العربية
Hello 常用國字標準字體表
I added the function that you requested, and deleted my custom function, also fixed some skipped returned error and fix tests. |
Thanks @monoflash. I will add error return values to functions later, and unit tests for this PR. |
…ax-os#540) * Fixed issue qax-os#539 Fixed error opening excel file created in encoding different from UTF-8, added logging of possible errors when decoding XML if the function does not provide exit with an error * Added test for CharsetReader * Fixed #discussion_r359397878 Discussion: qax-os#540 (comment) * Fixed go fmt * go mod tidy and removed unused imports * The code has been refactored
Fixed issue #539 Fixed error opening excel file created in encoding different from UTF-8, added logging of possible errors when decoding xml, if the function does not provide exit with error