Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible regression on Import-Excel skips the first row when Headers specified #712

Closed
AikenBM opened this issue Nov 4, 2019 · 6 comments

Comments

@AikenBM
Copy link

AikenBM commented Nov 4, 2019

I'm not sure if this should be reported as a new issue or if I should be commenting on the closed issue, but I'm getting the same behavior reported in #417 and #427.

If I have an Excel worksheet that looks like this:

A B C
1 a b c
2 d e f

And I execute this:

Import-Excel Book1.xlsx -HeaderName 'Q','R','S'

I get this output:

Q R S
- - -
d e f

Specifying -StartRow 1 does not change the output.

I expect to get this output:

Q R S
- - -
a b c
d e f

If I use the -NoHeader switch instead, I get this:

P1 P2 P3
-- -- --
a  b  c
d  e  f

if I change my spreadsheet to this:

A B C
1
2 a b c
3 d e f

Then I get this output:

Q R S
- - -
a b c
d e f

I'm using ImportExcel 6.5.1 on PowerShell 6.2.3 and PowerShell 5.1 on Windows 10 x64 Enterprise 1809. I have tried force uninstalling ImportExcel and then downloading and reinstalling it.

@dfinke
Copy link
Owner

dfinke commented Nov 4, 2019

Thanks for reporting. Looks like a bug.

@jhoneill
Copy link
Contributor

jhoneill commented Nov 5, 2019

Import-Excel Book1.xlsx -HeaderName 'Q','R','S'
Means "My data has headers . Replace them with Q, R & S"

-NoHeader Means "My data has no headers". Use P (for property) 1..N on import.

I can't recall why we don't allow -Noheader and -HeaderName together.

@AikenBM
Copy link
Author

AikenBM commented Nov 5, 2019

Import-Excel Book1.xlsx -HeaderName 'Q','R','S'
Means "My data has headers . Replace them with Q, R & S"

Yeah, that's not what I would expect based on how Import-Csv works. With that command, if you specify -Header, it means "These are the headers for my file." The first row is then assumed to be data. If you want to avoid that, you can do this:

Get-Content test.csv | Select-Object -Skip 1 | ConvertFrom-Csv -Header 'Q','R','S'

I can't recall why we don't allow -Noheader and -HeaderName together.

Personally, I would much rather have to specify Import-Excel -HeaderName 'Q','R','S' -StartRow 2 or Import-Excel -HeaderName 'Q','R','S' | Select-Object -Skip 1 than to find out the hard way that just -HeaderName 'Q','R','S' silently skips a row. I understand that you have backwards compatibility concerns to consider, however.

@dfinke
Copy link
Owner

dfinke commented Nov 5, 2019

I vaguely remember originally implementing this along the lines of Import-Csv. Only remembered it after @AikenBM posted. I didn't remember wanting -Noheader and -HeaderName combined.

This is what you get when you don't add inline help or create tests :-)

@ili101 also has a proposal #713 (comment)

Let's think about it and come to a consensus.

@jhoneill
Copy link
Contributor

jhoneill commented Nov 6, 2019

@dfinke If we change the behavior now to work like import-csv anyone who is using it to rewrite headers will need to add -row 2 otherwise the first row of data will be junk. i.e. it's a breaking change but the fix is fairly painless and I don't think it affects many people
if we allow both -Noheader and -HeaderName, it doesn't break it for anyone but it doesn't work like Import-csv, and 'work like existing stuff' is a good rule.
I looked at @ili101 's idea last night (got home with a dreadful headache so first impression might not be ideal), it looked both a good idea and slightly too complicated.
So I can see good and bad in all the options. Which means if a consensus emerges I can live with it :-)

@AikenBM
Copy link
Author

AikenBM commented Nov 8, 2019

I was looking at a possible compromise for a breaking change. Say, to require either -NoHeader or -StartRow if -HeaderName is specified, so that the script would error. That way you would force users to correct the ambiguity.

However, I'm finding some really bizarre behavior with -HeaderName and -StartRow.

Take the same Excel document, Book1.xlsx:

A B C
1 a b c
2 d e f

And run Import-Excel Book1.xlsx -HeaderName 'Q','R','S' -StartRow 1. I get this:

Q R S
- - -
d e f

As before my first row of data is silently skipped. It's "wrong" but I understand that that was the original design.

Now, watch what happens when I start incrementing -StartRow:

PS U:\> Import-Excel .\Book1.xlsx -HeaderName 'Q','R','S' -StartRow 2

Q R S
- - -

d e f

PS U:\> Import-Excel .\Book1.xlsx -HeaderName 'Q','R','S' -StartRow 3

Q R S
- - -


d e f

PS U:\> Import-Excel .\Book1.xlsx -HeaderName 'Q','R','S' -StartRow 4

Q R S
- - -



d e f

PS U:\> Import-Excel .\Book1.xlsx -HeaderName 'Q','R','S' -StartRow 5

Q R S
- - -




d e f

PS U:\> Import-Excel .\Book1.xlsx -HeaderName 'Q','R','S' -StartRow 5

Q R S
- - -




d e f

PS U:\>

I have no idea why it starts adding blank rows, nor why it includes data from row 2 in the output when the start row is higher than row 2.

The behavior of -StartRow by itself also seems strange to me:

PS U:\> Import-Excel .\Book1.xlsx -StartRow 1

a b c
- - -
d e f

PS U:\> Import-Excel .\Book1.xlsx -StartRow 2

d e f
- - -

d e f

PS U:\> Import-Excel .\Book1.xlsx -StartRow 3
Failed importing the Excel workbook 'U:\Book1.xlsx' with worksheet '': No column headers found on top row '3'. If column headers in the worksheet are not a requirement then please use the '-NoHeader' or '-HeaderName' parameter.
At C:\Users\XXXXX\Documents\PowerShell\Modules\ImportExcel\6.5.1\ImportExcel.psm1:452 char:17
+ ...     catch { throw "Failed importing the Excel workbook '$Path' with w ...
+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : OperationStopped: (Failed importing th\u2026derName' parameter.:String) [], RuntimeException
+ FullyQualifiedErrorId : Failed importing the Excel workbook 'U:\Book1.xlsx' with worksheet '': No column headers found on top row '3'. If column headers in the worksheet are not a requirement then please use the '-NoHeader' or '-HeaderName' parameter.

-StartRow 1 and -StartRow 3 make sense to me, but the output of -StartRow 2 seems incorrect. Maybe I'm not understanding what -StartRow is supposed to mean, but this isn't the behavior that I was expecting. I would expect:

d e f
- - -

That is, one row with empty/blank values in all columns. Alternately, no output at all (i.e., no rows returned).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants