Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate PXE issue with Lenovo ThinkSystem HR330A (Ampere eMAG) #225

Closed
tstromberg opened this issue Nov 23, 2021 · 0 comments · Fixed by #228
Closed

Investigate PXE issue with Lenovo ThinkSystem HR330A (Ampere eMAG) #225

tstromberg opened this issue Nov 23, 2021 · 0 comments · Fixed by #228
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.

Comments

@tstromberg
Copy link
Contributor

tstromberg commented Nov 23, 2021

Apparently, #216 may have broken PXE boot on this hardware. I'm not sure what BIOS version, however.

@mmlb mentioned this on chat:

... the last bunch of dhcps ... where the filename is under ipxe/ and nothing back from the nic/uefi. I was on console and the machine went almost straight to uefi shell.

#223 may fix this, but if we can confirm the problem we should probably just roll-back #216 and bring it back without the path changes.

@tstromberg tstromberg added kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Nov 23, 2021
mmlb added a commit to mmlb/tinkerbell-boots that referenced this issue Dec 1, 2021
This reverts commit ceb470a.

Something in tinkerbell#216 broke our c2.large.arm provisions. Reverting
this to unblock things that came after while I look into debugging.

Some more info in [tinkerbell#225].

[tinkerbell#225]: tinkerbell#225.

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
mmlb added a commit to mmlb/tinkerbell-boots that referenced this issue Dec 1, 2021
This reverts commit ceb470a.

Something in tinkerbell#216 broke our c2.large.arm provisions. Reverting
this to unblock things that came after while I look into debugging.

Some more info in [tinkerbell#225].

[tinkerbell#225]: tinkerbell#225

Signed-off-by: Manuel Mendez <mmendez@equinix.com>
mergify bot added a commit that referenced this issue Dec 1, 2021
## Description

This reverts commit ceb470a.

## Why is this needed

Something in #216 broke our c2.large.arm provisions. Reverting
this to unblock things that came after while I look into debugging.

Some more info in [#225].

[#225]: #225

## How Has This Been Tested?

Tested via sandbox against EM c2.large.arm
@mergify mergify bot closed this as completed in #228 Dec 9, 2021
mergify bot added a commit that referenced this issue Dec 9, 2021
## Description

This is a revert of the revert that occurred in #217 + fix that caused the original revert.
There's also removal of what is effectively useless/dead code (isPXEClient/pxeClient var/handling).
I also removed the leading `/` from `/nonexistent` because a `/` is already prefixed and `//nonexistent` has bothered me for years.

## Why is this needed

#216 was a great contribution from @rgl, unfortunately we ran into issues on some of our Ampere eMAG systems (#225).
Turns out that the eMAG boot firmware just straight up ignores filename values if the response packet is marked as `PXEClient`.
#216 changed the logic subtly by setting the response packet's class as `HTTPClient` if http was supported by the booting machine, and otherwise setting `PXEClient`. Before #216 we would set `PXEClient` only for the dhcp responses that were going out to our tftp'd build of iPXE, the initial filename: response was `PXEClient`-less.

Once I put back the only `PXEClient` for our iPXE everything worked out fine again (tested using sandbox), but then I realized that now pxeClient was likely useless because I suspected iPXE did not care/require it. I tested this too and found that iPXE is happy to tftp a file provided by a `PXEClient`-less DHCP response. So why keep code around that is effectively useless? Less code is better than any code so that caused me to write up d1878bd.

Fixes #210
Fixes #225

This debugging also opens up the way to bring back #149 as that had to be reverted due to the same "always set `PXEClient`" issue that I was not equipped to debug easily/quickly then.

## How Has This Been Tested?

Lots of testing on EM machines using github.com/tinkerbell/sandbox's terraform support (I plan to clean up and PR https://github.com/mmlb/sandbox/tree/fix-terraform-and-add-arm64-support at some point).

## How are existing users impacted? What migration steps/scripts do we need?

Back to faster/better iPXE binary downloads by utilizing TCP/HTTP instead of UDP/TFTP.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant