Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

native64: Separate board for 64-bit native #20335

Merged
merged 4 commits into from
Feb 6, 2024

Conversation

fzi-haxel
Copy link
Contributor

Contribution description

Adds a separate board for native64 instead of the NATIVE_64BIT workaround from #20315.

boards/native64 more or less just includes the files from board/native (similar to the openlabs-kw41z-mini-256kib board).
The main logic for native is still in cpu/native and boards/native.

The bulk of the changes are related to the build and tests checks for the native board. For example

Filetype Before After
C #ifdef BOARD_NATIVE #ifdef CPU_NATIVE
Makefile ifeq ($(BOARD),native) ifneq (,$(filter native native64,$(BOARD)))
Python if BOARD == "native": if BOARD in ["native", "native64"]:

Changes to the CI

The last commit makes some changes to the CI to test native64 in the same way as native.
Note: The CI docker may be missing some dependencies, such as libsdl2-dev for 64 bit.

Testing procedure

The board should behave just like the native board, except that it lacks Rust support.
The following command compiles and tests the same examples and tests as native (with the exception of the Rust-based ones).

  • ./dist/tools/compile_and_test_for_board/compile_and_test_for_board.py . native64.

@github-actions github-actions bot added Platform: native Platform: This PR/issue effects the native platform Area: doc Area: Documentation Area: tests Area: tests and testing framework Area: core Area: RIOT kernel. Handle PRs marked with this with care! Area: build system Area: Build system Area: pkg Area: External package ports Area: CI Area: Continuous Integration of RIOT components Area: tools Area: Supplementary tools Area: boards Area: Board ports Area: cpu Area: CPU/MCU ports Area: sys Area: System Area: examples Area: Example Applications Area: Kconfig Area: Kconfig integration labels Feb 2, 2024
@benpicco benpicco added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Feb 3, 2024
@riot-ci
Copy link

riot-ci commented Feb 3, 2024

Murdock results

✔️ PASSED

d1f1f8a sys/psa_crypto: Use PRIuSIZE in debug messages

Success Failures Total Runtime
10016 0 10016 12m:18s

Artifacts

@benpicco benpicco added the CI: no fast fail don't abort PR build after first error label Feb 3, 2024
@benpicco benpicco added the State: waiting for CI update State: The PR requires an Update to CI to be performed first label Feb 3, 2024
@benpicco
Copy link
Contributor

benpicco commented Feb 3, 2024

/usr/include/SDL2/SDL_config.h:4:10: fatal error: SDL2/_real_SDL_config.h: No such file or directory
    4 | #include <SDL2/_real_SDL_config.h>

looks like we also need to install the 64 bit version of that on CI

@fzi-haxel
Copy link
Contributor Author

/usr/include/SDL2/SDL_config.h:4:10: fatal error: SDL2/_real_SDL_config.h: No such file or directory
    4 | #include <SDL2/_real_SDL_config.h>

looks like we also need to install the 64 bit version of that on CI

I created RIOT-OS/riotdocker#241
Alternatively, we could blacklist them for native64.

@fzi-haxel
Copy link
Contributor Author

All tests seem to work on my machine. However, four tests fail sporadically on the CI (especially the LLVM builds).
I think most (maybe all) of them are stack overflows. native64 uses more stack than native, e.g. during printing (~1kB) or during thread switching, but since the native stacks are quite large, I never had to touch them.

I created a fixup for the problematic tests

  • tests/core/thread_priority_inversion: Bigger main thread stack -> still fails sometimes
  • tests/sys/sema_inv: Bigger test threads
  • tests/net/gnrc_netif_ieee802154: Bigger SOCKET_ZEP_MAC_STACKSIZE (this could be problematic, but it's hard to test as I can't reproduce it) -> still fails sometimes
  • tests/periph/timer_periodic: Fails sometimes with _native_popsig: real_read: Bad file descriptor, may still be the effect of a stack overflow. But again, can't test it locally. For now, I just blacklisted it for the CI.

If you have any ideas how to fix them or how to reproduce the failures locally, I'm all ears.

@benpicco
Copy link
Contributor

benpicco commented Feb 5, 2024

Since the riotdocker CI issue doesn't seem to be so easy to solve, I'd suggest you just blacklist the native64 board on the affected tests.

@fzi-haxel
Copy link
Contributor Author

Since the riotdocker CI issue doesn't seem to be so easy to solve, I'd suggest you just blacklist the native64 board on the affected tests.

I blacklisted the tests.

I also increased the default stacksize for native64 instead of adding exceptions for the LLVM tests that sometimes fail.
But SOCKET_ZEP_MAC_STACKSIZE is probably still a problem.

@chrysn
Copy link
Member

chrysn commented Feb 5, 2024

Is having separate boards the right abstraction here? Seeing that some of the ESP32 have a single board take an extra parameter to switch the precise chip (may be different CPUs) IIUC, maybe that's the more accurate abstraction. After all, what matters to users of native (at least to me) is not so much how it is compiled, but more that the network is tun/tap backed, MTD comes from a local file and so on.

What's the motivation behind having two separate boards?

@benpicco
Copy link
Contributor

benpicco commented Feb 5, 2024

What's the motivation behind having two separate boards?

  • ease of use
  • CI coverage

@benpicco benpicco removed the State: waiting for CI update State: The PR requires an Update to CI to be performed first label Feb 5, 2024
Copy link
Contributor

@benpicco benpicco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's getting much nicer

makefiles/arch/native.inc.mk Outdated Show resolved Hide resolved
makefiles/arch/native.inc.mk Outdated Show resolved Hide resolved
makefiles/arch/native.inc.mk Outdated Show resolved Hide resolved
makefiles/arch/native.inc.mk Outdated Show resolved Hide resolved
@chrysn
Copy link
Member

chrysn commented Feb 5, 2024

* ease of use

I'd think ease of use would be best if the 64bit version of native would just become the default -- and for examples that have BOARD=native, setting some detail flag is just as usable as changing the board. (And neither is immediately discoverable to users who have used native for some time).

* CI coverage

That's true but should be fixable just as easily in our CI -- without introducing a board variant.

I'll leave it at that, given that I don't contribute much to this port otherwise, but will ping the matrix room in case anyone wants to present more solid points than my gut feeling "this is the wrong way to do it".

[edit after some clarifying chat]: I misremembered the ESP32 part, and there is precedent for board variants over subtleties. Any remaining concerns of mine are better spent around cleaning up board inheritance and defining boards more clearly.

Copy link
Contributor

@benpicco benpicco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please squash

@benpicco
Copy link
Contributor

benpicco commented Feb 5, 2024

Hmm the timer tests are failing a bit too consistently on CI for native64 - maybe something is broken there.

Copy link
Contributor

@benpicco benpicco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that fixed it - Please squash

@fzi-haxel
Copy link
Contributor Author

Hmm the timer tests are failing a bit too consistently on CI for native64 - maybe something is broken there.

I think I found it. I increased the native isr stacksize

  • char __isr_stack[SIGSTKSZ] -> char __isr_stack[THREAD_STACKSIZE_DEFAULT]. (So same size for native and double for native64).

I try a few CI runs to see if it fails. But it's still a mystery why I can't reproduce it.
It only happens sometimes on the CI, only in threads with printf and only for LLVM builds.

Adds a separate board for native64 instead of the `NATIVE_64BIT` workaround.
The files in `boards/native64` are more or less dummy files and just include
the `boards/native` logic (similar to `openlabs-kw41z-mini-256kib`).
The main logic for native is in `makefiles/arch/native.inc.mk`, `cpu/native`
and `boards/native`.

The remaining changes concern the build system, and change native board checks
to native CPU checks to cover both boards.
- Adapted build system and test checks for the native boards to include native64
- Added `native64` to the same tests as `native`
- Test native64 like native in murdock
- Add native64 to "Platform: native" in github labeler
- Add "BUILDTEST_MCU_GROUP == x86_64" to `dist/tools/ci/build_and_test.sh`
@fzi-haxel
Copy link
Contributor Author

Yup, that fixed it - Please squash

Done

@benpicco benpicco added this pull request to the merge queue Feb 5, 2024
Merged via the queue into RIOT-OS:master with commit 4d9e8a8 Feb 6, 2024
25 checks passed
@MrKevinWeiss MrKevinWeiss added this to the Release 2024.04 milestone Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: boards Area: Board ports Area: build system Area: Build system Area: CI Area: Continuous Integration of RIOT components Area: core Area: RIOT kernel. Handle PRs marked with this with care! Area: cpu Area: CPU/MCU ports Area: doc Area: Documentation Area: examples Area: Example Applications Area: Kconfig Area: Kconfig integration Area: pkg Area: External package ports Area: sys Area: System Area: tests Area: tests and testing framework Area: tools Area: Supplementary tools CI: no fast fail don't abort PR build after first error CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Platform: native Platform: This PR/issue effects the native platform
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants