Skip to content

Conversation

@SammyVimes
Copy link
Contributor

@SammyVimes SammyVimes commented Jan 29, 2024

Hi! Noticed that there is no support for Big Endian, so decided to start working on it.
Currently I did:

  • sz_find_1char_swar
  • sz_rfind_1char_swar
  • sz_find_2char_swar
  • sz_find_3char_swar
  • sz_find_4char_swar
  • sz_find_substring_swar

(also fixed a little bug in the little endian version of sz_rfind_1char_swar).

I also took a liberty of adding googletest dependency (of course I can remove it, I just prefer using it)
and supporting linux version of qsort_r in the test.cpp.

Although I am using macOS, I was able to test on a big endian machine using QEMU and docker. So I think the next thing to do for this pull request will be adding a workflow with the similar setup (or maybe GitHub has BE machines, I don't know yet).

With all those if (IS_LITTLE_ENDIAN) code looks funky, I know. I will try to do something with it

@ashvardanian
Copy link
Owner

Hi, @SammyVimes! Thank you for the PR! Big Endian support is a great thing to have, but we need to make a few changes to proceed with the PR.

  1. The main-dev branch is miles ahead, and is being prepared for a major release. Please use it as a reference branch.
  2. It is probably wiser to use macros for big/little endian checks.
  3. Let's avoid Google Test and other third-party utilities. They are very useful in the general case, but a careful use of assert-s makes more sense here, allowing us to conditionally log more info about the scope, hence simplifying debugging.

@SammyVimes
Copy link
Contributor Author

Hi, @ashvardanian! Sure, will change the PR accordingly. I really don’t know how I managed to miss the main-dev branch 😅

@ashvardanian
Copy link
Owner

Hi @SammyVimes! Any chance you've made any progress on the new version? I'm installing Docker QEMU images now, to test on 32-bit and big-endian architectures to generalize StringZilla further. Can continue your efforts, if you have anything you can push 🤗

@SammyVimes
Copy link
Contributor Author

@ashvardanian sorry, I was completely swamped by work. If big endian support is still required, I will happily pick up where I left off

Repository owner deleted a comment from recurseml bot Sep 2, 2025
Repository owner deleted a comment from recurseml bot Sep 2, 2025
@ashvardanian ashvardanian changed the base branch from main to main-dev September 2, 2025 11:48
@ashvardanian ashvardanian changed the title Initial support for Big Endian Big Endian support for SWAR backends Sep 2, 2025
@ashvardanian
Copy link
Owner

Hi @SammyVimes! Thanks for laying all the groundwork for broader big-endian support! Assuming how much the directory structure has changed, I've ported all of your commits onto the new head and signed them with your name! Thanks for all the help and being part of the v4 #201 release 🤗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants