Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defork croaring #3596

Merged
merged 4 commits into from
Mar 22, 2021
Merged

Conversation

trevyn
Copy link
Contributor

@trevyn trevyn commented Mar 17, 2021

Replace mimblewimble/croaring-rs / croaring-mw fork with upstream while ensuring no new AVX/AVX2 instructions are added in release builds.

Bumps the effective version of croaring-rs from 0.4.4 to 0.4.6. (details)

Unblocks:

Motivation: We want to update croaring-rs to unblock the above issues, and at the same time, the upstream merge of RoaringBitmap/croaring-rs#62 makes our fork unnecessary. See comments in #3545 for comments on why we originally forked.

Verification

cargo test --all passes locally.

Verification of no new AVX/AVX2 instructions, using Intel XED from http://www.intel.com/software/xed:

git clone https://github.com/intelxed/xed.git
cd xed
./mfile.py examples install
cp kits/xed-install-base-2021-03-17-mac-x86-64/bin/xed /usr/local/bin/xed
$ cargo clean && cargo build --release
$ /usr/local/bin/xed -i target/release/grin > native-disasm
$ grep -c SSE2 native-disasm
22479
$ grep -c AVX native-disasm
4380
$ grep -c AVX2 native-disasm
1192

$ cargo clean && ROARING_ARCH=nehalem cargo build --release
$ /usr/local/bin/xed -i target/release/grin > nehalem-disasm
$ grep -c SSE2 nehalem-disasm 
22794
$ grep -c AVX nehalem-disasm
2505
$ grep -c AVX2 nehalem-disasm
570

$ cargo clean && ROARING_ARCH=x86-64-v2 cargo build --release
$ /usr/local/bin/xed -i target/release/grin > x86-64-v2-disasm
$ grep -c SSE2 x86-64-v2-disasm 
23235
$ grep -c AVX x86-64-v2-disasm
2505
$ grep -c AVX2 x86-64-v2-disasm
570

$ wget https://github.com/mimblewimble/grin/releases/download/v5.0.1/grin-v5.0.1-macos.tar.gz 
$ tar xzf grin-v5.0.1-macos.tar.gz
$ /usr/local/bin/xed -i grin/grin > release-disasm
$ grep -c SSE2 release-disasm
22908
$ grep -c AVX release-disasm 
2505
$ grep -c AVX2 release-disasm
570

I'm not sure where these other AVX instructions are coming from if the existing release is tested to function as we expect. It's possible they're behind runtime checks.

@trevyn
Copy link
Contributor Author

trevyn commented Mar 17, 2021

Hmm, Windows can't find libclang now:

thread 'main' panicked at 'Unable to find libclang: "the libclang shared library at C:\\msys64\\mingw64\\bin\\libclang.dll could not be opened: LoadLibraryExW failed"', C:\Users\VssAdministrator\.cargo\registry\src\github.com-1ecc6299db9ec823\bindgen-0.56.0\src/lib.rs:1922:31

I noticed clang-sys was updated in the Cargo.lock:

  name = "clang-sys"
- version = "0.28.1"
+ version = "1.1.1"

I'll look into this later.

Copy link
Member

@quentinlesceller quentinlesceller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Triggered another build. Also regarding the clang issue maybe worth adding

       env:
          LIBCLANG_PATH: ${{ runner.temp }}/llvm-${{ matrix.clang[0] }}/lib
          LLVM_CONFIG_PATH: ${{ runner.temp }}/llvm-${{ matrix.clang[0] }}/bin/llvm-config

in Github actions for test and release.

@@ -2,7 +2,10 @@ steps:
- script: 'cargo test --all'
displayName: Cargo Test All
condition: and(succeeded(), contains(variables['Build.SourceBranch'], 'refs/tags/'), eq(variables['CI_JOB'], 'release' ))
- script: 'cargo build --release'
- script: |
cargo clean
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why cargo clean ?

Copy link
Contributor Author

@trevyn trevyn Mar 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The env variable changes the croaring build configuration in a way that cargo isn't smart enough to pick up. This particular script might work without it, but the cargo test right above makes me nervous, and when the script or build behavior changes in the future, this adds a little bit of safety.

Of course, happy to remove it or implement a different solution if you prefer!

@trevyn
Copy link
Contributor Author

trevyn commented Mar 18, 2021

env:
  LIBCLANG_PATH: ${{ runner.temp }}/llvm-${{ matrix.clang[0] }}/lib
  LLVM_CONFIG_PATH: ${{ runner.temp }}/llvm-${{ matrix.clang[0] }}/bin/llvm-config

Thanks! I got GHA to work: https://github.com/trevyn/grin/runs/2138862178

- run: choco install -y llvm
- run: refreshenv && cargo test --all
  env:
    LIBCLANG_PATH: C:\Program Files\LLVM\lib
    LLVM_CONFIG_PATH: C:\Program Files\LLVM\bin\llvm-config

I'll push a commit in a moment and see if it works on Azure too.

And a PR against upstream, they disabled their Windows CI build instead😂

@phyro
Copy link
Member

phyro commented Mar 19, 2021

is croaring used for the bitmap mmr? what's the likelihood of this causing a possible consensus change?

@trevyn
Copy link
Contributor Author

trevyn commented Mar 19, 2021

is croaring used for the bitmap mmr? what's the likelihood of this causing a possible consensus change?

My newbie analysis (please educate if any of this is off-base for the question):

Assuming we're not newly triggering any LLVM/clang codegen bugs, the only functional effect of this change should be to bump croaring-rs from the fork point at 0.4.4 (mimblewimble/croaring-rs@9977f80) to 0.4.6 (RoaringBitmap/croaring-rs@4fc6582) (diff 0.4.4...0.4.6).

These changes in croaring-rs look innocuous to me, and also bump the underlying lemire/CRoaringUnityBuild submodule from 0.2.65 to 0.2.66 (diff). Those changes appear to be a simple tweak to memory management plus this change to roaring.hh that I do not yet feel qualified to validate.

If there is any concern that consensus change could be caused by bumping croaring-rs or its dependency RoaringBitmap/CRoaring, we should definitely have a system in place to ensure that this bumping doesn't happen accidentally, or otherwise mitigate the concern. If this is desired, I can put some thought into how to implement that.

If anyone can think of any tests that would help confirm the validity of the change, I'd also be happy to take a whack at trying to write them.

@trevyn
Copy link
Contributor Author

trevyn commented Mar 20, 2021

I found this in the clang docs: https://clang.llvm.org/docs/UsersManual.html#x86

Several micro-architecture levels as specified by the x86-64 psABI are defined. They are cumulative in the sense that features from previous levels are implicitly included in later levels.

  • -march=x86-64: CMOV, CMPXCHG8B, FPU, FXSR, MMX, FXSR, SCE, SSE, SSE2
  • -march=x86-64-v2: (close to Nehalem) CMPXCHG16B, LAHF-SAHF, POPCNT, SSE3, SSE4.1, SSE4.2, SSSE3
  • -march=x86-64-v3: (close to Haswell) AVX, AVX2, BMI1, BMI2, F16C, FMA, LZCNT, MOVBE, XSAVE
  • -march=x86-64-v4: AVX512F, AVX512BW, AVX512CD, AVX512DQ, AVX512VL

More background, apparently this is a new thing: https://developers.redhat.com/blog/2021/01/05/building-red-hat-enterprise-linux-9-for-the-x86-64-v2-microarchitecture-level/

So ROARING_ARCH=x86-64-v2 may be cleaner and closer to what we're looking for. I'll verlfy in a bit.

@trevyn trevyn marked this pull request as draft March 20, 2021 07:58
x86-64-v2 is a newer variant, so make sure in the tests that the runners support it.
@phyro
Copy link
Member

phyro commented Mar 20, 2021

oh haha, I was just trying to make people confirm it is fine from the consensus point of view - I think we should get into practice of questioning such things even for updates like this one. I'm honestly not qualified enough to answer the question which is why I asked and I'm sorry if this made you dive too deep. It was no my intention, but nevertheless, thanks for doing it!

@phyro
Copy link
Member

phyro commented Mar 21, 2021

@quentinlesceller would be great if you could make a pass when you find the time so we get this in 👍 I'd also suggest we consider archiving the croaring fork in the mimblewimble organization after we confirm everything works well in the release containing these changes

@antiochp
Copy link
Member

Testing this locally.
Changes look reasonable to me.

@antiochp
Copy link
Member

antiochp commented Mar 22, 2021

is croaring used for the bitmap mmr? what's the likelihood of this causing a possible consensus change?

We use it internally for storage and but we don't use it as part of the hashing implementation.
Hashing a bitmap chunk happens over the internal bytes directly (of which a croaring bitmap is simply an alternative representation).

This is how we "write" the bytes as part of the hashing impl -

impl Writeable for BitmapChunk {
fn write<W: Writer>(&self, writer: &mut W) -> Result<(), ser::Error> {
self.0.to_bytes().write(writer)
}
}

Copy link
Member

@antiochp antiochp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Member

@quentinlesceller quentinlesceller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good 👍 LGTM !

@quentinlesceller quentinlesceller merged commit 431e4b9 into mimblewimble:master Mar 22, 2021
@antiochp
Copy link
Member

antiochp commented May 7, 2021

We are seeing some cases of "Illegal Instruction" for 5.1.0 and this is a potential culprit?

@antiochp
Copy link
Member

antiochp commented May 7, 2021

Is it possible the scripts are not correctly setting env vars?

https://docs.microsoft.com/en-us/azure/devops/pipelines/process/variables?view=azure-devops&tabs=yaml%2Cbatch#access-variables-through-the-environment

It looks like the recommended way is to do something along the lines of -

variables:
    ROARING_ARCH: x86-64-v2 
steps:
  - script: |
      cargo clean
      cargo build --release

Rather than what we are doing within the script itself -

steps:
  - script: |
      cargo clean
      ROARING_ARCH=x86-64-v2
      cargo build --release

There is also the option to use bash instead of script in the pipeline config which I suspect may make things more robust (or at least better understood) -

https://docs.microsoft.com/en-us/azure/devops/pipelines/scripts/cross-platform-scripting?view=azure-devops&tabs=yaml#consider-bash-or-pwsh

I'm not actually sure what the implementation is for a simple script in the pipeline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants