Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Pal/Linux-SGX] Add sgx.disable_[cpu-feature] manifest options #461

Closed
wants to merge 1 commit into from

Conversation

dimakuv
Copy link

@dimakuv dimakuv commented Mar 16, 2022

Description of the changes

This PR adds three new manifest options: sgx.disable_avx, sgx.disable_avx512, sgx.disable_amx. Setting each of these options to true disables the corresponding CPU feature inside the SGX enclave even if this CPU feature is available on the system: this may improve enclave performance because this CPU feature will not be saved and restored during enclave entry/exit.

This PR is a re-creation of #321, since it turned out to be useful for performance. Without disclosing any perf numbers, I performed a set of micro-benchmarks and they prove that disabling Intel AMX inside of the enclave (when it is available on the machine but is not needed by the app inside the enclave) gives a measurable performance boost (for the worst-case scenarios). So this change actually is beneficial.

How to test this PR?

I tested this PR manually on the AMX-enabled machine:

# with   sgx.require_amx = false,   sgx.disable_amx = false 
$ gramine-sgx helloworld
[::] debug: LibOS xsave_enabled 1, xsave_size 0x2b00(11008), xsave_features 0x600e7 

# with   sgx.require_amx = true,   sgx.disable_amx = false 
$ gramine-sgx helloworld
[::] debug: LibOS xsave_enabled 1, xsave_size 0x2b00(11008), xsave_features 0x600e7

# with   sgx.require_amx = false,   sgx.disable_amx = true
$ gramine-sgx helloworld
[::] debug: LibOS xsave_enabled 1, xsave_size 0xa80(2688), xsave_features 0xe7

# with   sgx.require_amx = true,   sgx.disable_amx = true   -> this one is not recommended
$ gramine-sgx helloworld
[::] debug: LibOS xsave_enabled 1, xsave_size 0xa80(2688), xsave_features 0xe7

This change is Reviewable

This commit adds three new manifest options: `sgx.disable_avx`,
`sgx.disable_avx512`, `sgx.disable_amx`. Setting each of these options
to `true` disables the corresponding CPU feature inside the SGX enclave
even if this CPU feature is available on the system: this may improve
enclave performance because this CPU feature will *not* be saved and
restored during enclave entry/exit.

Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
@dimakuv
Copy link
Author

dimakuv commented Mar 16, 2022

Jenkins, retest Jenkins-Debug-18.04 please (LibOS.shim.test.ltp.test_ltp.test_ltp[fcntl14_64] and LibOS.shim.test.ltp.test_ltp.test_ltp[fdatasync01] timed out, unrelated)

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 5 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @dimakuv)

a discussion (no related file):
But the description of #321 says:

For example, disabling Intel AMX may improve performance of some workloads by around 5%. UPDATE: The previous statement turned out to be false. It seems that this PR doesn't provide any perf improvement. I am closing it and may resurrect in the future if this PR turns out to be useful for anything.

So, what happened in the meantime? And how can I verify this?


Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 5 files reviewed, 1 unresolved discussion, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @mkow)

a discussion (no related file):

So, what happened in the meantime?

I did more performance tests on Intel AMX. In particular, I wrote a microbenchmark; you can find this microbenchmark here: 1451bb6 (it is under CI-Examples/amxtest). The microbenchmark provides some numbers on perf overhead, but I cannot publish them.

And how can I verify this?

If you want to test the AMX disable/enable manifest options, you'll need an AMX-enabled machine. Do you have access to one?

If you don't have an AMX-enabled machine, you can check AVX and/or AVX512, in a similar way I show in the How to test this PR? section (i.e., you will see different XSAVE size and features).


Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 5 files at r1, all commit messages.
Reviewable status: 2 of 5 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @dimakuv and @mkow)


Documentation/manifest-syntax.rst line 567 at r1 (raw file):

Gramine doesn't disallow this, but the feature will be disabled in such case.
For example, setting both ``sgx.require_avx = true`` and ``sgx.disable_avx =
true`` will result in the SGX enclave running with AVX disabled.

This makes no sense to me.
Why not use something like:

sgx.avx = ["required","disabled","default"]

?

Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 5 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @boryspoplawski and @mkow)


Documentation/manifest-syntax.rst line 567 at r1 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

This makes no sense to me.
Why not use something like:

sgx.avx = ["required","disabled","default"]

?

The shortest review you've done yet :)

Yes, I could do that, and it is cleaner. But one question: do we want to keep the old sgx.require_avx manifest option (deprecate but keep it), or do we remove it?

Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 5 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @dimakuv and @mkow)


Documentation/manifest-syntax.rst line 567 at r1 (raw file):

The shortest review you've done yet :)

Well, I've skipped (not reviewed yet) most of the PR for now :)

But one question: do we want to keep the old sgx.require_avx manifest option (deprecate but keep it), or do we remove it?

Well, I guess we have to keep it for backward compatibility (for one version)?

Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 5 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @boryspoplawski and @mkow)


Documentation/manifest-syntax.rst line 567 at r1 (raw file):

Previously, boryspoplawski (Borys Popławski) wrote…

The shortest review you've done yet :)

Well, I've skipped (not reviewed yet) most of the PR for now :)

But one question: do we want to keep the old sgx.require_avx manifest option (deprecate but keep it), or do we remove it?

Well, I guess we have to keep it for backward compatibility (for one version)?

Sounds good. This will require a lot of clunky code, but I guess that's life. I'll update this PR soon-ish.

@boryspoplawski Should I rebase to the latest master? Or maybe even close this PR, and create a new one, with your comment on sgx.avx and based on current master?

Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 5 files reviewed, 2 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel) (waiting on @dimakuv and @mkow)


Documentation/manifest-syntax.rst line 567 at r1 (raw file):

. This will require a lot of clunky code, but I guess that's life.

Why is that? Doesn't sound so bad, just one or two ifs in python.

Should I rebase to the latest master? Or maybe even close this PR, and create a new one

I don't mind either - I did not review much so I can either check full diff here or at new PR, doesn't matter to me

@dimakuv
Copy link
Author

dimakuv commented Sep 2, 2022

Closed in favor of #877 (rebased and improved version)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants