Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PAL/Linux-SGX] Introduce sgx.cpu_features.[...] manifest options #881

Merged
merged 1 commit into from
Sep 18, 2023

Conversation

dimakuv
Copy link

@dimakuv dimakuv commented Sep 5, 2022

Description of the changes

This PR introduces sgx.cpu_features.[...] = "[unspecified|required|disabled]" instead of sgx.require_[...] = true|false.

Mapping of deprecated and new manifest options:

  • sgx.cpu_features.[...] = "unspecified" -> sgx.require_[...] = false
  • sgx.cpu_features.[...] = "required" -> sgx.require_[...] = true

sgx.cpu_features.[...] = "disabled" is new and it disables the corresponding CPU feature inside the SGX enclave even if this CPU feature is available on the system: this may improve enclave performance because this CPU feature will not be saved and restored during enclave entry/exit.

As a side effect, this PR removes unneeded attrs in get_mrenclave_and_manifest(). The only enclave attributes required for the SGX enclave measurement are the enclave size and the number of enclave threads. Other attributes such as ISV_PROD_ID, ISV_SVN, XFRM are not included in the measurement.

This PR is a re-creation of #321 and #461 and #877, since it turned out to be useful for performance. Without disclosing any perf numbers, I performed a set of micro-benchmarks and they prove that disabling Intel AMX inside of the enclave (when it is available on the machine but is not needed by the app inside the enclave) gives a measurable performance boost (for the worst-case scenarios). So this change actually is beneficial.

How to test this PR?

I tested this PR manually on the AMX-enabled machine:

# with nothing, should default to sgx.cpu_features.amx = "unspecified"
$ gramine-sgx helloworld
debug: LibOS xsave_enabled 1, xsave_size 0x2b00(11008), xsave_features 0x600e7

# with sgx.cpu_features.amx = "unspecified"
$ gramine-sgx helloworld
debug: LibOS xsave_enabled 1, xsave_size 0x2b00(11008), xsave_features 0x600e7 

# with sgx.cpu_features.amx = "required"
$ gramine-sgx helloworld
debug: LibOS xsave_enabled 1, xsave_size 0x2b00(11008), xsave_features 0x600e7

# with sgx.cpu_features.amx = "disabled"
$ gramine-sgx helloworld
debug: LibOS xsave_enabled 1, xsave_size 0xa80(2688), xsave_features 0xe7

This change is Reviewable

Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 6 files reviewed, all discussions resolved, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel)


pal/src/host/linux-sgx/sgx_arch.h line 102 at r1 (raw file):

 * the manifest option for some feature is set to "required", then the corresponding bits in the
 * XFRM mask are set. If the manifest option is set to "disabled", then the corresponding bits are
 * unset.

FYI: another way of thinking about sgx.cpu_features.[feature] = "autodetect" is like this:

  • the default XFRM mask stays;
  • for not-security-critical CPU features, the default XFRM mask bits are 0, so the untrusted Gramine loader can choose to enable the CPU feature;
  • for security-critical CPU features, the default XFRM mask bits are 1, so the untrusted Gramine loader can not enable the CPU feature, and it just stays disabled.

So there is an interesting logic for security-critical CPU features: they are initially definitely-disabled. The only way to set them is to do sgx.cpu_features.[feature] = "required", so in this case the CPU feature is definitely-enabled. Why so strict? Because we have two contradicting requirements: (a) we want to run our enclave on as many platforms as possible, (b) we must be sure that the platforms do not just enable/disable such CPU features as they wish. So security-critical CPU features are definitely-disabled by default (to allow for this enclave to run on more platforms), and they can only be definitely-enabled.

@dimakuv dimakuv force-pushed the dimakuv/disable-xfrm-features-final branch from 0c10c14 to d9feaae Compare September 5, 2022 13:09
Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 6 files reviewed, all discussions resolved, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel)

a discussion (no related file):
FYI: Based on an offline chat with @boryspoplawski, I did some changes to the second (main) commit and force-pushed, since nobody yet reviewed this. The changes were:

  • Replace security-critical with security-hardening (otherwise "security-critical CPU features can be disabled" sounds bizzare)
  • Replace autodetect with unspecified
  • Allow autodetect only for not-security-critical CPU features

donporter
donporter previously approved these changes Jan 6, 2023
Copy link
Contributor

@donporter donporter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 6 files at r1, 3 of 3 files at r2, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved


Documentation/manifest-syntax.rst line 575 at r2 (raw file):

unavailable on the platform, enclave initialization will fail.

In case of doubt, it is recommended to keep the default values for these

A more common English phrasing might be: "When in doubt", but I know what you mean.

Copy link
Contributor

@kailun-qin kailun-qin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @dimakuv)


-- commits line 22 at r2:
We default not-security-harening options to (from unspecified) disabled now. Does this mean we changed the default behavior? If so, should note it down somewhere.

Code quote:

  - `sgx.cpu_features.[...] = "unspecified"` -> `sgx.require_[...] = false`
  - `sgx.cpu_features.[...] = "required"` -> `sgx.require_[...] = true`

  `sgx.cpu_features.[...] = "disabled"` is new and it disables the

Documentation/manifest-syntax.rst line 556 at r2 (raw file):

security-hardening features (MPX and PKRU).

The ``"unspecified"`` syntax applies only to not-security-hardening features. It

Where do we validate this in this code?

Code quote:

The ``"unspecified"`` syntax applies only to not-security-hardening features

Documentation/manifest-syntax.rst line 999 at r2 (raw file):

    sgx.require_mpx    = [true|false]
    sgx.require_pkru   = [true|false]
    sgx.require_amx    = [true|false]

Default is not specified any more. So what the users should expect exactly?


pal/src/host/linux-sgx/sgx_arch.h line 102 at r2 (raw file):

  • "disabled": SIGSTRUCT.ATTRIBUTEMASK[feature] = 1 and
    SIGSTRUCT.ATTRIBUTES[feature] = 0.

Is the corresponding bit unset?

Code quote:

 * XFRM mask are set. If the manifest option is set to "disabled", then the corresponding bits are
 * unset.

python/graminelibos/manifest.py line 100 at r2 (raw file):

        sgx_cpu_features = sgx.setdefault('cpu_features', {})
        sgx_cpu_features.setdefault('mpx', "disabled")
        sgx_cpu_features.setdefault('pkru', "disabled")

As we removed the defaults of the legacy options, this changes the default behavior, right?

Code quote:

        sgx_cpu_features.setdefault('mpx', "disabled")
        sgx_cpu_features.setdefault('pkru', "disabled")

python/graminelibos/sgx_sign.py line 76 at r2 (raw file):

    val = 0
    for opt, bits in options_dict.items():
        if manifest_sgx.get(opt, 0) == 1:

This won't generate a KeyError anymore. I know this works for below but is this sth. we intended for?

Code quote:

get(opt, 0)

python/graminelibos/sgx_sign.py line 92 at r2 (raw file):

        elif manifest_sgx['cpu_features'].get(opt, "") == "disabled":
            val &= ~bits
            mask |= bits

Users can input things like sgx.cpu_features.amx = "abcd" (same semantics as unspecified)?


python/graminelibos/sgx_sign.py line 126 at r2 (raw file):

        'require_amx': offs.SGX_XFRM_AMX,
    }
    xfrms |= collect_bits(manifest_sgx, deprecated_xfrms_dict)

So we allow the co-existance and discrepancies between the legacy and new options?

Code quote:

xfrms |= collect_bits(manifest_sgx, deprecated_xfrms_dict)

Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 6 files reviewed, 8 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @donporter and @kailun-qin)


-- commits line 22 at r2:

We default not-security-harening options to (from unspecified) disabled now.

Is this a typo in your sentence? Did you mean We default security-hardening options...?

But you are right, there was a complex ambiguity in the previous syntax and the way it was described in the documentation. I added a note in the docs. ( The state of this feature was really poor :( )


Documentation/manifest-syntax.rst line 556 at r2 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

Where do we validate this in this code?

Done. Good catch.


Documentation/manifest-syntax.rst line 575 at r2 (raw file):

Previously, donporter (Don Porter) wrote…

A more common English phrasing might be: "When in doubt", but I know what you mean.

Done


Documentation/manifest-syntax.rst line 999 at r2 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

Default is not specified any more. So what the users should expect exactly?

Not sure I understand your question.

If users don't specify these deprecated CPU features, then the default from the corresponding new CPU feature is used by Gramine. E.g. if user wonders what is the default for MPX now, then the answer is disabled because this is the default for sgx.cpu_features.mpx .


pal/src/host/linux-sgx/sgx_arch.h line 102 at r2 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…
  • "disabled": SIGSTRUCT.ATTRIBUTEMASK[feature] = 1 and
    SIGSTRUCT.ATTRIBUTES[feature] = 0.

Is the corresponding bit unset?

Oops, you are right. Fixed the comment now.


python/graminelibos/manifest.py line 100 at r2 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

As we removed the defaults of the legacy options, this changes the default behavior, right?

No, the default behavior is the same for MPX and PKRU.

It's important to realize that security-hardening features (MPX and PKRU) and non-security-hardening features (AVX, AMX, ...) always had different behavior in Gramine. It is clear from this note:

* Notes:
* - Verified bits include: bit 0 + bit 1 (X87 + SSE, always enabled in SGX), bit 3 + bit 4
* (BNDREG + BNDCSR, enables Intel MPX), bit 9 (PKRU, enables Intel MPK), and all reserved bits.
* - Not-verified bits include: bit 2 (AVX), bit 5 + bit 6 + bit 7 (AVX-512), bit 17 + bit 18
* (AMX). These bits are considered not security-critical.
* - Two instances of the same enclave with difference in X87, SSE, MPX, MPK bits (e.g., one
* with PKRU == 0 and another with PKRU == 1) will have two different SIGSTRUCTs. However,
* difference in AVX/AVX-512/AMX bits does not lead to different SIGSTRUCTs.
* - Important consequence of the above: the same enclave (with the same SIGSTRUCT) may run on
* machines with and without AVX/AVX-512/AMX, but e.g. a PKRU-requiring enclave may run only on
* machine with PKRU.
* - CET bits are not yet reflected in Gramine, i.e., Gramine doesn't support Intel CET.
*/
#define SGX_XFRM_MASK_CONST 0xfffffffffff9ff1bUL

So the previous manifest syntax with sgx.require_xxx was totally bad... This PR attempts at fixing the syntax and to describe it more clearly.

UPDATE: I removed this explicit setting of defaults in manifest.py, because they are already covered via offs.SGX_XFRM_LEGACY and offs.SGX_XFRM_MASK_CONST.


python/graminelibos/sgx_sign.py line 76 at r2 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

This won't generate a KeyError anymore. I know this works for below but is this sth. we intended for?

Yes, this is intended. Or to be more specific, we never intended to throw a KeyError here.

This error never happened because we explicitly used sgx.setdefault() on those now-deprecated CPU feature options require_avx, etc. But now this error could happen because we don't force this explicit setting of now-deprecated options, so these keys may not exist now. That's why the change here.

UPDATE: I replaced 0 with False and 1 with True, because these are booleans. We are just lucky that Python maps 0 to False and 1 to True.


python/graminelibos/sgx_sign.py line 92 at r2 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

Users can input things like sgx.cpu_features.amx = "abcd" (same semantics as unspecified)?

Done. Good point.


python/graminelibos/sgx_sign.py line 126 at r2 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

So we allow the co-existance and discrepancies between the legacy and new options?

Done. Good catch. Significantly rewrote this part.

@dimakuv dimakuv force-pushed the dimakuv/disable-xfrm-features-final branch from 2b44542 to 35dcd60 Compare February 24, 2023 13:47
@dimakuv dimakuv force-pushed the dimakuv/disable-xfrm-features-final branch from 35dcd60 to 6a817f2 Compare April 20, 2023 07:47
Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 11 files reviewed, 9 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "DONTMERGE" found in commit messages' one-liners (waiting on @donporter and @kailun-qin)

a discussion (no related file):
I rebased this PR to the latest master (it got really stale, 200 commits behind master and 6 months old).

As a next step, I will test this PR again on AMX-enabled machines, using the amxtest application that I add in the third commit (this commit will be dropped at final rebase, since this is just a perf test).


@dimakuv dimakuv force-pushed the dimakuv/disable-xfrm-features-final branch from 6a817f2 to 010f301 Compare April 20, 2023 10:32
Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 11 files reviewed, 8 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "DONTMERGE" found in commit messages' one-liners (waiting on @donporter and @kailun-qin)

a discussion (no related file):

As a next step, I will test this PR again on AMX-enabled machines ...

Done. Everything looks good.


Copy link
Contributor

@donporter donporter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 11 files at r3, all commit messages.
Reviewable status: 6 of 11 files reviewed, 9 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "DONTMERGE" found in commit messages' one-liners (waiting on @dimakuv and @kailun-qin)


-- commits line 33 at r4:
I am not going to review these files, then, and leave a blocking comment here. but will review the rest.

Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 11 files reviewed, 10 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "DONTMERGE" found in commit messages' one-liners (waiting on @donporter and @kailun-qin)

a discussion (no related file):
I will need to make sure that other Gramine repos (gsc, examples, contrib) will also be updated to reflect this change in the manifest syntax.


a discussion (no related file):
FYI: rebased to latest master.

The first commit in this rebased PR is also extracted as a separate PR: #1550

I also kept this first commit here, for the ease of testing this PR (it won't work without the first commit).


@dimakuv dimakuv force-pushed the dimakuv/disable-xfrm-features-final branch from cd5b203 to fc0ff08 Compare September 13, 2023 17:06
Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 2 of 11 files reviewed, 10 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "DONTMERGE" found in commit messages' one-liners (waiting on @donporter and @kailun-qin)

a discussion (no related file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

FYI: rebased to latest master.

The first commit in this rebased PR is also extracted as a separate PR: #1550

I also kept this first commit here, for the ease of testing this PR (it won't work without the first commit).

The first commit was extracted into #1550, which was just merged. So I rebased on top of the just-merged commit, and now there are only two commits in this PR (one of the commits is an AMX test that will be removed before final merge).


Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 11 files at r3.
Reviewable status: 2 of 11 files reviewed, 11 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "DONTMERGE" found in commit messages' one-liners (waiting on @dimakuv, @donporter, and @kailun-qin)


Documentation/manifest-syntax.rst line 999 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Not sure I understand your question.

If users don't specify these deprecated CPU features, then the default from the corresponding new CPU feature is used by Gramine. E.g. if user wonders what is the default for MPX now, then the answer is disabled because this is the default for sgx.cpu_features.mpx .

What Kailun means is that previously the docs said that these options are false by default, but you dropped it when moving the docs here.


Documentation/manifest-syntax.rst line 783 at r6 (raw file):

::

    sgx.require_exinfo = [true|false]

Why is this left unchanged? require_exinfo == false is still misleading - the meaning in English is "don't require exinfo", but in practice it disables exinfo even on CPUs which support it.

Copy link
Contributor

@kailun-qin kailun-qin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 11 files at r3, 1 of 1 files at r4, 4 of 4 files at r5, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "DONTMERGE" found in commit messages' one-liners (waiting on @dimakuv and @mkow)


-- commits line 22 at r2:

Is this a typo in your sentence? Did you mean We default security-hardening options...?

Yeah, typo, excuse me.


Documentation/manifest-syntax.rst line 999 at r2 (raw file):

What Kailun means is that previously the docs said that these options are false by default, but you dropped it when moving the docs here.

Yes, exactly.


Documentation/manifest-syntax.rst line 808 at r6 (raw file):

security-hardening features (MPX and PKRU).

The ``"unspecified"`` syntax applies only to not-security-hardening features. It

Should we explain a bit why "unspecified" is not made for security-hardening features?

Code quote:

The ``"unspecified"`` syntax applies only to not-security-hardening features.

python/graminelibos/sgx_sign.py line 139 at r6 (raw file):

            if deprecated_key in manifest_sgx:
                raise KeyError(f'`sgx.cpu_features` cannot coexist with `sgx.{deprecated_key}`')
        xfrms, xfrms_mask = collect_cpu_feature_bits(manifest_sgx, xfrms_dict, xfrms, xfrms_mask,

It's guaranteed that we have manifest_sgx['cpu_features'] when we're here, can we input it directly (instead of inputing manifest_sgx)?

Code quote:

xfrms, xfrms_mask = collect_cpu_feature_bits(manifest_sgx, xfrms_dict, xfrms, xfrms_mask

python/graminelibos/sgx_sign.py line 140 at r6 (raw file):

                raise KeyError(f'`sgx.cpu_features` cannot coexist with `sgx.{deprecated_key}`')
        xfrms, xfrms_mask = collect_cpu_feature_bits(manifest_sgx, xfrms_dict, xfrms, xfrms_mask,
                                                     security_hardening=False)

I don't insist, but is it better to have a nested dict for xfrms_dict and secure_xfrms_dict so that we can get rid of this security_hardening=xxx?

Code quote:

security_hardening=False

Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 11 files reviewed, 7 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @kailun-qin and @mkow)

a discussion (no related file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I will need to make sure that other Gramine repos (gsc, examples, contrib) will also be updated to reflect this change in the manifest syntax.

Done:



Documentation/manifest-syntax.rst line 999 at r2 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

What Kailun means is that previously the docs said that these options are false by default, but you dropped it when moving the docs here.

Yes, exactly.

Done. Added a new paragraph to explain this.


Documentation/manifest-syntax.rst line 783 at r6 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Why is this left unchanged? require_exinfo == false is still misleading - the meaning in English is "don't require exinfo", but in practice it disables exinfo even on CPUs which support it.

Done, in a separate commit


Documentation/manifest-syntax.rst line 808 at r6 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

Should we explain a bit why "unspecified" is not made for security-hardening features?

Done. Not sure if the new sentence brings anything new, but check if you like it.


python/graminelibos/sgx_sign.py line 139 at r6 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

It's guaranteed that we have manifest_sgx['cpu_features'] when we're here, can we input it directly (instead of inputing manifest_sgx)?

Done.


python/graminelibos/sgx_sign.py line 140 at r6 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

I don't insist, but is it better to have a nested dict for xfrms_dict and secure_xfrms_dict so that we can get rid of this security_hardening=xxx?

I like the current code with two invocations -- one for non-security-critical features (xfrms_dict) and one for security-critical features (secure_xfrms_dict). I would like to keep it that way, I think it's more readable.

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 11 files at r3, 1 of 4 files at r5, 2 of 3 files at r7, all commit messages.
Reviewable status: 10 of 11 files reviewed, 7 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @kailun-qin)


Documentation/manifest-syntax.rst line 789 at r8 (raw file):

signal handler in case of a page fault. Otherwise (set to ``false``), the
faulting address will always be provided as ``0``.

I'd also mention here why the default is false - some frameworks/runtimes would dump the callstack + variables/registers on exceptions, potentially leaking data.


python/graminelibos/manifest.py line 99 at r8 (raw file):

        sgx.setdefault('enable_stats', False)
        sgx.setdefault('edmm_enable', False)

No defaults for sgx.features? All other options have the defaults set up here, why making an exception for sgx.features?

@dimakuv dimakuv force-pushed the dimakuv/disable-xfrm-features-final branch from 9cc5c63 to c769d71 Compare September 14, 2023 14:51
Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 4 of 11 files reviewed, 7 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "DONTMERGE" and "fixup! " found in commit messages' one-liners (waiting on @kailun-qin and @mkow)


Documentation/manifest-syntax.rst line 789 at r8 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

I'd also mention here why the default is false - some frameworks/runtimes would dump the callstack + variables/registers on exceptions, potentially leaking data.

Done.


python/graminelibos/manifest.py line 99 at r8 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

No defaults for sgx.features? All other options have the defaults set up here, why making an exception for sgx.features?

Done.

I actually don't like setting these defaults (I don't know/remember who started this), because it is redundant (Gramine's parsing code already has the defaults) and every time we modify some manifest options, we also need to modify here...

But I understand that we need to be consistent, so let me add the defaults here, and maybe later we'll decide to remove this defaults list.

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r8, 2 of 6 files at r9, all commit messages.
Reviewable status: 7 of 11 files reviewed, 12 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "DONTMERGE" and "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @kailun-qin)


python/graminelibos/manifest.py line 99 at r8 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Done.

I actually don't like setting these defaults (I don't know/remember who started this), because it is redundant (Gramine's parsing code already has the defaults) and every time we modify some manifest options, we also need to modify here...

But I understand that we need to be consistent, so let me add the defaults here, and maybe later we'll decide to remove this defaults list.

Hmm, maybe it was for the Python code which uses some of these, so that it doesn't have to use .get() everywhere? I'm not sure...


python/graminelibos/manifest.py line 105 at r9 (raw file):

            sgx.setdefault('enclave_size', DEFAULT_ENCLAVE_SIZE_NO_EDMM)

        # TODO: below were deprecated in release v1.6, remove this check in v1.7;

Suggestion:

remove this check in v1.7: (but keep the `if` body)

python/graminelibos/sgx_sign.py line 139 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Done.

It actually is, because of the defaults in manifest.py.


python/graminelibos/sgx_sign.py line 140 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I like the current code with two invocations -- one for non-security-critical features (xfrms_dict) and one for security-critical features (secure_xfrms_dict). I would like to keep it that way, I think it's more readable.

I think I prefer the current version also.


python/graminelibos/sgx_sign.py line 80 at r9 (raw file):

    val = 0
    for opt, bits in options_dict.items():
        if manifest_sgx.get(opt) is True:

This change is not needed anymore.


python/graminelibos/sgx_sign.py line 87 at r9 (raw file):

def collect_cpu_feature_bits(manifest_cpu_features, options_dict, val, mask, security_hardening):
    for opt, bits in options_dict.items():
        if manifest_cpu_features.get(opt) is None:

More Pythonic way:

Suggestion:

if opt not in manifest_cpu_features:

python/graminelibos/sgx_sign.py line 88 at r9 (raw file):

    for opt, bits in options_dict.items():
        if manifest_cpu_features.get(opt) is None:
            continue

This is not needed, manifest.py guarantees that these options are always set. Though, we aren't consistent in this I think, sometimes we rely on this here and sometimes not?


python/graminelibos/sgx_sign.py line 96 at r9 (raw file):

            mask |= bits
        elif security_hardening or manifest_cpu_features[opt] != "unspecified":
            raise KeyError(f'Manifest option `sgx.cpu_features.{opt}` has disallowed value')

Otherwise "has disallowed" would form a past tense.
OTOH, in plural this would be ambiguous, I guess? :)

Suggestion:

has a disallowed value

python/graminelibos/sgx_sign.py line 108 at r9 (raw file):

        flags |= offs.SGX_FLAGS_MODE64BIT

    if manifest_sgx.get('require_exinfo') is not None:

ditto


python/graminelibos/sgx_sign.py line 110 at r9 (raw file):

    if manifest_sgx.get('require_exinfo') is not None:
        raise KeyError(f'Manifest option `sgx.require_exinfo` was renamed to `sgx.use_exinfo`, '
                        'please modify the manifest file accordingly')

Hmm, wasn't the old name only deprecated, not removed?

Copy link
Contributor

@kailun-qin kailun-qin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r8, 6 of 6 files at r9, all commit messages.
Reviewable status: all files reviewed, 9 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "DONTMERGE" and "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)


python/graminelibos/sgx_sign.py line 140 at r6 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

I think I prefer the current version also.

OK then :)


python/graminelibos/sgx_sign.py line 96 at r9 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Otherwise "has disallowed" would form a past tense.
OTOH, in plural this would be ambiguous, I guess? :)

What about "contains disallowed value"?


python/graminelibos/sgx_sign.py line 110 at r9 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

Hmm, wasn't the old name only deprecated, not removed?

+1

Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 11 files reviewed, 9 unresolved discussions, not enough approvals from maintainers (1 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "fixup! " found in commit messages' one-liners (waiting on @kailun-qin and @mkow)


python/graminelibos/manifest.py line 105 at r9 (raw file):

            sgx.setdefault('enclave_size', DEFAULT_ENCLAVE_SIZE_NO_EDMM)

        # TODO: below were deprecated in release v1.6, remove this check in v1.7;

Done.


python/graminelibos/sgx_sign.py line 139 at r6 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

It actually is, because of the defaults in manifest.py.

@mkow I don't understand your comment. What are you blocking on? What action should I do?


python/graminelibos/sgx_sign.py line 80 at r9 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

This change is not needed anymore.

This change is still needed, because we can have deprecated options in the manifest-template file. E.g.:

# app.manifest.template

sgx.require_avx = true
sgx.require_mpx = false

In this case, our gramine-manifest tool detects the deprecated options and does not add the sgx.cpu_features.xxx options at all (to not have a conflict between the deprecated ones and the new ones).

Also note that the other deprecated options will not be added to the manifest file. E.g. sgx.require_avx512 will not be added (because it's one of the deprecated options, and we simply don't have the logic to add it explicitly).

So this .get(opt) is still required, because there are legitimate cases when the (deprecated) option was not set to a default and thus is missing completely from the manifest dict.

P.S. This whole thing can indeed go away when we don't support the deprecated options anymore, and in that case we'll always have the defaults in the manifest dict, and the .get(opt) trick could be replaced with a simple [opt]. But not now.


python/graminelibos/sgx_sign.py line 87 at r9 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

More Pythonic way:

Done.


python/graminelibos/sgx_sign.py line 88 at r9 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

This is not needed, manifest.py guarantees that these options are always set. Though, we aren't consistent in this I think, sometimes we rely on this here and sometimes not?

No, we are not guaranteed that these options are set. See my other comment.


python/graminelibos/sgx_sign.py line 96 at r9 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

What about "contains disallowed value"?

Done. I like Michal's suggestion better, so went with it.


python/graminelibos/sgx_sign.py line 108 at r9 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

ditto

Done.


python/graminelibos/sgx_sign.py line 110 at r9 (raw file):

Previously, kailun-qin (Kailun Qin) wrote…

+1

Done.

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r10, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion, "fixup! " found in commit messages' one-liners (waiting on @dimakuv)


python/graminelibos/sgx_sign.py line 139 at r6 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

@mkow I don't understand your comment. What are you blocking on? What action should I do?

Hmm, I think I accidentally replied under a wrong thread, sorry.


python/graminelibos/sgx_sign.py line 80 at r9 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

This change is still needed, because we can have deprecated options in the manifest-template file. E.g.:

# app.manifest.template

sgx.require_avx = true
sgx.require_mpx = false

In this case, our gramine-manifest tool detects the deprecated options and does not add the sgx.cpu_features.xxx options at all (to not have a conflict between the deprecated ones and the new ones).

Also note that the other deprecated options will not be added to the manifest file. E.g. sgx.require_avx512 will not be added (because it's one of the deprecated options, and we simply don't have the logic to add it explicitly).

So this .get(opt) is still required, because there are legitimate cases when the (deprecated) option was not set to a default and thus is missing completely from the manifest dict.

P.S. This whole thing can indeed go away when we don't support the deprecated options anymore, and in that case we'll always have the defaults in the manifest dict, and the .get(opt) trick could be replaced with a simple [opt]. But not now.

Ah, I see, makes sense.

kailun-qin
kailun-qin previously approved these changes Sep 16, 2023
Copy link
Contributor

@kailun-qin kailun-qin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r10, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion, "fixup! " found in commit messages' one-liners (waiting on @dimakuv)

Copy link
Contributor

@donporter donporter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 11 files at r3, 1 of 4 files at r5, 4 of 6 files at r9, 3 of 3 files at r10, all commit messages.
Reviewable status: all files reviewed, 3 unresolved discussions, "fixup! " found in commit messages' one-liners (waiting on @dimakuv)


CI-Examples/amxtest/README.md line 127 at r10 (raw file):

 static long sgx_ocall_sched_yield(void* args) {
     __UNUSED(args);
+#if 1

Is this #define needed in the documentation?


Documentation/manifest-syntax.rst line 787 at r10 (raw file):

If ``sgx.use_exinfo`` is set, user application can retrieve faulting address in
signal handler in case of a page fault. Otherwise (set to ``false``), the

Worth explicating that this is untrusted/non-enclave user code?

mkow
mkow previously approved these changes Sep 18, 2023
Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 2 unresolved discussions, "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @donporter)

a discussion (no related file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

As a next step, I will test this PR again on AMX-enabled machines ...

Done. Everything looks good.

Blocking until the AMX test gets removed.



CI-Examples/amxtest/README.md line 127 at r10 (raw file):

Previously, donporter (Don Porter) wrote…

Is this #define needed in the documentation?

This test will be dropped before merging, there was a blocking comment somewhere, but seems that someone unlocked it accidentally, let me fix that.


Documentation/manifest-syntax.rst line 787 at r10 (raw file):

Previously, donporter (Don Porter) wrote…

Worth explicating that this is untrusted/non-enclave user code?

What do you mean? The fault address is passed to the app inside Gramine? There's no "non-enclave user code" in Gramine?

donporter
donporter previously approved these changes Sep 18, 2023
Copy link
Contributor

@donporter donporter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion, "fixup! " found in commit messages' one-liners (waiting on @dimakuv)


CI-Examples/amxtest/README.md line 127 at r10 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

This test will be dropped before merging, there was a blocking comment somewhere, but seems that someone unlocked it accidentally, let me fix that.

Ah, right. My mistake


Documentation/manifest-syntax.rst line 787 at r10 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

What do you mean? The fault address is passed to the app inside Gramine? There's no "non-enclave user code" in Gramine?

Ah, My mistake. I was thinking this was referring the signal handler outside the enclave.

@dimakuv
Copy link
Author

dimakuv commented Sep 18, 2023

Jenkins, retest Jenkins-SGX-EDMM please (there seemed to be some problem with Intel PCCS service, but now it's gone?
)

Introduce `sgx.cpu_features.[...] = "[unspecified|disabled|required]"`
instead of `sgx.require_[...] = true|false`. Also, make more explicit
the difference between security-hardening CPU features and
not-security-hardening ones.

Mapping of deprecated and new manifest options:
- `sgx.cpu_features.[...] = "unspecified"` -> `sgx.require_[...] = false`
- `sgx.cpu_features.[...] = "required"` -> `sgx.require_[...] = true`

`sgx.cpu_features.[...] = "disabled"` is new and it disables the
corresponding CPU feature inside the SGX enclave even if this CPU
feature is available on the system: this may improve enclave performance
because this CPU feature will *not* be saved and restored during enclave
entry/exit.

This commit also renames `sgx.require_exinfo` to `sgx.use_exinfo`. The
old name was ambiguous -- `require_exinfo = false` can be read as "don't
require exinfo", but in practice it disables exinfo even on CPUs which
support it.

Signed-off-by: Dmitrii Kuvaiskii <dmitrii.kuvaiskii@intel.com>
@dimakuv dimakuv dismissed stale reviews from donporter, mkow, and kailun-qin via eed050c September 18, 2023 14:39
@dimakuv dimakuv force-pushed the dimakuv/disable-xfrm-features-final branch from cbc4684 to eed050c Compare September 18, 2023 14:39
Copy link
Contributor

@donporter donporter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r11, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion

Copy link
Author

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @mkow)

a discussion (no related file):

Previously, mkow (Michał Kowalczyk) wrote…

Blocking until the AMX test gets removed.

I extracted the AMX test in my private repo (just not to lose that test) and removed from this PR.



-- commits line 19 at r11:
FYI: previously I had a separate commit "[PAL/Linux-SGX] Rename sgx.require_exinfo to sgx.use_exinfo". But I messed up the fixup commits, so instead of untangling the commits, I melded the second commit into the first one. Sorry :(

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r11, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@dimakuv dimakuv merged commit eed050c into master Sep 18, 2023
1 check passed
@dimakuv dimakuv deleted the dimakuv/disable-xfrm-features-final branch September 18, 2023 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants