Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LibOS,PAL] Extend extra runtime configuration to include hostname #919

Merged
merged 1 commit into from
Sep 20, 2022

Conversation

oshogbo
Copy link
Contributor

@oshogbo oshogbo commented Sep 19, 2022

Description of the changes

This is a continuation of the issue #689.

From what I can see, most applications use the uname(2) syscall to obtain a current hostname, for example, gethostame.
The /etc/hostname is read during startup, and the sethostname(2) is called. For example, this is done on Ubuntu 20.04 with /etc/init/hostname.conf script.

Because of that we decided not to create a /etc/hostname file but just to set the Gramine hostname to the host's hostname.

The Gramine hostname has to be a valid domain. This is a difference from Linux, as Linux accepts any hostname value. That said, Linux doesn't assume that the root user tries to exploit it through hostname, and Gramine should.

This change doesn't address the support of the sethostname(2). The hostname can change after initialisation, however new process (after fork inside Gramine) will inherit an initial value of hostname. We probably should separately propagate the current value of the hostname. We are also missing a mechanism for propagating the hostname across multiple processes. We might also decide just to remove sethostname(2) completely. My understanding is that we have it only to enable LTP tests.

How to test this PR?

New tests are added.


This change is Reviewable

Copy link
Contributor

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 20 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @oshogbo)


-- commits line 4 at r1:
expects


libos/src/libos_init.c line 490 at r1 (raw file):

    RUN_INIT(init_sync_client);

    /* XXX: this will break uname checkpointing (if we implement it). */

Where we do actually have checkpointing of g_pal_public_state->dns_host.hostname?

I thought that in the previous iteration of this PR, you had some explicit checkpointing mechanism. But I don't see it in this PR. Is it because we already have some checkpointing mechanism (at the level of PAL) that forwards g_pal_public_state to the child?


libos/test/regression/hostname.c line 20 at r1 (raw file):

    if (pid == 0) {
        f(tag, expected_name);

I see no sense in this indirection, you could just call test_gethostname() here directly. But not blocking.


pal/src/host/linux-common/etc_host_info.c line 362 at r1 (raw file):

    return 0;
}
#endif

Please add /* ifndef PARSERS_ONLY */

Copy link
Contributor Author

@oshogbo oshogbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 20 files reviewed, 4 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @dimakuv and @oshogbo)


libos/src/libos_init.c line 490 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Where we do actually have checkpointing of g_pal_public_state->dns_host.hostname?

I thought that in the previous iteration of this PR, you had some explicit checkpointing mechanism. But I don't see it in this PR. Is it because we already have some checkpointing mechanism (at the level of PAL) that forwards g_pal_public_state to the child?

We checkpoint dns_host, so we also checkpoint hostname as this is static buffer.
https://github.com/gramineproject/gramine/blob/master/libos/src/fs/etc/fs.c#L142

struct pal_dns_host_conf {
    struct pal_dns_host_conf_addr nsaddr_list[PAL_MAX_NAMESPACES];
    size_t nsaddr_list_count;

    char dn_search[PAL_MAX_DN_SEARCH][PAL_HOSTNAME_MAX];
    size_t dn_search_count;

    bool inet6;
    bool rotate;

    char hostname[PAL_HOSTNAME_MAX];
};

pal/src/host/linux-common/etc_host_info.c line 362 at r1 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Please add /* ifndef PARSERS_ONLY */

Done.

Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 19 of 20 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 12 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @dimakuv and @oshogbo)


Documentation/manifest-syntax.rst line 156 at r2 (raw file):

    (Default: false)

This option will generate following extra configuration:

Suggestion:

the following extra configuration:

Documentation/manifest-syntax.rst line 158 at r2 (raw file):

This option will generate following extra configuration:

- Set up the Gramine hostname to the host's hostname.

If you said in the previous sentence that you're listing configurations here, then the list should be made of things, not sentences. So, please either change the form of the sentence above or the entries.


Documentation/manifest-syntax.rst line 159 at r2 (raw file):

- Set up the Gramine hostname to the host's hostname.
- Generate ``/etc/resolv.conf``, with keywords

missing colon at the end (or maybe two, not sure about the precise reST rules, but the rendered version should have exactly one)


Documentation/manifest-syntax.rst line 173 at r2 (raw file):

process configuration. For security-enforcing modes (such as SGX), Gramine
additionally sanitizes the information gathered from the host. Invalid host's
configuration is reported as an error (ex. invalid hostname, or invalid IPv4

(I guess?)

Suggestion:

e.g.

libos/test/regression/hostname.c line 53 at r2 (raw file):

    test_gethostname("normal", argv[1]);
    test_fork("fork gethostname", argv[1], test_gethostname);

Suggestion:

gethostname after fork

pal/include/host/linux-common/etc_host_info.h line 12 at r2 (raw file):

int parse_resolv_conf(struct pal_dns_host_conf* conf);

int get_hostname(char* hostname, size_t size);

This may get confused with another global function: set_hostname() - it sounds like it was its counterpart, while it does something completely different. I know one is in PAL and another in LibOS, but still, I'd make the names at least a bit more specific.

Code quote (from libos/src/sys/libos_uname.c):

set_hostname

pal/src/host/linux/pal_main.c line 134 at r2 (raw file):

    if (get_hostname(g_pal_public_state.dns_host.hostname,
        sizeof(g_pal_public_state.dns_host.hostname)) < 0) {

+4 more spaces


pal/src/host/linux-common/etc_host_info.c line 9 at r2 (raw file):

 * This file contains the APIs to retrieve information from the host:
 *   - parses host file `/etc/resolv.conf` into `struct pal_dns_host_conf`
 *   - get host's hostname through uname syscall

grammar form doesn't match the previous point on the list here

@dimakuv
Copy link
Contributor

dimakuv commented Sep 20, 2022

Jenkins, retest Jenkins-20.04 please (AssertionError: Command ['gramine-direct', 'fcntl14_64'] timed out after 60 s)

Copy link
Contributor

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 19 of 20 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 20 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "fixup! " found in commit messages' one-liners (waiting on @mkow and @oshogbo)


-- commits line 3 at r2:
The commit message actually needs to explain a bit what it does. I suggest:

Gramine exposes hostname information to the app via `nodename` field returned
by the `uname` syscall. Note that currently Gramine does *not* create an
`/etc/hostname` file as no apps seem to read this file directly.

To expose the hostname, Gramine obtains it from the host, sanitizes it, and
stores it in the global PAL state.

The difference between ...

-- commits line 7 at r2:
I would put this last sentence in brackets: (In the case of Linux, ... malicious hostname.)


Documentation/manifest-syntax.rst line 158 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

If you said in the previous sentence that you're listing configurations here, then the list should be made of things, not sentences. So, please either change the form of the sentence above or the entries.

Yes, I would do like this:

This option will generate the following extra configuration:

- Hostname (obtained by apps via `nodename` field in `uname` syscall),
  set to the host's hostname at initialization.

- Pseudo-file ``/etc/resolv.conf``, with keywords:

      ...

libos/src/libos_init.c line 490 at r1 (raw file):

Previously, oshogbo (Mariusz Zaborski) wrote…

We checkpoint dns_host, so we also checkpoint hostname as this is static buffer.
https://github.com/gramineproject/gramine/blob/master/libos/src/fs/etc/fs.c#L142

struct pal_dns_host_conf {
    struct pal_dns_host_conf_addr nsaddr_list[PAL_MAX_NAMESPACES];
    size_t nsaddr_list_count;

    char dn_search[PAL_MAX_DN_SEARCH][PAL_HOSTNAME_MAX];
    size_t dn_search_count;

    bool inet6;
    bool rotate;

    char hostname[PAL_HOSTNAME_MAX];
};

Got it, thanks.


libos/src/sys/libos_uname.c line 24 at r2 (raw file):

static struct new_utsname g_current_uname = {
    .sysname  = "Linux",
    .nodename = "localhost",

I don't like that we have some default here which is actually never used (it is immediately rewritten by the PAL-provided value, which is by default also localhost). What if we like this, to make it very explicit:

static struct new_utsname g_current_uname = {
    .sysname  = "Linux",
    .nodename = "(none)",  /* overwritten by PAL-provided value at startup */
    ...

libos/test/regression/hostname.c line 16 at r2 (raw file):

    pid_t pid = fork();
    if (pid == -1) {
        err(1, "Unable to fork %s", tag);

tag should be the first thing printed. Also, the colon would visually help in the message. So like this:

        err(1, "%s: unable to fork", tag);

Here and in all other places, please.


libos/test/regression/hostname.c line 25 at r2 (raw file):

    if (wait(&status) == -1) {
        err(1, "Wait failed %s", tag);

ditto


libos/test/regression/hostname.c line 29 at r2 (raw file):

    if (!WIFEXITED(status) || WEXITSTATUS(status) != 0) {
        err(1, "Test failed %s", tag);

ditto + it should be errx + I would write a better message (exit status of child is not zero)


libos/test/regression/hostname.c line 37 at r2 (raw file):

    if (gethostname(buf, sizeof(buf)) != 0) {
        err(1, "%s gethostname: failed", tag);

For uniformity, I would change the message to:

        err(1, "%s: gethostname failed", tag);

libos/test/regression/hostname.c line 42 at r2 (raw file):

    if (strcmp(buf, expected_name) != 0) {
        errx(1, "%s gethostname result doesn't match hostname (expected: %s, got: %s)",
             tag, expected_name, buf);

ditto (colon after %s)


libos/test/regression/test_libos.py line 916 at r2 (raw file):

        self.assertIn('TEST OK', stdout)

    def test_120_gethostname_localhost(self):

Maybe rename from _localhost to _default? The fact that by default Gramine uses localhost doesn't need to be reflected in the name -- maybe we'll change the default value in the future.


pal/include/host/linux-common/etc_host_info.h line 12 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

This may get confused with another global function: set_hostname() - it sounds like it was its counterpart, while it does something completely different. I know one is in PAL and another in LibOS, but still, I'd make the names at least a bit more specific.

Maybe get_hostname_from_host? It's a weird name, but it is clearly different from LibOS's set_hostname.

(I agree with Michal that current get_hostname and set_hostname look confusingly similar.)


pal/src/host/linux-common/etc_host_info.c line 9 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

grammar form doesn't match the previous point on the list here

I guess Michal means to change get to gets


pal/src/host/linux-common/etc_host_info.c line 348 at r2 (raw file):

int get_hostname(char* hostname, size_t size) {
    struct new_utsname c_uname;
    int ret;

Just declare it in the following line:

int ret = DO_SYSCALL(uname, &c_uname);

Rule of thumb: if some variable is used only once in the scope, no need to declare it separately.

Copy link
Contributor Author

@oshogbo oshogbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 12 of 20 files reviewed, 19 unresolved discussions, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ), "squash! " found in commit messages' one-liners (waiting on @dimakuv and @mkow)


-- commits line 4 at r1:

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

expects

Done.


-- commits line 3 at r2:

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

The commit message actually needs to explain a bit what it does. I suggest:

Gramine exposes hostname information to the app via `nodename` field returned
by the `uname` syscall. Note that currently Gramine does *not* create an
`/etc/hostname` file as no apps seem to read this file directly.

To expose the hostname, Gramine obtains it from the host, sanitizes it, and
stores it in the global PAL state.

The difference between ...

Done.


-- commits line 7 at r2:

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I would put this last sentence in brackets: (In the case of Linux, ... malicious hostname.)

Done.


Documentation/manifest-syntax.rst line 156 at r2 (raw file):

    (Default: false)

This option will generate following extra configuration:

Done.


Documentation/manifest-syntax.rst line 158 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Yes, I would do like this:

This option will generate the following extra configuration:

- Hostname (obtained by apps via `nodename` field in `uname` syscall),
  set to the host's hostname at initialization.

- Pseudo-file ``/etc/resolv.conf``, with keywords:

      ...

Done.


Documentation/manifest-syntax.rst line 159 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

missing colon at the end (or maybe two, not sure about the precise reST rules, but the rendered version should have exactly one)

Done.
One is fine.


Documentation/manifest-syntax.rst line 173 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

(I guess?)

Done.


libos/src/sys/libos_uname.c line 24 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I don't like that we have some default here which is actually never used (it is immediately rewritten by the PAL-provided value, which is by default also localhost). What if we like this, to make it very explicit:

static struct new_utsname g_current_uname = {
    .sysname  = "Linux",
    .nodename = "(none)",  /* overwritten by PAL-provided value at startup */
    ...

Done.


libos/test/regression/hostname.c line 16 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

tag should be the first thing printed. Also, the colon would visually help in the message. So like this:

        err(1, "%s: unable to fork", tag);

Here and in all other places, please.

Done.


libos/test/regression/hostname.c line 25 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

ditto

Done.


libos/test/regression/hostname.c line 29 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

ditto + it should be errx + I would write a better message (exit status of child is not zero)

Done.


libos/test/regression/hostname.c line 37 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

For uniformity, I would change the message to:

        err(1, "%s: gethostname failed", tag);

Done.


libos/test/regression/hostname.c line 42 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

ditto (colon after %s)

Done.


libos/test/regression/hostname.c line 53 at r2 (raw file):

    test_gethostname("normal", argv[1]);
    test_fork("fork gethostname", argv[1], test_gethostname);

Done.


libos/test/regression/test_libos.py line 916 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Maybe rename from _localhost to _default? The fact that by default Gramine uses localhost doesn't need to be reflected in the name -- maybe we'll change the default value in the future.

Done.


pal/include/host/linux-common/etc_host_info.h line 12 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Maybe get_hostname_from_host? It's a weird name, but it is clearly different from LibOS's set_hostname.

(I agree with Michal that current get_hostname and set_hostname look confusingly similar.)

Maybe get_hosts_hostname ?


pal/src/host/linux/pal_main.c line 134 at r2 (raw file):

Previously, mkow (Michał Kowalczyk) wrote…

+4 more spaces

Done.


pal/src/host/linux-common/etc_host_info.c line 9 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

I guess Michal means to change get to gets

Done.


pal/src/host/linux-common/etc_host_info.c line 348 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Just declare it in the following line:

int ret = DO_SYSCALL(uname, &c_uname);

Rule of thumb: if some variable is used only once in the scope, no need to declare it separately.

Done.

Copy link
Contributor

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 8 files at r3, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "squash! " found in commit messages' one-liners (waiting on @mkow)


pal/include/host/linux-common/etc_host_info.h line 12 at r2 (raw file):

Previously, oshogbo (Mariusz Zaborski) wrote…

Maybe get_hosts_hostname ?

The proposed version is fine with me too.

dimakuv
dimakuv previously approved these changes Sep 20, 2022
Copy link
Contributor

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 8 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "squash! " found in commit messages' one-liners (waiting on @mkow)

a discussion (no related file):
Verified this PR (and the previous one about /etc/resolv.conf) on three configurations:

  1. My work corporate machine, bare metal.
  2. My work corporate machine, in a Docker container.
  3. Azure VM.

In all cases, I checked out this PR's branch, built Gramine, and ran gramine-direct busybox sh and gramine-sgx busybox sh. Once in the Busybox session, I used ls /etc, cat /etc/resolv.conf and uname -n to verify that the files/configs are propagated inside Gramine. I also played with different combinations of manifest options (enable/disable sys.enable_extra_runtime_domain_names_conf, enable/disable /etc/ in fs.mounts).

Everything works as expected.


Copy link
Contributor Author

@oshogbo oshogbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 8 unresolved discussions, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: Intel), "squash! " found in commit messages' one-liners (waiting on @mkow)

a discussion (no related file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

Verified this PR (and the previous one about /etc/resolv.conf) on three configurations:

  1. My work corporate machine, bare metal.
  2. My work corporate machine, in a Docker container.
  3. Azure VM.

In all cases, I checked out this PR's branch, built Gramine, and ran gramine-direct busybox sh and gramine-sgx busybox sh. Once in the Busybox session, I used ls /etc, cat /etc/resolv.conf and uname -n to verify that the files/configs are propagated inside Gramine. I also played with different combinations of manifest options (enable/disable sys.enable_extra_runtime_domain_names_conf, enable/disable /etc/ in fs.mounts).

Everything works as expected.

That's amazing thank you :)



pal/include/host/linux-common/etc_host_info.h line 12 at r2 (raw file):

Previously, dimakuv (Dmitrii Kuvaiskii) wrote…

The proposed version is fine with me too.

Done.

mkow
mkow previously approved these changes Sep 20, 2022
Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 8 files at r3, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (1 more required), "squash! " found in commit messages' one-liners (waiting on @oshogbo)


-- commits line 23 at r3:
It's not clear whether this sentence describes the state before or after this commit. You can fix this while doing the final rebase, but plz give me the proposed version before for review ;)

Code quote:

Gramine exposes hostname information to the app via `nodename` field returned

@oshogbo oshogbo dismissed stale reviews from mkow and dimakuv via db07b85 September 20, 2022 19:28
@oshogbo oshogbo force-pushed the oshogbo/etc_hostname_push branch 2 times, most recently from db07b85 to f6af40c Compare September 20, 2022 19:36
Copy link
Contributor Author

@oshogbo oshogbo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion, not enough approvals from maintainers (3 more required), not enough approvals from different teams (2 more required, approved so far: ) (waiting on @mkow)


-- commits line 23 at r3:

Previously, mkow (Michał Kowalczyk) wrote…

It's not clear whether this sentence describes the state before or after this commit. You can fix this while doing the final rebase, but plz give me the proposed version before for review ;)

Done.

This commit adds the host's hostname to the information exposed via extra
runtime domain names configuration. App can access this information via
`nodename` field returned by the `uname` syscall. Note that currently,
Gramine does *not* create an `/etc/hostname` file as no apps seem to read
this file directly.

To expose the hostname, Gramine obtains it from the host, sanitizes it, and
stores it in the global PAL state.

The difference between Linux and Gramine is that Gramine expects
the hostname to be a valid domain - as we assume the host is untrusted.
(In the case of Linux, it doesn't presume that the root user tries to exploit
it through a malicious hostname.)

Signed-off-by: Mariusz Zaborski <oshogbo@invisiblethingslab.com>
Copy link
Member

@mkow mkow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (2 more required), not enough approvals from different teams (1 more required, approved so far: ITL)

Copy link
Contributor

@dimakuv dimakuv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: all files reviewed, all discussions resolved, not enough approvals from maintainers (1 more required)

Copy link
Contributor

@boryspoplawski boryspoplawski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not review the PR, but trust other reviewers did it well (because PR is new so needs 3 approvals) ;)

Reviewable status: :shipit: complete! all files reviewed, all discussions resolved

@mkow mkow merged commit 5fd2145 into gramineproject:master Sep 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants