Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Sanitization of /etc/ files #689

Closed
dimakuv opened this issue Jun 28, 2022 · 33 comments
Closed

RFC: Sanitization of /etc/ files #689

dimakuv opened this issue Jun 28, 2022 · 33 comments
Assignees

Comments

@dimakuv
Copy link
Contributor

dimakuv commented Jun 28, 2022

Description of the problem

Currently we put files like /etc/resolv.conf in the sgx.allowed_files list for simplicity. Example:

sgx.allowed_files = [
# Name Service Switch (NSS) files. Glibc reads these files as part of name-
# service information gathering. For more info, see 'man nsswitch.conf'.
"file:/etc/nsswitch.conf",
"file:/etc/ethers",
"file:/etc/hosts",
"file:/etc/group",
"file:/etc/passwd",
# getaddrinfo(3) configuration file. Glibc reads this file to correctly find
# network addresses. For more info, see 'man gai.conf'.
"file:/etc/gai.conf",
]

Having these files under sgx.allowed_files is not secure. They are read by e.g. Glibc which doesn't expect them to be ill-formatted or maliciously modified.

The current consensus (see #687) is: Gramine should read the set of network-related files under /etc/ (when specified in the manifest file), sanitize/verify them and expose to the user app.

  • Only a small set of network-related files under /etc/ should be sanitized like this.
  • One example of a file not to be sanitized is /etc/passwd. This file should be in the sgx.trusted_files list.

Things to be done as part of this effort

  1. Identify the set of files under /etc/ that needs sanitization.
  2. Analyze the format of each of the files and design per-file sanitization logic.
  3. User-friendliness: where in the manifest these files (like /etc/resolv.conf) should go. Do they just "appear" to the in-Gramine app? Or they need to be put in one of the lists?
@dimakuv dimakuv changed the title Sanitization of /etc/ files RFC: Sanitization of /etc/ files Jun 28, 2022
@mkow
Copy link
Member

mkow commented Jun 30, 2022

overriding the host versions

This is a wrong/misleading term, as Gramine is not overriding anything. It's a separate operating system, and the plan is to emulate these files based on whatever we gathered on the host (which may not even be Linux in theory).

User-friendliness: where in the manifest these files (like /etc/resolv.conf) should go. Do they just "appear" to the in-Gramine app? Or they need to be put in one of the lists?

It would be easier if these were from a special kernel fs, like sysfs, but since this isn't the case we can either:

  • Just silently emulate them without the user asking.
  • Create a special mount type for each of them? Or maybe for the whole emulated /etc? "etcfs" :)

@dimakuv
Copy link
Contributor Author

dimakuv commented Jun 30, 2022

This is a wrong/misleading term, as Gramine is not overriding anything. It's a separate operating system, and the plan is to emulate these files based on whatever we gathered on the host (which may not even be Linux in theory).

Ok, fixed the wording.

Create a special mount type for each of them? Or maybe for the whole emulated /etc? "etcfs" :)

I don't like this at all. This sets a bad precedence of creating random FSes for a couple files (and diverging from how Linux does it).

Just silently emulate them without the user asking.

Yeah, probably just this.

@boryspoplawski
Copy link
Contributor

Maybe passthrough_sanitized_resolv_conf manifest option? And write a log_always message saying we have such option if we detect user having /etc/resolv.conf in allowed_files?

@dimakuv
Copy link
Contributor Author

dimakuv commented Jun 30, 2022

Maybe passthrough_sanitized_resolv_conf manifest option?

Could you expand what exactly this option is supposed to do? Is it a boolean? Does it only pertain to the resolv.conf file, what about others?

@boryspoplawski
Copy link
Contributor

Is it a boolean?

Yes

Does it only pertain to the resolv.conf file, what about others?

What others? Is anything needed beside this file? I think everything works with just this one

@aneessahib
Copy link
Contributor

aneessahib commented Jun 30, 2022

/etc/hosts and /etc/host.conf too in certain cases. For ex if a container is run with --net=host to use the host's network.

@boryspoplawski
Copy link
Contributor

What these files have to do with dockers --net=host option? It only causes the network namespace not to be unshared, this explanation doesn't make sense to me...

@aneessahib
Copy link
Contributor

Yes, and this updates the IP mapping info in the /etc/hosts file (resulting in a measurement failure in Gramine)

@woju
Copy link
Member

woju commented Jun 30, 2022

  1. /etc/nsswitch.conf (https://manpages.debian.org/nsswitch.conf(5)) should be sanitized, and we should think about all references inside:

    • aliases (no?)
    • ethers (yes?)
    • group (no?)
    • hosts (yes?)
    • initgroups (no?)
    • netgroup (no?)
    • networks (no?)
    • passwd (no?)
    • protocols (no?)
    • publickey (no?)
    • rpc (no?)
    • services (no?)
    • shadow (no?)
  1. [feature request] It would be convenient if we allowed to paste files directly into the manifest, so they become trusted files. Maybe something like:
[[sgx.trusted_files]]
uri = "file:/etc/resolv.conf"
content = """\
nameserver 1.1.1.1
nameserver 8.8.8.8
search domain.example
"""

@boryspoplawski
Copy link
Contributor

Yes, and this updates the IP mapping info in the /etc/hosts file (resulting in a measurement failure in Gramine)

Apparently docker mount binds them from host then. Anyway, my point was not to use them at all, so there would be no measurement errors.

@mkow
Copy link
Member

mkow commented Jun 30, 2022

Hmm, I think I like the (a bit unrelated) idea of supporting inlining of small files into manifests, but we should probably discuss it more, there may be reasons against it.

@aneessahib
Copy link
Contributor

Can we close this in the core meeting pls? Time is tissue.

@dimakuv
Copy link
Contributor Author

dimakuv commented Jul 5, 2022

Discussion notes in the core meeting:

Woju's proposal of hard-coded sgx.trusted_files.content is rejected (may be useful for other things, but not as a solution to the current problem):

  • Because this would be basically the same as just using sgx.trusted_files.
  • Also, the file (e.g., /etc/resolv.conf) can legitimitaly change from one machine to the other.
    • Think of an SGX enclave that should be able to run on multiple clouds.
    • The SGX enclave must have the same MRENCLAVE measurement, thus it must have the exact same sgx.trusted_files.content.
    • But /etc/resolv.conf will contain different DNS names in different clouds.

Borys's proposal to require to run a DNS server alongside Gramine is also rejected:

  • In this proposal, the user can hard-code /etc/resolv.conf and other files as usual in sgx.trusted_files. This is possible because the DNS configuration will stay always the same (point to the local DNS server).
  • This proposal shifts the burden on the end user: the deployment of Gramine must also have a local DNS server correctly installed on the same machine and configured.
  • Reject: asking users to setup a DNS server is not user-friendly, since this task is non-trivial.
  • Woju doesn't like the idea (but I didn't keep the note and I forgot why exactly :)).

Borys mentions that there must be a way for some apps to:

  • completely ignore /etc/resolv.conf and similar files (e.g. the app doesn't perform any networking),
  • have hard-coded /etc/resolv.conf and similar files (e.g. the app allows to use only the Google DNS server).
  • Both these ways are currently supported by Gramine (the file is not mounted at all or the file is marked as sgx.trusted_files). We should keep these options available for users.
  • Note that these ways may be preferred by some users because they are more secure.

Agreed solution: introduce a new manifest option passthrough_sanitized_etc_files = true:

  • We mimic what Docker does with the "extra runtime files".
  • The set of extra runtime files is limited (at least currently) to these four files, similar to Docker:
    • /etc/hosts
    • /etc/resolv.conf
    • /etc/hostname
    • /etc/localtime (this one may be omitted for now)
  • We introduce a disabled-by-default passthrough_sanitized_etc_files = true|false manifest option.
  • If both this manifest option is set and sgx.trusted_files contains these files, then the passthrough_sanitized_etc_files manifest option wins (it takes precedence over anything else).
  • We will not support "partial" overwrite of /etc/ runtime files -- if the passthrough option is set, it overwrites all 4 files.
  • Gramine should probably write a warning if this new manifest option is enabled and some files are found in sgx.trusted_files.
  • All example manifests in all our Gramine repos should have this option enabled, for convenience of users.
  • Maintainers agreed that this behavior is the least surprising and more intuitive to end users.
  • When passthrough_sanitized_etc_files = true, sanitization happens similarly to the sysfs one:
    • The PAL API has new data structs that can store the contents of these files in a normalized abstract form.
    • The untrusted PAL reads the host files, parses them and puts their contents into the new data structs.
    • The LibOS reads the data structs and reconstructs the strings in /etc/ pseudo-files.
    • Gramine must not simply propagate strings from the host to the LibOS -- think of e.g. a Windows PAL that won't even have such host files in the first place.

@mkow
Copy link
Member

mkow commented Jul 5, 2022

Woju's proposal of hard-coded sgx.trusted_files.content is rejected:

Wait, it wasn't rejected overall, it's just completely unrelated to this specific issue, and it was rejected as a solution for it. But we may want to implement it someday for other purposes.

Gramine should probably write a warning if this new manifest option is enabled and some files are found in sgx.trusted_files.

Not sure about this, the option is already super explicit and it seems that this warning will spam too many users (which want to mount the whole OS image from Docker, but override these three files).

@oshogbo
Copy link
Contributor

oshogbo commented Jul 8, 2022

I can look into this.

@oshogbo
Copy link
Contributor

oshogbo commented Aug 1, 2022

The proposal:

The important note is that we will follow a guideline from RFC2181 that the entire domain name is limited to 255 octets (including separators).
Source: https://www.rfc-editor.org/rfc/rfc2181
Quote:

The DNS itself places only one restriction on the particular labels that can be used to identify resource records. That one restriction relates to the length of the label and the full name. The length of any one label is limited to between 1 and 63 octets. A full domain name is limited to 255 octets (including the separators).

PalGetHostname -> /etc/hostname

The API should be similar to gethostname():

int PalGetHostname(char *name, size_t namelen)

The difference between gethostname and the PalGetHostname will be that we guarantee a null-termination of the name on success.

struct pal_dns_configuration -> /etc/resolv.conf

We want to propose a global variable in PAL that keeps a DNS server configuration. Most applications don't support reconfiguration. Apps read /etc/resolv.conf and cache it content. There is no reason not to behave likewise.

The structure is Gramine would be similar to the libresolve(3):

#define MAXNS                     3         /* max # name servers we'll track */
#define MAXDNSRCH                 6         /* max # domains in search path */
#define MAXRESOLVSORT             10       /* number of the net to sort on */
#define PAL_MAX_HOSTNAME          256

struct pal_dns_server_conf {
        struct sockaddr_in nsaddr_list[MAXNS];   /* nameserver field */
        size_t             nsaddr_list_count;

        char               dnsrch[MAXDNSRCH+1][PAL_MAX_HOSTNAME];  /* components of domain to search; search field */
        size_t             dnsrch_count;

        struct {
                struct in_addr addr;
                uint32_t          mask;
        } sort_list[MAXRESOLVSORT];  /* sort list */
        size_t sort_list_count;

        /* Options: */
        uint8_t ndots;
        uint8_t timeout;
        uint8_t attempts;
        bool rotate;
        bool inet6;
        bool edns0;
        bool single-request;
        bool single-request-reopen;
        bool use-vc;
};

We want to ignore a couple of options supported by resolve.conf - if somebody requires these options, we will be able to add them later.
Ignored options:
debug - don't seem applicable
no-check-names - application specific. If you want it, use your own resolv.conf.
ip6-bytestring - application specific. If you want it, use your own resolv.conf.
ip6-dotint/no-ip6-dotint - application specific. If you want it, use your resolv.conf.
no-tld-query - application specific. If you want it, use your own resolv.conf.
no-reload - not supported currently
trust-ad - we don't trust the server's DNS server. This option probably should go with our DNS server.

Useful links:
https://man7.org/linux/man-pages/man5/resolv.conf.5.html
https://man7.org/linux/man-pages/man3/resolver.3.html

/etc/hosts

We don't think this is something that should be propagated. Most pairs of hostname and IP don't consider the application run in the Gramine environment. If the application requires overriding the domain names, it should be placed in the application itself - not in the server configuration.

Todo:

  • Implement PalGetHostname
  • Make sure to integrate PalGetHostname with a sethostname
  • Implement struct pal_dns_configuration
  • Implement a mini fuzzer of pal_dns_configuration

@boryspoplawski
Copy link
Contributor

PalGetHostname -> /etc/hostname

Why do we need a new function for this? Why cannot it be static information set at the startup?

Make sure to integrate PalGetHostname with a sethostname

Please no. Configuring networking from withing Gramine is not supported and I would prefer to leave it this way.

struct pal_dns_configuration

I would make all the arrays allocated separately and keep pointers to them (instead of inlining them in the struct). Is there any disadvantage of this?

Do you think it's worth implementing so many options initially? I would say most common ones (nameserver and search) would be enough for start? (disclaimer: I'm no sysadmin)

@mkow
Copy link
Member

mkow commented Aug 1, 2022

Also, on the security of this:

  • Just saying out loud what's obvious: we should implement only the part of the configs which are really used by common apps and required (to minimize the attack surface).
  • Network is already assumed to be untrusted (even by normal, non-TEE apps), so it shouldn't be a problem that the app gets DNS config from untrusted host - the host/something on the network can spoof DNSes anyways.
  • We need to ensure that the data passed in these structs is in valid format. They already don't provide a lot of space for misuse, but one thing we need to add there is checking if the domain name is in proper format (e.g. no two periods next to each other, etc.).
  • We may want to check/fuzz some popular resolv.conf parsing libraries (libresolv? anything else?) and see how they behave on weird inputs and how brittle they are.

@oshogbo
Copy link
Contributor

oshogbo commented Aug 1, 2022

PalGetHostname -> /etc/hostname

Why do we need a new function for this? Why cannot it be static information set at the startup?

The reason was that the name can be changed from Gramine. See below.

Make sure to integrate PalGetHostname with a sethostname

Please no. Configuring networking from withing Gramine is not supported and I would prefer to leave it this way.

We actually have a sethostname already in Gramine. This is why I put it here.
Or I miss read something:

long libos_syscall_sethostname(char* name, int len) {

struct pal_dns_configuration

I would make all the arrays allocated separately and keep pointers to them (instead of inlining them in the struct). Is there any disadvantage of this?

Do you think it's worth implementing so many options initially? I would say most common ones (nameserver and search) would be enough for start? (disclaimer: I'm no sysadmin)

I have chosen the ones that make the most sens for me - but I'm open for discussion.
For example it is important to know to use TCP instead of UDP as the UDP traffic might be filtered.
Maybe we can resign from sort_list as more I think about it its more system specific thing.

@boryspoplawski
Copy link
Contributor

We actually have a sethostname already in Gramine. This is why I put it here.
Or I miss read something:

But it's a dummy implementation, which does not propagate the change neither to other in-Gramine processes, nor to the host OS.

For example it is important to know to use TCP instead of UDP as the UDP traffic might be filtered.

Going this way everything is important - if something uses it, then it might be requires for correct operation. We just want to make sure that it's a common option that will be actually used.

@dimakuv
Copy link
Contributor Author

dimakuv commented Aug 3, 2022

Why do we need a new function for this? Why cannot it be static information set at the startup?

+1 to @boryspoplawski. This is not needed, let's just have a static info. The fact that we implement libos_syscall_sethostname() is not an argument here -- we probably actually want to remove this syscall alltogether (I don't remember who and why added it, it doesn't make sense anyway).

We don't think this is something that should be propagated. Most pairs of hostname and IP don't consider the application run in the Gramine environment. If the application requires overriding the domain names, it should be placed in the application itself - not in the server configuration.

I cannot parse this paragraph... Why shouldn't we propagate /etc/hosts? How else will the application map the hostname (e.g., provided as a command-line argument like ./myapp myhost) to the IP address?

@oshogbo
Copy link
Contributor

oshogbo commented Aug 3, 2022

I cannot parse this paragraph... Why shouldn't we propagate /etc/hosts? How else will the application map the hostname (e.g., provided as a command-line argument like ./myapp myhost) to the IP address?

My understanding was that with Gramine we target running a trusted process on untrusted server. So if you want to have your own myhost definition should this be a part of the application itself, instead of trusting server configuration. For example the server might override some DNS entry which should be override and we will blandly propagate them. That said I'm not sure if this is even a valid argument as I'm not sure if we can really trust DNS either. However even with untrusted configuration of DNS you might want to enforce DNSSEC which would ensure correctness.

@boryspoplawski
Copy link
Contributor

My understanding was that with Gramine we target running a trusted process on untrusted server. So if you want to have your own myhost definition should this be a part of the application itself, instead of trusting server configuration. For example the server might override some DNS entry which should be override and we will blandly propagate them. That said I'm not sure if this is even a valid argument as I'm not sure if we can really trust DNS either. However even with untrusted configuration of DNS you might want to enforce DNSSEC which would ensure correctness.

All of this doesn't really matter, host OS can e.g. change dest and src IP addresses as it wishes.

@dimakuv
Copy link
Contributor Author

dimakuv commented Aug 3, 2022

So if you want to have your own myhost definition should this be a part of the application itself, instead of trusting server configuration.

What does it mean exactly, "be a part of the application itself"? Why can't the application rely on /etc/hosts to get a mapping from the hostname to the IP address?

All of this doesn't really matter, host OS can e.g. change dest and src IP addresses as it wishes.

Yes, of course, but this can be said about any file that we want to sanitize here... The point is not to make /etc/hosts file trustworthy, but to verify that this file adheres to a certain structure that a normal non-security-hardened Glibc parser can parse correctly (to prevent exploits).

@boryspoplawski
Copy link
Contributor

boryspoplawski commented Aug 3, 2022

Yes, of course, but this can be said about any file that we want to sanitize here... The point is not to make /etc/hosts file trustworthy, but to verify that this file adheres to a certain structure that a normal non-security-hardened Glibc parser can parse correctly (to prevent exploits).

Yes, I know. I was responding to @oshogbo that we trust (or actually need other countermeasures) the server in this regard anyway.

@dimakuv
Copy link
Contributor Author

dimakuv commented Oct 31, 2022

@oshogbo completed all the required tasks, closing this meta-issue.

@dimakuv dimakuv closed this as completed Oct 31, 2022
@haraldh
Copy link

haraldh commented Jun 23, 2023

How do you solve the docker case with docker run --add-host=host.containers.internal:host-gateway where docker updates /etc/hosts? I am using:

allowed_files = [
    "file:/etc/hosts",
]

@dimakuv
Copy link
Contributor Author

dimakuv commented Jun 23, 2023

@haraldh The currently recommended way is like we have in the Python example:

  1. You create a file with hard-coded contents: https://github.com/gramineproject/gramine/tree/master/CI-Examples/python/helper-files
  2. You replace /etc/hosts with this file inside Gramine, using this manifest trick:
    { path = "/etc/hosts", uri = "file:helper-files/hosts" },
  3. You ship the helper-files/hosts file together with your Gramine app bundle.

I'm not familiar with docker run --add-host=host.containers.internal:host-gateway. How exactly does it update /etc/hosts? Does it add any important info, or the added info is actually irrelevant?

@haraldh
Copy link

haraldh commented Jun 24, 2023

--add-host=host.containers.internal:host-gateway adds the internal bridge IP address of the docker host to /etc/hosts, which you want, if you want to connect to e.g. pccs running on the hosts for all containers.

@haraldh
Copy link

haraldh commented Jan 23, 2024

--add-host=host.containers.internal:host-gateway adds the internal bridge IP address of the docker host to /etc/hosts, which you want, if you want to connect to e.g. pccs running on the hosts for all containers.

I still think this is a legitimate use case. What is your suggestion for /etc/sgx_default_qcnl.conf and a local PCCS?

@dimakuv
Copy link
Contributor Author

dimakuv commented Jan 23, 2024

--add-host=host.containers.internal:host-gateway adds the internal bridge IP address of the docker host to /etc/hosts, which you want, if you want to connect to e.g. pccs running on the hosts for all containers.

Hm, I can't come up with an elegant way right now. sgx.allowed_files doesn't seem a secure option, because the malicious host can then modify /etc/hosts in ways that will break the parser of the application, and Gramine will not notice such attack.

A workaround would be to have a immutable /etc/hosts as I explained above, and in that immutable file, you specify some hard-coded IP address. Then on Docker container startup, you spawn an additional helper tool (like socat) to redirect Gramine's connections to that hard-coded IP address to the internal bridge IP address that is specified in the real /etc/hosts. A bit ugly, and you lose a bit of performance, but this should be secure.

@haraldh
Copy link

haraldh commented Jan 23, 2024

I can imagine a synthesized sanitized /etc/hosts provided by gramine.

@haraldh
Copy link

haraldh commented Jan 23, 2024

Another option could be an env variable and a "pre-exec" which creates and provides /etc/sgx_default_qcnl.conf somehow

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants