Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions #12

Open
dralley opened this issue Feb 14, 2022 · 13 comments
Open

Questions #12

dralley opened this issue Feb 14, 2022 · 13 comments

Comments

@dralley
Copy link

dralley commented Feb 14, 2022

  1. Are there any plans to publish this library on crates.io?
  2. Are there any distinguishing features or goals of this library vs. https://github.com/Richterrettich/rpm-rs ?

For context, I'm (slowly) writing https://github.com/dralley/rpmrepo_rs as a side project and I need a library for parsing metadata from RPM package headers and verifying signatures.

@DemiMarie
Copy link
Collaborator

  1. Are there any plans to publish this library on crates.io?

I would very much be willing to do so, though I still recommend using the signed release tags instead.

  1. Are there any distinguishing features or goals of this library vs. https://github.com/Richterrettich/rpm-rs ?

Yes, there are. Disclaimer: I am the author of this crate and not at all familiar with https://github.com/Richterrettich/rpm-rs.

  • rpm-oxide was written in response to a series of security vulnerabilities I found in RPM in the first quarter of 2021. It is used in production by Qubes OS on untrusted input, and has the same high standards for security that the Qubes OS project itself does. Furthermore, it was written by a security researcher (me) and reviewed by another one (@marmarek). It also follows Qubes OS’s security process.

  • rpm-oxide has no crates.io dependencies whatsoever. The command-line rpmcanon tool currently relies on rustc’s private version of libc in order to get some constants, but this is technical debt that should be fixed.

  • rpm-oxide supports, and is used in production with, Rust versions as old as that shipped in Fedora 25.

  • rpm-oxide supports verifying signatures against the RPM keyring, which requires that it link against the system RPM libraries. The RPM libraries are also used for cryptography and to obtain the types of known RPM tags. To mitigate known undefined behavior in old RPM’s OpenPGP implementation, rpm-oxide validates OpenPGP signatures before passing them to RPM.

    A pure-Rust implementation of cryptography is indeed a highly desirable feature, as librpm has a buggy and thread-unsafe atexit() handler that can cause use-after-free during process exit. Suggestions for mitigating this are welcome. My current plan is to disable this handler by using a bespoke atexit() handler that just calls _exit(), and to make use of the system libraries gated by a Cargo feature.

  • rpm-oxide includes a utility library for parsing binary data, which is inspired by (but takes no code from) the untrusted crate. I plan to split this off into its own crate.

  • rpm-oxide has been tested on the entire Fedora 25 and Fedora 32 package repositories. It was able to handle all but one of the Fedora 25 packages, and all of the Fedora 32 packages.

  • rpm-oxide is careful to only buffer the package header in memory. The payload is processed in a streaming fashion. The included rpmcanon tool writes the output to an unnamed temporary file (opened with O_TMPFILE) and then atomically inserts it at its final location.

@dralley
Copy link
Author

dralley commented Feb 14, 2022

Good to know, thank you for the response.

Should I also expect it to work with packages from RHEL / CentOS 7, or are there known limitations with certain older packages?

Could you point me towards any examples of code which pulls various tags from the Header? Roughly similar to this code: [0]. The closest I could find was this code [1] but it doesn't include all of the other fields, which I need for my use case.

[0] https://github.com/Richterrettich/rpm-rs/blob/master/src/rpm/headers/header.rs#L392-L436
[1] https://github.com/QubesOS/qubes-rpm-oxide/blob/main/rpm-parser/src/header/mod.rs#L57-L69

@DemiMarie
Copy link
Collaborator

Good to know, thank you for the response.

You’re welcome.

Should I also expect it to work with packages from RHEL / CentOS 7, or are there known limitations with certain older packages?

There are a few known types of unsupported packages. Generally, rpm-oxide should accept any package that a somewhat recent (4.14+) RPM generates, and the vast majority of packages generated by older RPM versions. The cases below are the differences I am aware of:

  • Some packages (mostly from Intel, if I recall correctly) have a signature header that is not a single contiguous region. The current header-parsing code is not prepared for this and will fail this check:

    if region.count() != 16
    || region.ty() != TagType::Bin as _
    || region.offset() as usize != region_offset
    {
    bad_data!("bad region trailer location {:?}", region)
    }

  • Some old packages have header tag data entries that are not sorted. This should trip the following check:

    let tag = entry.tag();
    fail_if!(tag <= last_tag, "entries not sorted");
    but it does not, because last_tag never gets updated. This is actually a problem in practice because it means that duplicate tag data entries are not rejected, which can cause a panic later:
    assert!(payload_digest.is_none(), "duplicate tags rejected earlier");

  • v3 package parsing simply has not been implemented, so v3 packages will likely be rejected in this code:

    let region_offset = data_length as usize - 16;
    if region.count() != 16
    || region.ty() != TagType::Bin as _
    || region.offset() as usize != region_offset
    {
    bad_data!("bad region trailer location {:?}", region)
    }
    let mut last_tag = region.tag();
    if last_tag != region_tag {
    bad_data!("bad region kind {}, expected {}", last_tag, region_tag)
    };

    v3 packages are deprecated upstream, though, and will not be supported by a future version of RPM. They have also not been generated in 20 years or so, so I would not worry about them unless you need to parse some truly ancient packages.

  • Some packages have invalid UTF-8 in the immutable header, which trips this check:

    match std::str::from_utf8(r) {
    Ok(_) => r,
    Err(e) => bad_data!("String entry is not valid UTF-8: {}", e),
    }

    This was an RPM bug fixed in 4.14 if I recall correctly, and only impacted one Fedora 25 package.

  • rpm-oxide ignores the alternate payload digest, so it cannot verify packages that have already been decompressed. That this is the default behavior is intentional, since any modification should invalidate a signature, but I could provide an option to allow this instead.

  • rpm-oxide cannot verify old packages that only have a header+payload signature and no header signature. Having this be the default is again intentional, for attack surface reduction reasons, but I should be able to add an option to support verifying such packages.

If any of these differences are a problem in practice, would you mind opening an issue? Please file a separate issue for each problem.

Could you point me towards any examples of code which pulls various tags from the Header? Roughly similar to this code: [0]. The closest I could find was this code [1] but it doesn't include all of the other fields, which I need for my use case.

[0] https://github.com/Richterrettich/rpm-rs/blob/master/src/rpm/headers/header.rs#L392-L436 [1] https://github.com/QubesOS/qubes-rpm-oxide/blob/main/rpm-parser/src/header/mod.rs#L57-L69

That is indeed a limitation. You can fish the needed data out of the header yourself, but that is clumsy and error-prone. Can you file an issue for the missing tags, ideally with a list of tags you need?

@dralley
Copy link
Author

dralley commented Feb 14, 2022

Thanks again for the very thorough response.

I will file an issue, however I wouldn't want you to go too far out of your way to address those issues, as I think for my purposes it is a little difficult to justify starting out with a library without any documentation or many of the APIs I might need.

That's not to diminish your work at all! Certainly there are a lot of benefits to this library with one of them being your experience with upstream (I've see some of your conversations in the upstream repos occasionally). It's just that my project is mostly just for fun as opposed to a serious production library, and I don't want to get too sidetracked just yet, so documentation and ease of use are more important to me at this moment (although at some point I may want to circle back and re-evaluate).

@dralley
Copy link
Author

dralley commented Nov 7, 2022

@DemiMarie So, further question.

A few weeks after I wrote this issue the current owner/maintainer of https://github.com/Richterrettich/rpm-rs pretty much dropped off of the internet, he hasn't responded to emails or messages in nearly 8 months now.

In light of that, a bunch of interested developers have been having a discussion for the past several months about what to do. Ultimately the repo got forked to https://github.com/rpm-rs/rpm-rs/, but we haven't actually taken the step of creating a whole new crate yet.

Would you have any interest in trying to merge efforts into a single "rpm-oxide" library published on crates.io under that name?

@DemiMarie
Copy link
Collaborator

@dralley: I would indeed be interested in merging efforts. Some notes on rpm-oxide as it stands now:

  • rpm-oxide is used in production as part of Qubes OS, via the rpmcanon tool.
  • I do not have much time to work on rpm-oxide, though I am happy to review PRs by others.
  • rpm-oxide currently relies on librpm in a few places. Most of those should be removable, and the ones that are not (crypto stuff) can be feature-gated.

@dralley
Copy link
Author

dralley commented Nov 16, 2022

@DemiMarie That is excellent!

In practical terms, do you have strong feelings about how that should be handled? e.g would you be comfortable with the combined library living at https://github.com/rpm-rs/rpm-rs/ (where we could add you as a collaborator)

@DemiMarie
Copy link
Collaborator

@DemiMarie That is excellent!

In practical terms, do you have strong feelings about how that should be handled? e.g would you be comfortable with the combined library living at https://github.com/rpm-rs/rpm-rs/ (where we could add you as a collaborator)

I would be, yes. I do have some pretty strong feelings about how code should be maintained, notably in regards to security. I would strongly prefer to use the same approach that Qubes OS does, where every commit must be signed and the main branch should always have a signed tag. Since rpm-oxide is production code, and rpm-rs currently is not IIUC, it would be lower risk from a Qubes perspective to start with the former, and submit code from the latter via PRs.

@dralley
Copy link
Author

dralley commented Nov 16, 2022

I do have some pretty strong feelings about how code should be maintained, notably in regards to security.

Understandable

I would strongly prefer to use the same approach that Qubes OS does, where every commit must be signed and the main branch should always have a signed tag.

I think that should be doable, I'm not sure we can do that for existing git history, but new commits most likely yes.

Since rpm-oxide is production code, and rpm-rs currently is not IIUC, it would be lower risk from a Qubes perspective to start with the former, and submit code from the latter via PRs.

I believe it has at least some production usage via https://crates.io/crates/cargo-generate-rpm, and the author claimed that their workplace was using it to generate RPMs. Granted I am not sure how deep or extensive the usage has been in comparison but it is at least nonzero.

I think the most apparent sticking points are

  • rpm-rs tries to completely avoid linking against librpm
  • for crypto it is using the pgp crate which has been independently audited and used in production and is likely safe, but obviously a very different solution from this package
  • rpm-rs is currently licensed as Apache 2.0 only, and while I'm pretty sure every contributor would consent to Apache 2.0 + MIT, we can't get consent to relicense from the original author unless they start responding again. So there would need to be some firewalls in place to prevent the licensing situation from becoming messy, or else we would need to be careful with any code that is transferred.

@DemiMarie
Copy link
Collaborator

I do have some pretty strong feelings about how code should be maintained, notably in regards to security.

Understandable

Thank you

I would strongly prefer to use the same approach that Qubes OS does, where every commit must be signed and the main branch should always have a signed tag.

I think that should be doable, I'm not sure we can do that for existing git history, but new commits most likely yes.

For existing history it is not necessary, due to Git’s Merkle tree nature. Alternatively, if we import the existing rpm-rs code into this repo via PRs (and then replace the contents of the other repo with the contents this one) it would happen automatically.

Since rpm-oxide is production code, and rpm-rs currently is not IIUC, it would be lower risk from a Qubes perspective to start with the former, and submit code from the latter via PRs.

I believe it has at least some production usage via https://crates.io/crates/cargo-generate-rpm, and the author claimed that their workplace was using it to generate RPMs. Granted I am not sure how deep or extensive the usage has been in comparison but it is at least nonzero.

rpm-oxide is used by every Qubes OS user for dom0 updates and template downloads, which means that well over 10,000 people use it regularly IIUC. If it breaks, users will not be able to install the update to fix it, which is bad. Hence, I would strongly prefer to start with this repository as-is and make incremental changes to it, to minimize the risk of regressions. Additionally, previous versions of rpm-oxide has been tested on the entire Fedora 25 and Fedora 32 repositories (dom0 versions used by R4.0 and R4.1 respectively) with only one failure (a package in the Fedora 25 repo that had invalid UTF-8 in a string field).

I think the most apparent sticking points are

  • rpm-rs tries to completely avoid linking against librpm

That is fair. Especially old versions of RPM are not nice to link against. I had to work around a call to atexit() that caused multithreaded programs (such as the test suite) to segfault (see calls to grab_mutex() for details).

  • for crypto it is using the pgp crate which has been independently audited and used in production and is likely safe, but obviously a very different solution from this package

In Qubes OS linking against the system RPM libraries is preferable, as they are needed anyway and minimizing dependencies is important. That said, I would be fine with having switchable backends so that pgp could be used instead.

  • rpm-rs is currently licensed as Apache 2.0 only, and while I'm pretty sure every contributor would consent to Apache 2.0 + MIT, we can't get consent to relicense from the original author unless they start responding again. So there would need to be some firewalls in place to prevent the licensing situation from becoming messy, or else we would need to be careful with any code that is transferred.

I’ll have to think about this one. Most stuff in Qubes OS is GPL’d. @marmarek would this repo being Apache 2.0 only be a problem, or does it need to be GPLv2 compatible?

@marmarek
Copy link
Member

@marmarek would this repo being Apache 2.0 only be a problem, or does it need to be GPLv2 compatible?

For this repository, it should be fine I think. We don't (need to) link it with other parts, we use a standalone binary built from it.

@dralley
Copy link
Author

dralley commented Nov 18, 2022

FYI, we managed to get ahold of the rpm crate name (the person that reserved it wanted to use it for "redis package manager" 5 years ago, but since they never got around to it, they were willing to let us use it instead).

Probably for now we will use that, and then figure out which parts make sense to merge together or replace over time.

@dralley
Copy link
Author

dralley commented Nov 23, 2022

As expected the original author is the sticking point on the licensing issue. I'm sure he would agree if we could reach him, but, well...

rpm-rs/rpm#27

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants