-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add draft proposal for package hardening #562
base: master
Are you sure you want to change the base?
Conversation
@samoht thanks for writing this up. I'm not sure whether I don't understand why/what "by vendoring opam metadata" means. reproducibility of a MirageOS unikernel should depend only on a) dependencies (which are embedded into it, including version number) b) ld and C compiler c) OCaml compiler. I'm also unsure what you mean by "have a way to gather licenses" -- you can do this in a (more or less) generic way using opam metadata, can't you? what I'm missing from the list:
EDIT: needed to add some more notes:
|
I like both @samoht and @hannesm's lists. Regarding "have a way to gather licenses" it's useful to be able to get the license file itself because it will often have text like
so if you want to send someone a binary unikernel you need to also include the copyright notice i.e. the copyright holders as distinct from the authors or the maintainers (sometimes they're the same but sometimes they're not) and a bunch of other stuff which isn't readily available from the |
This will do it:
See |
|
Indexing arrays maybe ? |
@dbuenzli @cfcs indexing arrays is one use case, another is string or cstruct length operations and arithmetic (e.g. slicing a DNS frame into pieces, and keeping track of offsets... this will never be > 512 (or 2 ^ 16 - 1 in the TCP case)). if you receive a 16bit (8bit/24bit) value from a wire protocol, it is very safe to use int for that (unless you allow clients to fill an int without appropriate checking (since Cstruct wraps (using e.g. |
@dbuenzli @hannesm I believe that the native array and string indexing using The wrapping of(aka silent overflowing/underflowing of) ints is exactly what we want to [EDIT: prevent], so ideally we'd have a functor that defined an integer field, and use operations on that field. @hannesm if you receive a 8-bit length field (0x00) and subtract one (because the length field was supposed to include the length of the packet) you will get I'm aware of the fact that it's sometimes a bit awkward to not use the basic types, and I still hope we can find a way to make it less awkward. (some kind of polymorphic arithmetic operations like |
Thanks all for your feedback! I think we should aim to make that list actionable in the short term -- so the discussion about switching everything to not using Thanks @dbuenzli for the license tip, that's exactly what we need. @hannesm: for the reproducible bit, we started experimenting with having more control over the dependencies which are used to build a released binary (using things similar to lock files): https://github.com/moby/vpnkit/tree/master/repo so that we have a stronger control over the opam universe used in the CI. @hannesm @cfcs: thanks for your lists. I will try to merge all of this into something consistent :-) |
tmpl/wiki/package-hardening.md
Outdated
90 | ||
``` | ||
|
||
We now have various libraries covering all aspect of the OS: from network to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...aspects...
tmpl/wiki/package-hardening.md
Outdated
|
||
We now have various libraries covering all aspect of the OS: from network to | ||
block device drivers -- at all levels of the stack: from virtual network cards | ||
to TLS or virtual block devices to virtual filesystem. These libraries also |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...filesystems.
tmpl/wiki/package-hardening.md
Outdated
We now have various libraries covering all aspect of the OS: from network to | ||
block device drivers -- at all levels of the stack: from virtual network cards | ||
to TLS or virtual block devices to virtual filesystem. These libraries also | ||
span multiple hypervisors such Xen as well as KVM (via solo5). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...multiple hypervisors including Xen and (via solo5) KVM.
tmpl/wiki/package-hardening.md
Outdated
span multiple hypervisors such Xen as well as KVM (via solo5). | ||
|
||
All of this is great: this means that as a MirageOS user, you can spend time | ||
developping your own application instead of spending time re-implementing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
developing
tmpl/wiki/package-hardening.md
Outdated
All of this is great: this means that as a MirageOS user, you can spend time | ||
developping your own application instead of spending time re-implementing | ||
something that the OS would usually do for you. There are of course still | ||
gaps in the ecosystem, but this is getting better that it used to be. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...but things are getting better!
tmpl/wiki/package-hardening.md
Outdated
these libraries. Would be great to be able to answer questions such as: | ||
Does the library has a good test coverage? Was it formally verifired? | ||
Was it used in production? In order to do this, I would like to introduce | ||
the "MirageOS hardining process". Ideally this will live in a tool (e.g. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hardening
tmpl/wiki/package-hardening.md
Outdated
|
||
- have well-identified maintainers | ||
- use the correct set of tags in opam metadata | ||
- use proper names: same name for ocaml and ocamlfind package, and also ideally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mean opam rather than ocaml here?
tmpl/wiki/package-hardening.md
Outdated
- have documentation | ||
- have a clear LICENSE file | ||
- use Jbuilder to build, in order to have short feedback loops when fixing bugs | ||
spawning multiple packages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spanning rather than spawning?
tmpl/wiki/package-hardening.md
Outdated
spawning multiple packages | ||
- use topkg to release, in order to ease the task of the release manager | ||
- have proper indentation (usgin ocp-indent + checked-in ocp-indent file) | ||
- use at most 80 collumns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
columns (or combine this bullet with previous, and host a standard ocp-indent file somewhere?)
On 21/07/2017 14:37, Thomas Gazagnaire wrote:
I think we should aim to make that list actionable in the short term
-- so the discussion about switching everything to not using `int`
seems a bit unrealistic for now on (we still have libraries without
test/doc, and we just started to add fuzz testing to some of the core
libs). I agree that aiming for 32 bit support is great, especially if
we want to compile things to Javascript. Any idea how to do this?
Maybe we should have a special compiler switch set with 32bits that
we could use with TravisCI?
see also mirage/mirage#614 for further
discussion (I think that was planned before the MirageOS 3 release)...
I suspect commits such as
https://github.com/mirage/irmin/pull/458/files#diff-0b43dc6466ffe5ba4e4629f71c33631eR707
(and
https://github.com/mirage/irmin/pull/458/files#diff-0b43dc6466ffe5ba4e4629f71c33631eR776)
break 32bit support (and esp in there I don't understand why there's
need for an int type - it'll break if you switch platforms)...
|
@samoht if you make a list of what CI modes would be useful (e.g. afl, 32-bit, ...) I can add those explicitly into mirage-ci |
20fd1d7
to
6af3201
Compare
tmpl/wiki/package-hardening.md
Outdated
@@ -0,0 +1,118 @@ | |||
## Package hardening |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as mort mentioned on mirageos-devel, hardening
has security connotations, whereas this is more about 'towards high-quality packages' (and not only about security)
the current PR looks more like a laundry list - I agree with mort that pointing to some examples of "good libraries" would be good and another item would be "semantic versioning" -- although it seems there's no proper definition / understanding inside the MirageOS community (so maybe we should not mention it?) |
Yup completely agree about the laundry list, I feel the same. Any opinion on how to organise that better? Maybe we should split this into "phases" or "levels", e.g:
(yes, I quite enjoyed the capnproto protocol levels) And maybe have badges for each levels to put in the README. Also I will replace "hardening" by something related to "quality". For the examples of good library, please submit an example or two :-) I am not aware of any library today which fulfils all the basic items on that list ... |
That structure into levels LGTM. |
tmpl/wiki/package-hardening.md
Outdated
|
||
- Have unit tests (using alcotest) (ideally with coverage report) | ||
|
||
- Have fuzz tests (using crowbar) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a great idea for all projects, but the tooling really isn't ready for this to be included yet in a document people might be trying to adhere to. "Have randomized property-based testing" is more achievable and will give people the right foundation to use tools like Crowbar later, once they're more ready to be integrated in a CI pipeline.
The coding style items are all good goals, but they're at varying levels of subjectivity and apparent-ness (sorry). "avoid potential overflows" and "don't surface exceptions" are testable and concrete; items that begin with "be aware of" are less so. We should either come up with concrete ways in which a library surfaces these qualities, or note that they are goals somewhere rather than specific properties that can be possessed or not possessed. |
931d6ad
to
d7b3299
Compare
Pushed a tentative list of levels ... not really sure where this is going and indeed some of the item will be hard to verify automatically, at least in the short term. |
tmpl/wiki/package-quality.md
Outdated
|
||
- Use the correct set of tags in opam metadata | ||
|
||
- Have documentation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Crucial is to define the behaviour on edge cases (see e.g. the discussion about mirage-console being blocking on Xen vs partial write on Solo5 mirage/mirage-flow#4), and whether and how arguments are checked (e.g. values given to set*
in cstruct are not checked, see mirage/ocaml-cstruct#78)
tmpl/wiki/package-quality.md
Outdated
- Have a design document specifying the scope of the library (and esp. what is | ||
out of scope) | ||
|
||
- Use the correct set of tags in opam metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this mean?
tmpl/wiki/package-quality.md
Outdated
|
||
- Depends only on libraries released in opam. | ||
|
||
- Provide pretty printers by using [fmt](http://erratique.ch/software/fmt/doc/Fmt.html). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is intertwined with the previous use alcotest
... if you use alcotest, you likely already defined pretty printers
|
||
- Avoid the use of C bindings for no good reasons. A good reason would be to | ||
improve performance by an order of magnitude, or re-use an existing C library | ||
that has not been rewritten yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe elaborate a bit with links how difficult it is to get C bindings cross compiled atm!? and indicate that MirageOS does not include a C library, but only very few functions which are required for the OCaml runtime (see https://github.com/mirage/ocaml-freestanding/tree/master/nolibc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We definitely should add a documentation. What I'm afraid with MirageOS 4 is the ability to compile C stubs for the Unix target (with the libc
according to the dune
context) and be blocked when we want to switch the whole codebase to a new target such as ocaml-freestanding
/Solo5
.
Currently, mirage-crypto
or digestif
try to compile properly C stubs with ocaml-freestanding
flags (including nolibc
) but such test (just to check that these C stubs can be cross-compiled with ocaml-freestanding
) is not done by mirage
and nothing (as far as I can say) can prevent a bad C stubs for all targets.
tmpl/wiki/package-quality.md
Outdated
|
||
- Avoid integer overflows (basically every addition and subtraction, as well | ||
as multiplication needs to be guarded unless you know that an overflow can | ||
never happen (in this case, a comment should be suggested)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> in this case a comment is required
. and add "you can test this by passing max_int and min_int in your tests for all int which occur"
I'd like to revisit this, with the goal of making it into a concrete list of things to achieve by interested newcomers, applicable to at least three repositories, by the start of the next Mirage hack retreat on 28 November. |
@yomimono sounds great, given the release of ocamlformat, I'd suggest to jump on the train and reindent all the libraries, and deploy pre-commit hooks to never accept any code which ocamlformat disagrees with. |
@hannesm can we please have some fair warning before anything like that happens - reformatting everything and breaking everyone's local patches sounds to be me like a pretty high cost to pay to get normalized whitespace. |
I agree with @cfcs -- can we get the current guidelines in place without creating new ones? I've only just started using ocamlformat and it's a young tool with some way to go before it can be imposed as a standard. For example, object syntax isnt supported, and there are still rewrite issues with portions of the AST. The tool is definitely heading in the right direction, but will take some time to mature. |
ok, so how do we move forward here? I'd appreciate to have such a document in place (and agree that we should as well have some example libraries which are "high quality"). my experience with there are still some comments (from several people). can someone with a fresh pair of eyes review/revise this PR and we get settled to a first public draft!? |
I will be very happy to see this moving forward but I have unfortunately a growing list of TODOs that I want to complete first. If anyone else is interested, please feel free to push directly to my branch :-) |
941fc87
to
ffcf0ca
Compare
EDIT: sorry for the force-push I spend some time to revise this PR, and first wrote a paragraph that not every OCaml library works on MirageOS. I introduced tags ( I think most of our core libraries are in |
ffcf0ca
to
fb2454c
Compare
|
||
Style best practises | ||
- Have proper indentation (using ocp-indent + checked-in ocp-indent file). | ||
- Use at most 80 columns (`ocp-indent` unfortunately doesn't check this). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be add a note about ocamlformat
here.
|
||
The library uses continuous integration systems to catch issues early: | ||
- The [ocaml-ci-scripts](https://github.com/ocaml/ocaml-ci-scripts) contain | ||
Travis and AppVeyor integration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be add a note about ocaml-ci
.
I'd like to revisit that PR at one point, as @TheLortex has started to work on |
I would like to start the discussion about improving the general quality of MirageOS packages. Ideally we should have a tool to help us do that (something like
mirage lint
) and/or a dashboard as @hannesm suggested a few time already.I know some of these criteria are very opinionated but I would like to get a sense of what other people in the community is thinking, so please feel free to disagree or to suggest better criteria :-)