Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document how to develop goblint-cil in conjunction with Goblint #293

Merged
merged 1 commit into from
Jul 30, 2021

Conversation

sim642
Copy link
Member

@sim642 sim642 commented Jul 16, 2021

Issue #208.

I just tried to figure this out on my own and Opam's own documentation is so unhelpful for this. Maybe someone who is more experienced with goblint-cil development knows easier and better ways.

@sim642 sim642 added the documentation Documentation, comments label Jul 16, 2021
@sim642
Copy link
Member Author

sim642 commented Jul 16, 2021

Also, if I did something wrong or goblint-cil build is even more obscure than Goblint's, but ocamllsp for VSCode underlines everything red and cannot find anything, although dune build works just fine.

EDIT: Apparently deleting the committed .merlin in goblint-cil root fixes the issue. It's probably overriding the merlin stuff generated by dune. Not sure if it's necessary to keep that for some ocamlbuild support or could it be removed because it hinders the development process there.

@michael-schwarz
Copy link
Member

michael-schwarz commented Jul 22, 2021

Not sure if it's necessary to keep that for some ocamlbuild support or could it be removed because it hinders the development process there.

The nice thing for the ocamlbuild based build is that

  • it is easy to inspect machdep.ml as that file inside _build is correctly recognized by merlin as the place to jump to
  • dune build currently is sort of convoluted and calls the old make file in a lot of places, so it is not too easy to understand what it does.

So unless we overhaul the build system of CIL completely (get rid of make as much as we can, and use idiomatic dune things instead), I'd want to keep support for the ocamlbuild based build, because I found that easier to debug this far.

@sim642
Copy link
Member Author

sim642 commented Jul 22, 2021

Even though goblint-cil README has the Makefile build before the dune build, I thought the dune build would be the proper one, especially since the opam package is built using dune as well. So that's what opam upgrade goblint-cil etc are also using. As convoluted as this is, it's the standard way of building goblint-cil. This is confusing because the README suggests Makefile first, which I didn't even bother trying, thinking it's the outdated way of doing things.

I suspect the problem with machdep.ml has something to do with the convoluted way it's done through dune here: https://github.com/goblint/cil/blob/208d2a2f9e51a42ee0a036f4587624ac7ac23ccb/src/dune#L11-L15.
make machdep is just meant to generate it into _build/machdep.ml, but that's not identical to how dune uses _build. Instead dune keeps a copy of the source and the built files in _build/default, so Makefile generates it into _build/default/_build/machdep.ml and the last action copies it from there to _build/default/src/machdep.ml. Hardly surprising that merlin cannot find it since it's unaware of the source file being in a non-standard location and then copied as well. According to ocaml/dune#3552, (mode promote) is meant to be used to have dune copy the file nicely back into src/machdep.ml where it's nicely in the normal source tree. That might help merlin know where to find it in the dune build.
(Goblint's own generated files should probably also be promoted.)

But either way, if we keep build builds for goblint-cil, then maybe we need its dune build instructions to just say to delete .merlin for ocaml-lsp to work?

@sim642
Copy link
Member Author

sim642 commented Jul 27, 2021

promote works for development, but it breaks when you try to actually install/upgrade the package with opam and will say Machdep is unbound. This seems to be due to promoted files meant to be committed into the actual src. During package install, promotion doesn't happen, so the generated module will be missing.

The exception to that should be promote-until-clean but it's buggy until dune 3.0: ocaml/dune#4401.

@vogler
Copy link
Collaborator

vogler commented Jul 28, 2021

Related: goblint/cil#45.
I there also stumbled again over the complex autotools + dune interaction.
For compiling at least we just need to pull out this machdep.ml generation from the Makefile & co.
Maybe we should just throw this generation away and have some presets committed for the most common combinations ({linux, macos, windows?} * {x86_64, arm64, ?}). This would be beneficial anyway to be able to 'cross-analyze'.

@michael-schwarz
Copy link
Member

Maybe we should just throw this generation away and have some presets committed for the most common combinations ({linux, macos, windows?} * {x86_64, arm64, ?}). This would be beneficial anyway to be able to 'cross-analyze'.

The problem there is that the the preprocessor output that we feed to CIL and the pre-provided machdep.ml might make different assumptions about the size of certain types and some other compiler specific things. Currently we have the guarantee that they will match. (Unless one upgrades the compiler or uses a non-default version of it).

Another way to simplify the generation of machdep.ml would be dropping MSVC support from CIL. There is a lot of code related to it, and no-one has used it (or checked it still works) probably since we forked from upstream CIL.

@sim642
Copy link
Member Author

sim642 commented Jul 28, 2021

For compiling at least we just need to pull out this machdep.ml generation from the Makefile & co.

Isn't the only reason for all this C compilation (automake, Makefile, etc) the generation of machdep.ml? The rest is OCaml code, which can be compiled through dune just fine.

CIL even includes some code for parsing these Machdep things from some kind of strings here: https://github.com/goblint/cil/blob/develop/src/machdepenv.ml. No idea if this is used or documented anywhere, or has any examples.

But even if there's no default Machdep baked into CIL itself, we'd still need to generate the different variations somehow. And the way to do that would still be to use such generation by printing them from C, which needs to be compiled. So this C compilation is inevitable, the question is simply how to do it as cleanly as possible.

@michael-schwarz
Copy link
Member

Isn't the only reason for all this C compilation (automake, Makefile, etc) the generation of machdep.ml?

This, plus cilly for which the instructions are also in the Makefile.

@sim642
Copy link
Member Author

sim642 commented Jul 28, 2021

The problem there is that the the preprocessor output that we feed to CIL and the pre-provided machdep.ml might make different assumptions about the size of certain types and some other compiler specific things. Currently we have the guarantee that they will match. (Unless one upgrades the compiler or uses a non-default version of it).

Tbh, that's not a very good guarantee. It's actually harmful for SV-COMP or #54 if you want to analyze both 32bit and 64bit programs on the same machine. It's even a problem if you compile Goblint (or rather CIL) on one machine and want to distribute the binary to another.

There's no simple solution to the latter, but making the architecture choice explicit (including having to explicitly say you want to use the architecture of the machine where CIL was compiled) would still be a good thing.

@vogler
Copy link
Collaborator

vogler commented Jul 28, 2021

Yes, that's what I meant.
We currently also don't check this generated file and just assume that the machine you compiled CIL on is also what you want to analyze. At the moment you'd first have to recompile on the new target to be able to analyze its code soundly.

@michael-schwarz
Copy link
Member

Tbh, that's not a very good guarantee. It's actually harmful for SV-COMP or #54 if you want to analyze both 32bit and 64bit programs on the same machine.

At the moment you'd first have to recompile on the new target to be able to analyze its code soundly.

True. Also, if you want to analyze on a machine with a different architecture, you would need to specify the target when invoking the preprocessor to make sure that its output matches the architecture that you specified for CIL. So i think defaulting to the host machine is actually not such a bad idea (as the compiler used to compile machdep-ml.c will hopefully be very similar to the one used for preprocessing, i.e., usually be only a somewhat more up-to-date version).

@vogler
Copy link
Collaborator

vogler commented Jul 28, 2021

There's no simple solution to the latter

Why? Current generated machdep.ml attached below.
Remove the msvc stuff, gcc -> gcc_arm64_macos, generate and add the same for x86_64_linux etc.
theMachine is already a reference, just need to set it from goblint to the current machine or option if provided (and then also pass it for preprocessing as mentioned by Michael above).

(* This module was generated automatically by code in Makefile and machdep-ml.c *)
type mach = {
  version_major: int;     (* Major version number *)
  version_minor: int;     (* Minor version number *)
  version: string;        (* gcc version string *)
  underscore_name: bool;  (* If assembly names have leading underscore *)
  sizeof_short: int;      (* Size of "short" *)
  sizeof_int: int;        (* Size of "int" *)
  sizeof_bool: int;       (* Size of "_Bool" *)
  sizeof_long: int ;      (* Size of "long" *)
  sizeof_longlong: int;   (* Size of "long long" *)
  sizeof_ptr: int;        (* Size of pointers *)
  sizeof_float: int;      (* Size of "float" *)
  sizeof_double: int;     (* Size of "double" *)
  sizeof_longdouble: int; (* Size of "long double" *)
  sizeof_floatcomplex: int;      (* Size of "float _Complex" *)
  sizeof_doublecomplex: int;     (* Size of "double _Complex" *)
  sizeof_longdoublecomplex: int; (* Size of "long double _Complex" *)
  sizeof_void: int;       (* Size of "void" *)
  sizeof_fun: int;        (* Size of function *)
  size_t: string;         (* Type of "sizeof(T)" *)
  wchar_t: string;        (* Type of "wchar_t" *)
  alignof_short: int;     (* Alignment of "short" *)
  alignof_int: int;       (* Alignment of "int" *)
  alignof_bool: int;      (* Alignment of "_Bool" *)
  alignof_long: int;      (* Alignment of "long" *)
  alignof_longlong: int;  (* Alignment of "long long" *)
  alignof_ptr: int;       (* Alignment of pointers *)
  alignof_enum: int;      (* Alignment of enum types *)
  alignof_float: int;     (* Alignment of "float" *)
  alignof_double: int;    (* Alignment of "double" *)
  alignof_longdouble: int;  (* Alignment of "long double" *)
  alignof_floatcomplex: int;     (* Alignment of "float _Complex" *)
  alignof_doublecomplex: int;    (* Alignment of "double _Complex" *)
  alignof_longdoublecomplex: int;  (* Alignment of "long double _Complex" *)
  alignof_str: int;       (* Alignment of strings *)
  alignof_fun: int;       (* Alignment of function *)
  alignof_aligned: int;   (* Alignment of anything with the "aligned" attribute *)
  char_is_unsigned: bool; (* Whether "char" is unsigned *)
  const_string_literals: bool; (* Whether string literals have const chars *)
  little_endian: bool; (* whether the machine is little endian *)
  __thread_is_keyword: bool; (* whether __thread is a keyword *)
  __builtin_va_list: bool; (* whether __builtin_va_list is builtin (gccism) *)
}
let gcc = {
(* Generated by code in src/machdep-ml.c *)
	 version_major    = 11;
	 version_minor    = 1;
	 version          = "11.1.0";
	 sizeof_short               = 2;
	 sizeof_int                 = 4;
	 sizeof_bool                = 1;
	 sizeof_long                = 8;
	 sizeof_longlong            = 8;
	 sizeof_ptr                 = 8;
	 sizeof_float               = 4;
	 sizeof_double              = 8;
	 sizeof_longdouble          = 8;
	 sizeof_floatcomplex        = 8;
	 sizeof_doublecomplex       = 16;
	 sizeof_longdoublecomplex   = 16;
	 sizeof_void                = 1;
	 sizeof_fun                 = 1;
	 size_t                     = "unsigned long";
	 wchar_t                    = "int";
	 alignof_short              = 2;
	 alignof_int                = 4;
	 alignof_bool               = 1;
	 alignof_long               = 8;
	 alignof_longlong           = 8;
	 alignof_ptr                = 8;
	 alignof_enum               = 4;
	 alignof_float              = 4;
	 alignof_double             = 8;
	 alignof_longdouble         = 8;
	 alignof_floatcomplex       = 4;
	 alignof_doublecomplex      = 8;
	 alignof_longdoublecomplex  = 8;
	 alignof_str                = 1;
	 alignof_fun                = 4;
	 alignof_aligned            = 16;
	 char_is_unsigned           = false;
	 const_string_literals      = true;
	 underscore_name            = false;
	 __builtin_va_list          = true;
	 __thread_is_keyword        = true;
	 little_endian              = true;
}
let hasMSVC = false
let msvc = gcc
let theMachine : mach ref = ref gcc

@sim642
Copy link
Member Author

sim642 commented Jul 28, 2021

Why? Current generated machdep.ml attached below.
Remove the msvc stuff, gcc -> gcc_arm64_macos, generate and add the same for x86_64_linux etc.
theMachine is already a reference, just need to set it from goblint to the current machine or option if provided (and then also pass it for preprocessing as mentioned by Michael above).

That is only when the machine Goblint is run on is one of the predefined architectures (and we can detect that). If you compile CIL and Goblint on one machine and move the binary to another, where the installed GCC has a different value for __thread_is_keyword, then you have no way of knowing that unless Goblint does the Machdep generation right before analyzing the program there.

@vogler
Copy link
Collaborator

vogler commented Jul 28, 2021

That is true. Didn't go through the record, but hopefully all those things (or the ones we care about) are determined by your combination of OS+Arch+GCC which can be detected.

@sim642
Copy link
Member Author

sim642 commented Jul 30, 2021

Looks like there was no complaint about the documented instructions themselves, so I'll merge this as well. Refactoring Machdep and CIL's build process is a future task.

@sim642 sim642 merged commit 00d6117 into master Jul 30, 2021
@sim642 sim642 deleted the docs-developing-cil branch July 30, 2021 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Documentation, comments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants