Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a simplified C IR. #253

Closed
phadej opened this issue Oct 31, 2024 · 7 comments
Closed

Introduce a simplified C IR. #253

phadej opened this issue Oct 31, 2024 · 7 comments
Assignees

Comments

@phadej
Copy link
Collaborator

phadej commented Oct 31, 2024

When working with data declarations recently, I stumble all the time on "complexity" of C. For example

#include <stdio.h>

typedef struct foo_s {
	int x;
} foo_t;

int main() {
	struct foo_s y = {0};

	return 0;
}

works. There typedef declaration is two-in-one:

struct foo_s {
  int x;
};

typedef struct foo_s foo_t;

For humans the two-in-one notion maybe handy; but for code-processing having simpler blocks is... simpler.


So I suggest that we introduce a simplified C IR, elaborating some compound C declarations into simpler ones.


To continue an example, I'd argue that even

typedef struct {
  int x; int y;
} foo;

is easier to process as (in pseudo-C)

struct anonymous::typedef<foo> {
  int x; int y;
} 

typedef anonymous::typedef<foo> foo;

and similar transformations may applied to "flatten" nested structs and unions.

@phadej phadej self-assigned this Oct 31, 2024
@TravisCardwell
Copy link
Collaborator

TravisCardwell commented Oct 31, 2024

Normalizing data declarations sounds good. For name mangling, we need to make sure that the IR has the necessary context. For example, the context for an anonymous struct for a named field currently needs to have the Haskell type constructor name of the closest named ancestor and the C field name. If providing Haskell names is problematic with IR, we can instead provide the context required to create the Haskell name. In this example, just the C name that is mangled to the type constructor name is sufficient.

Example:

typedef struct rect {
  struct { int x; int y; } tl;
  struct { int x; int y; } br;
} rect;

IR sketch:

struct ir::anonymous<rect, tl> {
  int x;
  int y;
};

struct ir::anonymous<rect, br> {
  int x;
  int y;
};

struct ir::struct<rect> {
  struct ir::anonymous<rect, tl> tl;
  struct ir::anonymous<rect, br> br;
};

typedef ir::struct<rect> rect;
Haskell with default options
data CRectTl = MkCRectTl {
      cRectTlX :: Int
    , cRectTlY :: Int
    }

data CRectBr = MkCRectBr {
      cRectBrX :: Int
    , cRectBrY :: Int
    }

data CRect = MkCRect {
      cRectTl :: CRectTl
    , cRectBr :: CRectBr
    }

By the way, I am about to implement a change to name mangling that should make it easier to use due to improved organization and documentation.

@edsko
Copy link
Collaborator

edsko commented Nov 6, 2024

Yes, this sounds completely reasonable to me. I wonder if we can the AST annotations (#256) to provide some additional guarantees which kinds of simplications have taken place?

@phadej
Copy link
Collaborator Author

phadej commented Nov 6, 2024

some additional guarantees

Why?

@edsko
Copy link
Collaborator

edsko commented Nov 6, 2024

So that subsequent passes can depend on it....?

@phadej
Copy link
Collaborator Author

phadej commented Nov 6, 2024

So that subsequent passes can depend on it....?

I assume you meant that we'd use same types for input C and simplified IR. I'm against that. They are different enough that having separate structures is a lot cleaner, than trying to be clever how to not duplicate stuff.

@edsko
Copy link
Collaborator

edsko commented Nov 6, 2024

Oh, I see.

@phadej
Copy link
Collaborator Author

phadej commented Dec 18, 2024

Kind of done in #295

@phadej phadej closed this as completed Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants