Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement N3369 (_Lengthof) #102836

Closed
alejandro-colomar opened this issue Aug 11, 2024 · 12 comments · Fixed by #133125
Closed

Implement N3369 (_Lengthof) #102836

alejandro-colomar opened this issue Aug 11, 2024 · 12 comments · Fixed by #133125
Labels
c2y clang:frontend Language frontend issues, e.g. anything involving "Sema"

Comments

@alejandro-colomar
Copy link

alejandro-colomar commented Aug 11, 2024

Hi!

I've sent a patch set to GCC for adding a __lengthof__ operator:
https://inbox.sourceware.org/gcc-patches/20240728141547.302478-1-alx@kernel.org/T/#t

There's a related proposal for ISO C (wg14):
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2529.pdf
(although the proposal is old (not authored by me), and isn't as curated as the GCC patches). I have the intention of refining that proposal and sending a new one.

The specifications of the operator are:

The keyword __lengthof__ determines the length of an array operand,
that is, the number of elements in the array.
Its syntax is similar to sizeof.
The operand must be a complete array type or an expression of that type.
For example:

    int a[n];
    __lengthof__(a);           // returns n
    __lengthof__(int [7][3]);  // returns 7

The result of this operator is an integer constant expression,
unless the top-level array is a variable-length array.
The operand is only evaluated if the top-level array is a variable-length array.
For example:

    __lengthof__(int [7][n++]);  // integer constant expression
    __lengthof__(int [n++][7]);  // run-time value; n++ is evaluated

There are a few interesting reasons why this feature is better than just a macro around the usual sizeof division:

  • This keyword could be extended in the future to also give the length of a function parameter declared with array notation and a specified length.
  • This macro causes a compiler error if the argument is not an array (it's a constraint violation).
  • It results in a constant expression in some cases where sizeof would evaluate the operand. For example: __lengthof__(int [7][n++]).
  • It only evaluates the operand once for VLAs, where the sizeof division would evaluate twice (one per sizeof call).

Please feel free to give any feedback for the feature in the GCC thread.

Are you interested in this feature?

@alejandro-colomar alejandro-colomar changed the title c: Add __lengthof__ operator C: Add __lengthof__ operator Aug 11, 2024
@EugeneZelenko EugeneZelenko added clang:frontend Language frontend issues, e.g. anything involving "Sema" and removed new issue labels Aug 11, 2024
@llvmbot
Copy link
Member

llvmbot commented Aug 11, 2024

@llvm/issue-subscribers-clang-frontend

Author: Alejandro Colomar (alejandro-colomar)

Hi!

I've sent a patch set to GCC for adding a __lengthof__ operator:
<https://inbox.sourceware.org/gcc-patches/20240728141547.302478-1-alx@kernel.org/T/#t>

There's a related proposal for ISO C (wg14):
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2529.pdf>
(although the proposal is old (not authored by me), and isn't as curated as the GCC patches).

The specifications of the operator are:

The keyword __lengthof__ determined the length of an array operand,
that is, the number of elements in the array.
Its syntax is similar to sizeof.
The operand must be a complete array type or an expression of that type.
For example:

    int a[n];
    __lengthof__(a);           // returns n
    __lengthof__(int [7][3]);  // returns 7

The result of this operator is an integer constant expression,
unless the top-level array is a variable-length array.
The operand is only evaluated if the top-level array is a variable-length array.
For example:

    __lengthof__(int [7][n++]);  // integer constant expression
    __lengthof__(int [n++][7]);  // run-time value; n++ is evaluated

There are a few interesting reasons why this feature is better than just a macro around the usual sizeof division:

  • This keyword could be extended in the future to also give the length of a function parameter declared with array notation and a specified length.
  • This macro causes a compiler error if the argument is not an array (it's a constraint violation).
  • It results in a constant expression in some cases where sizeof would evaluate the operand. For example: __lengthof__(int [7][n++]).
  • It only evaluates the operand once for VLAs, where the sizeof division would evaluate twice (one per sizeof call).

Please feel free to give any feedback for the feature in the GCC thread.

Are you interested in this feature?

@AaronBallman AaronBallman added the enhancement Improving things as opposed to bug fixing, e.g. new or missing feature label Aug 12, 2024
@AaronBallman
Copy link
Collaborator

AaronBallman commented Aug 12, 2024

I think this is a reasonably common need; users can use the sizeof(array) / sizeof(array[0]) trick, but having a dedicated operator to do this instead would help catch mistakes.

One edge case would be with flexible array members; should those be a constraint violation?

Clang already supports __array_extent as a type trait, but only in C++: https://godbolt.org/z/54MKdMGd7 One thing that's interesting though is that makes it much more clear as to what "length" means for multidimensional arrays. You have to ask on a per-rank basis what the length is. Have you considered a similar design?

In terms of standardization, it's worth noting that N2529 has not been seen by WG14 and so it's unclear how the committee feels about the idea. That leaves some concerns (WG14 has a habit of renaming things or altering semantics slightly), but I think they could be overcome.

@alejandro-colomar
Copy link
Author

alejandro-colomar commented Aug 12, 2024

I think this is a reasonably common need; users can use the sizeof(array) / sizeof(array[0]) trick, but having a dedicated operator to do this instead would help catch mistakes.

One edge case would be with flexible array members; should those be a constraint violation?

For now they are a constraint violation (incomplete types are rejected), with a reservation of the right to extend support to them.

If there appears a way to add length information such as proposed in https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3188.htm, it would make sense to extend this operator to work with them.

We've also discussed about supporting the [[gnu::counted_by()]] attribute, but the feedback was mixed, and the consensus was to not do it, at least for now. We prefer to only support array lengths that are expressed using the type system.

Clang already supports __array_extent as a type trait, but only in C++: https://godbolt.org/z/54MKdMGd7 One thing that's interesting though is that makes it much more clear as to what "length" means for multidimensional arrays. You have to ask on a per-rank basis what the length is. Have you considered a similar design?

The usual ARRAY_SIZE() or NITEMS() macros are the prior art I based the feature on. I like it because it's simple. And it's common-enough already that I expect it to be easy to explain.

I didn't consider something like __array_extent, because it's usually as easy as:

__array_extent(decltype(foo), 4) == __lengthof__(****foo)

That is, with regular language expressions you can ask for whatever length you're interested in.

In terms of standardization, it's worth noting that N2529 has not been seen by WG14 and so it's unclear how the committee feels about the idea.

Several WG14 members are CCed in the GCC thread (and a few more in a discussion about the state of that paper prior to the development of the patch. Around half a dozen in total. So far they haven't complained, other than suggesting the usual pedantic wording refinements (very welcome, of course). :)

That leaves some concerns (WG14 has a habit of renaming things or altering semantics slightly), but I think they could be overcome.

That's why we've started with the keyword __lengthof__ in GCC, to make it a GNU extension, without entering into ISO C reserved words territory. I expect that the semantics won't be touched by WG14. We're prepared to accept a new name (we expect _Lengthof, and then likely lengthof).

One detail where we didn't have consensus is in accepting expressions without parentheses like sizeof, or requiring them. One WG14 member suggested that we start clean without the mistakes of sizeof. But so far, the implementation is like sizeof in this regard. I think they should match, and if WG14 wants to remove the parentheses from lengthof, they should start by deprecating it from sizeof. But we can still provide sizeof-like behavior in GCC as an extension if ISO C decides to disagree. Having sizeof and lengthof differ here would mean more duplication of code in the compiler, which I'd avoid.

@AaronBallman
Copy link
Collaborator

I think this is a reasonably common need; users can use the sizeof(array) / sizeof(array[0]) trick, but having a dedicated operator to do this instead would help catch mistakes.
One edge case would be with flexible array members; should those be a constraint violation?

For now they are a constraint violation (incomplete types are rejected), with a reservation of the right to extend support to them.

I could live with that.

If there appears a way to add length information such as proposed in https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3188.htm, it would make sense to extend this operator to work with them.

We've also discussed about supporting the [[gnu::counted_by()]] attribute, but the feedback was mixed, and the consensus was to not do it, at least for now. We prefer to only support array lengths that are expressed using the type system.

Great, thank you!

Clang already supports __array_extent as a type trait, but only in C++: https://godbolt.org/z/54MKdMGd7 One thing that's interesting though is that makes it much more clear as to what "length" means for multidimensional arrays. You have to ask on a per-rank basis what the length is. Have you considered a similar design?

The usual ARRAY_SIZE() or NITEMS() macros are the prior art I based the feature on. I like it because it's simple. And it's common-enough already that I expect it to be easy to explain.

I didn't consider something like __array_extent, because it's usually as easy as:

__array_extent(decltype(foo), 4) == __lengthof__(****foo)

That is, with regular language expressions you can ask for whatever length you're interested in.

Given that the only reason to add this feature is to help users more clearly express their intent, I definitely am not a fan of playing "guess how declarators and operators relate to one another" for the feature. Multi-dimensional arrays are a fairly prolific feature of C and it seems to me that getting the array rank and extent is a reasonable thing for users to want to do, and that maps nicely to the C++ features (https://en.cppreference.com/w/cpp/types/rank and https://en.cppreference.com/w/cpp/types/extent) used to get the same information.

In terms of standardization, it's worth noting that N2529 has not been seen by WG14 and so it's unclear how the committee feels about the idea.

Several WG14 members are CCed in the GCC thread (and a few more in a discussion about the state of that paper prior to the development of the patch. Around half a dozen in total. So far they haven't complained, other than suggesting the usual pedantic wording refinements (very welcome, of course). :)

There's three (active) committee members on that thread, and I make four, but that's only a bit over 10% of the committee.

That leaves some concerns (WG14 has a habit of renaming things or altering semantics slightly), but I think they could be overcome.

That's why we've started with the keyword __lengthof__ in GCC, to make it a GNU extension, without entering into ISO C reserved words territory. I expect that the semantics won't be touched by WG14. We're prepared to accept a new name (we expect _Lengthof, and then likely lengthof).

If WG14 insists on a design that separates rank and extent, that would be a pretty major shift in semantics and it would be unfortunate for either GCC or Clang to have to carry the extension interface in that case. Before we went ahead with such a feature in Clang, we'd really need some sort of signal from WG14 on that design decision (this is part of our criteria for adding extensions: https://clang.llvm.org/get_involved.html#criteria).

One detail where we didn't have consensus is in accepting expressions without parentheses like sizeof, or requiring them. One WG14 member suggested that we start clean without the mistakes of sizeof. But so far, the implementation is like sizeof in this regard. I think they should match, and if WG14 wants to remove the parentheses from lengthof, they should start by deprecating it from sizeof. But we can still provide sizeof-like behavior in GCC as an extension if ISO C decides to disagree. Having sizeof and lengthof differ here would mean more duplication of code in the compiler, which I'd avoid.

We already broke from tradition in that regard with typeof (it accepts either a type or an expression same as sizeof but it requires the parentheses) and alignof (it only accepts a parenthesized type name in ISO C, but both Clang and GCC allow an expression operand and no parens as an extension), but my preference would be to follow sizeof if we kept this interface and require parens if we went with a rank/extent pair of operators. We could technically leave off the parens for rank when given a type operand but then it would be inconsistent between rank and extent, which doesn't seem like good design. (I also think we should probably write a paper to allow alignof unary-expression same as sizeof given that it's a commonly supported extension, but that's neither here nor there for your proposal.)

@alejandro-colomar
Copy link
Author

If there appears a way to add length information such as proposed in https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3188.htm, it would make sense to extend this operator to work with them.
We've also discussed about supporting the [[gnu::counted_by()]] attribute, but the feedback was mixed, and the consensus was to not do it, at least for now. We prefer to only support array lengths that are expressed using the type system.

Great, thank you!

:-)

Clang already supports __array_extent as a type trait, but only in C++: https://godbolt.org/z/54MKdMGd7 One thing that's interesting though is that makes it much more clear as to what "length" means for multidimensional arrays. You have to ask on a per-rank basis what the length is. Have you considered a similar design?

The usual ARRAY_SIZE() or NITEMS() macros are the prior art I based the feature on. I like it because it's simple. And it's common-enough already that I expect it to be easy to explain.
I didn't consider something like __array_extent, because it's usually as easy as:

__array_extent(decltype(foo), 4) == __lengthof__(****foo)

That is, with regular language expressions you can ask for whatever length you're interested in.

Given that the only reason to add this feature is to help users more clearly express their intent,

Not the only one. To me, the main reason is the "future directions" note that foresees adding support to function parameters declared with array notation. I'm paving the way for it. If this keyword was to stay as just a standard ARRAY_SIZE() macro, I wouldn't be so much interested. But we have to start somewhere. :)

I definitely am not a fan of playing "guess how declarators and operators relate to one another" for the feature. Multi-dimensional arrays are a fairly prolific feature of C

Yup, a very nice feature of C, indeed.

and it seems to me that getting the array rank and extent is a reasonable thing for users to want to do, and that maps nicely to the C++ features (https://en.cppreference.com/w/cpp/types/rank and https://en.cppreference.com/w/cpp/types/extent) used to get the same information.

Hmm.

In terms of standardization, it's worth noting that N2529 has not been seen by WG14 and so it's unclear how the committee feels about the idea.

Several WG14 members are CCed in the GCC thread (and a few more in a discussion about the state of that paper prior to the development of the patch. Around half a dozen in total. So far they haven't complained, other than suggesting the usual pedantic wording refinements (very welcome, of course). :)

There's three (active) committee members on that thread, and I make four, but that's only a bit over 10% of the committee.

Yup, plus other two that I had asked if they know about the state of n2529 before that thread.

That leaves some concerns (WG14 has a habit of renaming things or altering semantics slightly), but I think they could be overcome.

That's why we've started with the keyword __lengthof__ in GCC, to make it a GNU extension, without entering into ISO C reserved words territory. I expect that the semantics won't be touched by WG14. We're prepared to accept a new name (we expect _Lengthof, and then likely lengthof).

If WG14 insists on a design that separates rank and extent, that would be a pretty major shift in semantics and it would be unfortunate for either GCC or Clang to have to carry the extension interface in that case. Before we went ahead with such a feature in Clang, we'd really need some sort of signal from WG14 on that design decision (this is part of our criteria for adding extensions: https://clang.llvm.org/get_involved.html#criteria).

Makes sense. I've started to develop a paper for WG14, as you may have seen in your mailbox. :)

One detail where we didn't have consensus is in accepting expressions without parentheses like sizeof, or requiring them. One WG14 member suggested that we start clean without the mistakes of sizeof. But so far, the implementation is like sizeof in this regard. I think they should match, and if WG14 wants to remove the parentheses from lengthof, they should start by deprecating it from sizeof. But we can still provide sizeof-like behavior in GCC as an extension if ISO C decides to disagree. Having sizeof and lengthof differ here would mean more duplication of code in the compiler, which I'd avoid.

We already broke from tradition in that regard with typeof (it accepts either a type or an expression same as sizeof but it requires the parentheses) and alignof (it only accepts a parenthesized type name in ISO C, but both Clang and GCC allow an expression operand and no parens as an extension), but my preference would be to follow sizeof if we kept this interface and require parens if we went with a rank/extent pair of operators. We could technically leave off the parens for rank when given a type operand but then it would be inconsistent between rank and extent, which doesn't seem like good design. (I also think we should probably write a paper to allow alignof unary-expression same as sizeof given that it's a commonly supported extension, but that's neither here nor there for your proposal.)

Agree.

@alejandro-colomar
Copy link
Author

Here's a link to the already submitted proposal to WG14:
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3313.pdf

@alejandro-colomar
Copy link
Author

This has been merged as _Lengthof into C2y today.

The paper that was merged was https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3369.pdf with some trivial/editorial wording cosmetic changes on top of it.

@cor3ntin
Copy link
Contributor

cor3ntin commented Oct 3, 2024

@alejandro-colomar the fact it was standardized mean it can be implemented in clang without further discussions on whether we want it.
Are you interested in submitting a pull request? Otherwise someone will get to it as part of our conformance work

@cor3ntin cor3ntin added c2y and removed enhancement Improving things as opposed to bug fixing, e.g. new or missing feature labels Oct 3, 2024
@cor3ntin cor3ntin changed the title C: Add __lengthof__ operator Implement N3369 (_Lengthof) Oct 3, 2024
@alejandro-colomar
Copy link
Author

alejandro-colomar commented Oct 3, 2024

@alejandro-colomar the fact it was standardized mean it can be implemented in clang without further discussions on whether we want it. Are you interested in submitting a pull request? Otherwise someone will get to it as part of our conformance work

I would want to attempt it. :-)

I've never written any patches for Clang/LLVM, AFAIR, so I would appreciate some help on where should I look in the code. I expect it to be similar to GCC, but of course different, so some help would help.

If I find myself unable, I'll let you know. Thanks!

@alejandro-colomar
Copy link
Author

alejandro-colomar commented Oct 4, 2024

If I find myself unable, I'll let you know. Thanks!

@cor3ntin

I've been looking at the code, and there's too much C++ for my taste; I give up. Would someone else mind implementing it? :-)

@AaronBallman
Copy link
Collaborator

No worries, we'll get around to it at some point, thanks for looking!

@alejandro-colomar
Copy link
Author

alejandro-colomar commented Feb 25, 2025

This has been merged as _Lengthof into C2y today.

The paper that was merged was https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3369.pdf with some trivial/editorial wording cosmetic changes on top of it.

Hi @AaronBallman ,

Could you implement it as __builtin_countof? (or whatever variation of underscores is more appropriate; maybe with trailing __).

The survey by JeanHeyd (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3469.htm) confirms that it would be stressful to users if this was called lengthof as ISO C has merged.

So, to clarify, could you please implement what ISO C2y already has merged, with the same semantics, but with a vendor name derived from countof? That would force ISO to reconsider the name of the operator.

GCC has a proposed patch (by myself) for adding this operator with a name derived from countof:
https://inbox.sourceware.org/gcc-patches/cover.1731233627.git.alx@kernel.org/T/#u

Cheers,
Alex

AaronBallman added a commit to AaronBallman/llvm-project that referenced this issue Mar 26, 2025

Verified

This commit was signed with the committer’s verified signature.
vszakats Viktor Szakats
C2y adds the _Countof operator which returns the number of elements in
an array. As with sizeof, _Countof either accepts a parenthesized type
name or an expression. Its operand must be (of) an array type. When
passed a constant-size array operand, the operator is a constant
expression which is valid for use as an integer constant expression.

Fixes llvm#102836
AaronBallman added a commit that referenced this issue Mar 27, 2025
C2y adds the `_Countof` operator which returns the number of elements in
an array. As with `sizeof`, `_Countof` either accepts a parenthesized
type name or an expression. Its operand must be (of) an array type. When
passed a constant-size array operand, the operator is a constant
expression which is valid for use as an integer constant expression.

This is being exposed as an extension in earlier C language modes, but
not in C++. C++ already has `std::extent` and `std::size` to cover these
needs, so the operator doesn't seem to get the user enough benefit to
warrant carrying this as an extension.

Fixes #102836
alejandro-colomar added a commit to alejandro-colomar/llvm-project that referenced this issue Mar 27, 2025
Link: <llvm#102836>
Link: <llvm#133125>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
alejandro-colomar added a commit to alejandro-colomar/llvm-project that referenced this issue Mar 28, 2025
Link: <llvm#102836>
Link: <llvm#133125>
Cc: Aaron Ballman <aaron@aaronballman.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
alejandro-colomar added a commit to alejandro-colomar/llvm-project that referenced this issue Mar 28, 2025
Link: <llvm#102836>
Link: <llvm#133125>
Cc: Aaron Ballman <aaron@aaronballman.com>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c2y clang:frontend Language frontend issues, e.g. anything involving "Sema"
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants