Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lex] Reorganize contents to follow grammar and phases of translation #7193

Closed
wants to merge 0 commits into from

Conversation

AlisdairM
Copy link
Contributor

@AlisdairM AlisdairM commented Jul 30, 2024

This PR purely moves existing words around, and does not create any new content. It would be the precursor to a larger change set that might integrate [cpp] into lex, or move it adjactent, and similarly move [modules] adjactent.

This may be more significant than we want to land in C++26, but I offer it while time is still available, and as inspiration for the C++29 reorg otherwise.

Note in particular that it does a much better job at retaining the floating tables within the relevant text, rather than floating onto the next page and invading a different subclause.

The proposed subclause ordering is now:

  • 5 Lexical convensions
    • 5.1 Separate translation
    • 5.2 Phases of translation
    • 5.3 Characters
      • 5.3.1 Character sets
      • 5.3.2 Universal character names
    • 5.4 Comments
    • 5.5 Preprocessing tokens
    • 5.6 Header names
    • 5.7 Preprocessing numbers
    • 5.8 Operators and punctuators
    • 5.9 Alternative tokens
    • 5.10 Tokens
    • 5.11 Identifiers
    • 5.12 Keywords
    • 5.13 Literals
      • 5.13.1 Kinds of literals
      • 5.13.2 ...

A bigger change, similar to introducing 5.3 Characters, would be to introduce two new 2nd level sections to group "Preprocessor tokenization" and "Translation unit tokenization":

  • 5 Lexical convensions
    • 5.1 Separate translation
    • 5.2 Phases of translation
    • 5.3 Characters
      • 5.3.1 Character sets
      • 5.3.2 Universal character names
    • 5.4 Comments
    • 5.5 Preprocessor tokenization
      • 5.5.1 Preprocessing tokens
      • 5.5.2 Header names
      • 5.5.3 Preprocessing numbers
      • 5.5.4 Operators and punctuators
      • 5.5.5 Alternative tokens
    • 5.6 Translation unit tokenization
      • 5.6.1 Tokens
      • 5.6.2 Identifiers
      • 5.6.3 Keywords
      • 5.6.4 Literals
        • 5.6.4.1 Kinds of literals
        • 5.6.4.2 ...

This structure would allow for C++29 to tackle some of the more awkward dual-purpose parts of the grammar, such as string and character literals being preprocessor tokens, suggesting they would be extracted into 5.5, and recognizing that just as pp-tokens and tokens are different, we might want to introduce pp-identifier for the preprocessor to distinguish from identifier in the rest of the language.

I have deferred pushing the version with the extra levels unless there is strong interest for C++26, as at a quick glance it appears to be a more radical restructuring than it really is.

@AlisdairM
Copy link
Contributor Author

Is this kind of internal reorganization of a clause still in scope for C++26?

If not, please flag with the C++29 milestone.
If it is in scope, I will work with editors to get it into shape to land.

If only parts are in scope (I am thinking specifically of moving [lex.comment]) I can break them out into separate PRs.

@AlisdairM
Copy link
Contributor Author

Also, needs a rebase!

@wg21bot wg21bot added the needs rebase The pull request needs a git rebase to resolve merge conflicts. label Oct 17, 2024
@jensmaurer
Copy link
Member

jensmaurer commented Oct 17, 2024

I think shuffling around stuff inside [lex] is still possible for C++26 if we want to do that. Moving [lex.comment] earlier seems like a no-brainer; we can do that right away if we wanted to. For the rest: @AlisdairM , could you please show the new section order in a comment here, e.g. as a screenshot of the table of contents or so? It's hard to reconstruct that from the github diff view.

And I think I want some CWG input (or at least awareness) for the new order.

@tkoeppe
Copy link
Contributor

tkoeppe commented Oct 17, 2024

I think shuffling around stuff inside [lex] is still possible for C++26 if we want to do that. Moving [lex.comment] earlier seems like a no-brainer; we can do that right away if we wanted to.

Let's land that first then and appraise the remainder of the proposal afterwards.

@AlisdairM
Copy link
Contributor Author

See #7315 for moving [lex.comment], I will remove that part before rebasing and pushing this PR again, and presenting the new clause ordering more clearly. per Jens comment above.

@AlisdairM
Copy link
Contributor Author

This branch has now been rebased so can lose the "needs rebase" tag.

@AlisdairM
Copy link
Contributor Author

I have updated the initial comment with the intended section outline, and an alternative that I prefer that simple adds another level of grouping, with no additional text.

@jensmaurer jensmaurer removed the needs rebase The pull request needs a git rebase to resolve merge conflicts. label Oct 17, 2024
Copy link
Member

@jensmaurer jensmaurer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, trusting the "nothing but text reshuffling" presentation, I think this is an improvement.

The additional subclause nesting suggested in the intro text (but not actually realized in the pull request) seems questionable without also splitting identifier into pp-identifier.

@AlisdairM
Copy link
Contributor Author

Trust, but verify!
I tried to repeat the re-organization to verify that there were no intervening commits to the cut and pasted text, and discovered that I had also moved a few paragraphs around. My attempt to push the simple fix accidentally force-pushed a copy of main (as I forgot to commit changes) and that closed this PR. I am going to open a new PR that really is just a cut and paste for the section ordering and nothing more.

@AlisdairM
Copy link
Contributor Author

The clean version of this branch, without discussion of the optional deeper nesting, is rebased and ready for review/landing at #7316

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants