Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification that text-valued variables and attributes can be Unicode strings or UTF-8 char arrays #556

Merged
merged 8 commits into from
Nov 23, 2024

Conversation

JonathanGregory
Copy link
Contributor

See issue #141 for discussion of these changes.

Release checklist

  • [NA] Authors updated in cf-conventions.adoc? Add in two places: on line 3 and under .Additional Authors in About the authors.
  • [Y] history.adoc up to date?
  • [Y] Conformance document up to date?

@ChrisBarker-NOAA
Copy link
Contributor

See my comment in #141

I understand why you think it's not necessary to specify the endowing for var-len strings, but I don't see why you think it's. a bad idea to do so.

The rules are not any different for strings and char arrays, why make the language look like they are?

@JonathanGregory JonathanGregory added this to the 1.12 milestone Oct 23, 2024
@JonathanGregory JonathanGregory linked an issue Oct 23, 2024 that may be closed by this pull request
The other strings, such as "May", should be padded with trailing NULL or space characters so that every array element is filled.
If the atomic string option is chosen, each element of the variable can be assigned a string with a different length.
A text string can be stored either in a variable-length **`string`** or in a fixed-length **`char`** array.
In both cases, text strings must be represented in Unicode Normalization Form C (NFC, link:$$https://www.unicode.org/versions/Unicode16.0.0/UnicodeStandard-16.0.pdf$$[section 3.11] and link:$$https://unicode.org/reports/tr15$$[Annex 15] of the Unicode standard) and encoded according to UTF-8.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both links? though I suppose more is better.

Looks good!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More is better, I think! I liked the annex because of its explanations, but the main text gives context.

history.adoc Outdated
@@ -7,6 +7,7 @@

=== Working version (most recent first)

* {issues}141[Issue #141]: Clarification that text-valued variables and attributes can be Unicode vlen strings or UTF-8 char arrays.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe:

Clarification that text-valued variables and attributes can be vlen strings or char arrays, encoded at UTF-8.

though this is the history, so precision isn't critical.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks for all the work!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to "Clarification that text may be stored in variables and attributes as either vlen strings or char arrays, and must be represented in Unicode Normalization Form C and encoded according to UTF-8."

@JonathanGregory JonathanGregory merged commit 4268011 into cf-convention:main Nov 23, 2024
2 checks passed
@ChrisBarker-NOAA
Copy link
Contributor

Whoo Hoo! thanks all for getting this through!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add support for attributes of type string
2 participants