Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SUGGESTION] Literal Templates #316

Closed
msadeqhe opened this issue Apr 3, 2023 · 9 comments
Closed

[SUGGESTION] Literal Templates #316

msadeqhe opened this issue Apr 3, 2023 · 9 comments

Comments

@msadeqhe
Copy link

msadeqhe commented Apr 3, 2023

Preface

Both variables and literals have types in C++1, so we may explicitly specify the type of variables, and the type of literals with suffixes. For example:

// Here we explicitly specify the type of a variable.
unsigned long long x = 2;

// Here we explicitly specify the type of a literal.
auto y = 2ull;

But the name that we specify for the type of a variable is different from the name that we specify for the type of a literal. For example the type of a literal with suffix ull is unsigned long long integer. What if we specify the type of both variables and literals with the same name?

I should mention that literals are somehow unified with variables in C++2. For example of a function call:

x0 := /*smth*/.f();

// /*smth*/ may be a literal.
// Because of UFCS, f() calls a free function.
x1 := 2.f();

// /*smth*/ may be a variable.
// Because of UFCS, f() calls a free function or a member function.
x2 := x.f();

The same thing can be done for specifying the type of both variables and literals in C++2. if we specify the type of literals with a template argument which is named Literal Template:

x0 := /*smth*/<t>;

// /*smth*/ may be a literal.
// Because of UTVS, it's a literal template.
x1 := 2<t>;

// /*smth*/ may be a variable.
// Because of UTVS, it's a variable template.
x2 := x<t>;

UTVS means Uniform Typed Value Syntax. I will explain about it under "Why do I suggest this change?".

Suggestion Detail

C++1 already have the following templates:

  • Class Templates, e.g. Type<float>
  • Function Templates, e.g. function<float>()
  • Variable Templates (since C++14), e.g. variable<float>

I suggest to also have Literal Templates, e.g. 2.0<float>.

Literal templates are similar to variable templates, except they only have one template parameter which only accepts one of the predefined types for the literal. Predifined types for each literal are listed below:

// T can be only one of the following types:
// i8, ..., i64, u8, ..., u64, int, size, long, ulong, ...
// or any other integer type.
x1 := 2<T>;
y1 := 2<u8>;

// T can be only one of the following types:
// f16, ..., f128, float, double, ...
// or any other floating-point type.
x2 := 2.0<T>;
y2 := 2.0<f16>;

// T can be only one of the following types:
// c8, ..., c32, char, ...
// or any other character type.
x3 := 'x'<T>;
y3 := 'x'<c16>;

// T can be only one of the following types:
// c8, ..., c32, char, ...
// or any other character type.
x4 := "text"<T>;
y4 := "text"<c8>;

I need to explain as suggested in this issue:

  • f16, ..., f128 are type aliases to corresponding floatN_t types.
  • c8, ..., c32 are type aliases to corresponding charN_t types.

The template parameter of string literals are for their underlying character type, therefore if the template argument is c8 then the string literal is UTF-8.

// := u8"This text is UTF-8.";
x0 := "This text is UTF-8."<c8>;

Type Aliases are good enough to make literal templates convenient:

ull: type == ulonglong;

// OK. ull is an integer type.
x0 := 2<ull>;

fll: type == longdouble;

// ERROR! fll is not an integer type.
/**
  x1 := 2<fll>;
 */

2<fll> is not valid, because 2 only accepts a template argument of integer types.

Although it's technically possible to have user-defined literal templates, but I don't recommend it, because it may be misused.

Edit 6: User-defined Literal Templates will complement the concept of Literal Templates (see this comment for explanation and the cause of it, or continue reading the following comments). For example:

operator""kilo: <T>(value: T) -> T
requires std::is_integral_v<T> || std::is_floating_point_v<T>
= value * 1'000;

main: () = {
    sum: double = 1.5kilo + 2<ull>kilo;
}

In above example if {N} is an integer or a floting-point literal, then {N}kilo and {N}<T>kilo works.

Why do I suggest this change?

Because this change makes C++2 to have a more general language feature, and to be simpler to learn in the following ways:

1. Unification of how we use templates

Consider how we write 2.func() and x.func() to call func() regardless of whether they are literal or variable. The same thing will be possible for 3.14<float> and PI<float> to get the value in type float (without any type conversion) regardless of whether they are literal or variable. Types, functions, variables and literals will be familiar when working with types:

x0: TypeName<int> = ();
x1 := function_name<int>();
x2 := variable_name<int>;
x3 := 10<int>;

2. UTVS in generic programming

Literal templates allow literal types (also known as literal suffix or prefix) to be a template parameter (e.g. 10<T>), therefore with both variable templates and literal templates, C++2 will have Uniform Typed Value Syntax (UTVS) in generic programming. It means when only the type and the value of something is important, it doesn't matter if it was from a variable or a literal in generic programming. For example:

X: <Value: VariableOrLiteral, T: type>
type = {
    member := Value<T>;
}

x1 := X<PI, f16>;
x2 := X<3.14, f16>;

Value can be either a variable or a literal, anyway the implementation of X will work.

I need to explain that we cannot write PI as T for variable template PI<T>, therefore as cannot be used for UTVS. Also Value<T> for literals is different from Value as T in generic programming, because Value as T tries to safely convert Value to type T, but Value<T> doesn't change the type of literals, it only carries type T with Value, e.g. 10<T> doesn't convert the integer literal 10 to a floting-point type, because its template parameter is constrained to only accept integer types.

For more explanation see this comment.

3. Reducing Concept Count

Literal templates eliminate the concept of prefix and suffix for all type of literals, and make them to look uniform as their behaviour are naturaly similar to variable templates.

// Here we use <f16> template argument instead of f16 suffix.
// := 3.14f16 + 1.0;
x1 := 3.14<f16> + 1.0;
y1 := PI<f16> + 1.0; // PI is a variable template.

// Here we use <c8> template argument instead of u8 prefix.
// := u8"text";
x2 := "text"<c8>;
y2 := "text"<c8>sv; // y2 is std::basic_string_view<c8>.

4. UDL Improvement

C++1 has mixed the concept of User-defined Literal suffixes (UDL suffixes) with literal types, therefore there is a limit that UDL suffixes cannot be used at the same time with literal types (except for character and string literals which their type are written as prefix instead of postfix):

// This calls operator""ms(long double) in C++2.
// This is equal to 2.0ms in C++1.
x1 := 2.0<longdouble>ms;

// This calls operator""ms(f16) in C++2.
// NO! This isn't possible in C++1.
x2 := 2.0<f16>ms;

// This calls operator""ms(double) in C++2.
// This calls operator""ms(long double) in C++1.
x3 := 2.0ms;

I should point again that User-defined Literal Templates will complement the concept of Literal Templates (see this comment for explanation and the cause of it, or continue reading the following comments).

5. Consistent Literal Types

Built-in literal suffixes are inconsistent for integer and floating-point literals when the constant of a literal exceeds the type (which is specified with the suffix), and user-defined literal suffixes can replace built-in literal suffixes and prefixes (see this comment for explanation and the cause of it, or continue reading the following comments).

TL;DR. Literal template consistency in a nutshell:

ull: type == ulonglong;
c8: type == char8_t;

a: ull = 2<ull>;
b: c8 = 'x'<c8>;
c: float = 2.0<float>;
d: std::basic_string_view<c8> = "text"<c8>sv;

In the following paragraphs I'll explain in detail:

Firstly, The name we use to specify the type of variables (e.g. ulong) is different from the name we use to specify the type of literals (e.g. ul). For example:

// Here we explicitly specify the type of a variable with ulong.
x0: ulong = 2;

// Here we explicitly specify the type of a literal with ul.
x1 := 2ul;

On the other hand with literal templates, we use the same name to specify the type of variables and literals:

// Here we explicitly specify the type of a variable with ulong.
x0: ulong = 2;

// Here we explicitly specify the type of a literal with ulong.
x1 := 2<ulong>;

Secondly, to specify the type of literals in C++1, integer and floating-point literals have suffixes, but character and string literals have prefixes. For example:

// Suffix f specifies a literal with float type.
x1 := 2.0f;

// Prefix u8 specifies a literal with char8_t type.
x2 := u8'x';

On the other hand with literal templates, all of them use the existing concept of templates to achieve the same thing:

// Template argument float specifies a literal with float type.
x1 := 2.0<float>;

// Template argument c8 specifies a literal with c8 type.
x2 := 'x'<c8>;

6. All Possible Types

suffixes in C++1 doesn't support all possible types, e.g. int64_t. Fixed-width integer types are not supported with integer suffixes in C++1 as described in this question on Stack Overflow, but with literal templates all integer types are supported:

// It's not possible in C++1.
x0 := 2<i64>;

Even type aliases can be used:

il: type == long;

// OK. il is an integer type.
x0 := 2<il>;

x1 := 2<long>;

2<il> and 2<long> are excatly the same.

7. Unlocking More Versatile Syntax

By eliminating prefixes from character and string literals (also suggested in this issue), C++2 can additionally use more versatile syntax on desire, such as function'...' or function"...". For example see Tagged Templates in JavaScript.

Literal template syntax is not verbose, and it's readable.

It's not verbose with the help of type aliases, because only two characters < and > are added to the syntax:

ill: type == longlong;

// x: longlong = -10;
x: ill = -10;

// y := -10ll;
y := -10<ill>;

x: ill = -10; is less verbose than x: longlong = -10;.

Also -10ll may be misread as -1011, but -10<ill> is more expressive and easier to read.

Edits

  1. I've added another case under "Why do I suggest this change?".
  2. I've explained why Value<T> is different from Value as T in generic programming.
  3. I've explained why literal template syntax isn't verbose.
  4. I've arranged "Why do I suggest this change?".
  5. I've fixed grammar and spelling.
  6. I've corrected a wrong statement about User-defined Literal Templates (see this comment).
  7. I've added links to comments for more explanations.
@AbhinavK00
Copy link

I think this change would be great. With typenames such as u16 and i64, literal suffices such as 16l and 18ll don't make much sense. This would be kind of more intuitive.

@msadeqhe
Copy link
Author

msadeqhe commented Apr 3, 2023

Thanks for your feedback. Also type aliases help to have smaller names for other types. For example:

/**
i8, ..., i64 for fixed-size integer types: intN_t
 */
i0: type == int;
i_: type == int;
ii: type == int;
il: type == long;
ill: type == longlong;

/**
u8, ..., u64 for fixed-size unsigned integer types: uintN_t
 */
u0: type == uint;
u_: type == uint;
uu: type == uint;
ul: type == ulong;
ull: type == ulonglong;

/**
f16, ..., f128 for fixed-size floating-point types: floatN_t
 */
f0: type == float;
f_: type == float;
ff: type == float; // single
fl: type == double; // long float
fll: type == longdouble; // long long float

/**
c8, ..., c32 for fixed-size character types: charN_t
 */
c0: type == char;
c_: type == char;
cc: type == char;

a := 2<ill>; // 2ll
b := 2<ull>; // 2ull
c := 2.0<fll>; // 2.0l

msadeqhe referenced this issue Apr 9, 2023
A base type subobject is declared the same as any member object, but with the name `this`. See test case example in this commit, pasted below for convenience

Note that `:` continues to be pronounces "is a"... e.g., `f: () -> int` is pronounced as "f is a function returning int," `v: vector<int>` as "v is a vector<int>", `this: Shape` as "this object is a Shape."

This is consistent because in Cpp2 I'm pursuing the experiment of making inheritance be always `public` and therefore inheritance should always model IS-A substitutability.
- Why not protected inheritance? Because the only theoretical use of that I know of is to enable IS-A substitutability only for code within the same class hierarchy, and I know of no actual use of that feature.
- Why not private inheritance? Because it is anti-recommended, nearly always that should be a private data member instead, and the main reason to use a private base is to make that data member outlive a base class object... which is already covered in Cpp2 because all subobjects (bases and members) can be declared in any order. In this initial checkin, if there are any non-base subobjects (ordinary data members) that are declared before base subobjects (base classes), as an implementation detail those non-bases are emitted as private base classes via a helper wrapper that prevents their interface (functions) from leaking into the type.

If a type has base classes, I don't generate assignment from construction. This is because polymorphic types usually don't want (nonvirtual) assignment, and so base classes generally don't provide assignment so a generated memberwise assignment wouldn't work anyway. However, as of this commit, I don't currently ban a user explicitly writing their own assignment operator if they want to, which will be fine as long as `Base::operator=` memberwise calls are available.

Abstract virtual functions are simply virtual functions that have no initializer. (All other functions in Cpp2 must have an initializer.)

Also corrected the namespace alias representation to use id-expression

### Test case (pasted for convenience)

```
Human: type = {
    operator=: (out this) = {}
    speak:     (virtual this);
}

N: namespace = {
    Machine: type = {
        operator=: (out this, id: std::string) = {}
        work:      (virtual this);
    }
}

Cyborg: type = {
    name: std::string;
    this: Human = ();
    this: N::Machine;

    operator=: (out this, n: std::string) = {
        name = n;
        N::Machine = "Acme Corp. engineer tech";
        std::cout << "(name)$ checks in for the day's shift\n";
    }

    speak: (override this) =
        std::cout << "(name)$ cracks a few jokes with a coworker\n";

    work: (override this) =
        std::cout << "(name)$ carries some half-tonne crates of Fe2O3 to cold storage\n";

    operator=: (move this) =
        std::cout << "Tired but satisfied after another successful day, (name)$ checks out and goes home to their family\n";
}

make_speak: ( h: Human ) = {
    std::cout << "-> [vcall: make_speak] ";
    h.speak();
}

do_work: ( m: N::Machine ) = {
    std::cout << "-> [vcall: do_work] ";
    m.work();
}

main: () = {
    c: Cyborg = "Parsnip";
    c.make_speak();
    c.do_work();
}
```

Output:

```
Parsnip checks in for the day's shift
-> [vcall: make_speak] Parsnip cracks a few jokes with a coworker
-> [vcall: do_work] Parsnip carries some half-tonne crates of Fe2O3 to cold storage
Tired but satisfied after another successful day, Parsnip checks out and goes home to their family
```
@msadeqhe
Copy link
Author

msadeqhe commented Apr 10, 2023

4. UDL Improvement

C++1 has mixed the concept of User-defined Literal suffixes (UDL suffixes) with literal types, therefore there is a limit that UDL suffixes cannot be used at the same time with literal types (except for character and string literals which their type are written as prefix instead of postfix): ...

I think that's the reason why User-defined Literal Templates are not allowed in C++1 as described in this question on Stack Overflow. Therefore C++2 can additionally support User-defined Literal Templates (with a template parameter which is restricted to integer, floating-point and character types just similar to Literal Templates):

weight: <T>type = {
    grams: T = 0;
    operator=: (out this, value: T) = {
        grams = value;
    }
}

operator""kg: <T>(value: T) -> weight<T> = {
    return: weight<T> = value * 1'000;
}

main: () = {
    a: = 4kg;
    // a is weight<int>
    // a.grams == 4'000

    b: = 2.0kg
    // b is weight<double>
    // b.grams == 2'000.0

    c: = 10<ull>kg;
    // c is weight<ull>
    // c.grams == 10'000<ull>

    d: = 1.5<float>kg;
    // d is weight<float>
    // d.grams == 1'500.0<float>
}

It seems that Literal Templates simplify C++2 language specification by using existing powerful concepts (templates) and elminating additional rules such as literal suffixes and prefixes, and restrictions of UDL suffixes (e.g. above example).

I'm thinking how it makes sense that Literal Templates and User-defined Literal Templates are complementing each other. Therefore I should correct my previous wrong statement which was:

Although it's technically possible to have user-defined literal templates, but I don't recommend it, because it may be misused.

User-defined Literal Templates will complement the concept of Literal Templates. I'll edit my original suggestion to correct that wrong statement.

Another example but simpler:

operator""kilo: <T>(value: T) -> T
requires std::is_integral_v<T> || std::is_floating_point_v<T>
= value * 1'000;

main: () = {
    a: = 4kilo;
    // a is int
    // a == 4'000

    b: = 2.0kilo;
    // b is double
    // b == 2'000.0

    c: = 10<ull>kilo;
    // c is ull
    // c == 10'000<ull>

    d: = 1.5<float>kilo;
    // d is float
    // d == 1'500<float>
}

In above example kilo simply means * 1000 regardless of the type.

@msadeqhe
Copy link
Author

msadeqhe commented Apr 10, 2023

User-defined literal suffixes can replace built-in literal suffixes and prefixes.

Yes, that's it. We can write user-defined literal suffixes with the same name as built-in suffixes. Therefore we don't lose any feature if C++2 supports literal templates:

operator""ull: (value: ulonglong) -> ulonglong = value;
operator""u8: (value: std::u8string) -> std::u8string = value;

main: () = {
    // This calls operator""ull.
    x: = 1'000ull;

    // This calls operator""u8.
    // It's similar to u8"text".
    a: = "text"u8;

    // This calls operator""ull too,
    // because only one overload is available for operator""ull.
    y: = 10<int>ull;
}

IMO I don't like to do this (like above example), I just wanted to explain that literal templates are more general language feature than just literal suffixes and prefixes.

These are differences between those user-defined literal suffixes and built-in suffixes:

  • for built-in integer suffixes: the type of the integer literal is the first type in which the value can fit, e.g. the type of 5'000'000'000u is unsigned int (suffix u) but its constant (5'000'000'000) is unsigned long int, therefore its type will be unsigned long int.
  • for built-in floating-point suffixes: the type of the floating-point literal is the type in which specified by the suffix, so floating point constant should not exceed the range of suffix, e.g. the type of 4.1234567e38f is float (suffix f) but its constant (4.1234567e38) is double, therefore its a compiler error (MSVC) or the value will be inf (GCC and Clang).
  • for user-defined suffixes: the type of the literal is the type in which declared for the parameter of operator"".

Therefore I think built-in suffixes are inconsistent for integer and floating-point literals when the constant of literal exceeds the type (which is specified with the suffix).

@JohelEGP
Copy link
Contributor

JohelEGP commented Apr 10, 2023

operator""u8: (value: std::u8string) -> u8string = value;

main: () = {
    // This calls operator""u8.
    // It's similar to u8"text".
    a: = "text"u8;
}

It seems like all C++1's character types in the language could've been a library feature.

@msadeqhe
Copy link
Author

msadeqhe commented Apr 10, 2023

Yes, you're right as well as built-in literal prefixes and suffixes could've been a library feature. It leads to reducing concept count that's because the parameter type of operator"" is not restricted to certain types, the reason is that literal types (which are literal templates in my suggestion) are not mixed with user-defined literal suffixes.

@msadeqhe
Copy link
Author

msadeqhe commented Apr 11, 2023

7. Unlocking More Versatile Syntax

By eliminating prefixes from character and string literals (also suggested in this issue), C++2 can additionally use more versatile syntax on desire, such as function'...' or function"...". For example see Tagged Templates in JavaScript.

To support this use case, I suggest to change the syntax of user-defined literal suffixes (if C++2 supports them someday) from operator""suffix to anything else such as operator_suffix or operator suffix or ...suffix (or whatever syntax that you think works better). By this change, operator overloading for two operators '' and "" will be available with corresponding names operator'' and operator"". For example:

StringJoiner: type = {
    inner_text: = "";

    // This is just an example.
    operator"": (this, value: std::string) -> X = {
        if ! inner_text is void {
            inner_text += "\"";
        }
        return inner_text += value;
    }

    operator(): (this) -> std::string = inner_text;
}

main: () = {
    joiner: X = ();

    // This respectively calls:
    // joiner = joiner.operator""("text");
    // joiner()
    x0: = joiner"text"();
    // x0 == "text"

    // This respectively calls:
    // joiner = joiner.operator""("I remember ");
    // joiner = joiner.operator""("my name");
    // joiner = joiner.operator""(".");
    // joiner()
    x1: = joiner"I remember ""my name""."();
    // x1 == "I remember \"my name\"."
}

In the above example, I should explain that StringJoiner overloads operator"". It joins chained operators "" and inserts " between them in a way that it feels two double quote "" will be transpiled to one double quote ". Also it returns inner_text with the help of operator(). Operators '' and "" are like operators () and [], their return type may be anything, but they only have one parameter.

I'm not suggesting that C++2 should have operators '' and "", and the above example doesn't matter (it's just a possible usage). But I suggest to change the syntax of user-defined literal suffixes from operator""suffix to anything else, because maybe someday C++2 may support operator overloading for '' and "". Unfortunately operator""suffix resembles string literals for user-defined literal suffixes, and it ignores the fact that integer or floating-point literals can be user-defined too.

@msadeqhe
Copy link
Author

msadeqhe commented Apr 13, 2023

2. UTVS in generic programming

Literal templates allow literal types (also known as literal suffix or prefix) to be a template parameter (e.g. 10<T>), therefore with both variable templates and literal templates, C++2 will have Uniform Typed Value Syntax (UTVS) in generic programming. It means when only the type and the value of something is important, it doesn't matter if it was from a variable or a literal in generic programming. ...

I think I should explain it more and mention about what solutions are available now. Unfortunately C++1 doesn't support Variable Template Template Parameters as described in this question and this other question in Stack Overflow. Hopefully this proposal and this other proposal (aka Universal Template Parameters) are reported to C++1 standard to support it.

If C++2 supports Variable Template Template Parameters as described in those proposals, literal templates can be supported in the same way with it too. I don't know what's the syntax of template template parameters in C++2, therefore the following example is just my thought on how it can look like in C++2 based on that <Value: _> already means a template value in C++2, and both variable templates and literal templates are values:

// Value is a variable template.
// T is a template template parameter.
X: <Value: <T: type> _> type = {
    mem: = Value<T>;
}

// PI is a variable template.
PI: <T: type> T = 3.1415926535897932385<longdouble> as T;

So a new syntax isn't required in C++2 to support literal templates in generic programming, becuase literal templates are syntactically similar to variable templates, and we can use X type in the following ways:

// X is instantiated from a variable template with template argument float
x0: = X<PI<float>>;
// x0.mem is float

// X is instantiated from a literal template with template argument float
x1: = X<3.14<float>>;
// x1.mem is float

The other point is that it'll be possible to extract constant 3.14 from its literal template 3.14<float> (but we cannot do this with literal suffixes such as 3.14f in C++1) in template programming.

@msadeqhe
Copy link
Author

Thanks... well, I have to close it, because I don't think @hsutter would accept it.

This suggestion is a generalized simpler alternative to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants