Skip to content

Commit

Permalink
spec: types
Browse files Browse the repository at this point in the history
  • Loading branch information
xushiwei committed Jul 24, 2024
1 parent 1b71ff6 commit 1a8a169
Show file tree
Hide file tree
Showing 2 changed files with 302 additions and 12 deletions.
137 changes: 125 additions & 12 deletions doc/mini-spec.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,16 @@ Go+ STEM Education Minimum Specification

Comments serve as program documentation. There are three forms:

* __Line comments__ start with the character sequence `//` and stop at the end of the line.
* __Line comments__ start with the character sequence `#` and stop at the end of the line.
* __General comments__ start with the character sequence `/*` and stop with the first subsequent character sequence `*/`.
* _Line comments_ start with the character sequence `//` and stop at the end of the line.
* _Line comments_ start with the character sequence `#` and stop at the end of the line.
* _General comments_ start with the character sequence `/*` and stop with the first subsequent character sequence `*/`.

A __general comment__ containing no newlines acts like a space. Any other comment acts like a newline.
A _general comment_ containing no newlines acts like a space. Any other comment acts like a newline.

```
# this is a line comment
// this is another line comment
/* this is a general comment */
/* this is a general comment */
```

## Literals
Expand Down Expand Up @@ -87,7 +87,7 @@ For readability, an underscore character _ may appear after a base prefix or bet

### Imaginary literals

An imaginary literal represents the imaginary part of a [complex constant](). It consists of an [integer](#integer-literals) or [floating-point](#floating-point-literals) literal followed by the lowercase letter __i__. The value of an imaginary literal is the value of the respective integer or floating-point literal multiplied by the imaginary unit __i__.
An imaginary literal represents the imaginary part of a [complex constant](). It consists of an [integer](#integer-literals) or [floating-point](#floating-point-literals) literal followed by the lowercase letter _i_. The value of an imaginary literal is the value of the respective integer or floating-point literal multiplied by the imaginary unit _i_.

For backward compatibility, an imaginary literal's integer part consisting entirely of decimal digits (and possibly underscores) is considered a decimal integer, even if it starts with a leading 0.

Expand All @@ -110,6 +110,19 @@ For backward compatibility, an imaginary literal's integer part consisting entir

TODO

```sh
1r # bigint 1
2/3r # bigrat 2/3
```

### Boolean literals

TODO

```go
true
false
```

### Rune literals

Expand Down Expand Up @@ -194,18 +207,118 @@ These examples all represent the same string:

If the source code represents a character as two code points, such as a combining form involving an accent and a letter, the result will be an error if placed in a rune literal (it is not a single code point), and will appear as two code points if placed in a string literal.

#### C style string literals

TODO
## Types

### Boolean types

A _boolean type_ represents the set of Boolean truth values denoted by the predeclared constants true and false. The predeclared boolean type is `bool`; it is a defined type.

```go
c"Hello, world!\n"
bool
```

#### Python string literals
### Numeric types

TODO
An _integer_, _floating-point_, _complex_ or _rational_ type represents the set of integer, floating-point, or complex values, respectively. They are collectively called _numeric types_. The predeclared architecture-independent numeric types are:

```go
py"Hello, world!\n"
uint8 // the set of all unsigned 8-bit integers (0 to 255)
uint16 // the set of all unsigned 16-bit integers (0 to 65535)
uint32 // the set of all unsigned 32-bit integers (0 to 4294967295)
uint64 // the set of all unsigned 64-bit integers (0 to 18446744073709551615)

int8 // the set of all signed 8-bit integers (-128 to 127)
int16 // the set of all signed 16-bit integers (-32768 to 32767)
int32 // the set of all signed 32-bit integers (-2147483648 to 2147483647)
int64 // the set of all signed 64-bit integers (-9223372036854775808 to 9223372036854775807)

float32 // the set of all IEEE-754 32-bit floating-point numbers
float64 // the set of all IEEE-754 64-bit floating-point numbers

complex64 // the set of all complex numbers with float32 real and imaginary parts
complex128 // the set of all complex numbers with float64 real and imaginary parts

byte // alias for uint8
rune // alias for int32
```

The value of an _n_-bit integer is n bits wide and represented using [two's complement arithmetic](https://en.wikipedia.org/wiki/Two's_complement).

There is also a set of predeclared integer types with implementation-specific sizes:

```go
uint // either 32 or 64 bits
int // same size as uint
uintptr // an unsigned integer large enough to store the uninterpreted bits of a pointer value
```

To avoid portability issues all numeric types are defined types and thus distinct except _byte_, which is an [alias]() for _uint8_, and _rune_, which is an alias for _int32_. Explicit conversions are required when different numeric types are mixed in an expression or assignment. For instance, _int32_ and _int_ are not the same type even though they may have the same size on a particular architecture.

TODO:

```go
bigint // TODO
bigrat // TODO
```

### String types

A _string type_ represents the set of string values. A string value is a (possibly empty) sequence of bytes. The number of bytes is called the length of the string and is never negative. Strings are immutable: once created, it is impossible to change the contents of a string. The predeclared string type is `string`; it is a defined type.

```go
string
```

The length of a string `s` can be discovered using the built-in function [len](). The length is a compile-time constant if the string is a constant. A string's bytes can be accessed by integer [indices]() `0` through `len(s)-1`. It is illegal to take the address of such an element; if `s[i]` is the i'th byte of a string, `&s[i]` is invalid.


### Array types

An array is a numbered sequence of elements of a single type, called the element type. The number of elements is called the length of the array and is never negative.

```go
[N]T
```

The length is part of the array's type; it must evaluate to a non-negative [constant]() representable by a value of type int. The length of array `a` can be discovered using the built-in function [len](). The elements can be addressed by integer [indices]() `0` through `len(a)-1`. Array types are always one-dimensional but may be composed to form multi-dimensional types.

```go
[32]byte
[1000]*float64
[3][5]int
[2][2][2]float64 // same as [2]([2]([2]float64))
```

### Slice types

A _slice_ is a descriptor for a contiguous segment of an underlying array and provides access to a numbered sequence of elements from that array. A slice type denotes the set of all slices of arrays of its element type. The number of elements is called the length of the slice and is never negative. The value of an uninitialized slice is `nil`.

```go
[]T
```

The length of a slice `s` can be discovered by the built-in function [len](); unlike with arrays it may change during execution. The elements can be addressed by integer [indices]() `0` through `len(s)-1`. The slice index of a given element may be less than the index of the same element in the underlying array.

A slice, once initialized, is always associated with an underlying array that holds its elements. A slice therefore shares storage with its array and with other slices of the same array; by contrast, distinct arrays always represent distinct storage.

The array underlying `a` slice may extend past the end of the slice. The capacity is a measure of that extent: it is the sum of the length of the slice and the length of the array beyond the slice; a slice of length up to that capacity can be created by [slicing]() a new one from the original slice. The capacity of a slice a can be discovered using the built-in function `cap(a)`.

A new, initialized slice value for a given element type `T` may be made using the built-in function [make](), which takes a slice type and parameters specifying the length and optionally the capacity. A slice created with make always allocates a new, hidden array to which the returned slice value refers. That is, executing

```go
make([]T, length, capacity)
```

produces the same slice as allocating an array and [slicing]() it, so these two expressions are equivalent:

```
make([]int, 50, 100)
new([100]int)[0:50]
```

Like arrays, slices are always one-dimensional but may be composed to construct higher-dimensional objects. With arrays of arrays, the inner arrays are, by construction, always the same length; however with slices of slices (or arrays of slices), the inner lengths may vary dynamically. Moreover, the inner slices must be initialized individually.

### Pointer types

TODO
177 changes: 177 additions & 0 deletions doc/spec.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
Go+ Specification
=====

## Comments

See [Comments](mini-spec.md#comments).


## Literals

See [Literals](mini-spec.md#literals).

### String literals

#### C style string literals

TODO

```go
c"Hello, world!\n"
```

#### Python string literals

TODO

```go
py"Hello, world!\n"
```

## Types

### Boolean types

See [Boolean types](mini-spec.md#boolean-types).

### Numeric types

See [Numeric types](mini-spec.md#numeric-types).

#### String types

See [String types](mini-spec.md#string-types).

##### C style string types

```go
import "c"

*c.Char // alias for *int8
```

##### Python string types

```go
import "py"

*py.Object // TODO: *py.String?
```

### Array types

See [Array types](mini-spec.md#array-types).

An array type T may not have an element of type T, or of a type containing T as a component, directly or indirectly, if those containing types are only array or struct types.

```go
// invalid array types
type (
T1 [10]T1 // element type of T1 is T1
T2 [10]struct{ f T2 } // T2 contains T2 as component of a struct
T3 [10]T4 // T3 contains T3 as component of a struct in T4
T4 struct{ f T3 } // T4 contains T4 as component of array T3 in a struct
)

// valid array types
type (
T5 [10]*T5 // T5 contains T5 as component of a pointer
T6 [10]func() T6 // T6 contains T6 as component of a function type
T7 [10]struct{ f []T7 } // T7 contains T7 as component of a slice in a struct
)
```

### Slice types

See [Slice types](mini-spec.md#slice-types).

### Struct types

A _struct_ is a sequence of named elements, called fields, each of which has a name and a type. Field names may be specified explicitly (IdentifierList) or implicitly (EmbeddedField). Within a struct, non-[blank]() field names must be [unique]().

```go
// An empty struct.
struct {}

// A struct with 6 fields.
struct {
x, y int
u float32
_ float32 // padding
A *[]int
F func()
}
```

A field declared with a type but no explicit field name is called an _embedded field_. An embedded field must be specified as a type name T or as a pointer to a non-interface type name *T, and T itself may not be a pointer type. The unqualified type name acts as the field name.

```go
// A struct with four embedded fields of types T1, *T2, P.T3 and *P.T4
struct {
T1 // field name is T1
*T2 // field name is T2
P.T3 // field name is T3
*P.T4 // field name is T4
x, y int // field names are x and y
}
```

The following declaration is illegal because field names must be unique in a struct type:

```go
struct {
T // conflicts with embedded field *T and *P.T
*T // conflicts with embedded field T and *P.T
*P.T // conflicts with embedded field T and *T
}
```

A field or [method]() f of an embedded field in a struct x is called _promoted_ if x.f is a legal [selector]() that denotes that field or method f.

Promoted fields act like ordinary fields of a struct except that they cannot be used as field names in [composite literals]() of the struct.

Given a struct type S and a [named type]() T, promoted methods are included in the method set of the struct as follows:

* If S contains an embedded field T, the [method sets]() of S and *S both include promoted methods with receiver T. The method set of *S also includes promoted methods with receiver *T.
* If S contains an embedded field *T, the method sets of S and *S both include promoted methods with receiver T or *T.

A field declaration may be followed by an optional string literal tag, which becomes an attribute for all the fields in the corresponding field declaration. An empty tag string is equivalent to an absent tag. The tags are made visible through a [reflection interface]() and take part in [type identity]() for structs but are otherwise ignored.

```go
struct {
x, y float64 "" // an empty tag string is like an absent tag
name string "any string is permitted as a tag"
_ [4]byte "ceci n'est pas un champ de structure"
}

// A struct corresponding to a TimeStamp protocol buffer.
// The tag strings define the protocol buffer field numbers;
// they follow the convention outlined by the reflect package.
struct {
microsec uint64 `protobuf:"1"`
serverIP6 uint64 `protobuf:"2"`
}
```

A struct type T may not contain a field of type T, or of a type containing T as a component, directly or indirectly, if those containing types are only array or struct types.

```go
// invalid struct types
type (
T1 struct{ T1 } // T1 contains a field of T1
T2 struct{ f [10]T2 } // T2 contains T2 as component of an array
T3 struct{ T4 } // T3 contains T3 as component of an array in struct T4
T4 struct{ f [10]T3 } // T4 contains T4 as component of struct T3 in an array
)

// valid struct types
type (
T5 struct{ f *T5 } // T5 contains T5 as component of a pointer
T6 struct{ f func() T6 } // T6 contains T6 as component of a function type
T7 struct{ f [10][]T7 } // T7 contains T7 as component of a slice in an array
)
```

### Pointer types

See [Pointer types](mini-spec.md#pointer-types).

0 comments on commit 1a8a169

Please sign in to comment.