Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First crack at new List API, intended to replace "array-as-vec" operations on arrays #13030

Closed
363 changes: 363 additions & 0 deletions modules/standard/Lists.chpl
Original file line number Diff line number Diff line change
@@ -0,0 +1,363 @@
/*
* Copyright 2004-2019 Cray Inc.
* Other additional copyright holders may be indicated within.
*
* The entirety of this work is licensed under the Apache License,
* Version 2.0 (the "License"); you may not use this file except
* in compliance with the License.
*
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

/*
(Word smithing module text below).

---

This module contains the implementation of a list datatype, intended to
replace the use of arrays to perform "vector-like" operations.

A List is a lightweight container intended to mimic Python's list datatype,
with fast O(log n)/O(1) (whatever we end up on) indexing and an API close
in spirit to its Python counterpart.

The highly parallel nature of Chapel means that great care should be taken
when performing operations which may trigger a move of List elements.
Inserts and removals into the middle of a List are example operations which
may trigger a move, thus invalidating all references to elements of the
List held by tasks before the move. Appending an element to the end of a
List will never cause a move.

The following operations may trigger a move:
- insert
- remove
- reverse
- sort
- pop
dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved

---

(End word smithing module text).

I know that @mppf in particular is very interested in limiting the number of
operations that move/displace List elements.

This pertains to the general issue of invalidating references to (and also
iterators over) List elements.

There is also the separate, more general issue of parallel safety. I'd
happily appreciate any/all recommendations for the "Chapeltastic" way of
making this List type parallel safe.

---

I believe it is the intention of this new List datatype to replace all
"array-as-vec" operations on arrays. That means the following procedures
would no longer be supported (deprecated?, or removed?) on arrays:

- proc push_back(in val: this.eltType)
- proc push_back(vals: [])
- proc pop_back()
- proc push_front(in val: this.eltType)
- proc push_front(vals: [])
- proc pop_front()
- proc insert(pos: this.idxType, in val: this.eltType)
- proc insert(pos: this.idxType, vals: [])
- proc remove(pos: this.idxType)
- proc remove(pos: this.idxType, count: this.idxType)
- proc remove(pos: range(this.idxType, stridable = false))
- proc clear()

All of the methods given above change an array's domain (exactly what we
are trying to avoid, and the impetus for the List type).

*/
module Lists {

// Building blocks used to store "chunks" of List elements.
pragma "no doc"
class ListBlock {
type _etype;
var data: _ddata(_etype) = nil;
}

/*
A container intended to replace the use of arrays to perform "vector-like"
operations.
dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved

Is not currently parallel safe.

The following operations may trigger a move:
dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved
- insert
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to link to the definitions of these functions elsewhere in the documentation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if I would repeat the information in the module documentation, but I think it is appropriate here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be sure, the module documentation is in no way final at this point, and it contains a lot of the ramblings and musings that I put in the OP. I hope to have it sorted out after we get the API and semantics of List finalized.

But yes, that is a really good reminder, thanks!

- remove
- reverse
- sort
- pop

A move will invalidate any references to List elements held by tasks, as
well as iterators to List elements produced by tasks before the move
occurred.
*/
class List {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It must be a record if you want to overload = or supply a copy initializer


pragma "no doc"
var _size = 0;
pragma "no doc"
var _etype;

//
// Michael suggested that we move from a LL of blocks to an array, which
// would give us constant speed indexing at the cost of more complicated
// indexing logic.
//
// For now, I can start with Vass's logic and then switch over to a O(1)
// indexing scheme when everything else is working.
//
pragma "no doc"
var _head, _tail: unmanaged ListBlock(_etype) = nil;

/*
Initializes a new List containing elements of the given type.
dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved

:arg etype: The type of the elements of this List.
*/
proc init(type etype) {
_etype = etype;
}

/*
Initializes a new List as a copy of another List object.

dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved
:arg other: The List object to initialize from.
*/
proc init=(other: List) {
this._etype = other._etype;
}

proc deinit() {}
lydia-duncan marked this conversation as resolved.
Show resolved Hide resolved

/*
Add an element to the end of this List.

.. note::

Appending an element to a List will never invalidate references to
elements in the List.

:arg x: An element to append.
*/
proc append(x: this._etype) {}
dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved
dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved

/*
Extend this List by appending all the elements from another List.
dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved

:arg other: A List, the elements of which are appended to this List.
*/
proc extend(other: List(this._etype)) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be able to get away with using that method wrapper here as well. If not, we could try using the generic version of List and having checks in the body of extend to ensure the appropriate behavior is maintained.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is true, might that be the more appropriate solution here?

Copy link
Member

@lydia-duncan lydia-duncan May 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I'd want to look at the generated documentation with the method call before making a decision

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working on that, waiting on the filesystem...


/*
Extend this List by appending all the elements from an array.
dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved

:arg other: An array, the elements of which are appended to this List.
*/
proc extend(other: this._etype[?d]) {}

/*
Insert an element at a given position in this List, shifting all elements
following that index one to the right. Note that `a.insert(a.size, x)`
is equivalent to `a.append(x)`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Append or extend?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to link to the functions you are mentioning, both here and later

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!


dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved
:arg i: The index of the element at which to insert.
:arg x: The element to insert.

:throws IllegalArgumentError: If the given index is out of bounds.
*/
proc insert(i: int, x: this._etype) throws {}
dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved

/*
Remove the first item from this List whose value is equal to _x_.

:arg x: The element to remove.

:throws IllegalArgumentError: If there is no such item.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably use a different Error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a ..note about it invalidating references

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Python API throws a ValueError, which is consistent with our use of IllegalArgumentError. Do you have an idea for a different error type?

*/
proc remove(x: this._etype) throws {}

/*
Remove the item at the given position in this List, and return it. If no
index is specified, remove and return the last item in this List.

dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved
:arg i: The index of the element to remove. Defaults to the last item
in this List, or `a.remove(a.size - 1)`.

:return: The item removed.

:throws IllegalArgumentError: If the given index is out of bounds.
*/
proc pop(i: int=this._size - 1): this._etype throws {
return nil;
}

/*
Clear the contents of this List.
dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved
*/
proc clear() {}

/*
Return a zero-based index into this List of the first item whose value
is equal to _x_.

:arg x: An element to search for.

:return: The index of the element to search for.

:throws IllegalArgumentError: If the given element cannot be found.
*/
proc indexOf(x: _etype, start: int=0, end: int=_size): int throws {
dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved
return 0;
}

/*
Return the number of times _x_ appears in this List.

:arg x: An element to count.

:return: The number of times a given element is found in this List.
:rtype: `int`
*/
proc count(x: this._etype): int {
return 0;
}

/*
Sort the items of this List in place. The parameters of this method are
holdovers from Python. This method could more closely integrate with
Chapel sorting APIs.

:arg key: Unused in current API.
dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved
:arg reverse: True if this List should be sorted in reverse order.

*/
proc sort(key: this._etype=nil, reverse: bool=false) {}

/*
Reverse the elements of this List in place.
dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved
*/
proc reverse() {}

/*
Index this List via subscript. Returns a reference to the element at a
given index in this List.

:arg i: The index of the element to access.

:return: An element from this List.

:throws IllegalArgumentError: If the given index is out of bounds.
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we documented this/these/writeThis functions in other places?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like builtins would be documented in an obvious part of the Chapel Docs (I know we do here https://chapel-lang.org/docs/primers/specialMethods.html, but haven't checked the manual itself).

proc this(i: int) ref throws {
return nil;
}

/*
Produce a serial iterator over the elements of this List.

:yields: A reference to one of the elements contained in this List.
*/
iter these() ref {
yield _size;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll have parallel iterators eventually too, right? Note that parallel iteration can still work even if the list doesn't use a lock.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the plan, but I have to learn more about parallel iterators first.


/*
Write the contents of this List to a given channel.

:arg ch: A channel to write to.
*/
proc writeThis(ch: channel) {}

/*
Convert the contents of this List to string form. Could be replaced by
an overload of the cast operation?

:return: The contents of this List, in a form suitable for printing.
:rtype: `string`
*/
proc toString(): string {
dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved
return "";
}

/*
Return the number of elements in this List.

:return: The number of elements in this List.
:rtype: `int`
*/
proc size {
return _size;
}

/*
Returns `true` if this List contains zero elements.

:return: `true` if this List is empty.
:rtype: `bool`
*/
proc isEmpty(): bool {
dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved
return true;
}

/*
Returns a new DefaultRectangular array containing the elements of this
List.

:return: A new DefaultRectangular array.
*/
proc toArray(): this._etype[] {
dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved
return nil;
}
}

proc =(a: List(?t), b: List(t)) {}

/*
Returns `true` if the contents of two Lists are the same.

:arg a: A List to compare.
:arg b: A List to compare.

:return: `true` if the contents of two Lists are the same.
:rtype: `bool`
*/
proc ==(a: List(?t), b: List(t)): bool {
return true;
}

/*
An important question here: do we want the semantics of addition to mirror
Chapel arrays, or Python lists?

Given Chapel semantics, `+` would be a "vectorlike" operation, performing
scalar addition against all elements of this List, returning a new List
as a result.

Given Python semantics, `+` would be a shorthand for concatenating two
Lists together, returning a new List as a result.
dlongnecke-cray marked this conversation as resolved.
Show resolved Hide resolved

:arg a: A List to add.
:arg b: A List to add.

:return: A new List whose elements are the concatenation of two Lists.
*/
proc +(a: List(?t), b: List(t)): List(t) { return nil; }


} // End module Lists!