-
-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Array iterator classes private to hide implementation from client #193
base: main
Are you sure you want to change the base?
Conversation
If this is discussed at a sync that I'm not present at before I have a chance to comment more fully, I would like it noted that I am not in favor of this change. I want to note this note as I probably won't have time to put together a full write up within the next few days. |
Apart from the naming suggestion I gave above, I am in favor of this change 👍 |
A bit of context for this RFC. I'm presently learning Pony, reading all documentation and particularly the one of the stdlib. Looking at the
I couldn't find the logic but the fact that some results concern interfaces and some concrete classes. When designing classes, exposing internals is often considered bad design. But it could be related to some Pony constraints that I don't know about yet. So I decided to ask the question on Zulip regarding the class A few days passed and I was still thinking about that API decision, and because I think that there must be some rational in any decision, I thought that it could be because of a technical constraint imposed by the language. So I decided to test it by myself to see if there was something in Pony that prevented it. I did the changes in the Here is the link to my branch of Pony compiler that implement the changes: https://github.com/pmetras/ponyc/tree/rfc-array Following Sean T Allen's recommendation that the discussion should continue with a RFC, that's what I did. One of Sean's argument is that naming things is important: "I personally find naming the thing that implements the interface to be a good thing. I have a thing I can refer to. A name that can be reused." Thinking about this argument, from a type or object perspectives is not important, in my opinion, when client code call All my attempts to find a rational failed. I must say I don't understand Sean's argument. I agree that naming thing for public types are important, but I don't see why you need to name things that clients of your class don't need to know (private implementation classes). That's against information hiding/encapsulation. I don't think that Pony syntax or stdlib promotes not enforcing this design principle. |
And I forgot to say that |
I don't have a strong opinion on this RFC, but if My only concern about this RFC is adding the overhead of virtual calls for these iterators. I assume iterating over arrays is very common, so these specific classes are used a lot, even if users don't name them. |
@ergl - I had the same concern about virtual calls, though I think in practice most iterator-using use cases are direct enough (e.g. passing straight into a This is a hypothesis we should probably test prior to accepting the RFC. I can do some testing of that if it looks like the RFC has enough support to be accepted. |
@SeanTAllen mentioned in the sync call that he'd like to see one of two things happen with this RFC:
He wants to hone in on one of those two directions so that he can better focus his energy in stating his objections, either to the general case (:one:) or to the I think this a reasonable request, and agree. |
My initial concern was only to get an answer to the question in the title of this RFP that I could understand. It could have been "It's that way because I decided that it is", without any later discussion... But I was intrigued by the fact that the design decision is against the Sanctus encapsulation/information hiding principle of computer science. And I decided to open this RFC to get a definite answer. Now, if I had to select between direction 1️⃣ or 2️⃣, I think that applying a design principle only on part of a library would be a bad decision by creating an exception (the less exceptions, the better the language). The remaining paths are either correct everything ( 2️⃣) or do nothing (cancel and close the RFC). Personally, I think that 2️⃣ is the best move, but I don't know yet what could be the impact on existing code or other consequences of such change. But I'm ready to investigate the impacts and update the text of the RFC in that direction if others think it's worth my effort. |
@pmetras - I'm confused by your use of numbering - can you please check the numbers in the above comment I gave to make sure they match the use of your numbers - I think you meant 1️⃣ when you said 2️⃣ - can you confirm? We discussed this on today's sync call, and Sean suggested you go ahead and try to update the RFC to make your arguments in favor of the approach for either 1️⃣ or 2️⃣ - that's not a guarantee the RFC will be accepted but it's to ensure he (and others) fully understand the points being argued so that he (and others) can voice any relevant objections clearly. |
Effectively, I mixed the numbers. I meant [one], that is applying the design change to all stdlib. |
This is very hard to read in GitHub due to lack of line breaks. Can you add hard line breaks around column 80 or so to make it easier to read in the GitHub web UI? |
I think this RFC doesn't consider the cost of breaking user code enough. Even if I was to be convinced of the value of this change, I think we need to give more weight to considering that this will break people's code and a good justification beyond "design principles" needs to be given for breaking code. Regardless of how I feel about this specific RFC, I would make the same point for any RFC. I would like to see the breaking change taking more seriously and consideration given to justifying that breaking change. RFCs set precedent for future changes and the breaking change in here feels somewhat cavalierly presented. |
text/0000-array-private-classes
Outdated
Note: To remain consistent with `Array` behaviour, functions `keys` and `pairs` should return a `RewindableIterator` too but we limited the API change to minimum as we did not understood why it was not already the case. | ||
|
||
```pony | ||
interface RewindableIterator[A] is Iterator[A] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no need for is Iterator[A]
as far as I can tell here. Is this an attempt to make sure in the future that RewindableIterator doesn't diverge at all from Iterator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true for interface
but I just kept with the coding style of the stdlib where numerous classes (i.e. Array
, String
, List
, Range
, Reverse
, HashSet
, etc.) explicitly name the interface they implement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in the case of those, it made sure that if the Iterator interface was changed that all those concrete classes would fail compilation, is that the intention here? If yes, I think there is a larger discussion about the idea of duplicating the methods from Iterator in RewindableIterator and then using is
to enforce the Iterator interface.
It feels odd. It might be worth it to avoid having an interface that is only rewind
and the returning the two interfaces, but, it does feel odd and I think the reasoning is worth discussion.
I know you had some conversation around this with @jemc. I think you should discuss with him some supporting justification for this. Is this a one off? Is this establishing a pattern that we want to use in the standard library? Is it a pattern that is already being used in the standard library?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My suggestion would be to keep the is Iterator[A]
part and remove the explicit next
and has_next
definitions.
For an interface
, that roughly reads as:
- "To implement
RewindableIterator[A]
, you need to implement everythingIterator[A]
requires (which doesn't need to be repeated here), but you additionally need to implement thisrewind
method."
Or from the perspective of the interface user:
- "If you get a
RewindableIterator[A]
, it can do everythingIterator[A]
can do (which doesn't need to be repeated here), but you additionally can use it torewind
"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As noted below by me, I prefer the alternative design of not having RewindableIterator[A]
overlap with Iterator[A]
and instead have Iterator[A] & Rewindable[A]
define RewindableIterator[A]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what is finally desired here. What would be the RewindableIterator
(or CollectionIterator
) definition so that I can include it in the RFC text?
@pmetras thank you for the linebreaks. its much easier to read now. |
text/0000-array-private-classes
Outdated
- The interface `RewindableIterator` is added to create rewindable iterators | ||
(can be re-start from first value). | ||
|
||
This change remains compatible with the existing code base but for client code | ||
that is directly using the classes `*Keys`, `*Values` and `*Pairs`. A search on | ||
Github shows that the impact is very limited. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem to be motivation and should be worked into design as these are notes and statements about a some specific design points.
@pmetras This part:
would not work for iterators that define a streaming sequence. You could, potentially, define an iterator over an infinite stream of events that can't possibly be rewinded, unless you kept all previous elements in memory. We don't want to constraint future uses of the |
@ergl , perhaps I did not express correctly my concern. The important word is for collection only. I have previously suggested naming it If interface Iterator[A}
fun has_next(): Bool
fun next(): A ?
fun rewind() ? => error
"""By default, iterators can't be reset""" I don't think that such interface is the best solution, but at least it prevents having two types with one that is difficult to limit to collections usage. |
@pmetras This RFC should not modify Iterator[A]. No one is suggesting anything that would result in a change to |
I've added the rendered RFC at https://pmetras.github.io/rfcs/ and updated the text. Now, I've just to find some time to see how to implement it and measure the impact. |
@rhagenson , please explain how to have it rendered and how to get a rendered URL. I wasn't able to find it automatically rendered as I wanted to be able to read it and I would have save a few hours today. So I used pandoc and followed Github documentation... |
@pmetras Thank you! Will you update your first comment with the link to the GitHub rendered Markdown? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have left a large number of comments. My overall impression is still that I am in favor of the simpler return types, but there are issues that need to be addressed with respect to performance impact and ensuring the motivation for these changes are not setting a precedent for hiding all implementation details based on a notion of strict information hiding.
* `Vec` | ||
* persistent `Vec` | ||
* `Set` | ||
* `Itertools` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
itertools
seems out of place to me here. I see no internals of itertools
being exposed. Iter
is the only public type in itertools
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I addressed this below. There are references to ArrayValues
within itertools
which would need to change. Having itertools
in this list is still out of place though as this list is for "other collection classes that expose internals too" which itertools
is neither "[an]other collection class" nor does it "expose internals too".
To quote Antoine de Saint-Exupéry: | ||
|
||
> Perfection is achieved, not when there is nothing more to add, but when there | ||
> is nothing left to take away. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To quote from the Pony Philosophy:
Performance
Runtime speed is more important than everything except correctness. If performance must be sacrificed for correctness, try to come up with a new way to do things. The faster the program can get stuff done, the better. This is more important than anything except a correct result.
Simplicity
Simplicity can be sacrificed for performance. It is more important for the interface to be simple than the implementation. The faster the programmer can get stuff done, the better. It’s ok to make things a bit harder on the programmer to improve performance, but it’s more important to make things easier on the programmer than it is to make things easier on the language/runtime.
If making this change to abstract types has a performance impact than it is not inline with our driving philosophy which places performance at a higher priority than simplicity.
Currently the API makes a guarantee that the return value of each of these functions | ||
is a specific concrete class with a specific public name. That particular guarantee | ||
does not actually serve any practical needs of users (except, arguably, the virtual | ||
calls caveat that was mentioned on [Github discussion](https://github.com/ponylang/rfcs/pull/193#issuecomment-1009733064)). If it is not helpful to users in practice, | ||
it can be removed. And if it can be removed, then it should be removed, in the | ||
name of taking another step toward stabilizing the Pony standard library's API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This linked issue comment is in line with my previous review comment concerning performance taking priority over simplicity in the Pony Philosophy. I would like to see some evidence that the compiler as it is today devirtualize the calls and as such we get both performance (no worse than today) and simplicity (improved by this RFC).
It'll always be easier to get more specific with these return types than more | ||
general. So we should default to offering the most general return type that is | ||
still useful to users, and ratchet down to more specific subtypes as needed (as | ||
tangible use cases for more specific return types are discovered). Ratcheting down | ||
will always be possible without a breaking change, but ratcheting up will be a | ||
breaking change each time. It's best to make such "ratchet up" breaking changes | ||
like this RFC before we reach Pony 1.0.0 as part of a general effort to make Pony | ||
and its standard library stable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not agree with this as motivation. It is always easier to start with many concrete implementations/returns which are then used to derive a generic implementation/return and trying to be generic before concrete presents more room for instability.
|
||
# Detailed design | ||
|
||
Iterating functions in collections `keys`, `values` and `pairs` are changed to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In your example they return a CollectionIterator
, not an Iterator
.
|
||
# Alternatives | ||
|
||
1. Stay as is. Continue the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we continue the conversation of keeping this as is? If no change is needed then no conversation is needed? Are you suggesting other changes which should be pulled into this RFC from Zulip?
| a. Simpler stdlib API | --- | --- | + | +++ | +++ | | ||
| b. Simpler function signatures | --- | --- | + | +++ | +++ | | ||
| c. Stay compatible with existing stdlib | +++ | +++ | + | +++ | + | | ||
| d. Evolutive | --- | + | +++ | +++ | + | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes for the sake of changes is not evolution -- there must be positive or negative selection which lead microevolution into macroevolution. Put another way, if all the suggested changes have negative effects than it is still "evolution" to do nothing as we do not bring those negative effects forward with us.
Side note: I am a career "full stack biologist" who has done work ranging from cancer to conservation to COVID-19 -- Pony is the project that I volunteer my time with when not working on the "evolving nature" of biological problems.
| c. Stay compatible with existing stdlib | +++ | +++ | + | +++ | + | | ||
| d. Evolutive | --- | + | +++ | +++ | + | | ||
| e. No impact on compiler | +++ | +++ | - | +++ | - | | ||
| f. No impact on performance | +++ | +++ | +++ | +++ | +++ | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you benchmarked these changes to ensure no performance impact?
| e. No impact on compiler | +++ | +++ | - | +++ | - | | ||
| f. No impact on performance | +++ | +++ | +++ | +++ | +++ | | ||
| g. Limit stdlib pollution with interfaces | +++ | +++ | + | + | +++ | | ||
| g. Limit stdlib pollution with classes | --- | --- | +++ | ++ | +++ | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is "g" repeated?
# Unresolved questions | ||
|
||
- Must analyze how `Range` and `Reverse` would be impacted if defined as | ||
`CollectionIterator`. Particularly, impact on existing client code. | ||
- Possible candidates to be analyzed: `StringBytes`, `StringRunes`, `ByteSeqIter` | ||
and probably others will be found in stdlib code. | ||
- Understand why the intersection type proposal is not considered as it seems to | ||
provide a better flexibility and is easier to understand. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of these are questions.
Following information hiding design principle, the builtin classes of collection | ||
data structures must not be made visible through iterator functions like `ArrayKeys`, | ||
`ArrayValues` and `ArrayPairs`. These classes can be made private as they are | ||
only used as return types for `Array` functions `keys`, `values` and `pairs`. | ||
The return values for these functions are changed to the more general interface | ||
`Iterator`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without "Following information hiding design principle"
being removed. I can't support this RFC. There's no "design principle" that I want to codify into Pony via RFC as a precedent for making changes.
This design principle is applied to the other collection classes that expose | ||
internals too like: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, no, I am not in favor of anything in here that I am in favor of based on an abstract design principle. I can't support this language being included.
I've repeated numerous times that I am against setting precedent based on a design principle. I didn't read the entire new version of this RFC. I read up to the 3rd mention of this being done because of a design principle and stopped. I'm against this RFC and at this point, I'm not going to invest more time in it. I can't support the motivation given, do not support it being used to set precedent going forward and have stated that more than once. I have many other Pony related things to do and so I am voicing my opposition to this RFC. I would welcome an RFC that meets the reasoning stated here: #193 (comment). As is, it seems clear to me this RFC is not going to be written taking that motivation into account. |
I won't update this text anymore. It seems my English vocabulary and wording are not adequate to describe the objective of the RFC. Anybody who wants to support this proposal can take over and update this text. |
@pmetras Thank you for the work you have put into this RFC. I am in support of the core of the RFC so I am still hoping to see it through to the end. However, I am already working on another RFC (#200) which I expect will draw most of my attention. Is there anybody else in the Pony community with the bandwidth needed to take over this RFC? |
Proposal to make
Array
iterator classesArrayKeys
,ArrayValues
andArrayPairs
private. InsteadArray
functionkeys
,values
andpairs
will return generalIterator
interface like other stdlib classes that implement theses functions. Hiding the implementation of these iterators follow good design principles.A new interface
Rewindable
has been added for when an iterator can be rewinded.These changes have been implemented in my repository of the Pony compiler and all tests pass.
EDIT: Rendered RFC: https://github.com/pmetras/rfcs/blob/rfc-array/text/0000-array-private-classes.md