-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove switches in Geometry type dispatch #399
Comments
A thought/idea:
Benefits:
Drawbacks:
I don't have a thorough understanding of the codebase, so there may be other drawbacks/benefits that I haven't considered. Would be interested in people's thoughts on this proposal. |
Warning: big wall of text coming 😅 Thanks for the detailed suggestion! I agree with most of the benefits that you outlined. In particular, you proposal solves the following nicely:
This is a really good point, and I think it goes further than elegance. I've seen people make the mistake (and I've made it myself) where This is a nice advantage too:
This simplifies the library API, and anything that simplifies the API (or makes it easier to use) is a definite win. The other advantages are nice too, although I think they're really secondary advantages. Things that have to do with making internal parts of the library nicer are less important compared to making the API itself better. E.g. things like eliminating the switch statements and the panics. For the drawbacks you outlined:
Agreed, this isn't ideal, but I don't think it's a show stopper.
This is a pretty cool idea actually... It's not something that I'd considered before. It solves the problem where the type name (such as "MultiPoint") can differ between
This is true, but I don't think that should stop us. On the plus side, most of the changes would be relatively mechanical, so would be unlikely to be tricky. There are a few other major disadvantage to using a There are some comments in that PR that describe why we went with that way of doing things, but I'll recap here since I don't think I did a great job out outlining the reasons back then. I'll order the reasons from what I consider to be the most important to the least important:
I'm fairly hesitant to move forward with the proposal due to the disadvantages outlined above, even though I do agree that some of the advantages that the proposal would bring are quite valuable. In particular, I really like your idea about the |
Thanks for the warning ... and ditto 😅 The responses that follow are fairly abstract and on reflection, hard to understand, so in an effort to help explain what I mean, I've thrown together a quick runnable proof of concept to hopefully "illustrate" my responses: https://play.golang.org/p/bnbR5Bk0T-D
IIUC, the problem we're facing with converting However, must we cast to concrete types? Could we instead cast to a I'm mindful returning an interface breaks the "accept interfaces, return structs" principle, but I would ask the question "why". The principle only applies in cases where we're returning a single concrete type; in our case, it could return many implementations (as before "For example, if your function actually can return multiple types then obviously return an interface."
Are the legitimate reasons to allow for this? If not, we can ensure that
This is an excellent counter-argument to using an interface that I wasn't aware was a requirement. It's a pretty cool feature actually, to dynamically unmarshal a generic geometry JSON field to its concrete underlying geometry. I think we could solve this by having a concrete
Yes, I agree with this, and why the In an effort to understand the reason behind this proverb, I went to the source itself :) https://www.youtube.com/watch?v=PAAkCSZUG1c&t=317s My take-away from Rob Pike's talk is that the smaller the interface is, the more useful and re-usable it becomes. A question in my mind is, is this a requirement for us? Do we want the An extension to this is that a strong abstraction, to me, means all concrete types implementing that interface have a reason to implement it, and not just forced to return a sentinel or zero value because it's not relevant. For example, if the From a cursory look at the methods defined in Having said that, I could be wrong and perhaps it makes sense break these methods out into more specialised sub-interfaces or removing methods altogether...I don't have enough domain knowledge right now to understand if this is necessary or not. |
Yes, your understanding is correct.
That's a good point. For the most part, geometric algorithms operate on concrete geometry types via their exposed methods and don't access their internals (even though the algorithms are in the same package as the concrete type definitions themselves). So it would be feasible from a technical perspective to do everything via interfaces corresponding to each concrete implementation.
To clarify, simplefeatures itself as a library would never return anything other than
There's no legitimate reason to allow it, and it's a good idea to prevent it. Sealed interfaces are a good solution to that problem.
Ahh, yes that's very similar to the original solution we had here. The concrete geometry type used for unmarshalling was called
I think I agree with you on that point. We don't want a generic geometry interface to be reused by any implementations other than our own, so the size of the interface doesn't matter much. We don't care that it would be hard for others to implement, since we don't want them to anyway.
Yes, agreed. |
I gave my proposal a shot, and in summary, I believe it's not feasible. The code changes (excuse the mess, the goal was just to get tests to pass) can be found in https://github.com/albertteoh/simplefeatures/tree/399-use-interface, and the relevant piece of code discussed starts around here. I got to a stage where all unit tests pass; however, there were some significant performance regressions, particularly around
Zooming in on this particular benchmark to understand why there was such a large regression, a difference that stood out most are the additional memory allocations. Running the Before Benchmark
After Benchmark
Initially, I didn't quite understand why since I haven't introduced any new structs or attributes to existing geometries. Digging a bit further into the problem, we can see a difference that stands out between the... Before Memory Profile and... After Memory Profile ... in that there's an additional bit of memory allocated to the call to This additional allocation has quite an impact as we see in the (please ignore the colours, they convey no meaning)... Before Flame and ... After Flame In the "After Flame", More importantly, of that 20%, about two-thirds of that time is spent on memory allocations, both from the So it looks like memory allocations are expensive (not to mention the GC later on). Next question is, what exactly is causing the additional memory allocations? Ah, we see that it's on line
... and we can confirm that these variables indeed escaped to the heap because we returned a pointer reference to those instances (along with the original slice memory alloc):
Why return a pointer and not just a copy so that it stays on the stack avoiding an expensive allocation? It's because the existing API exposes a Because the primary goal of this proposal was to abstract the geometry types behind an interface, as long as the Given the significant change involved in this particular proposal and little benefit gained in performance or usability, I think it wouldn't warrant further effort to explore optimisations. |
Thanks for the detailed writeup! Ahh, I didn't realise that the impact of all of the heap allocations would be that severe. Btw, I've found in the past that even if if pointers aren't used you will get a heap allocation anyway as soon as you store a value in an interface. So even with |
I really like this idea, and I think it's worth pursuing. I'll created a new ticket for it, since it's a bit different from the issue here. |
In
type_geometry.go
, there are lots of places where we perform a switch statement on the geometry type, grab out the concrete geometry, and then perform some action on it. Usually, the action is the same for each of the types. For example:Having this switch statement everywhere isn't ideal:
I'm not too sure what the ideal solution should be... Here's one idea though: We can create a helper that tries to wrap a single interface around all geometry types. The helper then does the switch for us. While the switch statement is still there, it would be shared across all instances of the switch pattern (so we'd only have a single switch, instead of over a dozen). The downside of this approach is that it's much less explicit, and a little bit more magical. It would look something like:
The text was updated successfully, but these errors were encountered: