Skip to content

Should we use datatypes with more appropriate semantics at the cost of worse runtime performance and some code bloat? #2669

Closed
@sjakobi

Description

@sjakobi

In many functions, such as Stack.Package.findCandidates, we currently use lists for parameters that should semantically be sets:

  • We don't care about the order of the elements.
  • Duplicate elements don't change the return value.

I think this practice is quite common and not without benefits: lists have nice notation and can often be fused away during compilation, leading to fast code.

I also find that this practice has some downsides:

  • Reading the code, I'm often unsure if a parameter has list semantics or set semantics. What happens when there are duplicates? Is the order important?
  • In the particular case of the findCandidate function, a set parameter could have prevented a performance bug where duplicate elements caused a lot of unnecessary parsing and IO.

A similar argument can probably made for other types too.

I currently think that we should put correctness and readability above performance considerations and use the most "correct" datatypes possible.

How does everyone else feel about this tradeoff?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions