Skip to content

File Input Patterns Design Discussion

Brandon Bosman edited this page May 29, 2024 · 11 revisions

Purpose

It has become clear that there are issues with our current patterns for reading input parameters from the file. This page seeks to analyze the patterns we currently have, and to propose improvements.

What We Have

Currently, the Choices construct has two options for inputStructure, which defines how input is stored: Bundled and Unbundled.

Unbundled Analysis

Current:

  • There is no input object holding all the input parameters; instead each input parameter has its own variable within the main function.
  • Instead of having a constructor that returns an object, InputParameters has a static method that reads in the data and returns it sequentially.
    • In Java, it returns an array of objects, which then has to be destructured.
    • In Python and swift, it returns a tuple, which can be destructured using syntactic sugar.
    • In C++ and C#, it mutates arguments that are passed to the method.
  • I don't really like this, as it requires users of the API to know the arbitrary ordering of the returned values. It's somewhat OK in Python and swift as Tuples are designed for this sort of thing and their syntactic sugar makes it clearer, but the other targets feel a bit hacky.
  • Examples: Double Pendulum, PD Controller, some versions of Projectile, and Solar Water Heater

Potential Changes:

Design 1: Leave it
  • It is functional, and adding procedural languages wouldn't cause problems.
Design 2: Use hashmaps
  • Instead of using a linear data structure to return the values, use a key/value pair data structure.
  • Pros:
    • Structure of returned object is more clear.
  • Cons:
    • Less type safety than Tuples.
    • No enforced structure of what keys will be returned.
Design 3: Use structs
  • Similar to Design 2, but clearly state the structure and type of the returned data.
  • Pros:
    • Type safety and structural safety.
  • Cons:
    • Requires that a new struct be defined.
    • A struct needs to be created, then used exactly once.
Design 4: Get rid of Unbundled
  • Honestly, I can't think of that many cases where it would actually be useful to use unbundled.
  • I guess if there's only a few variables, it would be nice to have an unqualified name, but that's hardly the end of the world.
  • Also, the code's being generated, so a qualified name won't take longer to write, and arguably makes it easier to understand.

Bundled Analysis

Current:

  • There's a special input object to hold the input parameters.
  • Its constructor doesn't set its members directly, but calls two of its methods:
    • get_input, which sets its members by reading and parsing a file.
    • input_constraints, which validates its members based on given constraints.
  • Honestly, I don't hate the design. It's modular and keeps the constructor simple.
  • It has two issues though, both of which stem from the fact that it calls methods to define its members:
    • Swift wants all members to be defined in the constructor, and requires defaults even if the constructor calls methods to set these values.
    • For procedural languages where we want to use structs instead of classes, we have no instance to pass to methods, which means that we would need to do some serious reworking of the design to make it work.
  • Examples: some versions of Projectile.
Special Case: GlassBR
  • The GlassBR is bundled, but it has some weird behaviour:
    • Its constructor is empty; its members are undefined at initialization.
    • The main program passes a reference of it to another class's method, which defines the object's members.
  • This isn't great, as we want to avoid having undefined members outside the constructor, even for just a second.
  • We shouldn't be allowing behaviour like this. No matter what, we want the class's members to be defined by the time the constructor is finished.

Potential Changes

Forward:

From my research and thinking, there are 3 main questions we need to ask:

  1. Do we allow this method calls in the constructor before all class members are defined?
    • E.g. calling a file parser method which defines the class's members.
    • Probably not:
      • Swift requires defaults if we do that.
      • For procedural languages we would need a ton of extra information to get around the fact that we don't have an instance to pass.
  2. Do we allow this method calls in the constructor after all class members have been defined?
    • E.g. input verifiers.
    • This doesn't seem to have any negative effects, but to make them work in procedural languages we would need to add a field to the constructor function making it explicit that the calls happen after all variables have been defined. This shouldn't be a monumental change, though.
  3. Do we want the constructor to do the 'rough work' of finding values for its members, or should we have a function that finds the values, then calls the constructor on them?
    • E.g. do we want the initializer to parse the input and then set its variables, or do we want a function that parses the input to get the variables, then passes them to the constructor?
    • If the initializer does all the work, then we need to either call a method or do all the work inside the function. Calling methods to set members is not ideal, as discussed; and putting all the code inside the constructor can lead to large constructors.
    • Putting all the hard work in another function increases the complexity of the program, but it simplifies the individual pieces. It also decouples the class from the input method, which would make alternate input methods easier to implement.
Design 1: Bundled with full freedom
  • Idea: give GOOL enough information about this method calls in the constructor that it is able to translate methods that define class members into functions that return class members.
  • We would need a new type of method, called subConstructor or something.
    • It would need to clearly state which class members it uses, and which class members it defines.
    • All this method calls within constructors must be to subConstructors, and subConstructors likely can only be used by constructors.
  • Pros:
    • Gives full freedom of current Bundled implementation.
  • Cons:
    • The current Bundled implementation is already too free - Swift doesn't like it.
    • Requires a lot of extra functionality be added to GOOL.
    • subConstructors would likely not be reusable by the rest of the class (or to make them reusable would require even more work).
Design 2: Bundled with external parsing and internal validation
  • Only allow calls to this methods after all class members have been defined.
  • Values for class members can be calculated by an external function that would then call the constructor (as in Design 3). We could also calculate the values within the constructor, but I like the modularity of this way.
  • We would need to add a field to the constructor function in GOOL denoting code that occurs after all class members have been defined, so that GOOL knows it can safely instantiate the struct before that code in procedural languages.
  • Pros:
    • Using an external function for parsing decouples the class from the input method.
    • Keeping validation internal to the class makes that code reusable (e.g. if we had multiple functions that call the constructor).
    • The constructor is simple.
  • Cons:
    • Requires an external function to parse the input and call the constructor.
    • Requires a bit of extra work to tell GOOL which code happens after all class members have been define.
Design 3: Bundled with external parsing and validation
  • Do not allow any kind of this method calls within constructors. This will encourage constructors to be as simple as possible.
  • Instead of the constructor reading and parsing the file, have another function that parses the file to get the values for the class members, then calls the constructor on those values. Validation would need to occur either within the external function, or be a method of the class that is called after the constructor is finished.
  • Pros:
    • Keeps the constructor simple, since the code for 'deriving' class member definitions is in another function. This is also the only solution where the constructor's sole responsibility is to create the object.
    • No extra functionality required for GOOL; we just don't need to worry about this method calls.
    • Decouples the class from its input format.
  • Cons:
    • Less flexible constructors - e.g. validation has to be performed externally or after the constructor is finished.
    • Requires an extra function for parsing and calling the initializer.
Design 4: No Modularity
  • Just put the code for get_input and input_constraints inside the constructor.
  • That is, this method calls are not allowed, and we choose not have external functions that return new instances of the class.
  • Pros:
    • Makes sure that all of the class's members are defined in its constructor.
    • Should be easy to translate to procedural languages.
  • Cons:
    • Loses out on modularity. It would lead to bloated constructors and a tight coupling between the class and the class's input method.

Summary

I like Design 2 the best.

  • While input parsing could be part of the class, it definitely doesn't need to be, and decoupling it from the class opens the door to using the same class with different input methods (e.g. file vs command-line).
  • I also think that constraint validation should be a part of the class, since the constraints are not coupled to the input method.
    • There might be some validation that needs to be different (e.g. sanitizing strings based on how trustworthy their source is), but the constraints on the variables themselves (e.g. nonnegativity) will be the same.

PS

I've been working on some case studies for these designs as I wrote this. They're almost ready, so I'll provide a link when they're on the repo.

Clone this wiki locally