Skip to content

CSE Machine

CZX edited this page Apr 1, 2024 · 10 revisions

Overview

This tool generates drawings of the control, stash and environment models of a program at a particular point in its execution. The model aims to follow as closely as possible the structure defined in the official module textbook.

To generate the models, select a breakpoint in the gutter on the left, then run the program. Click the "CSE Machine" button to switch to the CSE Machine tab and view the models.

Data structures such as arrays and pairs are visualised using the box-and-pointer-notation. Each compound object is shown as a pointer to a box. The box for a pair has two parts, the left containing the head and the right containing the tail. The box for an array comprises multiple parts, each representing the corresponding index in the array. Arrows linking different data structures can be hovered over to see clearly where they point to.

We followed certain heuristics when creating the diagram: TODO: update this

  • Frames assigned an (x, y) grid coordinate with the y-coordinate being its depth from the global frame, and the x-coordinate representing the order of creation of each frame, where x-coordinate of newer framesx-coordinate of older frames.
  • Function objects are placed to the right of its enclosing environment frame.
    • Function objects can be hovered over to view the tooltips containing its parameters and function body.
    • In print mode, all tooltips are expanded with the parameters and only the first few characters of the function body displayed.
  • Arrays are placed in ArrayLevels between FrameLevels, with its x and y position being close to the mean position of arrays / bindings pointing to it.
    • Each Array is assigned to an ArrayLevel based on its y position, and it is placed at the highest available array sublevel within the ArrayLevel at that x position.
    • In general, arrows point to arrays from its left/right, and exit from array units from its top/bottom, travelling horizontally/vertically to the target objects.
    • For arrays within the same ArrayLevel but at different array sublevels, arrows take shortest path from the array unit to the target array
    • For arrows from an array to itself, the arrow exits array unit from top and points back down.
  • Arrows moving between different levels and frames move along ArrowLanes, which are just various x positions (vertical lanes to the left of Frames) and y positions (horizontal lanes between FrameLevels and ArrayLevels) which might intersect with arrays, but not Frames or function objects. Each ArrowLane consists of multiple "sublanes" (same x or y position), with sublanes being evenly spaced from one another. Arrows pointing to the same object will be assigned to the same sublane, and arrows within the same ArrowLane pointing to different objects might be assigned to the same sublane if all sublanes are filled.

Technical Details

The visualiser works closely with JS Slang. Whenever the interpreter meets a breakpoint, the current program context is sent to both the debugger and the visualizer. The visualizer receives the context in the form of an EnvTree, which represents all of the frames created in the program's execution. Each node in the tree represents a frame, which in turn is a table of bindings (as defined in the textbook). The parent pointer of each node is hence a pointer to its enclosing environment. The tree is rooted at the node representing the global function bindings.

The visualizer depends on KonvaJS and its corresponding React bindings from react-konva. The feature resides under src/features/envVisualizer and is exposed via EnvVisualizer.tsx in that directory.

We make use of Typescript and try to follow OOP concepts with JS ES6 classes. Drawing logic and dimension calculations etc. pertaining to a certain type of value (array, for example) are encapsulated within their own class files. Overall, this has led to better correctness, extensibility, maintainability, and collaboration (its predecessor was a single file with ~3K LoC).

For a rough outline, we have a Layout class that encapsulates the canvas and calculations. The Layout contains a single Grid, which contains an array of Levels alternating between FrameLevels and ArrayLevels. Each Frame within the grid is assigned an (x, y) coordinate, with Frames aligned horizontally by its x-coordinate, and vertically by its y-coordinate. The y-coordinate is determined by the Frame’s depth from the global frame, and the x-coordinate is determined by the order of creation of the frame.FrameLevels each contain an array of Frames with the same y-coordinate. A Frame has Bindings, which consists of a key (string) and the associated data. The data can be of any type: functions, arrays, pairs, etc. Hence, we have wrapper classes for each of these types, encapsulating the logic for dimension calculations and drawing for that specific type.

The x-position of frames (including function objects) and the relative position of bindings within each frame is first calculated for the entire diagram, and each Frame is added to the correct FrameLevel. Then, the ideal positions for ArrayValues are calculated and they are added to the correct ArrayLevel. The y-position of each Level is updated based on the height of each Level.

After the objects are initialised and the dimensions/coordinates are calculated, Layout.draw() is invoked, which in turns triggers a cascade of draw() invocations in its children and their children etc. The arrows from any object or frame would be created and handled within the object’s draw method. The frames and layers are drawn from right to left and bottom to top to ensure that function tooltips will be above frames to its right.

Exporting: The last JS Slang Context is stored, and toggling printable mode will reuse the current Context and redraw the entire diagram with different colours and the truncated tooltip being used for function objects. Clicking on save will obtain the image data url of the entire konva stage, and save it as multiple images of width ≤ 25000px

Limitations

  • Testing: While there is snapshot testing implemented currently, it may not be very comprehensive and complete. It may be possible for wrong/broken layouts to pass the tests. By nature, testing of this tool may ultimately require visual inspection. It is not clear how to programmatically verify the visual 'correctness'.
  • Arrows: Arrows moving horizontally within an ArrayLevel between array sublevels currently overlap.
  • Garbage Collected Functions: Over the course of the program's execution, many functions may be created. However, it may so happen that they are no longer referenced in later parts of the program, following which they will be garbage collected. Our tool will hence be unable to visualize such functions.
  • Large Canvas Scrolling: Our current implementation is slow at handling large canvas such as the environment model of a metacircular evaluator. A possible way to optimise this is to emulate the scrollbars using canvas, to render only the visible portion of the environment model at a time. To get started, you may want to refer to Large Canvas Scrolling Demo using Konva.

Testing

Snapshot testing is implemented here. It executes some previously problematic or tricky code samples and calculates the layout. It then checks that the general structure of the layout and data in the frames is correct. However, it may be worth noting that this may not be a fully comprehensive test. It aims to prevent glaring errors but subtly wrong visualizations may still pass. Some automated tests conducted for each sample program includes: checking function object is next to enclosing frame, ensuring svg path of arrows from function object to enclosing frame, and from array unit to array within the same level behaves correctly. More tests can be added to reduce the amount of manual testing needed.

For a more comprehensive test, a manual visual inspection may still be required. We have amassed a collection of code samples to test against.

Animations

The CSE Machine additionally supports animations whenever the control and stash are enabled. The animations aim to help users observe dynamic data flow by offering a visual representation of how data is processed and transferred, resulting in a better comprehension of complex code execution. Currently, animations are played upon pressing the next button (moving forward by 1 step).

The CSE Machine currently supports animations for the following instructions:

  • Array access
  • Array assignment
  • Array creation
  • Assignment (to binding)
  • Binary operation
  • Block evaluation (splitting of control item into more items)
  • Branch (conditionals)
  • Environment change
  • Frame creation
  • Function application
  • Literals
  • Lookup (identifier)
  • Pop
  • Unary operation

Technical Details

The CSE Machine uses the konva and react-konva libraries for its animations. It makes use of KonvaNodeComponents from the react-konva library, most commonly Rect and Text. We represent general animatable components under the classes Animatable and AnimatableTo, with the main difference being that AnimatableTo supports simultaneous animations on the same component. We mostly encapsulate the unique animation of each instruction into its own file as an Animatable, which makes use of more basic animation components such as AnimatedTextbox (which is an AnimatableTo). The precise logic for each animation is defined in the animate() function within each Animatable.

For a rough outline, we store information from the previous step i in previousControlComponent and previousStashComponent. Note that this is necessary as the animation for step i is played when the 'next' button to step i+1 is pressed, which means that we need to read information from step i after processing the runtime context of step i+1 to determine what animation to play. We initialise the animations in CseMachineAnimation after processing the runtime context from JS Slang in CseMachineLayout, reading the instruction type of the top control item from the previous step. After the visualisation state has been updated in the CSE Machine, we play the created animations.

Future Improvements

  • Solve the limitation of not being able to show all historically created frames as mentioned above (will need changes to js-slang) done by SA2122
  • Visualisation of arrays done by SA2122
  • Representation of functions belonging to the global frame (e.g. const x = pair) should be implemented done by SA2122
  • Improve text formatting in data structures (eliminate the problem of long text being cut off) done by SA2122
  • Allow arrows to be hovered over individually instead of all at once (this involves coding each arrow as its own object with its own properties) done by SA2122
  • Downloading visualizations
  • Large Canvas Scrolling (see above)
  • Toggle between 'show all frames created' and 'show current frames only'
  • Store previous visualizations and add buttons to go back and forth
  • Show array indexes done by SA2324
  • Animations (see above, work in progress)
  • Arrows (see above)
  • Testing (see above)
  • Garbage Collected Functions (see above) - this should be toggle-able, as no GC may mean performance issues for programs that heavily use TCO for example (fixed in https://github.com/source-academy/frontend/issues/1973, can't reproduce performance issues)