Skip to content

In Search Of Performance

James Dunkerley edited this page Jan 24, 2018 · 2 revisions

In Search of Performance ...

As I work towards the release of version 1.3 of the Alteryx Omnibus add ins, I've started to experiment with different ways to do some very basic functions to try and understand performance implications of the various ways to implement functions, and then it descended into a train (or tram) ride obsession. Anybody who has fallen for the Alteryx platform, will generally have discovered how efficient and fast it is. One of my goals when writing the add ins for it, is to make them as performant as possible.

Functions in the Alteryx formula system can be written in three ways:

  • All the core functions writen by Alteryx themselves are C++ code built into the AlteryxBasePluginsEngine.dll file.
  • XML add ins
  • C style DLL functions (still need to write these up...)

For the purpose of this test my goal is to create a simple YEAR function. This must

  • Return the year of dates or datetimes
  • Return the value as an integer
  • Return null for invalid inputs

The Test Rig

Alteryx formula functions only take either string values or numeric values. Dates are stored as 10 character strings in yyyy-mm-dd format. DateTimes are stored as 19 character strings in yyyy-mm-dd HH:mm:ss format.

For the purpose of this experiment, I generated 50,000 dates and 50,000 datetimes as well as a few invalid strings (some of incorrect length, some invalid dates, and some invalid datetimes). These were saved into a yxmd file as this is the format Alteryx can read quickest (plus it's impressively small - half the size of the storing same in an Excel xlsx file).

The version of the engine is likely to make a pretty massive difference here, as I expect will hardware. So I test both on 10.6 and 11.0. I'm running on a Surface Pro 3 with an i7 (4th Gen chip) and 8Gb of RAM. As it is all about engine performance, I will be running the tests in the command line runner.