OpenFL Haxe Benchmark

This project's inception was a direct port of Grant Skinner's ActionScript project: PerformanceTest. Migration to Haxe and OpenFL was effortless - there was little to adapt, even given the dynamic nature of the original project.


Intent for this project serves as a comparison and benchmark of code targeting different platforms. Not all code demands performance critical optimizations.

Now in practice for over two years, there is greater clarity of aspects to refine and enhance in this project.

Business logic

Logic to execute tests resides in two locations: the core Benchmark class, and individual test classes.

Benchmark class

Within the core package, a Benchmark class manages a queue of test suites. Test suites are a cohesive collection of tests that make sense combined for comparison of a scenario. Responsibility for executing test suites is maintained here. There are two methods of execution:

Enhancements currently under consideration include:

Test classes

Execution of tests are managed by the test classes. This separation of concern is due to the benchmark engine unaware of what is being tested. Test classes inherit from an AbstractTest class - an abstract base class that should inherited, and not instantiated directly.

Currently there are two types of test classes:

To execute a test, call the run() function.

In its simplest form, the method test could be removed from this benchmark project and implemented within a library to enable discreet profiling and benchmarking inside projects. For example, from your project you could test execution time of a method by calling:

    var time:Float = new MethodTest(myFunction, [args]).run();

However, this benchmark project stands alone to isolate small segments of code.

While placing business logic here empowers test classes to implement any test imaginable, it also forces placement of logic inside the test, and potentially presentation (or interpretation) of test results. This is further compounded by boilerplate loop operations in each test. See more on this topic under the 'Iterations and loops' section.

Instead of an abstract base class, it would be ideal to leverage an IITest interface. Execution would be simple, given the standard run() method; however, there are complications with data and configuration. Ideally, this class would be a simple context for execution by the Benchmark core.

Iterations and loops

Iterations are individual passes of executing a test case. For example, you might have 4 to 10 trials executing an individual test.

Loops are tightly scoped around code being tested. For example, you might have 10,000 to 10,000,000 loops in a trial to obtain an accurate measurement of time spent per single loop.

Meta code to interpret 4-iterations at 10,000-loops:

    for 4 iterations {
        test 10,000 loops on this iteration
        test 10,000 loops on this iteration
        test 10,000 loops on this iteration
        test 10,000 loops on this iteration

Iterations and loops start with good intentions, enabling fine tuning of parameters. Unfortunately, they become complicated quickly, and required the most critical analysis in translating from the original project.

For starters, the terminology is confusing given they both speak of repeated operations: passes, trials, iterations, loops. These parameters also convolute configuration and constructors, where systems of defaults cascade. For example, iterations and loops defined in a test suite will be applied to a test unless the test overrides those values.

Fine tuning is the next obstacle - configuration of a specific case becomes moot when you compare targets such as neko versus native c++. Even if you obtain desirable thresholds for one platform, they will not apply to others.

Ideally, these terms could be eliminated entirely by auto-tuning parameters. Currently under prototype is a tuning mechanism to target 30 milliseconds of execution. This way, the core engine will auto-calculate appropriate values. Early prototypes are proving successful, pushing some c++ operations over 100,000,000 loops.

Next impact is the accuracy of results - when loops are tight around functionality, results are more consistent.

Ideally, tests would isolate small discreet functionality without concern of benchmarking implementation. For example:

    public function absoluteTest():Void {

This would also enable defining anonymous functions in method tests, as in:

    new MethodTest(function() { });

With this model, having the benchmark core execute each loop has some advantages - certain optimizations that skew results are eliminated by calling into the test method. However, there's greater deviation in results, not giving a fair result of functionality being tested.

Though tedious to include boilerplate, mandating awareness of test case assembly, there are other advantages to specifying loops in each test function. Besides the loop itself, it provides a secondary opportunity for initialization of variables or things like local functions.

    public function shiftLeftTest():Void {
        var n:Int = 0;

        for (i in 0 ... loops) {
            n = i << 1;

    public function referenceInstanceFunctionTest():Void {
        var fn:Dynamic = instanceFunction;
        for (i in 0 ... loops) {

Finally, passing parameters to a test function would require using reflection from the benchmark core engine, making for very heavy and slow test cycles. Using this model, the benchmark core engine would call the test run() function, which in turn would execute the test method.

Without parameters, it would be acceptable:

    override public function run():Void {

However with parameters, introduces too much weight and variance:

    override public function run():Void {
        Reflect.callMethod(this, method, params);

User interface

Currently this project presents results through logs; however, a user interface to select and execute individual tests with results displayed graphically is underway.