Graphick – Simple(r) graphing

For one of my courses, I’ve been drawing a load of graphs of a program’s performance; something I also had to do in a course last year. To say the least, it’s a bit of a nightmare.

What stood out to me was that most of the time, the process I was following was almost mechanical — run an application a bunch of times with different inputs, save a portion of the output somewhere, and then either write a script to parse the CSV and generate a graph, or throw it into Excel and generate one by hand. Sometimes I’d throw together a quick shell script to generate the initial data too, but either way it was a lot of context switching between different languages and if I wanted to regenerate the data after a change I also had to mess around to make sure the graph was redrawn.

As we know, though, everything is improved with large configuration files! When in history has a project started with a configuration file, and gradually became more and more complicated? Never, of course!

As a result, I thought it would be fun and helpful to develop a reasonably general utility for creating line graphs to analyse program data – whether it be temporal data for performance analysis, or just plotting the output with varying inputs.

Motivating spoiler: A graph which I generated for my coursework using Graphick

I set out with a few goals, and picked up a few more along the way:

  • It should be easy to write the syntax; preferably nicer than GNUPlot
  • It should be able to handle series of data — multiple lines
  • Any output data or input variable should be able to be on the X axis, the Y axis, or part of the series
  • The data series should be able to be varied by more than one variable – i.e. you might have, as depicted in the picture above, four lines which represent varying two different variables.
  • It should be extensible, so it can support new ways of data processing and rendering easily.
  • Data should be cached, so if the same graph is drawn it can be redrawn without re-running the program
    • Ideally, cache per-data-point, but the current implementation of Graphick just caches per-graph based on the variables and program. This can definitely be implemented in future though.

After a week of hacking, Graphick is the result. Graphick parses a simple file and generates graph(s) based on it.

Here’s a simple Graphick file:

command echo "6 * $num" | bc
title Multiples of six

varying envvar num sequence 1 to 10
data output

When you run Graphick with a file like this, it will proceed to generate the data to graph (or load it from a cache if it has previously generated it) by running the program for every combination of inputs.

Each line of a Graphick file, besides blank lines and comments (beginning with a %), represent some kind of directive to the Graphick engine. The most important directive is the command directive. This begins a new graph based on the command following it.

The text after command is what is executed. In this case, it’s actually a short shell script which pipes a short mathematical expression into bc, which is just a built-in calculator program on Unix. Most of the time, you’ll probably write something more like command ./myApplication $variable.

There are a number of ‘aesthetic’ directives – title, x_label, y_label, series_label. The only complicated one is series_label, which I’ll go into later. For the rest, the text following the directive is simply put where you’d expect on the graph.

The varying and data directives are the most important. varying allows you to specify which variables to run the program for. If you have two variables, which each have six values, then the program will be run with every combination of them — thirty six times. Right now, only environment variables are supported. You write varying envvar <name> <values>. Values can either be a numeric sequence (as in the above example) or a set of values. For example, sequence 1 to 5 or vals 1 2 4 8 15.

Data is the other important one. Only output is supported, currently, which corresponds to lines of stdout. You can also filter for columns, by adding a selection after the directive – for example, data output column 2 separator ,. This would get the second comma-separated column.

Another type of directive, which isn’t featured in this example, is filtering. If you have a program which outputs lots of lines, and you only care about a certain subset of them, you can filter them. There is more detail on this in the repository README, but suffice to say you can filter for columns of output data to be either in or not in a set of data, which can be defined either as a sequence or a set of values. The columns you filter on need not be selected as data, which means you can filter on data which isn’t presented on the graph.

Graphick files can contain multiple graphs by just adding more command directives. Currently, there is no way to share directives between them, so properties like title need to be set for each graph. Here’s an example of two graphs in a single file:

command echo "6 * $num" | bc
title Multiples of six
output six.svg

varying envvar num sequence 1 to 10
data output

command echo "12 * $num" | bc
title Multiples of twelve
output twelve.svg

varying envvar num sequence 1 to 10
data output

As you can see, there’s no need to do anything except add a new command directive – Graphick automatically blocks lines to the most recent command.

As an example of generating more complicated graphs, the graph I featured at the start of this post, which was for my coursework, was generated as follows:

command bin/time_fourier_transform "hpce.ad5615.fast_fourier_transform_combined"
title Comparison of varying both types of parallelism in FFT combined
output results/fast_fourier_recursion_versus_iteration.svg
x_label n
y_label Time (s)
series_label Loop K %d, Recursion K %d

varying series envvar HPCE_FFT_LOOP_K vals 1 16
varying series envvar HPCE_FFT_RECURSION_K vals 1 16

data output column 4 separator ,
data output column 6 separator ,

As you can see, adding the series modifier allows you to turn the variable into data which is used to plot lines, rather than as part of the X/Y axis. There must always be two non-series data sources (where a data source is either a data directive or a varying directive), and the first one always represents the X axis (the second the Y axis). You can have any number of series data sources, which combine in all combinations to create lines. In this graph, both variables take the values one and sixteen, to create four lines in total. The series_label directive takes a format string. The n-th formatting template (both %d in the string) indicates to put the value used for the n-th series variable at that position in the label.

Finally, there is one more directive which is useful: postprocessing. Postprocessing directives allow you to run arbitrary Ruby code to process the resultant data before it is rendered on the graph. Currently, only postprocessing the y axis is supported, but it would be straightforward to add support for postprocessing the x axis and series data. The postprocessing attributes are fed three variables – x, y, and s. x and y are the corresponding values for each data point, and s is an array of all series values at that point, ordered by the definition in the file. For example, if you wanted to normalise the y axis by a certain value, you might do this:

postprocess_y y / 2

Or, you might want to divide it by x and add a constant:

postprocess_y y / x + 5

Imagine the postprocess_y directive to be y = and this syntax should be reasonably intuitive.

So, in summary, Graphick is a somewhat powerful tool for generating of program graphs. You can plot multiple columns of the output, or run the program multiple times to generate multiple outputs — or maybe even a combination of both! Graphick should handle what you throw at it reasonably how you’d expect.

If you come across this, and have feature requests, drop a GitHub issue on the repository, or a comment on this post, and I’ll definitely consider implementing it – especially if it seems like something which is widely useful.

vTables and runtime method resolution

My first second year university group project recently came to an end, in which we had to implement a fully functional compiler that generated machine code from a relatively simplistic procedural language. It had functions, nested expressions, and a strong dynamic typing system. In addition to typical stack-stored primitive types such as int, char, and bool, it also had two dynamically allocated types; arrays and pairs.

For the final milestone of the compiler, we had two weeks to implement our choice of an extension. We chose, fairly ambitiously, to implement a number of extensions, one of which was a fully working Object Oriented system which I took most of the responsibility for implementing.

In the end, our OO system ended up supporting three main types of user-defined types: structs, interfaces, and classes. Structs were relatively simplistic – little more than a fixed-size pieces of memory on the heap. Interfaces allowed you to declare methods which a class implementing the interface would have to implement, and classes allowed you to define methods and fields together.

Our classes supported extending a single other class, and implementing any number of interfaces. Semantically, this worked nearly identical to in Java – in fact, our implementation ended up creating a language that was essentially a subset of Java. The implementation had what are often referred to as the three pillars of object oriented programming – inheritance, encapsulation (public, private and protected, where private was class-only and protected was class-and-subclass-only), and polymorphism (run-time method dispatch).

This post is primarily about how I structured objects on the heap to allow them to support the run-time method dispatch – this was something we found difficult to research, as most resources about vTables and the like tend to be about C++ and rarely are about the low-level implementation. We found that C++ compilers would often optimise away the vTables, even with optimisations turned off, making it very difficult to analyse the ones it generated. As a result of these, I decided to write a summary of how I went about implementing it, in the hope that it is useful to others.

The system I ended up settling with results in a tiny amount of extra overhead for method calls on classes, but adds a significant overhead on calls to interfaces. This is something that I am sure is not done optimally, and as such I certainly do not wholeheartedly recommend this as a perfect way of laying out objects on the heap.

First, to consider why this is not a trivial problem, consider a basic example:

class A { void printAString() { print "A string" } }
class B extends A { void printAString() { print "Another string" } void anotherFunction() { print "Hello, World!" } }
void main() {
    A a = new B();
    a.printAString();
}

A naïve implementation of this would print out something that may be unexpected if you come from a background that leads you to expect the kind of behaviour that run-time dispatch allows, which is “A string”, rather than “Another string”. This comes about as the call to “a”  seems, to the compiler, to be acting on an A, not a B. This could theoretically be avoided by intelligently (as far as a compiler is concerned) noticing that it’s instantiated as a B, but this causes some problems – what if you have a function accepting an A as a parameter, or you instantiate the A in two mutually exclusive branches differently (for example an “if” that sets it to a B, but an “else” that sets it to an A)? These problems, so far as I am aware, make it nigh-on impossible to implement it with compile-time code analysis (Though, theoretically, you could generate a huge file that accounts for every branching possibility and expands them cleverly, this would certainly not be viable for large files).

So the way I solved this, and indeed, I believe, the way it is typically solved, is to create a table of methods for each object at compile-time, which is included in the assembly code generated (and hence the runtime, in some form). I implemented it in a fashion such that every object would have a table for itself, and an additional one for each object it “is” – that is, that it extends or implements. In the example above, the following might be generated:

A:
 o_A_printAString
B:
 o_B_printAString
 o_B_anotherFunction

Function labels here are named as “o_ImplementingType_functionName”. As you can see, the superclass function is in the same location in both tables. This means that, were a function calling a method on an object typed as A to use this as a layer of indirect to access A’s method, then it would be “tricked” into calling B’s version instead.

We then stored objects on the heap similarly to the instance of B that follows:

B's type identification number
B's vTable address
A's fields
B's fields
Interface information about B's implemented interfaces

To call a method on an instance of a class, you would have to navigate a word down from the object pointer, load the vTable pointer stored, use this to load the vTable, then add the offset of the method (this is known at compile time – for example, “printAString” will always be the zeroth method in the vTable). Then, load the value stored here into the program counter to jump to the method.

This is trickier for interfaces – since an object can implement many, we can’t just have a fixed position in the vTable for each method. There is undoubtedly a better way to do this, but I chose to put a few pieces of data about them at the end of the object. For each interface, the following was stored:

The interface's type identification number
The address of a custom version of this interface's vTable with correctly overloaded methods

Additionally, at the end of an object, a zero is stored, which denotes the end of the object. Interface vTables are looked up by a simple, but fairly significant overhead-adding algorithm: look to the interface section with the interface offset information stored at the beginning of the object, and then repeatedly jump two words until either the interface identification number you want is found, or you reach a zero (The zero indicating an explicit cast has failed – it should never occur on an implicit cast). Once found, jump a single word to get the vTable of the interface, then look down a fixed offset to find the method you are interested in calling. The overhead added by this is somewhere in the line of 6 instructions, plus potentially up to four times the total number of interfaces (depending on how far down the one you are interested in is in the list). This is clearly suboptimal, and an approach I considered is, when an object is cast to an interface, moving it’s pointer down to the start of the relevant interface, but this would have been considerably more difficult to integrate into the codebase, as all implicit cast locations would need to be considered.

There is undoubtedly a better way to handle interfaces – I am very interested in finding this out; feel free to contact me if you know of a more optimal way or are wondering about anything I have explained above.

Ruby, a few months on

Sidenote: I’ve pretty much dumped my Thing of the Month plans, because they proved to be too difficult to balance with university work and general life, where I’m trying to branch out more and also trying to be more active in game development. That said, I’m still always trying to learn new things in the software engineering field, as I always have; but just in a less forced and artificial way, which I think does not work as well for me. I’m looking into Kotlin right now, and may put up a post about it sometime soon.

Since I posted about learning Ruby, I think I’m getting rather good at it. My most recent project was a hundred or so line integration test runner for a university compiler project written in Java. It executes the project with different test files, checking the output is as expected, all the while producing a nice output, which overwrites a line in the terminal to give an updating appearance without spamming it. It then proceeds to allow manual test verification, where you can see source code, and the error the compiler produced, to manually verify if the error produced looks sensible and understandable.

Soon, we realised that running such a complicated Java program over 250 times was slow, so I looked into multithreading the script. I was pleasantly surprised by just how easy it was to integrate concurrency into my test runner, and it essentially consisted of two additions; wrapping the main logic in Thread.new‘s block, and then storing that in a list, making up to 8 (Though this is variable by a command line parameter) threads, and waiting for the oldest one to finish before making a new one.

I’ve also started coding a little game akin to how I remember The Hobbit, a lovely game I used to play on a ZX Spectum emulator when I was pretty young. I emphasise how I remember it, because I recently watched a playthrough and my memories weren’t very reliable, but the main thing that my younger self found attractive was the method of input – you would type something, like “light the fire,” or “kick Gandalf,” and it was like magic – it seemed like it always had a well-written reaction to anything you could imagine, and I can remember being really interested in knowing how it worked. I think I’ve got a rather sensible approach to mimicking it, but I don’t think I could ever hope achieve the same kind of magic I felt playing that game. Wikipedia is fairly complimentary to it’s approach:

The parser was very advanced for the time and used a subset of English called Inglish.[5][6] When it was released most adventure games used simple verb-noun parsers (allowing for simple phrases like ‘get lamp’), but Inglish allowed one to type advanced sentences such as “ask Gandalf about the curious map then take sword and kill troll with it”. The parser was complex and intuitive, introducing pronouns, adverbs (“viciously attack the goblin”), punctuation and prepositions and allowing the player to interact with the game world in ways not previously possible.

https://en.wikipedia.org/wiki/The_Hobbit_(1982_video_game)

Anyone who’s interested in retro games, I’d heavily recommend it. It’s truly something I wouldn’t have thought would have existed at the time if I didn’t know about it.

All this is to say that, despite my initial doubts about how long it would last, I still really do love the language, and it’ll definitely be one of my first choices for future projects. I’d probably lean towards C#, C++ or Java for any game that requires better performance, but for most other things I think Ruby is going to indefinitely be one of my favourite choices.