Graphick – Simple(r) graphing

For one of my courses, I’ve been drawing a load of graphs of a program’s performance; something I also had to do in a course last year. To say the least, it’s a bit of a nightmare.

What stood out to me was that most of the time, the process I was following was almost mechanical — run an application a bunch of times with different inputs, save a portion of the output somewhere, and then either write a script to parse the CSV and generate a graph, or throw it into Excel and generate one by hand. Sometimes I’d throw together a quick shell script to generate the initial data too, but either way it was a lot of context switching between different languages and if I wanted to regenerate the data after a change I also had to mess around to make sure the graph was redrawn.

As we know, though, everything is improved with large configuration files! When in history has a project started with a configuration file, and gradually became more and more complicated? Never, of course!

As a result, I thought it would be fun and helpful to develop a reasonably general utility for creating line graphs to analyse program data – whether it be temporal data for performance analysis, or just plotting the output with varying inputs.

Motivating spoiler: A graph which I generated for my coursework using Graphick

I set out with a few goals, and picked up a few more along the way:

  • It should be easy to write the syntax; preferably nicer than GNUPlot
  • It should be able to handle series of data — multiple lines
  • Any output data or input variable should be able to be on the X axis, the Y axis, or part of the series
  • The data series should be able to be varied by more than one variable – i.e. you might have, as depicted in the picture above, four lines which represent varying two different variables.
  • It should be extensible, so it can support new ways of data processing and rendering easily.
  • Data should be cached, so if the same graph is drawn it can be redrawn without re-running the program
    • Ideally, cache per-data-point, but the current implementation of Graphick just caches per-graph based on the variables and program. This can definitely be implemented in future though.

After a week of hacking, Graphick is the result. Graphick parses a simple file and generates graph(s) based on it.

Here’s a simple Graphick file:

command echo "6 * $num" | bc
title Multiples of six

varying envvar num sequence 1 to 10
data output

When you run Graphick with a file like this, it will proceed to generate the data to graph (or load it from a cache if it has previously generated it) by running the program for every combination of inputs.

Each line of a Graphick file, besides blank lines and comments (beginning with a %), represent some kind of directive to the Graphick engine. The most important directive is the command directive. This begins a new graph based on the command following it.

The text after command is what is executed. In this case, it’s actually a short shell script which pipes a short mathematical expression into bc, which is just a built-in calculator program on Unix. Most of the time, you’ll probably write something more like command ./myApplication $variable.

There are a number of ‘aesthetic’ directives – title, x_label, y_label, series_label. The only complicated one is series_label, which I’ll go into later. For the rest, the text following the directive is simply put where you’d expect on the graph.

The varying and data directives are the most important. varying allows you to specify which variables to run the program for. If you have two variables, which each have six values, then the program will be run with every combination of them — thirty six times. Right now, only environment variables are supported. You write varying envvar <name> <values>. Values can either be a numeric sequence (as in the above example) or a set of values. For example, sequence 1 to 5 or vals 1 2 4 8 15.

Data is the other important one. Only output is supported, currently, which corresponds to lines of stdout. You can also filter for columns, by adding a selection after the directive – for example, data output column 2 separator ,. This would get the second comma-separated column.

Another type of directive, which isn’t featured in this example, is filtering. If you have a program which outputs lots of lines, and you only care about a certain subset of them, you can filter them. There is more detail on this in the repository README, but suffice to say you can filter for columns of output data to be either in or not in a set of data, which can be defined either as a sequence or a set of values. The columns you filter on need not be selected as data, which means you can filter on data which isn’t presented on the graph.

Graphick files can contain multiple graphs by just adding more command directives. Currently, there is no way to share directives between them, so properties like title need to be set for each graph. Here’s an example of two graphs in a single file:

command echo "6 * $num" | bc
title Multiples of six
output six.svg

varying envvar num sequence 1 to 10
data output

command echo "12 * $num" | bc
title Multiples of twelve
output twelve.svg

varying envvar num sequence 1 to 10
data output

As you can see, there’s no need to do anything except add a new command directive – Graphick automatically blocks lines to the most recent command.

As an example of generating more complicated graphs, the graph I featured at the start of this post, which was for my coursework, was generated as follows:

command bin/time_fourier_transform "hpce.ad5615.fast_fourier_transform_combined"
title Comparison of varying both types of parallelism in FFT combined
output results/fast_fourier_recursion_versus_iteration.svg
x_label n
y_label Time (s)
series_label Loop K %d, Recursion K %d

varying series envvar HPCE_FFT_LOOP_K vals 1 16
varying series envvar HPCE_FFT_RECURSION_K vals 1 16

data output column 4 separator ,
data output column 6 separator ,

As you can see, adding the series modifier allows you to turn the variable into data which is used to plot lines, rather than as part of the X/Y axis. There must always be two non-series data sources (where a data source is either a data directive or a varying directive), and the first one always represents the X axis (the second the Y axis). You can have any number of series data sources, which combine in all combinations to create lines. In this graph, both variables take the values one and sixteen, to create four lines in total. The series_label directive takes a format string. The n-th formatting template (both %d in the string) indicates to put the value used for the n-th series variable at that position in the label.

Finally, there is one more directive which is useful: postprocessing. Postprocessing directives allow you to run arbitrary Ruby code to process the resultant data before it is rendered on the graph. Currently, only postprocessing the y axis is supported, but it would be straightforward to add support for postprocessing the x axis and series data. The postprocessing attributes are fed three variables – x, y, and s. x and y are the corresponding values for each data point, and s is an array of all series values at that point, ordered by the definition in the file. For example, if you wanted to normalise the y axis by a certain value, you might do this:

postprocess_y y / 2

Or, you might want to divide it by x and add a constant:

postprocess_y y / x + 5

Imagine the postprocess_y directive to be y = and this syntax should be reasonably intuitive.

So, in summary, Graphick is a somewhat powerful tool for generating of program graphs. You can plot multiple columns of the output, or run the program multiple times to generate multiple outputs — or maybe even a combination of both! Graphick should handle what you throw at it reasonably how you’d expect.

If you come across this, and have feature requests, drop a GitHub issue on the repository, or a comment on this post, and I’ll definitely consider implementing it – especially if it seems like something which is widely useful.

Ruby, a few months on

Sidenote: I’ve pretty much dumped my Thing of the Month plans, because they proved to be too difficult to balance with university work and general life, where I’m trying to branch out more and also trying to be more active in game development. That said, I’m still always trying to learn new things in the software engineering field, as I always have; but just in a less forced and artificial way, which I think does not work as well for me. I’m looking into Kotlin right now, and may put up a post about it sometime soon.

Since I posted about learning Ruby, I think I’m getting rather good at it. My most recent project was a hundred or so line integration test runner for a university compiler project written in Java. It executes the project with different test files, checking the output is as expected, all the while producing a nice output, which overwrites a line in the terminal to give an updating appearance without spamming it. It then proceeds to allow manual test verification, where you can see source code, and the error the compiler produced, to manually verify if the error produced looks sensible and understandable.

Soon, we realised that running such a complicated Java program over 250 times was slow, so I looked into multithreading the script. I was pleasantly surprised by just how easy it was to integrate concurrency into my test runner, and it essentially consisted of two additions; wrapping the main logic in Thread.new‘s block, and then storing that in a list, making up to 8 (Though this is variable by a command line parameter) threads, and waiting for the oldest one to finish before making a new one.

I’ve also started coding a little game akin to how I remember The Hobbit, a lovely game I used to play on a ZX Spectum emulator when I was pretty young. I emphasise how I remember it, because I recently watched a playthrough and my memories weren’t very reliable, but the main thing that my younger self found attractive was the method of input – you would type something, like “light the fire,” or “kick Gandalf,” and it was like magic – it seemed like it always had a well-written reaction to anything you could imagine, and I can remember being really interested in knowing how it worked. I think I’ve got a rather sensible approach to mimicking it, but I don’t think I could ever hope achieve the same kind of magic I felt playing that game. Wikipedia is fairly complimentary to it’s approach:

The parser was very advanced for the time and used a subset of English called Inglish.[5][6] When it was released most adventure games used simple verb-noun parsers (allowing for simple phrases like ‘get lamp’), but Inglish allowed one to type advanced sentences such as “ask Gandalf about the curious map then take sword and kill troll with it”. The parser was complex and intuitive, introducing pronouns, adverbs (“viciously attack the goblin”), punctuation and prepositions and allowing the player to interact with the game world in ways not previously possible.

https://en.wikipedia.org/wiki/The_Hobbit_(1982_video_game)

Anyone who’s interested in retro games, I’d heavily recommend it. It’s truly something I wouldn’t have thought would have existed at the time if I didn’t know about it.

All this is to say that, despite my initial doubts about how long it would last, I still really do love the language, and it’ll definitely be one of my first choices for future projects. I’d probably lean towards C#, C++ or Java for any game that requires better performance, but for most other things I think Ruby is going to indefinitely be one of my favourite choices.

Ruby, and why it quickly became my favourite language

I’d be taken by surprise if I were told by someone that their favourite programming language is one that they’d never written more than a few lines of code in, but that’s my situation right now. Due to a number of unrelated circumstances, I’ve been unable to install and use Ruby, but I’ve been reading a book I obtained recently – Eloquent Ruby, by Russ Olsen – which I’ve found to be a fantastic read. I’m currently about halfway through, and am almost certainly going to pick up some other books in the series – Design Patterns in Ruby, also by Russ, Practical Object-Oriented Design in Ruby, and, (though this isn’t in the same series of books), when it’s released, Agile Web Development in Rails 5.

So far, I’ve learnt that Ruby seems to be exactly how I want a programming language to be – very consistent, intuitive, expressive, and clean. As a short history, I began programming in Lua. At the time, I was pretty young – either eight or nine – and didn’t quite grasp the fundamentals of how a programming language is written. I could write code, but it was a short while before I realised that essentially everything was an expression, which could be nested and used in funky ways – meaning I could write lines like (Not that I’m advocating this style, of course)

tab[index + 3] = get_variable(get_function()({ [“a”] = 5}));

Or to realise that the functions provided to me by my environment (At the time, Roblox), such as their event system, where you’d subscribe a listener in a manner similar to:

object.event:connect(function() … code … end);

Were often something I could manufacture myself, by making a table (Vaguely similar to a hash and an array butchered and stitched together), with a function called “connect” that accepts a function as it’s parameter. These kinds of complex nested expressions and the use of closures and anonymous functions hadn’t really made much of an impression on me, and the higher-level constructs I was using merely felt like a black magic that just worked. Once I realised this, I gradually drifted to feeling like Lua wasn’t ideal for many things – both in terms of speed, and a limited syntax (allowing for some incredible OO systems such as MiddleClass, but still falling short of true OO languages).

I then transitioned to writing object oriented code with C# and Java, languages that, of course, have methods coupled with data, so I could write code that kept functionality with it’s associated data; something that felt sensible and correct. I still wasn’t completely satisfied, though. While “everything” (for the most part) was an object, there were still things that I felt I should be able to do that I couldn’t. Primitives, for example, are essentially special cases, and although autoboxing is nice, it’s a bit clunky. While C# hides the detail better than Java (Array<int>, anyone?), it’s still got its own problems.

Another two languages I’ve used widely are PHP (boo!) and Python. With these, I loved how it was object oriented, but you could flexibly pass objects around. I still prefer static typing, and I do think it’s often more optimal for larger projects (if only so your editor can be a bit lot more intelligent; there’s sometimes type annotations, but they always seemed like a poor man’s static typing to me), but I think dynamic typing can, when used well, be a great convenience.

I had a placement this summer at Netcraft, an internet services company in Bath. It introduced me to Perl, a language which I’d heard about but never really been interested in – my first year of university was my first serious venture into the Unix world, and I’d spent most of that trying to hide from calculus and trigonometry, while trying to improve my ability with Haskell, a language we used in our first term.

At first, I really didn’t like Perl. I’m still not overly fussed, but it managed to persuade me and I ended up writing a few scripts in it at home. I find a few things about it rather annoying – it’s inconsistent, too many things have unexpected side effects or sets special variables, there’s too many ways of expressing the same idea, the object system is not just unintuitive, but feels completely hacked on (Though, in fairness, Moose fixes this, I disagree with the principle that you should have to use a library for something like this). I find it ridiculous that it took decades to add method signatures, and even now they’re considered experimental! I don’t like how I have to think for ages before I can even begin to get the length of an array in a data structure, and when I do, I end up producing code that looks a bit like this:

scalar @{$structure->[{[@@{$}->{a}->$@{ } ]]] ) }->{key}->[3]->{3}}

I jest, but this is certainly how it feels. Even if the speed of thinking comes with practice, it’s still a bit gross how many extra symbols I need to access children of arrayrefs and hashrefs, compared to other languages where you can just nest these kinds of structures effortlessly without thought. Even some of the dedicated Perl community seems to agree here – Perl 6 eliminates a lot of the variation in the symbols to make them at the very least more consistent.

But there are also a lot of things I love about it, and wish more other languages I use had: statement modifiers are a big one. For context, a statement modifier lets you suffix any statement with a little expression like:

print “hello” if should_print_hello;

For some reason, this seems infinitely more elegant than a faux statement modifier in C#, which would be:

if (should_print_hello) print “hello”;

Realistically, they’re similar, but the latter feels more bulky, doesn’t read as well, and I’m not particularly keen on if statements with omitted curly braces.

All that said, whenever I have a basic task to do, my first thought is “Hey, I could write a 10 line Perl script to do this!”. The Perl community isn’t lying when it says it makes the “easy things easy” – it really does. This is something I’ve heard almost unanimously from all of my intern colleagues; most of us seem to harbour some level of disdain for Perl, but still want to use it a lot, because it’s just that damn easy. It’s like an infection that grows on you, presumably eventually turning you into a fully fledged Perl monk before you go to live in a monastery and dedicate your life to answering questions on http://perlmonks.org/.

Ruby is a language I’ve wanted to learn ever since I got pretty good at Lua and decided to move to greener pastures. It was for a completely superficial reason: I thought it’s website was really well designed. Looking at http://lua.org/ and comparing it to http://ruby-lang.org/, you can probably see why I thought this.

For some reason, I slowly became under the impression that I didn’t like the look of Ruby, despite never taking a decent look, and avoided it. Until I decided to learn a new thing, and made it Ruby, after seeing a colleague writing some Rails code.

Soon after looking into it, I realised something: Ruby seemed to be just as good a language for writing quick scripts to solve problems as Perl, a trivially superior one for web development thanks to Rails and Sinatra, and seemed to take all the nice features, like statement modifiers, but wrap it in with a very consistent object oriented approach, where literally everything behaves like an object. “hello”.upcase? Well, “HELLO”, of course – no syntax errors to be found here!

I love the loop structures, and how enumerating is handled so elegantly. I love how there’s a culture of writing DSLs (Though the term is used very loosely) to do all sorts of things, from testing, to make tools. Everything I’ve read about the language makes me itch to rewrite all of my code-bases in it, but I think I’ll just settle for using it for personal systems administration and future website development.

No doubt this is just some kind of initial language infatuation, and it may pass, but right now, Ruby is my favourite language, and I’ve yet to even use it properly.

Thing of the Month 1: Ruby on Rails

I’m trying to learn a new thing (language or framework) every month. Each time, I’d like to begin by answering What, Why, Prior Experience, What, Why, and Compromises. Respectively, those are: What language and why, previous experience I have that I think is relevant, what do I hope to build in the process and why, and any compromises I expect I may have to make to succeed. I’m open to varying what I plan to build through the month if I decide what I chose was too optimistic (or even not optimistic enough), or if I have a particularly busy month and don’t find enough time to learn my Thing of the Month.


Month 1: September 2016

What?

I’m going to try to learn two things: Ruby and Rails. I’m cheating a little, because it’s technically still August, but I think I can forgive myself.

Why?

I’ve seen a lot of Ruby and always wanted to give it a try, and I think that it’s important for me to vary my server-side technologies more, as I’ve not used ASP.NET for a long time, and so am mostly limited to PHP, which is something I would like to change going forward.

Prior Experience?

I’ve written MVC code on top of ASP.NET and CakePHP before; in fact, my current main web project, Gamer-Island, is written on top of CakePHP. This should make learning the Rails aspect much simpler. A strong background in scripting languages should assist in learning Ruby. Overall, I think prior experience will make it easier, but certainly not easy, to gain a degree of fluency in Ruby on Rails.

What?

I’d like to remake an old project of mine, which was a Minecraft server administration panel. Servers would get their own subdomain, and it’d monitor users, chat, and logs, which could then be accessed by staff of the server. Wish this, they could then, for example, issue time bans on users and associate the bans with chat messages, meaning server owners and admins can keep track of bans to ensure they are all fair.

Why?

I used to run a Minecraft server (running the Tekkit modpack). I stopped (due to a change in the EULA not allowing donations in exchange for in-game items on servers, which previously made the server self-sustaining), but I still think there is potential in this idea. I found, during my time running it, that it was difficult to find ‘staff’ who could be trusted to be cool-headed and fair in all situations. Initially, the panel was to ensure my own server’s staff had to provide evidence with their actions, but soon I realised other servers would likely be suffering the same issues. Additionally, Minecraft server plugins all tend to log in their own funky ways. If you don’t capture their messages at run-time, and parse them into a standard form, then the information is dumped in a log file full of a jumble of all different logging formats.

Additionally, I think this provides an ample challenge, as it will require well configured routing, lots of AJAX while maintaining a secure front against CSRF attacks, and configurable levels of access.

Compromises?

I suspect I will have to compromise on the core of the application: I do not believe I will have time to write a Java plugin to hook into servers and securely communicate with the admin panel, uploading user chat, logged information, and others. Instead, I will focus on writing the Ruby end, which would be the web front to the data, and the API for uploading data.

Then, in a future Thing of the Month, I have the option of writing a Minecraft server plugin in Java to upload this data, and create a fully-functioning product.