Graphick – Simple(r) graphing

For one of my courses, I’ve been drawing a load of graphs of a program’s performance; something I also had to do in a course last year. To say the least, it’s a bit of a nightmare.

What stood out to me was that most of the time, the process I was following was almost mechanical — run an application a bunch of times with different inputs, save a portion of the output somewhere, and then either write a script to parse the CSV and generate a graph, or throw it into Excel and generate one by hand. Sometimes I’d throw together a quick shell script to generate the initial data too, but either way it was a lot of context switching between different languages and if I wanted to regenerate the data after a change I also had to mess around to make sure the graph was redrawn.

As we know, though, everything is improved with large configuration files! When in history has a project started with a configuration file, and gradually became more and more complicated? Never, of course!

As a result, I thought it would be fun and helpful to develop a reasonably general utility for creating line graphs to analyse program data – whether it be temporal data for performance analysis, or just plotting the output with varying inputs.

Motivating spoiler: A graph which I generated for my coursework using Graphick

I set out with a few goals, and picked up a few more along the way:

  • It should be easy to write the syntax; preferably nicer than GNUPlot
  • It should be able to handle series of data — multiple lines
  • Any output data or input variable should be able to be on the X axis, the Y axis, or part of the series
  • The data series should be able to be varied by more than one variable – i.e. you might have, as depicted in the picture above, four lines which represent varying two different variables.
  • It should be extensible, so it can support new ways of data processing and rendering easily.
  • Data should be cached, so if the same graph is drawn it can be redrawn without re-running the program
    • Ideally, cache per-data-point, but the current implementation of Graphick just caches per-graph based on the variables and program. This can definitely be implemented in future though.

After a week of hacking, Graphick is the result. Graphick parses a simple file and generates graph(s) based on it.

Here’s a simple Graphick file:

command echo "6 * $num" | bc
title Multiples of six

varying envvar num sequence 1 to 10
data output

When you run Graphick with a file like this, it will proceed to generate the data to graph (or load it from a cache if it has previously generated it) by running the program for every combination of inputs.

Each line of a Graphick file, besides blank lines and comments (beginning with a %), represent some kind of directive to the Graphick engine. The most important directive is the command directive. This begins a new graph based on the command following it.

The text after command is what is executed. In this case, it’s actually a short shell script which pipes a short mathematical expression into bc, which is just a built-in calculator program on Unix. Most of the time, you’ll probably write something more like command ./myApplication $variable.

There are a number of ‘aesthetic’ directives – title, x_label, y_label, series_label. The only complicated one is series_label, which I’ll go into later. For the rest, the text following the directive is simply put where you’d expect on the graph.

The varying and data directives are the most important. varying allows you to specify which variables to run the program for. If you have two variables, which each have six values, then the program will be run with every combination of them — thirty six times. Right now, only environment variables are supported. You write varying envvar <name> <values>. Values can either be a numeric sequence (as in the above example) or a set of values. For example, sequence 1 to 5 or vals 1 2 4 8 15.

Data is the other important one. Only output is supported, currently, which corresponds to lines of stdout. You can also filter for columns, by adding a selection after the directive – for example, data output column 2 separator ,. This would get the second comma-separated column.

Another type of directive, which isn’t featured in this example, is filtering. If you have a program which outputs lots of lines, and you only care about a certain subset of them, you can filter them. There is more detail on this in the repository README, but suffice to say you can filter for columns of output data to be either in or not in a set of data, which can be defined either as a sequence or a set of values. The columns you filter on need not be selected as data, which means you can filter on data which isn’t presented on the graph.

Graphick files can contain multiple graphs by just adding more command directives. Currently, there is no way to share directives between them, so properties like title need to be set for each graph. Here’s an example of two graphs in a single file:

command echo "6 * $num" | bc
title Multiples of six
output six.svg

varying envvar num sequence 1 to 10
data output

command echo "12 * $num" | bc
title Multiples of twelve
output twelve.svg

varying envvar num sequence 1 to 10
data output

As you can see, there’s no need to do anything except add a new command directive – Graphick automatically blocks lines to the most recent command.

As an example of generating more complicated graphs, the graph I featured at the start of this post, which was for my coursework, was generated as follows:

command bin/time_fourier_transform "hpce.ad5615.fast_fourier_transform_combined"
title Comparison of varying both types of parallelism in FFT combined
output results/fast_fourier_recursion_versus_iteration.svg
x_label n
y_label Time (s)
series_label Loop K %d, Recursion K %d

varying series envvar HPCE_FFT_LOOP_K vals 1 16
varying series envvar HPCE_FFT_RECURSION_K vals 1 16

data output column 4 separator ,
data output column 6 separator ,

As you can see, adding the series modifier allows you to turn the variable into data which is used to plot lines, rather than as part of the X/Y axis. There must always be two non-series data sources (where a data source is either a data directive or a varying directive), and the first one always represents the X axis (the second the Y axis). You can have any number of series data sources, which combine in all combinations to create lines. In this graph, both variables take the values one and sixteen, to create four lines in total. The series_label directive takes a format string. The n-th formatting template (both %d in the string) indicates to put the value used for the n-th series variable at that position in the label.

Finally, there is one more directive which is useful: postprocessing. Postprocessing directives allow you to run arbitrary Ruby code to process the resultant data before it is rendered on the graph. Currently, only postprocessing the y axis is supported, but it would be straightforward to add support for postprocessing the x axis and series data. The postprocessing attributes are fed three variables – x, y, and s. x and y are the corresponding values for each data point, and s is an array of all series values at that point, ordered by the definition in the file. For example, if you wanted to normalise the y axis by a certain value, you might do this:

postprocess_y y / 2

Or, you might want to divide it by x and add a constant:

postprocess_y y / x + 5

Imagine the postprocess_y directive to be y = and this syntax should be reasonably intuitive.

So, in summary, Graphick is a somewhat powerful tool for generating of program graphs. You can plot multiple columns of the output, or run the program multiple times to generate multiple outputs — or maybe even a combination of both! Graphick should handle what you throw at it reasonably how you’d expect.

If you come across this, and have feature requests, drop a GitHub issue on the repository, or a comment on this post, and I’ll definitely consider implementing it – especially if it seems like something which is widely useful.

Array length in high-level languages

While pondering writing a standard library for a language I’ve written for my RPG engine, I’ve been stuck considering a question I find pretty interesting – how can one allow a standard library of a language, and ideally only the standard library, to do special operations, like direct memory accesses? Specifically, in my instance, array length accesses, but I was hoping for a more general philosophy.

First, let’s consider the problem of getting the length of an array. In my language, as in many, arrays are a first-class construct, and their allocated length is written in the first word of the array in memory. As a result, getting the length of the array involves reading the word of memory pointed at by the array’s reference.

For example, suppose we had an array of five integers. This would occupy six words of memory; the initial set to the length (5), and then the five elements, all initialised to zero.

This is where the problem arises for my standard library: the language allows  users to index the array in a fairly typical manner – x[0] through x[5] – but these do not correspond directly to the memory accesses (in fact, the memory accesses are all shifted upwards by one to skip the length). As a result, it is not possible to access the array’s length directly through the language.

A trivial solution would be to implement some special new syntax – length x – which generates a direct memory read to the array’s address, hence returning the length. But that’s no fun – it involves making the parser more complicated, adding a special case to code generation, and causes what I would call a “break in immersion” when coding – it’s one less thing that is intuitive and natural to users, when they can do array.sort() but not array.length(). Taking this vein of thought further, we could instead parse it as normal, and hijack it during code generation – if we’re generating code for a method call on an array, then we don’t try generating a classic method call, but instead directly output memory access code.

This approach has many benefits, in that it’s trivial to implement, and doesn’t add any special cases to parsing (just code generation), or increase the mental load for users too much. Essentially, to end users, this is a fairly seamless approach, but it still leaves something to be desired – now some array logic, such as sorting, is encoded in higher level code, but some is just hard-coded into the compiler.

Maybe that’s an acceptable loss, but I was still interested in how other languages had solved this problem, so I looked into how Java and C# solved this issue.

Java

Java seems to solve this by having a dedicated JVM instruction called arraylength – this is along the lines of what I was saying above, where the compiler hijacks what syntactically looks like a field access. Syntactically, it is next to identical to your average field, but you can use reflection to prove that it’s not actually a field.

C#

C# seems to take a very similar approach to Java (unsurprisingly, given the similarity between the two), with a CIL instruction ldlen (this article http://mattwarren.org/2017/05/08/Arrays-and-the-CLR-a-Very-Special-Relationship/ is a goldmine for related information)

Summary

I really intended to look into quite a lot more languages – specifically Python, Ruby and Lua, but didn’t really have time. Digging through the Python compiler to find the answer was taking me quite a long time. If anyone stumbles upon this and happens to know how they handle it, I’d love a comment.

It does seem like the mainstream approach is just a special case in code generation, though. Personally, I was expecting an approach where verified library code would be able to hold lower-level code in it (like inline assembly in C) to avoid this, but this seems like quite an overkill feature in retrospect.

Film review – Hereditary

This post contains fairly mild spoilers for the film Hereditary. If you hate spoilers, look away!

As a pretty avid horror viewer, I’ve been pretty excited about the release of Hereditary for quite a while. Besides A Quiet Place, which I thought was fantastic, I haven’t seen a good horror film in the cinema for quite a long time, and the reviews I’d seen leading up to the release were all very positive.

That in mind, I went in with fairly high expectations, and coming out I can see pretty well why the reviews are so torn – most viewers seem fairly disappointed, and most critics are raving. The film was certainly great from an artistic perspective, and it told a story without spoon-feeding you what was going on, which is something the horror genre sorely lacks in my view – though I feel it’s quite a borderline horror film and does coast more towards being a mystery drama, which is what Google reports as its genre. Unfortunately, and I’m not sure if this hurt it for all viewers or just me, it was one I feel you have to put serious concentration into, but which doesn’t necessarily invite that throughout.

For example, there were scenes where very serious issues were brought up, but in which the dialog was less than stellar and delivered slightly ineffectually. In fairness, the most memorable of these occurred in a dream, so I do wonder if this was intentional on some level, however it caused the audience to laugh, which broke the tension the scene had been otherwise building.

Actually, audience reaction was a problem throughout. Perhaps it’s my fault for heading over at 7PM on a weekday night, but the cinema had groups of people who laughed during tense moments, made audible jokes, and imitated the characteristic tongue click of the daughter. This was pretty distracting and caused me to lose focus on moments which, in retrospect, were important to pick up on to get the full experience and piece together what was going on during the film. I’ve read a few reviews that mentioned similar problems, and think that in part this is enabled by the film – the aforementioned clicking is repeated just enough to make it memorable (intentionally, of course), and there were plenty of slow, dramatic pans of the camera enabling people to take full advantage of loudly copying this in the cinema.

I’ve focused on this so much because I don’t think other suspenseful films suffer from this problem nearly as badly. In A Quiet Place, for example, the tense scenes really are edge-of-your-seat moments, with a constant fear that any second could be the last for the characters — the precedent of serious danger for this is set up immediately at the beginning of the film, which ensures the audience has no misconception as to how dangerous and tense the setting is: there’s real danger from the start. In contrast, there were many tense scenes in Hereditary which didn’t have a very strong payoff. I don’t think this, in and of itself, is a bad thing, but it can definitely lead to some of the tension being taken less seriously.


Small spoilers follow

Putting that aside, the story was pretty great, but I think the characters could’ve done with some work. The father was the cliché supernatural skeptic who, despite having evidence placed directly in his lap, refuses to believe anything out of the ordinary is going on. The mother was the cliché main character who alternates between seeming fairly put together and completely fallen apart. I think her character was one of the more believable and realistic, but with scenes leading me to not know for sure whether she was a protagonist or an antagonist, it was hard to empathise with her. Altogether, the cast had very few empathetic characters – the son and daughter both lacked a strong personality and seemed detached, the mother and father felt too predictable, and there weren’t enough recurring cast members to cast them against, besides a few students who aren’t fleshed out beyond the fact that they take drugs. That said, Joan, one of the supporting characters, who appears about halfway through the film, is played brilliantly and was one of the better characters I felt – it would’ve been great if she’d had more screentime.

To be clear, I don’t think these characters on their own are written badly or lazily – these clichés are such because they’re, I imagine, fairly realistic reactions to what the characters are going through – but it made too many of the suspenseful scenes too predictable and hence detracted from a lot of the film for me.


More major spoilers follow

Focusing on the story, there were a few issues which really stuck out to me. The first was that around mid-way through, the son accidentally kills his sister while frantically driving her to the hospital while apparently high on marijuana by swerving past a post when she has her head out the window. When they move onto the police investigation of this, the believability was ruined for me – because there… wasn’t one? They have a scene for the funeral, so time has definitely progressed sufficiently enough so that there should’ve been one, but apparently this film exists in a universe where manslaughter and reckless & intoxicated driving aren’t crimes worth investigating.

Another scene that bothered me, albeit to nowhere near the same degree, is when the mother’s sleeve lit on fire and not only did she only realise about ten to twenty seconds after the fire started, but she then attempted to put it out by ineffectually brushing part of the sleeve that wasn’t visibly on fire. Sure, it turns out that the fire was supernatural in some way and probably hence not extinguishable directly, but she didn’t know this at the time and I feel she reacted very unnaturally given that.

I feel like a film like this works best when it’s got you concentrating on the edge of your seat, and ruining immersion even for a second can throw the whole atmosphere out the window. Despite this, I think the film really did redeem itself with a fantastically presented story – it’s been a while since I’ve seen a horror which had a story you really had to pay attention for, and I really think it deserves props for that. I’ve certainly focused more on the negatives in this review, but I think that putting these aside it’s definitely worth watching, and one of the better films I’ve seen recently.

All things considered, if I had to rate it, I’d give it around a 4/5, but I would definitely recommend viewing it in a quieter showing or, in future, in your own home. I think that in a better setting, the film would be a lot more suspenseful and enjoyable, but that without a respectful audience, its tension is severely detracted from and a lot of the power of the film is lost.

RPG engine development blog 1 – Inventory

I thought I’d start posting updates on my RPG engine’s development, as it’s starting to near to a point where a simple game could actually be built on it, and it’s nice to discuss it somewhere. While many features have already been implemented – dialog, maps and movement, NPCs, among others – there’s still a lot to do; dialog is still not as dynamic as I’d like, the scripting language isn’t super great so far, combat doesn’t exist, items barely exist, there’s no way to change area, etc. If you’re interested, a more detailed documentation is available at it’s README on GitHub, or on my projects page.


My first goal was to get some form of visualising the inventory. I’d made some GUI system for the dialog, so I leveraged that existing system and improved it where necessary. There wasn’t a way of having a GUI element render at a fixed position so I added a container that just offsets its children by a fixed amount, which is used to get the 30 item slots aligning nicely.

Initial stages of inventory development – moving items

After this, I hooked it up to the player’s inventory, and started displaying items from the inventory in the slots. After that, some trivial changes made it so that you can click an item to pick it up and then click elsewhere to drop it. This is visible in the image above.

Next, I added some inventory slots for nearby items on the ground, modified the map and hooked it up to allow the player to drop items where they are standing and pick up close-by items.

Some entity mis-rendering is visible in this video – you can see them clipping on top of the tree and over each-other. Currently, entity rendering is done after the entire map is rendered, rather than being interlaced, which leads to this issue. However, dropping items and collecting them works correctly as one may expect.

Next, I made some minor polish to hide the mouse cursor and render held items in the correct place when the player is moving an item.


I’ll be updating this post as I progress through the inventory system, hopefully completing it sometime in the coming weeks.

IC Hack 17 – My first hackathon!

I recently attended IC Hack 17, Imperial College’s Department of Computing Society’s annual hackathon, where I and three others grouped together to create a game.

IC Hack Projection on Queen’s Tower – Picture from @ICHackUK (By Paul Balaji @thepaulbalaji)

The general atmosphere at the event was great – the team had done a fantastic job of making Imperial’s campus feel like an event centre, and gratuitous amounts of plugs (for laptops!), posters, signs, and even a gigantic IC Hack logo projected onto the resident Queen’s Tower gave the venue a strong feeling of organisation and, for the briefest of times, I managed to suspend the belief that I was just sitting in a cafeteria.

The catering deserves a second, more thorough mention – whenever snacks were even remotely near running out, a brand new bunch would appear almost like magic (but perhaps some credit should also be given to the volunteers); not to mention the Domino’s, the dinners provided, and the breakfast of sausages in buns. (Hotdogs?)

In terms of my hack, to show for our 30+ sleepless hours, my team and I created a zombie shooting game which we rather imaginatively called “Zombie Hack,” in Unity with C#. It’s a classic wave-based zombie game, where the waves progressively get bigger and tougher, but with a twist – after you save up enough money, you can buy walls and towers.

The walls and towers would completely block the zombies, but they’re not outsmarted that easily – with some help from Unity (and our resident Unity expert, Marek Beseda), we added pretty awesome path-finding so they’d find their way to you; a bit like a tower defence game!

Unfortunately, as we suspected, playtesting showed a bit of a flaw in our strategy: players would build up towers and create an impenetrable square defence, meaning zombies just walk up to your defences and hang around until they get killed by them.

Fighting a small wave

We initially decided to solve this by just making zombies damage structures, but a combination of zombies dying before they get close, and also some difficulties in making the pathfinding engine happy to walk into walls made this a difficult path. Instead, we created a new variant of zombies to complement the existing 7 (Stupid zombies who have low stats in general, Slow zombies which just walk a bit slowly, Normal zombies that are relatively balanced, Fast zombies who move faster than usual, Fighter zombies who move a tiny bit faster but do a lot more damage, tank zombies with tens of times more health, and boss zombies with loads of health, speed and damage), which got colloquially (and semi-officially) named kamikaze zombies. These zombies would spawn with other zombies in their wave, but had a special quirk: Unlike other zombies, who only chased the player, these zombies would raycast towards the player when they spawned.

 

If this raytrace hit the player, the zombies act normal, with the exception of them blowing up when they get close, immediately killing both themselves and the player. But, and the real quirk is here, if the raytrace hit a wall or tower, the zombies go into charge mode – they target that specific building the ray hit, 5x their speed, and charge at it. If they succeed (Which we found to be around 60-70% of the time), they spawn 2 more of this variant of zombie. This means that even structures made entirely of towers (Which we thought were slightly overpowered), eventually you’d get a few unlucky combinations of tank and kamikazes in a row, and before you know it there’s explosions on all sides of your base, and the towers quickly get overwhelmed.

A much bigger horde!

We thought after this that the game-play was fairly exciting and balanced, and we were fairly excited to demo it to other participants and the judges. Whilst we didn’t win anything, it was great to see everyone’s reactions and even better to see how long we managed to keep some of the volunteers occupied!

On the topic of winning, I think it’s definitely worth a mention to the winning team in our category (games) – Karma, a horror game. You can see it here https://devpost.com/software/karma-lsyi81. The polish they managed to produce in just a weekend was incredible.

Another great submission I particularly liked, which unfortunately haven’t got a video or photographs on DevPost, called Emotional Rollercoaster, was a Kinect-based game where it would show you an expression name (such as disgust) and a photograph of someone (usually Trump or Clinton) making that expression, which you would then have to try to make. If you managed to convince the Microsoft API they then sent their data to, then you’d get some points – which were displayed in a pretty awesome fashion. They’d created a small roller-coaster in wood, with a car that drives forward a bit when you made the correct expression (varying based on how well) and went back if you didn’t. While their balancing seemed a little off – they had to hold the coaster car back to demo it sufficiently – that’s a pretty small issue and easy to fix, and it still looked pretty fun to play – I’m sure there’s a lot of untapped potential in analysing the expressions of players in a video game.

If you’d like to give my team’s game a go, you can clone it here https://github.com/MarekBeseda/ICHack. It was built for Unity, so you’ll need that too. I’ve built it for Windows users here: http://davies.me.uk/ZombieHackWin64.zip Let me know if you beat my high score of 32 waves! 🙂

High Scores:

Alberto Spina: 110 waves [I do not recommend attempting to beat this score if you want to achieve anything productive in your life]

vTables and runtime method resolution

My first second year university group project recently came to an end, in which we had to implement a fully functional compiler that generated machine code from a relatively simplistic procedural language. It had functions, nested expressions, and a strong dynamic typing system. In addition to typical stack-stored primitive types such as int, char, and bool, it also had two dynamically allocated types; arrays and pairs.

For the final milestone of the compiler, we had two weeks to implement our choice of an extension. We chose, fairly ambitiously, to implement a number of extensions, one of which was a fully working Object Oriented system which I took most of the responsibility for implementing.

In the end, our OO system ended up supporting three main types of user-defined types: structs, interfaces, and classes. Structs were relatively simplistic – little more than a fixed-size pieces of memory on the heap. Interfaces allowed you to declare methods which a class implementing the interface would have to implement, and classes allowed you to define methods and fields together.

Our classes supported extending a single other class, and implementing any number of interfaces. Semantically, this worked nearly identical to in Java – in fact, our implementation ended up creating a language that was essentially a subset of Java. The implementation had what are often referred to as the three pillars of object oriented programming – inheritance, encapsulation (public, private and protected, where private was class-only and protected was class-and-subclass-only), and polymorphism (run-time method dispatch).

This post is primarily about how I structured objects on the heap to allow them to support the run-time method dispatch – this was something we found difficult to research, as most resources about vTables and the like tend to be about C++ and rarely are about the low-level implementation. We found that C++ compilers would often optimise away the vTables, even with optimisations turned off, making it very difficult to analyse the ones it generated. As a result of these, I decided to write a summary of how I went about implementing it, in the hope that it is useful to others.

The system I ended up settling with results in a tiny amount of extra overhead for method calls on classes, but adds a significant overhead on calls to interfaces. This is something that I am sure is not done optimally, and as such I certainly do not wholeheartedly recommend this as a perfect way of laying out objects on the heap.

First, to consider why this is not a trivial problem, consider a basic example:

class A { void printAString() { print "A string" } }
class B extends A { void printAString() { print "Another string" } void anotherFunction() { print "Hello, World!" } }
void main() {
    A a = new B();
    a.printAString();
}

A naïve implementation of this would print out something that may be unexpected if you come from a background that leads you to expect the kind of behaviour that run-time dispatch allows, which is “A string”, rather than “Another string”. This comes about as the call to “a”  seems, to the compiler, to be acting on an A, not a B. This could theoretically be avoided by intelligently (as far as a compiler is concerned) noticing that it’s instantiated as a B, but this causes some problems – what if you have a function accepting an A as a parameter, or you instantiate the A in two mutually exclusive branches differently (for example an “if” that sets it to a B, but an “else” that sets it to an A)? These problems, so far as I am aware, make it nigh-on impossible to implement it with compile-time code analysis (Though, theoretically, you could generate a huge file that accounts for every branching possibility and expands them cleverly, this would certainly not be viable for large files).

So the way I solved this, and indeed, I believe, the way it is typically solved, is to create a table of methods for each object at compile-time, which is included in the assembly code generated (and hence the runtime, in some form). I implemented it in a fashion such that every object would have a table for itself, and an additional one for each object it “is” – that is, that it extends or implements. In the example above, the following might be generated:

A:
 o_A_printAString
B:
 o_B_printAString
 o_B_anotherFunction

Function labels here are named as “o_ImplementingType_functionName”. As you can see, the superclass function is in the same location in both tables. This means that, were a function calling a method on an object typed as A to use this as a layer of indirect to access A’s method, then it would be “tricked” into calling B’s version instead.

We then stored objects on the heap similarly to the instance of B that follows:

B's type identification number
B's vTable address
A's fields
B's fields
Interface information about B's implemented interfaces

To call a method on an instance of a class, you would have to navigate a word down from the object pointer, load the vTable pointer stored, use this to load the vTable, then add the offset of the method (this is known at compile time – for example, “printAString” will always be the zeroth method in the vTable). Then, load the value stored here into the program counter to jump to the method.

This is trickier for interfaces – since an object can implement many, we can’t just have a fixed position in the vTable for each method. There is undoubtedly a better way to do this, but I chose to put a few pieces of data about them at the end of the object. For each interface, the following was stored:

The interface's type identification number
The address of a custom version of this interface's vTable with correctly overloaded methods

Additionally, at the end of an object, a zero is stored, which denotes the end of the object. Interface vTables are looked up by a simple, but fairly significant overhead-adding algorithm: look to the interface section with the interface offset information stored at the beginning of the object, and then repeatedly jump two words until either the interface identification number you want is found, or you reach a zero (The zero indicating an explicit cast has failed – it should never occur on an implicit cast). Once found, jump a single word to get the vTable of the interface, then look down a fixed offset to find the method you are interested in calling. The overhead added by this is somewhere in the line of 6 instructions, plus potentially up to four times the total number of interfaces (depending on how far down the one you are interested in is in the list). This is clearly suboptimal, and an approach I considered is, when an object is cast to an interface, moving it’s pointer down to the start of the relevant interface, but this would have been considerably more difficult to integrate into the codebase, as all implicit cast locations would need to be considered.

There is undoubtedly a better way to handle interfaces – I am very interested in finding this out; feel free to contact me if you know of a more optimal way or are wondering about anything I have explained above.

Ruby, a few months on

Sidenote: I’ve pretty much dumped my Thing of the Month plans, because they proved to be too difficult to balance with university work and general life, where I’m trying to branch out more and also trying to be more active in game development. That said, I’m still always trying to learn new things in the software engineering field, as I always have; but just in a less forced and artificial way, which I think does not work as well for me. I’m looking into Kotlin right now, and may put up a post about it sometime soon.

Since I posted about learning Ruby, I think I’m getting rather good at it. My most recent project was a hundred or so line integration test runner for a university compiler project written in Java. It executes the project with different test files, checking the output is as expected, all the while producing a nice output, which overwrites a line in the terminal to give an updating appearance without spamming it. It then proceeds to allow manual test verification, where you can see source code, and the error the compiler produced, to manually verify if the error produced looks sensible and understandable.

Soon, we realised that running such a complicated Java program over 250 times was slow, so I looked into multithreading the script. I was pleasantly surprised by just how easy it was to integrate concurrency into my test runner, and it essentially consisted of two additions; wrapping the main logic in Thread.new‘s block, and then storing that in a list, making up to 8 (Though this is variable by a command line parameter) threads, and waiting for the oldest one to finish before making a new one.

I’ve also started coding a little game akin to how I remember The Hobbit, a lovely game I used to play on a ZX Spectum emulator when I was pretty young. I emphasise how I remember it, because I recently watched a playthrough and my memories weren’t very reliable, but the main thing that my younger self found attractive was the method of input – you would type something, like “light the fire,” or “kick Gandalf,” and it was like magic – it seemed like it always had a well-written reaction to anything you could imagine, and I can remember being really interested in knowing how it worked. I think I’ve got a rather sensible approach to mimicking it, but I don’t think I could ever hope achieve the same kind of magic I felt playing that game. Wikipedia is fairly complimentary to it’s approach:

The parser was very advanced for the time and used a subset of English called Inglish.[5][6] When it was released most adventure games used simple verb-noun parsers (allowing for simple phrases like ‘get lamp’), but Inglish allowed one to type advanced sentences such as “ask Gandalf about the curious map then take sword and kill troll with it”. The parser was complex and intuitive, introducing pronouns, adverbs (“viciously attack the goblin”), punctuation and prepositions and allowing the player to interact with the game world in ways not previously possible.

https://en.wikipedia.org/wiki/The_Hobbit_(1982_video_game)

Anyone who’s interested in retro games, I’d heavily recommend it. It’s truly something I wouldn’t have thought would have existed at the time if I didn’t know about it.

All this is to say that, despite my initial doubts about how long it would last, I still really do love the language, and it’ll definitely be one of my first choices for future projects. I’d probably lean towards C#, C++ or Java for any game that requires better performance, but for most other things I think Ruby is going to indefinitely be one of my favourite choices.

Ruby, and why it quickly became my favourite language

I’d be taken by surprise if I were told by someone that their favourite programming language is one that they’d never written more than a few lines of code in, but that’s my situation right now. Due to a number of unrelated circumstances, I’ve been unable to install and use Ruby, but I’ve been reading a book I obtained recently – Eloquent Ruby, by Russ Olsen – which I’ve found to be a fantastic read. I’m currently about halfway through, and am almost certainly going to pick up some other books in the series – Design Patterns in Ruby, also by Russ, Practical Object-Oriented Design in Ruby, and, (though this isn’t in the same series of books), when it’s released, Agile Web Development in Rails 5.

So far, I’ve learnt that Ruby seems to be exactly how I want a programming language to be – very consistent, intuitive, expressive, and clean. As a short history, I began programming in Lua. At the time, I was pretty young – either eight or nine – and didn’t quite grasp the fundamentals of how a programming language is written. I could write code, but it was a short while before I realised that essentially everything was an expression, which could be nested and used in funky ways – meaning I could write lines like (Not that I’m advocating this style, of course)

tab[index + 3] = get_variable(get_function()({ [“a”] = 5}));

Or to realise that the functions provided to me by my environment (At the time, Roblox), such as their event system, where you’d subscribe a listener in a manner similar to:

object.event:connect(function() … code … end);

Were often something I could manufacture myself, by making a table (Vaguely similar to a hash and an array butchered and stitched together), with a function called “connect” that accepts a function as it’s parameter. These kinds of complex nested expressions and the use of closures and anonymous functions hadn’t really made much of an impression on me, and the higher-level constructs I was using merely felt like a black magic that just worked. Once I realised this, I gradually drifted to feeling like Lua wasn’t ideal for many things – both in terms of speed, and a limited syntax (allowing for some incredible OO systems such as MiddleClass, but still falling short of true OO languages).

I then transitioned to writing object oriented code with C# and Java, languages that, of course, have methods coupled with data, so I could write code that kept functionality with it’s associated data; something that felt sensible and correct. I still wasn’t completely satisfied, though. While “everything” (for the most part) was an object, there were still things that I felt I should be able to do that I couldn’t. Primitives, for example, are essentially special cases, and although autoboxing is nice, it’s a bit clunky. While C# hides the detail better than Java (Array<int>, anyone?), it’s still got its own problems.

Another two languages I’ve used widely are PHP (boo!) and Python. With these, I loved how it was object oriented, but you could flexibly pass objects around. I still prefer static typing, and I do think it’s often more optimal for larger projects (if only so your editor can be a bit lot more intelligent; there’s sometimes type annotations, but they always seemed like a poor man’s static typing to me), but I think dynamic typing can, when used well, be a great convenience.

I had a placement this summer at Netcraft, an internet services company in Bath. It introduced me to Perl, a language which I’d heard about but never really been interested in – my first year of university was my first serious venture into the Unix world, and I’d spent most of that trying to hide from calculus and trigonometry, while trying to improve my ability with Haskell, a language we used in our first term.

At first, I really didn’t like Perl. I’m still not overly fussed, but it managed to persuade me and I ended up writing a few scripts in it at home. I find a few things about it rather annoying – it’s inconsistent, too many things have unexpected side effects or sets special variables, there’s too many ways of expressing the same idea, the object system is not just unintuitive, but feels completely hacked on (Though, in fairness, Moose fixes this, I disagree with the principle that you should have to use a library for something like this). I find it ridiculous that it took decades to add method signatures, and even now they’re considered experimental! I don’t like how I have to think for ages before I can even begin to get the length of an array in a data structure, and when I do, I end up producing code that looks a bit like this:

scalar @{$structure->[{[@@{$}->{a}->$@{ } ]]] ) }->{key}->[3]->{3}}

I jest, but this is certainly how it feels. Even if the speed of thinking comes with practice, it’s still a bit gross how many extra symbols I need to access children of arrayrefs and hashrefs, compared to other languages where you can just nest these kinds of structures effortlessly without thought. Even some of the dedicated Perl community seems to agree here – Perl 6 eliminates a lot of the variation in the symbols to make them at the very least more consistent.

But there are also a lot of things I love about it, and wish more other languages I use had: statement modifiers are a big one. For context, a statement modifier lets you suffix any statement with a little expression like:

print “hello” if should_print_hello;

For some reason, this seems infinitely more elegant than a faux statement modifier in C#, which would be:

if (should_print_hello) print “hello”;

Realistically, they’re similar, but the latter feels more bulky, doesn’t read as well, and I’m not particularly keen on if statements with omitted curly braces.

All that said, whenever I have a basic task to do, my first thought is “Hey, I could write a 10 line Perl script to do this!”. The Perl community isn’t lying when it says it makes the “easy things easy” – it really does. This is something I’ve heard almost unanimously from all of my intern colleagues; most of us seem to harbour some level of disdain for Perl, but still want to use it a lot, because it’s just that damn easy. It’s like an infection that grows on you, presumably eventually turning you into a fully fledged Perl monk before you go to live in a monastery and dedicate your life to answering questions on http://perlmonks.org/.

Ruby is a language I’ve wanted to learn ever since I got pretty good at Lua and decided to move to greener pastures. It was for a completely superficial reason: I thought it’s website was really well designed. Looking at http://lua.org/ and comparing it to http://ruby-lang.org/, you can probably see why I thought this.

For some reason, I slowly became under the impression that I didn’t like the look of Ruby, despite never taking a decent look, and avoided it. Until I decided to learn a new thing, and made it Ruby, after seeing a colleague writing some Rails code.

Soon after looking into it, I realised something: Ruby seemed to be just as good a language for writing quick scripts to solve problems as Perl, a trivially superior one for web development thanks to Rails and Sinatra, and seemed to take all the nice features, like statement modifiers, but wrap it in with a very consistent object oriented approach, where literally everything behaves like an object. “hello”.upcase? Well, “HELLO”, of course – no syntax errors to be found here!

I love the loop structures, and how enumerating is handled so elegantly. I love how there’s a culture of writing DSLs (Though the term is used very loosely) to do all sorts of things, from testing, to make tools. Everything I’ve read about the language makes me itch to rewrite all of my code-bases in it, but I think I’ll just settle for using it for personal systems administration and future website development.

No doubt this is just some kind of initial language infatuation, and it may pass, but right now, Ruby is my favourite language, and I’ve yet to even use it properly.

Thing of the Month 1: Ruby on Rails

I’m trying to learn a new thing (language or framework) every month. Each time, I’d like to begin by answering What, Why, Prior Experience, What, Why, and Compromises. Respectively, those are: What language and why, previous experience I have that I think is relevant, what do I hope to build in the process and why, and any compromises I expect I may have to make to succeed. I’m open to varying what I plan to build through the month if I decide what I chose was too optimistic (or even not optimistic enough), or if I have a particularly busy month and don’t find enough time to learn my Thing of the Month.


Month 1: September 2016

What?

I’m going to try to learn two things: Ruby and Rails. I’m cheating a little, because it’s technically still August, but I think I can forgive myself.

Why?

I’ve seen a lot of Ruby and always wanted to give it a try, and I think that it’s important for me to vary my server-side technologies more, as I’ve not used ASP.NET for a long time, and so am mostly limited to PHP, which is something I would like to change going forward.

Prior Experience?

I’ve written MVC code on top of ASP.NET and CakePHP before; in fact, my current main web project, Gamer-Island, is written on top of CakePHP. This should make learning the Rails aspect much simpler. A strong background in scripting languages should assist in learning Ruby. Overall, I think prior experience will make it easier, but certainly not easy, to gain a degree of fluency in Ruby on Rails.

What?

I’d like to remake an old project of mine, which was a Minecraft server administration panel. Servers would get their own subdomain, and it’d monitor users, chat, and logs, which could then be accessed by staff of the server. Wish this, they could then, for example, issue time bans on users and associate the bans with chat messages, meaning server owners and admins can keep track of bans to ensure they are all fair.

Why?

I used to run a Minecraft server (running the Tekkit modpack). I stopped (due to a change in the EULA not allowing donations in exchange for in-game items on servers, which previously made the server self-sustaining), but I still think there is potential in this idea. I found, during my time running it, that it was difficult to find ‘staff’ who could be trusted to be cool-headed and fair in all situations. Initially, the panel was to ensure my own server’s staff had to provide evidence with their actions, but soon I realised other servers would likely be suffering the same issues. Additionally, Minecraft server plugins all tend to log in their own funky ways. If you don’t capture their messages at run-time, and parse them into a standard form, then the information is dumped in a log file full of a jumble of all different logging formats.

Additionally, I think this provides an ample challenge, as it will require well configured routing, lots of AJAX while maintaining a secure front against CSRF attacks, and configurable levels of access.

Compromises?

I suspect I will have to compromise on the core of the application: I do not believe I will have time to write a Java plugin to hook into servers and securely communicate with the admin panel, uploading user chat, logged information, and others. Instead, I will focus on writing the Ruby end, which would be the web front to the data, and the API for uploading data.

Then, in a future Thing of the Month, I have the option of writing a Minecraft server plugin in Java to upload this data, and create a fully-functioning product.