Use The Force(-Directed Layout)

I have recently been playing around with Haskell, and after writing some small programs I thought I’d try something more medium sized. The idea of combining Haskell with trains was irresistible, so I created a project called tracks, which is a simulation that runs a number of trains on a rail network in real time.

I have got as far as loading up the rail network and service information from some files and then using Concurrent Haskell to run the trains on their own threads, with the signalling shared using STM. At the moment this just spits out the train events to stdout, but the plan is to do something a bit more pretty: to display a map of the network, with the trains shown running along the lines.

The network data doesn’t include any coordinates for the stations, only their logical connections. I wanted to be able to quickly put together networks, particularly real world networks. Geographic coordinates don’t tend to make very nice schematic layouts, and to reuse layouts from the official maps would probably run foul of copyrights. Besides, having to go though the fuss of finding out and entering all the coordinates would be a bit dull. It would be much nicer for the layout to be generated automatically. To that end, I wrote an implementation of a force-directed graph layout algorithm, and what follows is what I discovered.

The Algorithm

The basic principle of force-directed layout is to set up a physical simulation where the vertices in the graph repel each other, but the edges between them act like springs that pull them together. After a suitable amount of simulated time, the forces should have arranged the graph so that everything is nicely spaced out and the vertices that are logically close to each other are also physically close. The overall result should be somewhat aesthetically pleasing and clearly show what is going on.

The particular algorithm that I decided to use was described in the paper Graph Drawing by force-directed Placement1, the outline of which is:

  • Every vertex is initially given a random position within the layout area
  • Each pair of vertices repels each other with a force of k²/d
  • Each adjacent pair of vertices attract each other with a force of d²/k
  • A “temperature” limits the movement at each step
  • “Cooling” reduces the temperature over time as the system converges towards a solution

In the forces, d is the distance between the vertices and k is given by C × √(area/num_vertices). C is “found experimentally”.

One thing to note is that while this is inspired by a physical system, it is not actually simulating one. The “forces” don’t accelerate the vertices as F = ma would dictate, but are simply used as a displacement at each step. I did play with this idea later on, however.

The Implementation

The pseudo-code in the paper was fairly straight forward to translate to Haskell; the highlights are in the snippet below. The main difference is that rather than keeping track of the displacements for each vertex as they are repelled and then attracted, all the forces at each vertex are collected together in vertexForces and then applyForces just needs to use this as it maps over the current positions to generate the new ones.

It turns out that the temperature is quite important to the algorithm working at all. I initially left this out as it didn’t seem that important, but even very simple graphs would degrade into nonsense very quickly without it.

The paper does not directly cover the details of the cooling, so I decided to use a simple exponential decay by multiplying the temperature at each step by some value in (0, 1). The cooling would stop once the temperature reached some fraction of it’s original value. This is described as quenching and simmering.

This code misses out some of the details, you can see the full and final version on GitHub.

Once I had implemented the algorithm as outlined in the paper, simple graphs would converge to reasonable layouts.

Can you tell what this is supposed to be?

Can you tell what this is supposed to be?

A slightly more complicated network. This didn’t come out too bad.

Adjustments

My intention for this implementation was to layout relatively complex networks, in particular the London Underground. This proved to be something of a challenge and it took a while to adjust the constants and get a feel for how tweaking the algorithm would affect the speed of convergence and the quality of the solution at the end.

One of the main problems that algorithms of this type can suffer from is converging on poor local minima; if the graph gets stuck in a bad configuration, no amount of iterating will get it out. I had initially thought that keeping the temperature higher for longer would give the vertices more energy to jump out of local minima and into better configurations, but all this actually did was greatly increase the time it took to converge. Once the graph started to take shape from the initial random layout, the vertices would begin a cycle of jumping between two positions as they were pulled back and forth by their neighbours. With higher temperatures, this jumping was more extreme, with most of the energy going into the jittering and the graph never becoming stable enough to converge properly.

Using a cooling value of 0.99 reduced the temperature fairly quickly which allowed the graph to converge to something sensible in a reasonable amount of time. Keeping the graph simmering at 10% of the staring temperature allowed it enough energy to explore better solutions.

In an effort reduce the jittering further, I introduced damping by scaling the displacement by 0.2.

Just introducing damping gave this much nicer result in about half as many iterations

Just introducing damping gave this much nicer result in about half as many iterations.

The mysterious value of C is never elaborated on in the paper. The constant k which it scales is the “ideal” distance between adjacent vertices (the repulsive force added to the attractive force is 0 when d = k). Setting this to 0.3 worked nicely for very small networks, but for the Tube I ended up using 0.07. The only problem with this is that the small networks end up being tiny within the space for the layout. It seems that there could be a better formula for calculating k given the input graph.

Getting Physical

As mentioned above, this is not really a physical simulation, since that doesn’t give us the results we want. But I was curious to see what would happen if the forces were actually used affect the velocity of the vertices rather than as displacements. To achieve this I made some changes a bit like this:

The result of doing this was predictably not very useful. When the vertices start out in random positions they begin flying around under the influence of a strange sort of orbital mechanics, never getting anywhere close to a sensible arrangement. Less charitably, it’s just noise.

This. Forever.

This. Forever.

I was undaunted though, and wondered what would happen if the graph was somewhere closer to a nice layout before the physical simulation started. I updated applyForces again so that it used the original function initially and then switched to the new one after a number of iterations.

This looked a little better. It had the effect of loosening up the graph, which is what I was hoping for, though after a while the outer parts would start to curl up and flap around and the whole thing would start to become a chaotic mess again.

Switching back to the original function after this would straighten things out again fairly quickly. I ended up switching between the two functions every 1000 iterations. If nothing else, the physical simulation phase adds some noise into the graph which seems to help the regular phase get to a better solution. Smaller graphs converge before this point anyway, but larger and more complex graphs like the Tube benefited from this.

Results


This video shows the algorithm running on the Tube network. You can see the change in behaviour as it switches between the two methods every 1000 iterations. It seems to reach the best layout at around 4300 iterations; after that it continues to go through slight adjustments, but no overall improvement in the layout.

How will tourists get on with this?

How well will tourists get on with this? I’m not sure Harry Beck would be impressed.

You can also download the full map.

I’m quite pleased with the result, particularly the fact that it is actually possible to make out what is going on even in the busiest parts of the network. It’s also interesting that the overall shape isn’t too far off real life, although the residents of West Ruislip might take a while to get used to living so far from their neighbours in the rest of Ruislip.

Further Work

Now that I’ve got this working for correctly, I’d like to take a look at reducing the running time. The Map.mapWithKey part of applyForces should be fairly easy to parallelise to make good use of available cores. There are also a lot of Map lookups to retrieve vertex positions and adjacency which could probably be reduced by rearranging the code. After that, it might be possible to make the lookups more efficient by using a hash map, which would mean moving parts into the ST monad.

An improvement in the paper which I did not try was to ignore the repulsive forces from vertices far away from each other. This reduces the complexity of this part from O(n²) to something more like O(n). I’d be interested to see how much difference these changes could make from my initial implementation.

After that, on to laying out and simulating more rail networks!

  1. T. Fruchterman and E. Reingold, “Graph drawing by force-directed placement”, Software—Practice and Experience, 21 (11), 1129–1164, 1991. doi:10.1002/spe.4380211102 PDF

Leave a Reply

Your email address will not be published. Required fields are marked *