Sunday, January 20, 2013

Traveling Santa

Kaggle.com hosted a fun coding competition called Traveling Santa.

The goal is to solve a given traveling salesman (TSP) instance with 150,000 cities and some additional rules:

  • The solution is a path rather than a cycle (like it’s more common for TSP).
  • The problem is to find not just one, but two paths and they must have no edge in common. The measure to minimize is the length of the longest of the two paths.

The cities are provided as coordinates in the 2D plane.

SantaGeese

When plotted, the 150,000 points display a drawing of Santa (originally by Thomas Nast).

Results

Here is the first path from my final submission, which scored 6,547,910 and placed 5th in the competition:

path1

You may notice the path crossing itself. This would be a non-optimal TSP solution, but in this problem having to accommodate two paths makes everything a bit more complex and even optimal (pairs of) paths can contain crossings.

The winning team posted a solution of length 6,526,972. They also solved a relaxed problem that should give a lower bound for the optimal solution and got a value of 6,525,446. This means my solution is from 0.32% to 0.34% worse than optimal. With a margin of just 0.02%, it’s amazing how close the winning team got to the optimum!

First implementation

As the fun of solving a puzzle and coding are more important for me than trying to get a good placement, I reserved the first couple of days to just brainstorming and developing some intuitive ideas without looking at the literature. I had studied the linear programming formulation of TSP at the university, but decided to try a different approach, especially considering the Euclidean nature of this instance.

2-opt

The first two obvious ideas were to divide the area in smaller squares and use nearest neighbor to build the paths in each of them. I then removed crossings in the paths (where the uncrossing move was not blocked by the other path) and “cut corners”, i.e. cutting a city out of a corner in the path and rather moving it to split another closer nearby edge. This gave solutions with a score around 7,000,000 and getting two paths of comparable lengths required much tweaking.

Corner opt

Unless the paths have very long edges, these optimizations can only involve cities that are close to each other, so I used a pre-computed table listing for each city the 200 (later increased to 1200) closest cities and their distances. To track which edges were not available because they belonged to the other path, I just used a hash table.

I then did a bit of online reading, finding I had implemented 2-opt and 2.5-opt.I found a very good TSP tutorial by David S. Johnson and Lyle A. McGeoch at http://www2.research.att.com/~dsj/papers/TSPchapter.pdf. It has clear and complete explanations and a wealth of experimental results. The paper claims that the state of the art is the Lin-Kernighan algorithm (LK) and variants. This paper also discouraged me from trying a simulated annealing approach (slower than LK in their experiments) and showed a further 2-opt code optimization that allowed my implementation to complete in a fraction of a second.

Final implementation

To generate the initial paths I replaced nearest neighbor with a greedy algorithm where the two paths alternate in choosing the next shortest available edge. Some randomness was also added to the choice.

It may seem that always choosing the shortest edge should lead to a good solution, but the algorithm proceeds creating many disjoint paths and most short edges need to be discarded as they would create forks or cycles… at some point the multiple segments of the same path grow blocking each other, plus many good choices are blocked by the other path and the algorithm ends up adding some very long edges.

The greedy initial paths have comparable lengths of about 7,645,000 and are nice starting points for local optimizations. Alternating 2-opt on the two paths until no further move was possible improved the solution to a score of about 6,890,0000. This animation shows one of the two paths created by the greedy algorithm before and after 2-opt:

Greedy

Up to this point, the total execution time is 16 seconds (single core, but excluding the long time it may take to pre-compute the  table of neighboring cities), 6 seconds for the greedy algorithm and a total of 10 seconds for multiple alternating 2-opt passes.

Adding 2.5-opt increased the run time by 11 minutes but only improved the solution by about 50,000. This pattern continued the following days: each additional optimization required more and more computation time while providing diminishing returns.

To improve my result, I tried random perturbations of the city coordinates, using 2 and 2.5 opt to change the paths, then restoring the original coordinates and using the same optimizations again, in the hope that this would help untangle the two paths. This improved the solution, but was too slow.

I then considered looking closer at areas where 2-opt moves were blocked by the other path or implementing 3-opt, but in the end I decided for a simpler and more general mechanism: given a set of cities, remove all edges between them and exhaustively try all possible ways of rewiring them.

In the next post, I’ll describe this rewiring mechanism.