Tackling the travelling salesman problem: introduction

This is the first part in my series on the “travelling salesman problem” (TSP). An outline of what I plan to cover can be seen in the prologue.

To kick things off here’s a quick quote:

The traveling salesman problem (TSP) asks for the shortest route to visit a collection of cities and return to the starting point. Despite an intensive study by mathematicians, computer scientists, operations researchers, and others, over the past 50 years, it remains an open question whether or not an efficient general solution method exists. [1]

The TSP is an NP-Hard Problem. That does not necessarily mean any one instance of the problem will be hard to solve, it just means that we do not currently have an algorithm that can give us the guaranteed best solution for all problems in “polynomial time”. We can’t make predictions about how long it might take to find the best solution to the TSP from just looking at the data. We have no way of knowing how long a problem that is twice as large as one that took 2 minutes to solve will take.

Although we might not be able to make predications about finding the best solution, we often only want a good solution to the TSP. We aren’t always so worried if we find a route amongst 1000 cities that is only a few miles longer than the best solution – particularly if it would take an inordinate amount of computing time to get from the good solution we already have to the best solution.

stochastic optimisation

As we often just want a good solution fairly quickly we can turn to stochastic optimisation methods. We take randomly generated routes through the cities and incrementally improve/change them in some fashion to search for a better route. How these changes are made depends on the algorithm being used, but there are a couple of simple approaches we can take, that I will outline here:

  • swapping the position of two cities on a route
  • reversing the order of an entire section of a route

These simple operators – so called as they “operate” on a route to create a new one – will form the initial basis of my code. There are more complex operators, but for simplicity I shall not talk about them here.

finally on with some code….

I will be using standard Python lists to represent a route (or tour as I refer to it in my code – a name borrowed from graph theory) through a collection of cities. Each city will simply be assigned a number from 0 to N-1 (where N is the number of cities) and therefore our list of cities will be a list of unique numbers between 0 and N-1.

We also need to specify a “distance matrix” that we can use to find out the distance between two cities. To generate a distance matrix for a set of x,y co-ordinates the following will do nicely:

def cartesian_matrix(coords):
    '''create a distance matrix for the city coords
      that uses straight line distance'''
    matrix={}
    for i,(x1,y1) in enumerate(coords):
        for j,(x2,y2) in enumerate(coords):
            dx,dy=x1-x2,y1-y2
            dist=sqrt(dx*dx + dy*dy)
            matrix[i,j]=dist
    return matrix


cartesian_matrix() takes a Python list of (x,y) tuples and outputs a Python dictionary that contains the distance between the distances between any two cities:

>>> matrix=cartesian_matrix([(0,0),(1,0),(1,1)])
>>> print matrix
{(0, 1): 1.0, (1, 2): 1.0, (0, 0): 0.0, (2, 1): 1.0, 
(1, 1): 0.0, (2, 0): 1.4142135623730951, (2, 2): 0.0, 
(1, 0): 1.0, (0, 2): 1.4142135623730951}
>>> print matrix[1,2]
1.0


Where matrix[1,2] gives the distance between city number 1 and city number 2. In our case this is the same as matrix[2,1], but for some TSP’s it may not be (for example if there is a one way street between locations/cities that means we have to take a long way round when going in one direction).

In addition to generating the distance matrix we will probably also want to read the city co-ordinates from a text file (one x,y per line):

def read_coords(coord_file):
    coords=[]
    for line in coord_file:
        x,y=line.strip().split(',')
        coords.append((float(x),float(y)))
    return coords

That should be sufficient for generating distance matrices for now. On real world problems generating a distance matrix may be more involved – you might need to take map data and calculate what the actual distance by road between any two cities is. This process can be done offline, before we start optimising our routes and is a subject for another time.

Ok, so now we can read in a list of cities from a file and generate our distance matrix. What next? Well it would be good if we knew how long a route was:

def tour_length(matrix,tour):
    total=0
    num_cities=len(tour)
    for i in range(num_cities):
        j=(i+1)%num_cities
        city_i=tour[i]
        city_j=tour[j]
        total+=matrix[city_i,city_j]
    return total


Where matrix is a distance matrix and tour is a list of cities (as integers).

implementing the operators

I am going to implement the two operators as generator functions, that will return (in a random order) all of the possible versions of a route that can be made in one step of the operator. By using a generator function we can assess each different possibility and perhaps decide to not generate any more variations – which saves us the overhead of generating all of the combinations in one go. Both operators rely on the following generator function:

def all_pairs(size,shuffle=random.shuffle):
    r1=range(size)
    r2=range(size)
    if shuffle:
        shuffle(r1)
        shuffle(r2)
    for i in r1:
        for j in r2:
            yield (i,j)


Which will generate all pairings of the numbers from 0 to size as (i,j) tuples in a random order (needs the random module imported to work).

So each operator then looks like:

def swapped_cities(tour):
    '''generator to create all possible variations
      where two cities have been swapped'''
    for i,j in all_pairs(len(tour)):
        if i < j:
            copy=tour[:]
            copy[i],copy[j]=tour[j],tour[i]
            yield copy

and

def reversed_sections(tour):
    '''generator to return all possible variations 
      where the section between two cities are swapped'''
    for i,j in all_pairs(len(tour)):
        if i != j:
            copy=tour[:]
            if i < j:
                copy[i:j+1]=reversed(tour[i:j+1])
            else:
                copy[i+1:]=reversed(tour[:j])
                copy[:j]=reversed(tour[i+1:])
            if copy != tour: # no point returning the same tour
                yield copy

Note I'm using copy=tour[:] to duplicate the route, rather than copy=list(tour) (or similar). Although I am currently using a Python list to represent a tour I may in the future want to use something else that is merely “list-like”, in which case I could just override __getitem__ to return an object of the appropriate type.

Using both operators looks like:

>>> for tour in swapped_cities([1,2,3]):
...     print tour
... 
[3, 2, 1]
[2, 1, 3]
[1, 3, 2]
>>> for tour in reversed_sections([1,2,3]):
...     print tour
... 
[3, 2, 1]
[2, 1, 3]
[2, 3, 1]
[3, 1, 2]
[1, 3, 2]


As you can see reversed_sections gives us a lot more possibilities - even with only 3 cities. I'll compare how these two operators stack up in the next part of the series.

additional code

The last piece of utility code for today generates a .png of a route overlaid on top of the city co-ordinates (using PIL):

def write_tour_to_img(coords,tour,img_file):
    padding=20
    # shift all coords in a bit
    coords=[(x+padding,y+padding) for (x,y) in coords]
    maxx,maxy=0,0
    for x,y in coords:
        maxx=max(x,maxx)
        maxy=max(y,maxy)
    maxx+=padding
    maxy+=padding
    img=Image.new("RGB",(int(maxx),int(maxy)),color=(255,255,255))
    
    font=ImageFont.load_default()
    d=ImageDraw.Draw(img);
    num_cities=len(tour)
    for i in range(num_cities):
        j=(i+1)%num_cities
        city_i=tour[i]
        city_j=tour[j]
        x1,y1=coords[city_i]
        x2,y2=coords[city_j]
        d.line((int(x1),int(y1),int(x2),int(y2)),fill=(0,0,0))
        d.text((int(x1)+7,int(y1)-5),str(i),font=font,fill=(32,32,32))
    
    
    for x,y in coords:
        x,y=int(x),int(y)
        d.ellipse((x-5,y-5,x+5,y+5),outline=(0,0,0),fill=(196,196,196))
    del d
    img.save(img_file, "PNG")


Which produces output along these lines:


test.png

conclusion

Well that's it for part one. At this point you can create a distance matrix, see how long a route is, create a few different variations of a route and create an image of a route. The missing part is being able to generate a good route. I'll save that for the next part, where I'll discuss the simplest stochastic optimisation method: Hill-climbing.

source-code

Full source-code is available here as a .tar.gz file. I've included some unit tests which can be run using nosetests.

references

  1. http://www.tsp.gatech.edu/
  2. http://mathworld.wolfram.com/NP-HardProblem.html
  3. http://en.wikipedia.org/wiki/Stochastic_optimization

13 thoughts on “Tackling the travelling salesman problem: introduction

  1. hello…now ım looking for a hill climbing codes for my graduation project….
    can u help me ?
    ım studying with c++ …
    if you help me ı ll be so thank to you…..

  2. Hi Sedat,

    You’re more than welcome to study the code and articles I’ve got on my blog. I hope you find them useful. A lot of this should still be relevant even when coding in C++.

    If you have any particular questions please let me know.

    John

  3. I’ve recently worked on a solution to a simplified version of TSP, which only requires that each city be visited once (i.e. no round-trip.) For which I think I found a solution. (I only needed it to create an optimal playlist, given acoustic distance scores between music files.)

    What I did was: connect the two cities with the lowest distance, then connect the two with the next lowest distance *unless* either of them is already in the middle of another chain. When two chains have a common end/begin city, join them together at that point. Continue until all cities are connected.

    The code for this can be found here:

    http://code.google.com/p/autoqueue/source/browse/trunk/mirage_miximize.py

    I’ll be reading the upcoming entries with interest, thanks!

  4. I’m having trouble generating the image file. I’m new to python and studying your code. Is there something I should be entering into the code to correctly write the image? I’ve tried on linux and windows with no results. Everything runs but nothing appears to the file path I assign.

  5. Hi Ray,

    I know there’s a bug in later versions of my code, as I don’t open the output file in binary mode, but that only leads to image corruption on Windows. A file does show up.

    What does your code look like? Do you have PIL installed ok? Maybe you’re writing the file to the wrong place and just not seeing it? Do you get any other errors?

    cheers,

    John

  6. hi,
    in the all_pairs you have this statament:
    if shuffle:
    I wonder what is the purpose of it… if I take it out, it does the same…
    please share your idea when implementing that…

  7. Hi Jose,

    it’s so that you could pass in shuffle=None to disable the shuffling. In later versions of the code I actually removed the feature, as it was never used in this code.

    cheers,

    John

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>