Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

optimise knight-tour LISP

I am new to LISP and I encounter this problem with the below code.

(defun knights-tour-brute (x y m n) 
  (setq height m)     
  (setq width n)      
  (setq totalmoves (* height width))  
  (setq steps 1)      
  (setq visited-list (list (list x y))) 
  (tour-brute (list (list x y))))
(defun tour-brute (L)
  (cond
   ((null L) NIL)   
   ((= steps totalmoves) L)   
   (t
    (let ((nextmove (generate L)))
      (cond ((null nextmove) (backtrack (car (last L)))
                             (tour-brute (reverse (cdr (reverse L)))))
            (t (setq visited-list (append visited-list (list nextmove)))
               (tour-brute (append L (list nextmove))))))))) 

(defun generate (L)
  (let ((x (caar (last L))) 
        (y (cadar (last L)))) 
    (setq steps (+ 1 steps))  
    (cond
     ((correct-state(+ x 2) (+ y 1) L) (list (+ x 2) (+ y 1)))
     ((correct-state (+ x 2) (- y 1) L) (list (+ x 2) (- y 1)))
     ((correct-state (- x 1) (+ y 2) L) (list (- x 1) (+ y 2)))
     ((correct-state (+ x 1) (+ y 2) L) (list (+ x 1) (+ y 2)))
     ((correct-state (+ x 1) (- y 2) L) (list (+ x 1) (- y 2)))
     ((correct-state (- x 1) (- y 2) L) (list (- x 1) (- y 2)))
     ((correct-state (- x 2) (+ y 1) L) (list (- x 2) (+ y 1)))
     ((correct-state (- x 2) (- y 1) L) (list (- x 2) (- y 1)))
     (t (setq steps (- steps 2)) NIL))))

(defun correct-state (x y L)
  (if (and (<= 1 x)
           (<= x height)
           (<= 1 y)
           (<= y width)
           (not (visited (list x y) L))
           (not (visited (list x y) 
                (tail (car (last L)) visited-list)))) (list (list x y)) NIL))

(defun tail (L stateslist)
  (cond
   ((equal L (car stateslist)) (cdr stateslist))
   (t (tail L (cdr stateslist)))))

(defun visited (L stateslist)
  (cond
   ((null stateslist) NIL)   
   ((equal L (car stateslist)) t) 
   (t (visited L (cdr stateslist)))))

(defun backtrack (sublist)
  (cond
   ((null visited-list) t)
   ((equal sublist (car (last visited-list))) t)
   (t (setq visited-list (reverse (cdr (reverse visited-list)))) 
      (backtrack sublist))))

It returns me an error *** - Program stack overflow. RESET. When I was googling around, I realise that this is the result of recursion. However I am not sure how should I optimise this code to resolve this issue. Any help is deeply appreciated.

Hi, above is the updated code. This is the test code. (knights-tour-brute 5 5 1 1)

like image 527
Hero1134 Avatar asked Dec 06 '25 10:12

Hero1134


2 Answers

As I mentioned in the comments, the problem is lacking Tail Call Optimisation (TCO). You might be able to enable that with

(declaim (optimize (speed 3)))

But it depends on your implementation. I'm not sure about CLISP.

Edit: The other answers have more efficient ways for solving the problem, but it's still worth reading this answer for ways to write the original solution better

Anyway, I optimised the code a bit. You will still need to have TCO in order to run it. That's an inherent problem of using recursion like this. It should run well under SBCL at least. Just save it into a file, and do

(load (compile-file "file.lisp"))

It should run must faster than your original code, and do much less memory allocation. The relevant numbers for (time (knights-tour-brute 1 1 6 6)) with your code:

4,848,466,907 processor cycles
572,170,672 bytes consed

And my code:

1,155,406,109 processor cycles
17,137,776 bytes consed

For most part I left your code as is. The changes I made are mostly:

  1. I actually declared the global variables and cleaned up some bits of the code.
  2. In your version you build visited-list in order. That might seem intuitive when you don't understand how the singly linked lists in Lisp work, but it's very inefficient (those (reverse (cdr (reverse list))) were really eating performance). You should read some Lisp book regarding Lists. I keep it in reverse order, and then finally reverse it with nreverse at the end.
  3. You used lists for the coordinates. I use a struct instead. Performance is very greatly increased.
  4. I added type declarations for everything. It improves performance a little.

However, it is still the same brute force algorithm, so it will be very slow for larger boards. You should look into smarter algorithms for those.

(declaim (optimize (speed 3) (space 0) (safety 0) (debug 0)))

(declaim (type fixnum *height* *width* *total-moves* *steps*))
(declaim (type list *visited-list*))

(declaim (ftype (function (fixnum fixnum fixnum fixnum) list)
                knights-tour-brute))
(declaim (ftype (function (list) list)
                tour-brute))
(declaim (ftype (function (list) (or pos null))
                generate))
(declaim (ftype (function (fixnum fixnum list) (or t null))
                correct-state))
(declaim (ftype (function (fixnum fixnum list) (or t null))
                visited))
(declaim (ftype (function (pos) t)
                backtrack))
(declaim (ftype (function (fixnum fixnum pos) (or t null))
                vis-2))
(declaim (ftype (function (pos pos) (or t null))
                pos=))
(declaim (ftype (function (pos fixnum fixnum) (or t null))
                pos=*))

(defstruct pos
  (x 0 :type fixnum)
  (y 0 :type fixnum))

(defmethod print-object ((pos pos) stream)
  (format stream "(~d ~d)" (pos-x pos) (pos-y pos)))

(defparameter *height* 0)
(defparameter *width* 0)
(defparameter *total-moves* 0)
(defparameter *steps* 0)
(defparameter *visited-list* '())

(defun knights-tour-brute (x y m n)
  (let ((*height* m)
        (*width* n)
        (*total-moves* (* m n))
        (*steps* 1) 
        (*visited-list* (list (make-pos :x x :y y))))
    (nreverse (tour-brute (list (make-pos :x x :y y))))))

(defun tour-brute (l)
  (cond
    ((null l) nil)
    ((= *steps* *total-moves*) l)   
    (t (let ((nextmove (generate l)))
         (cond
           ((null nextmove)
            (backtrack (first l))
            (tour-brute (rest l)))
           (t (push nextmove *visited-list*)
              (tour-brute (cons nextmove l)))))))) 

(defun generate (l)
  (let ((x (pos-x (first l)))
        (y (pos-y (first l))))
    (declare (type fixnum x y))
    (incf *steps*)
    (cond
      ((correct-state (+ x 2) (+ y 1) l) (make-pos :x (+ x 2) :y (+ y 1)))
      ((correct-state (+ x 2) (- y 1) l) (make-pos :x (+ x 2) :y (- y 1)))
      ((correct-state (- x 1) (+ y 2) l) (make-pos :x (- x 1) :y (+ y 2)))
      ((correct-state (+ x 1) (+ y 2) l) (make-pos :x (+ x 1) :y (+ y 2)))
      ((correct-state (+ x 1) (- y 2) l) (make-pos :x (+ x 1) :y (- y 2)))
      ((correct-state (- x 1) (- y 2) l) (make-pos :x (- x 1) :y (- y 2)))
      ((correct-state (- x 2) (+ y 1) l) (make-pos :x (- x 2) :y (+ y 1)))
      ((correct-state (- x 2) (- y 1) l) (make-pos :x (- x 2) :y (- y 1)))
      (t (decf *steps* 2)
         nil))))

(defun correct-state (x y l)
  (and (<= 1 x *height*)
       (<= 1 y *width*)
       (not (visited x y l))
       (vis-2 x y (first l))))

(defun visited (x y stateslist)
  (loop
     for state in stateslist
     when (pos=* state x y) do (return t)))

;;---TODO: rename this
(defun vis-2 (x y l-first)
  (loop
     for state in *visited-list*
     when (pos= l-first state) do (return t)
     when (pos=* state x y) do (return nil)))

(defun backtrack (sublist)
  (loop
     for state in *visited-list*
     while (not (pos= sublist state))
     do (pop *visited-list*)))

(defun pos= (pos1 pos2)
  (and (= (pos-x pos1)
          (pos-x pos2))
       (= (pos-y pos1)
          (pos-y pos2))))
(defun pos=* (pos1 x y)
  (and (= (pos-x pos1) x)
       (= (pos-y pos1) y)))

Edit: I improved correct-state so as to not look through the same list twice. Reduces consing significantly.

Edit2: I switched to using a struct for positions instead of using cons-cells. That improves performance dramatically.

It could probably be optimised more, but it should be sufficiently fast for boards up 6x6. If you need better performance, I think switching to a different algorithm would be more productive than trying to optimize a brute force solution. If someone does want to optimize this anyway, here are some results from profiling.

Results from sb-sprof show that majority of time is spent in checking equality. I don't think there's much to be done about that. visited also takes quite a bit of time. Maybe storing the visited positions in an array would speed it up, but I haven't tried it.

           Self        Total        Cumul
  Nr  Count     %  Count     %  Count     %    Calls  Function
------------------------------------------------------------------------
   1   1631  40.8   3021  75.5   1631  40.8        -  VISITED
   2   1453  36.3   1453  36.3   3084  77.1        -  POS=*
   3    337   8.4   3370  84.3   3421  85.5        -  CORRECT-STATE
   4    203   5.1   3778  94.5   3624  90.6        -  GENERATE
   5    101   2.5    191   4.8   3725  93.1        -  VIS-2
   6     95   2.4     95   2.4   3820  95.5        -  POS=
   7     88   2.2   3990  99.8   3908  97.7        -  TOUR-BRUTE
   8     44   1.1     74   1.9   3952  98.8        -  BACKTRACK
   9     41   1.0     41   1.0   3993  99.8        -  MAKE-POS

:ALLOC mode doesn't give much usefull information:

           Self        Total        Cumul
  Nr  Count     %  Count     %  Count     %    Calls  Function
------------------------------------------------------------------------
   1   1998  50.0   3998  99.9   1998  50.0        -  TOUR-BRUTE
   2   1996  49.9   1996  49.9   3994  99.9        -  MAKE-POS

sb-profile shows that generate does most of the consing, while visited takes most of the time (note that the seconds of course are way off due to the instumentation):

  seconds  |     gc     |   consed   |    calls   |  sec/call  |  name  
-------------------------------------------------------------
     8.219 |      0.000 |    524,048 |  1,914,861 |   0.000004 | VISITED
     0.414 |      0.000 |     32,752 |    663,273 |   0.000001 | VIS-2
     0.213 |      0.000 |     32,768 |    266,832 |   0.000001 | BACKTRACK
     0.072 |      0.000 |          0 |  1,505,532 |   0.000000 | POS=
     0.000 |      0.000 |          0 |          1 |   0.000000 | TOUR-BRUTE
     0.000 |      0.024 | 17,134,048 |    533,699 |   0.000000 | GENERATE
     0.000 |      0.000 |     32,768 |  3,241,569 |   0.000000 | CORRECT-STATE
     0.000 |      0.000 |     32,752 | 30,952,107 |   0.000000 | POS=*
     0.000 |      0.000 |          0 |          1 |   0.000000 | KNIGHTS-TOUR-BRUTE
-------------------------------------------------------------
     8.918 |      0.024 | 17,789,136 | 39,077,875 |            | Total
like image 122
jkiiski Avatar answered Dec 08 '25 05:12

jkiiski


The list-based answer from @jkiiski takes the same approach as OP and greatly optimizes it. Here the goal is different: I try to use another way to represent the problem (but still brute force) and we can see that with vectors and matrices, we can solve harder problems better, faster and stronger1.

I also applied the same heuristics as in the other answer, which significantly reduces the effort required to find solutions.

Data-structures

(defpackage :knight (:use :cl))
(in-package :knight)

(declaim (optimize (speed 3) (debug 0) (safety 0)))

(deftype board () '(simple-array bit *))
(deftype delta () '(integer -2 2))

;; when we add -2, -1, 1 or 2 to a board index, we assume the
;; result can still fit into a fixnum, which is not always true in
;; general.

(deftype frontier () (list 'integer -2 most-positive-fixnum))

Next, we define a class to hold instances of a Knight's Tour problem as well as working data, namely height, width, a matrix representing the board, containing either 0 (empty) or 1 (visited), as well as the current tour, represented by a vector of size height x width with a fill-pointer initialized to zero. The dimensions are not strictly necessary in this class since the internal board already stores them.

(defclass knights-tour ()
  ((visited-cells :accessor visited-cells)
   (board :accessor board)
   (height :accessor height :initarg :height :initform 8)
   (width :accessor width :initarg :width :initform 8)))

(defmethod initialize-instance :after ((knight knights-tour)
                                       &key &allow-other-keys)
  (with-slots (height width board visited-cells) knight
    (setf board (make-array (list height width)
                            :element-type 'bit
                            :initial-element 0)

          visited-cells (make-array (* height width)
                                    :element-type `(integer ,(* height width))
                                    :fill-pointer 0))))

By the way, we also specialize print-object:

(defmethod print-object ((knight knights-tour) stream)
  (with-slots (width height visited-cells) knight
    (format stream "#<knight's tour: ~dx~d, tour: ~d>" width height visited-cells)))

Auxiliary functions

(declaim (inline visit unvisit))

Visiting a cell at position x and y means setting a one at the appropriate location in the board and pushing current cell's coordinate into the visited-cell vector. I store the row-major index instead of a couple of coordinates because it allocates less memory (in fact the difference is not important).

(defmethod visit ((knight knights-tour) x y)
  (let ((board (board knight)))
    (declare (board board))
    (setf (aref board y x) 1)
    (vector-push-extend (array-row-major-index board y x)
                        (visited-cells knight))))

Unvisiting a cell means setting a zero in the board and decreasing the fill-pointer of the sequence of visited cells.

(defun unvisit (knight x y)
  (let ((board (board knight)))
    (declare (board board))
    (setf (aref board y x) 0)
    (decf (fill-pointer (visited-cells knight)))))

Exhaustive search

The recursive visiting function is the following one. It first visits current cell, recursively calls itself on each free valid neighbour and finally unvisits itself before exiting. The function accepts a callback function to be called whenever a solution is found (edit: I won't refactor, but I think the callback function should be stored in a slot of the knights-tour class).

(declaim (ftype
          (function (knights-tour fixnum fixnum function)
                    (values &optional))
          brute-visit))

(defun brute-visit (knight x y callback
                    &aux (board (board knight))
                      (cells (visited-cells knight)))
  (declare (function callback)
           (board board)
           (type (vector * *) cells)
           (fixnum x y))
  (visit knight x y)
  (if (= (fill-pointer cells) (array-total-size cells))
        (funcall callback knight)
        (loop for (i j) of-type delta
                in '((-1 -2) (1 -2) (-2 -1) (2 -1)
                     (-2 1) (2 1) (-1 2) (1 2))
              for xx = (the frontier (+ i x))
              for yy = (the frontier (+ j y))
              when (and (array-in-bounds-p board yy xx)
                        (zerop (aref board yy xx)))
                do (brute-visit knight xx yy callback)))
  (unvisit knight x y)
  (values))

Entry point

(defun knights-tour (x y callback &optional (h 8) (w 8))
  (let ((board (make-instance 'knights-tour :height h :width w)))
    (brute-visit board x y callback)))

Tests 1

The following test asks to find a solution for a 6x6 board:

(time (block nil
        (knights-tour 0 0 (lambda (k) (return k)) 6 6)))

Evaluation took:
  0.097 seconds of real time
  0.096006 seconds of total run time (0.096006 user, 0.000000 system)
  [ Run times consist of 0.008 seconds GC time, and 0.089 seconds non-GC time. ]
  98.97% CPU
  249,813,780 processor cycles
  47,005,168 bytes consed

Comparatively, the version from the other versions runs as follows (the origin point is the same, but we index cells differently):

(time (knights-tour-brute 1 1 6 6))

Evaluation took:
  0.269 seconds of real time
  0.268017 seconds of total run time (0.268017 user, 0.000000 system)
  99.63% CPU
  697,461,700 processor cycles
  17,072,128 bytes consed

Tests 2

For larger boards, the difference is more visible. If we ask to find a solution for an 8x8 board, the above versions acts as follows on my machine:

> (time (block nil (knights-tour 0 0 (lambda (k) (return k)) 8 8)))

Evaluation took:
  8.416 seconds of real time
  8.412526 seconds of total run time (8.412526 user, 0.000000 system)
  [ Run times consist of 0.524 seconds GC time, and 7.889 seconds non-GC time. ]
  99.96% CPU
  21,808,379,860 processor cycles
  4,541,354,592 bytes consed

#<knight's tour: 8x8, tour: #(0 10 4 14 20 3 9 19 2 8 18 1 11 5 15 21 6 12 22 7
                              13 23 29 35 25 40 34 17 27 33 16 26 32 49 43 28
                              38 55 61 44 59 53 63 46 31 37 47 30 36 51 57 42
                              48 58 52 62 45 39 54 60 50 56 41 24)>

The original list-based approach did not return and after ten minutes I killed the worker thread.

Heuristics

There are still room for improvements (see actual research papers to have more information) and here I'll sort the neighbors like @jkiiski's updated version to see what happens. What follows is just a way to abstract iterating over neighbours, because we will use it more than once, and differently:

(defmacro do-neighbourhood ((xx yy) (board x y) &body body)
  (alexandria:with-unique-names (i j tx ty)
    `(loop for (,i ,j) of-type delta
             in '((-1 -2) (1 -2) (-2 -1) (2 -1)
                  (-2 1) (2 1) (-1 2) (1 2))
           for ,tx = (the frontier (+ ,i ,x))
           for ,ty = (the frontier (+ ,j ,y))
           when (and (array-in-bounds-p ,board ,ty ,tx)
                     (zerop (aref ,board ,ty ,tx)))
             do (let ((,xx ,tx)
                      (,yy ,ty))
                  ,@body))))

We need a way to count the number of possible neighbors:

(declaim (inline count-neighbours)
         (ftype (function (board fixnum fixnum ) fixnum)
                count-neighbours))

(defun count-neighbours (board x y &aux (count 0)) 
  (declare (fixnum count x y)
           (board board))
  (do-neighbourhood (xx yy) (board x y)
    (declare (ignore xx yy))
    (incf count))
  count)

And here is the alternative search implementation:

(defstruct next
  (count 0 :type fixnum)
  (x 0 :type fixnum)
  (y 0 :type fixnum))

(defun brute-visit (knight x y callback
                    &aux (board (board knight))
                      (cells (visited-cells knight)))
  (declare (function callback)
           (board board)
           (type (vector * *) cells)
           (fixnum x y))
  (visit knight x y)
  (if (= (fill-pointer cells) (array-total-size cells))
      (funcall callback knight)
      (let ((moves (make-array 8 :element-type 'next
                                 :fill-pointer 0)))
        (do-neighbourhood (xx yy) (board x y)
          (vector-push-extend (make-next :count (count-neighbours board xx yy)
                                         :x xx
                                         :y yy)
                              moves))
        (map nil
             (lambda (next)
               (brute-visit knight
                            (next-x next)
                            (next-y next)
                            callback)
               (cerror "CONTINUE" "Backtrack detected"))
             (sort moves
                   (lambda (u v)
                     (declare (fixnum u v))
                     (<= u v))
                   :key #'next-count)
             )))
  (unvisit knight x y)
  (values))

The results are immediate when trying previous tests. For example, with a 64x64 board:

knight> (time
         (block nil
           (knights-tour
            0 0
            (lambda (k) (return))
            64 64)))

Evaluation took:
  0.012 seconds of real time
  0.012001 seconds of total run time (0.012001 user, 0.000000 system)
  100.00% CPU
  29,990,030 processor cycles
  6,636,048 bytes consed

Finding the 1728 solutions for a 5x5 board takes 42 seconds.

Here I keep the backtrack mechanism, and in order to see if we need it, I added a cerror expression in the search, so that we are notified as soon as the search tries another path. The following test triggers the error:

(time
 (dotimes (x 8)
   (dotimes (y 8)
     (block nil
       (knights-tour
        x y
        (lambda (k) (return))
        8 8)))))

The values for x and y for which the error is reported are respectively 2 and 1.


1 For reference, see Daft Punk.

like image 28
coredump Avatar answered Dec 08 '25 06:12

coredump



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!