Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Max path sum in a 2D array

Consider the following question:

Given a 2D array of unsigned integers and a maximum length n, find a path in that matrix that is not longer than n and which maximises the sum. The output should consist of both the path and the sum.

A path consists of neighbouring integers that are either all in the same row, or in the same column, or down a diagonal in the down-right direction.

For example, consider the following matrix and a given path length limit of 3:

 1  2  3  4  5    
 2  1  2  2  1   
 3  4  5* 6  5    
 3  3  5 10* 5    
 1  2  5  7 15* 

The most optimal path would be 5 + 10 + 15 (nodes are marked with *).

Now, upon seeing this problem, immediately a Dynamic Programming solution seems to be most appropriate here, given this problem's similarity to other problems like Min Cost Path or Maximum Sum Rectangular Submatrix. The issue is that in order to correctly solve this problem, you need to start building up the paths from every integer (node) in the matrix and not just start the path from the top left and end on the bottom right.

I was initially thinking of an approach similar to that of the solution for Maximum Sum Rectangular Submatrix in which I could store each possible path from every node (with path length less than n, only going right/down), but the only way I can envision that approach is by making recursive calls for down and right from each node which would seem to defeat the purpose of DP. Also, I need to be able to store the max path.

Another possible solution I was thinking about was somehow adapting a longest path search and running it from each int in the graph where each int is like an edge weight.

What would be the most efficient way to find the max path?

like image 478
loremIpsum1771 Avatar asked Dec 04 '25 15:12

loremIpsum1771


1 Answers

The challenge here is to avoid to sum the same nodes more than once. For that you could apply the following algorithm:

Algorithm

  1. For each of the 3 directions (down, down+right, right) perform steps 2 and 3:

  2. Determine the number of lines that exist in this direction. For the downward direction, this is the number of columns. For the rightward direction, this is the number of rows. For the diagonal direction, this is the number of diagonal lines, i.e. the sum of the number of rows and columns minus 1, as depicted by the red lines below:

enter image description here

  1. For each line do:

    • Determine the first node on that line (call it the "head"), and also set the "tail" to that same node. These two references refer to the end points of the "current" path. Also set both the sum and path-length to zero.

    • For each head node on the current line perform the following bullet points:

      • Add the head node's value to the sum and increase the path length

      • If the path length is larger than the allowed maximum, subtract the tail's value from the sum, and set the tail to the node that follows it on the current line

      • Whenever the sum is greater than the greatest sum found so far, remember it together with the path's location.

      • Set the head to the node that follows it on the current line

At the end return the greatest sum and the path that generated this sum.

Code

Here is an implementation in basic JavaScript:

function maxPathSum(matrix, maxLen) {
    var row, rows, col, cols, line, lines, dir, dirs, len,
        headRow, headCol, tailRow, tailCol, sum, maxSum;

    rows = matrix.length;
    cols = matrix[0].length;
    maxSum = -1;
    dirs = 3; // Number of directions that paths can follow
    if (maxLen == 1 || cols == 1) 
        dirs = 1; // Only need to check downward directions

    for (dir = 1; dir <= 3; dir++) {
        // Number of lines in this direction to try paths on
        lines = [cols, rows, rows + cols - 1][dir-1];
        for (line = 0; line < lines; line++) {
            sum = 0;
            len = 0;
            // Set starting point depending on the direction
            headRow = [0, line, line >= rows ? 0 : line][dir-1];
            headCol = [line, 0, line >= rows ? line - rows : 0][dir-1];
            tailRow = headRow;
            tailCol = headCol;
            // Traverse this line
            while (headRow < rows && headCol < cols) {
                // Lengthen the path at the head
                sum += matrix[headRow][headCol];
                len++;
                if (len > maxLen) {
                    // Shorten the path at the tail
                    sum -= matrix[tailRow][tailCol];
                    tailRow += dir % 2;
                    tailCol += dir >> 1;
                }
                if (sum > maxSum) {
                    // Found a better path
                    maxSum = sum;
                    path = '(' + tailRow + ',' + tailCol + ') - ' 
                         + '(' + headRow + ',' + headCol + ')';
                }
                headRow += dir % 2;
                headCol += dir >> 1;
            }
        }
    }
    // Return the maximum sum and the string representation of
    // the path that has this sum
    return { maxSum, path };
} 

// Sample input
var matrix = [
    [1,  2,  3,  4,  5],
    [2,  1,  2,  2,  1],
    [3,  4,  5,  5,  5],
    [3,  3,  5, 10,  5],
    [1,  2,  5,  5, 15],
];

var best = maxPathSum(matrix, 3);

console.log(best);

Some details about the code

  1. Be aware that row/column indexes start at 0.

  2. The way the head and tail coordinates are incremented is based on the binary representation of the dir variable: it takes these three values (binary notation): 01, 10, 11

    You can then take the first bit to indicate whether the next step in the direction is on the next column (1) or not (0), and the second bit to indicate whether it is on the next row (1) or not (0). You can depict it like this, where 00 represents the "current" node:

    00 10
    01 11
    

    So we have this meaning to the values of dir:

    • 01: walk along the column
    • 10: walk along the row
    • 11: walk diagonally

    The code uses >>1 for extracting the first bit, and % 2 for extracting the last bit. That operation will result in a 0 or 1 in both cases, and is the value that needs to be added to either the column or the row.

  3. The following expression creates a 1D array and takes one of its values on-the-fly:

    headRow = [0, line, line >= rows ? 0 : line][dir-1];
    

    It is short for:

    switch (dir) {
    case 1:
        headRow = 0;
        break;
    case 2:
        headRow = line;
        break;
    case 3:
        if (line >= rows) 
            headRow = 0
        else 
            headRow = line;
        break;  
    }
    

Time and space complexity

The head will visit each node exactly once per direction. The tail will visit fewer nodes. The number of directions is constant, and the max path length value does not influence the number of head visits, so the time complexity is:

        Θ(rows * columns)

There are no additional arrays used in this algorithm, just a few primitive variables. So the additional space complexity is:

        Θ(1)

which both are the best you could hope for.

Is it Dynamic Programming?

In a DP solution you would typically use some kind of tabulation or memoization, possibly in the form of a matrix, where each sub-result found for a particular node is input for determining the result for neighbouring nodes.

Such solutions could need Θ(rows*columns) extra space. But this problem can be solved without such (extensive) space usage. When looking at one line at a time (a row, a column or a diagonal), the algorithm has some similarities with Kadane's algorithm:

One difference is that here the choice to extend or shorten the path/subarray is not dependent on the matrix data itself, but on the given maximum length. This is also related to the fact that here all values are guaranteed to be non-negative, while Kadane's algorithm is suitable for signed numbers.

Just like with Kadane's algorithm the best solution so far is maintained in a separate variable.

Another difference is that here we need to look in three directions. But that just means repeating the same algorithm in those three directions, while carrying over the best solution found so far.

This is a very basic use of Dynamic Programming, since you don't need the tabulation or memoization techniques here. We only keep the best results in the variables sum and maxSum. That cannot be viewed as tabluation or memoization, which typically keep track of several competing results that must be compared at some time. See this interesting answer on the subject.

like image 162
trincot Avatar answered Dec 07 '25 08:12

trincot