Get shortest distance from src to all nodes optimization

Question

I have an input as [][]edges. The col length of array is 2. Each row of the 2D array hence has 2 elements. Each element is a vertex. And it is bidirectional i.e we can say the edge is in both directions. Hence if we go through this 2D array, we can say we have an undirected graph.

I am trying to find the shortest distance from one particular node to all nodes. In this case say from node 0 to all the nodes that exist.

I have code that works but I think I am re-computing things which I want to avoid. I call the function computeDistPerNode(m,0,key); again and again and I am sure it is doing re-computation of distance from 0 to nodes that it has seen in prior calls. I am unable to optimize it and leverage the past computation. How do I do it?

Here is the working code without optimization

    public Map<Integer, List<Integer>> createUnDirectedGraph(int [][]edges) {
    Map<Integer, List<Integer>> m = new HashMap<>();
    for(var i = 0; i<edges.length; i++) {
        m.put(edges[i][0], new ArrayList<>());
        m.put(edges[i][1], new ArrayList<>());
    }
    for(var edge:edges) {
        var v1 = edge[0];
        var v2 = edge[1];
        m.get(v1).add(v2);
        m.get(v2).add(v1);
    }
    return m;
}

public int[] getShortestDistances(Map<Integer, List<Integer>> m) {
    int distance[] = new int[m.size()];
    for(Integer key:m.keySet()) {
       var d = computeDistPerNode(m,0,key);
       distance[key] = d;
    }
    return distance;
}
public int computeDistPerNode(Map<Integer, List<Integer>> m, int src, int dest) {
    Queue<Integer> q = new LinkedList<>();
    Integer dist[] = new Integer[m.size()];
    Set<Integer> visited = new HashSet<>();
    Arrays.fill(dist, Integer.MAX_VALUE);
    dist[src] = 0;
    q.add(src);
    while(!q.isEmpty()) {
       var currNode = q.poll();
       if(visited.contains(currNode)) continue;
       visited.add(currNode);
       if(currNode == dest) {
           return dist[dest];
       }


       for(var child: m.get(currNode)) {
           if (visited.contains(child)) {
               continue;
           }

           q.offer(child);
           var newDist = 1 + dist[currNode];
           if(newDist<dist[child]) {
               dist[child] = newDist;
           }
       }
    }
    return -1;
}

public int[][] getsample() {
    int [][] edges = {
            {0,1},
            {0,2},
            {1,4},
            {2,3},
            {4,3},
            {0,4},
    };
    return edges;
}

AKSingh · Accepted Answer

You can calculate distance from the source node to all the other nodes in one go.

The method int computeDistPerNode(Map<Integer, List<Integer>> m, int src, int dest) returns as soon as you reach the destination node. Change that to return the dist array when the queue is empty. Here is your modified method

public Integer[] computeDistFromSource(Map<Integer, List<Integer>> m, int src) {
    Set<Integer> visited = new HashSet<>();

    Integer[] dist = new Integer[m.size()];
    Arrays.fill(dist, Integer.MAX_VALUE);
    dist[src] = 0;

    Queue<Integer> q = new LinkedList<>();
    q.add(src);

    while(!q.isEmpty()) {
        var currNode = q.poll();
        if(visited.contains(currNode)) continue;
        visited.add(currNode);

        for(var child: m.get(currNode)) {
            if (visited.contains(child)) continue;

            q.offer(child);
            var newDist = 1 + dist[currNode];

            if(newDist < dist[child]) {
                dist[child] = newDist;
            }
        }
    }

    return dist;
}

Improvements

If you re-position your lines a little, you can avoid three if calls. This results in a more clean and readable code.

public Integer[] computeDistFromSource(Map<Integer, List<Integer>> m, int src) {
    Set<Integer> visited = new HashSet<>();

    Integer[] dist = new Integer[m.size()];
    Arrays.fill(dist, Integer.MAX_VALUE);
    dist[src] = 0;

    Queue<Integer> q = new LinkedList<>();
    visited.add(src);   // mark source visited here
    q.add(src);

    while(!q.isEmpty()) {
        var currNode = q.poll();

        for(var child: m.get(currNode)) {
            if (!visited.contains(child)) {
                visited.add(child);
                q.offer(child);
                dist[child] = 1 + dist[currNode];
            }
        }
    }

    return dist;
}

Analysis

The algorithm employed is Breadth-first search. According to Wikipedia

The time complexity can be expressed as O(|V| + |E|), since every vertex and every edge will be explored in the worst case. |V| is the number of vertices and |E| is the number of edges in the graph. Note that O(|E|) may vary between O(1) and O(|V|^2), depending on how sparse the input graph is.

Question

Can you help me understand how a larger value of newDist might not get written in current dist[child] without that check? I think the reason is that a child due to the nature of BFS/using queue will be visited first when an univisited node is pulled out and hence the check is not required?

The if(newDist < dist[child]) condition is necessary in your code for correct working. It is not required in the optimized code. The reason is the placement of visited.add(child). In your code, that check happens after a node is polled from queue. In the optimized code, this happens immediately after a node is discovered. This creates a big difference.

Consider your input graph

0 ------- 1
|\        |
|  \      |
|    \    | 
|      \  |
|        \|
|         4
|         |
|         |
|         |
2 ------- 3

Working of your code

The source vertex is 0. Before the beginning of the loop while (!q.isEmpty() we add it to the queue.

In the while loop, we remove 0 and mark it as visited. We explore its neighbors 1, 2 and 4 in that order. We update their distance to 1 and add all of them to the queue. However, none of them have been marked as visited.

Now we go back to the start of the while loop, poll 1, mark it as visited and again explore its neighbors 0 and 4. We do not update the distance of 0 since it is visited. We add 4 to the queue again even though it is already part of the queue. We have added the same node in the queue again this is not a good thing in itself. Notice if there is no if(newDist < dist[child]) condition, its distance will be updated to 2 which is wrong.

Working of the optimized code

The source vertex is 0. Before the beginning of the loop while (!q.isEmpty() we add it to queue and mark it as visited here only.

In the while loop, we remove 0. We explore its neighbors 1, 2 and 4 in that order. We update their distance to 1 and add all of them to the queue and mark all of them as visited. Hence their distance can never be updated again.

Now we go back to the start of the while loop, poll 1 and again explore its neighbors 0 and 4. We do not update the distance of 0 as well as 1 since both of them are visited. The node 4 is also not added to the queue twice.

RandomBits · Answer

If you use a min-priority-queue or min-heap, you can reduce the algorithmic complexity to O(|V| * |E|), i.e. the produce of the number of vertices and number of edges. Even with the improvements to your algorithm from @AKSingh's answer, I think it is still O(|V|^2).

Wikipedia has is a good description of Dijkstra's algorithm which is the standard technique for solving the min-path problem with a min-priority-queue. Here is a more tutorial oriented description with a lot of figures to visualize the algorithm.

The following is some sample code that implements the algorithm. I apologize that it is not in Java, but the translation should be straight forward.

Sample Code

#include <iostream>
#include <map>
#include <queue>
#include <set>
#include <vector>

using NodePair = std::pair<int,int>;
using NodePairs = std::vector<NodePair>;

using DistanceVertex = std::pair<int, int>;
using MinQueue = std::priority_queue<DistanceVertex,
                                  std::vector<DistanceVertex>,
                                  std::greater<DistanceVertex>>;

int main(int argc, const char *argv[]) {
    // The sample problem. We store the graph as a adjacency list
    // using a multimap.
    std::multimap<int, int> edges {
        { 0, 1 },
        { 0, 2 },
        { 1, 4 },
        { 2, 3 },
        { 4, 3 },
        { 0, 4 }
    };

    // How many vertices?
    int max_vertex{};
    for (auto [a, b] : edges) {
        max_vertex = std::max(max_vertex, a);
        max_vertex = std::max(max_vertex, b);
    }
    int number_vertices = max_vertex + 1;

    // Initialize the distance from source to each vertex as MAX_INT.
    int source{};
    std::vector<int> distance(number_vertices, std::numeric_limits<int>::max());

    // Initialize distance to source and priority queue
    MinQueue pq;
    distance[source] = 0;
    pq.emplace(0, source);

    while (!pq.empty()) {
        auto [udist, udx] = pq.top();
        pq.pop();

        // Iterate over all neighbors of vdx
        auto [begin, end] = edges.equal_range(udx);
        for (auto iter = begin; iter != end; ++iter) {
            auto vdx = iter->second, vdist = iter->first;

            // If there is a shorter path, record it
            if (udist + vdist < distance[vdx]) {
                distance[vdx] = udist + vdist;
                pq.push({udist, vdx});
            }
        }
    }

    // distance now contains the shortest distance between source and each node
    for (auto i = 0; i < number_vertices; ++i)
        std::cout << distance[i] << std::endl;

    return 0;
}

Get shortest distance from src to all nodes optimization

Tags:

algorithm

graph

graph-theory

breadth-first-search

curiousengineer

2 Answers

Improvements

Analysis

Question

Working of your code

Working of the optimized code

AKSingh

Sample Code

RandomBits

Recent Activity

Donate For Us

Get shortest distance from src to all nodes optimization

Tags:

algorithm

graph

graph-theory

breadth-first-search

curiousengineer

2 Answers

Improvements

Analysis

Question

Working of your code

Working of the optimized code

AKSingh

Sample Code

RandomBits

Related questions

Recent Activity

Donate For Us