I have an input as [][]edges. The col length of array is 2. Each row of the 2D array hence has 2 elements. Each element is a vertex. And it is bidirectional i.e we can say the edge is in both directions. Hence if we go through this 2D array, we can say we have an undirected graph.
I am trying to find the shortest distance from one particular node to all nodes. In this case say from node 0 to all the nodes that exist.
I have code that works but I think I am re-computing things which I want to avoid. I call the function computeDistPerNode(m,0,key); again and again and I am sure it is doing re-computation of distance from 0 to nodes that it has seen in prior calls. I am unable to optimize it and leverage the past computation. How do I do it?
Here is the working code without optimization
    public Map<Integer, List<Integer>> createUnDirectedGraph(int [][]edges) {
    Map<Integer, List<Integer>> m = new HashMap<>();
    for(var i = 0; i<edges.length; i++) {
        m.put(edges[i][0], new ArrayList<>());
        m.put(edges[i][1], new ArrayList<>());
    }
    for(var edge:edges) {
        var v1 = edge[0];
        var v2 = edge[1];
        m.get(v1).add(v2);
        m.get(v2).add(v1);
    }
    return m;
}
public int[] getShortestDistances(Map<Integer, List<Integer>> m) {
    int distance[] = new int[m.size()];
    for(Integer key:m.keySet()) {
       var d = computeDistPerNode(m,0,key);
       distance[key] = d;
    }
    return distance;
}
public int computeDistPerNode(Map<Integer, List<Integer>> m, int src, int dest) {
    Queue<Integer> q = new LinkedList<>();
    Integer dist[] = new Integer[m.size()];
    Set<Integer> visited = new HashSet<>();
    Arrays.fill(dist, Integer.MAX_VALUE);
    dist[src] = 0;
    q.add(src);
    while(!q.isEmpty()) {
       var currNode = q.poll();
       if(visited.contains(currNode)) continue;
       visited.add(currNode);
       if(currNode == dest) {
           return dist[dest];
       }
       for(var child: m.get(currNode)) {
           if (visited.contains(child)) {
               continue;
           }
           q.offer(child);
           var newDist = 1 + dist[currNode];
           if(newDist<dist[child]) {
               dist[child] = newDist;
           }
       }
    }
    return -1;
}
public int[][] getsample() {
    int [][] edges = {
            {0,1},
            {0,2},
            {1,4},
            {2,3},
            {4,3},
            {0,4},
    };
    return edges;
}
You can calculate distance from the source node to all the other nodes in one go.
The method int computeDistPerNode(Map<Integer, List<Integer>> m, int src, int dest) returns as soon as you reach the destination node. Change that to return the dist array when the queue is empty. Here is your modified method
public Integer[] computeDistFromSource(Map<Integer, List<Integer>> m, int src) {
    Set<Integer> visited = new HashSet<>();
    Integer[] dist = new Integer[m.size()];
    Arrays.fill(dist, Integer.MAX_VALUE);
    dist[src] = 0;
    Queue<Integer> q = new LinkedList<>();
    q.add(src);
    while(!q.isEmpty()) {
        var currNode = q.poll();
        if(visited.contains(currNode)) continue;
        visited.add(currNode);
        for(var child: m.get(currNode)) {
            if (visited.contains(child)) continue;
            q.offer(child);
            var newDist = 1 + dist[currNode];
            if(newDist < dist[child]) {
                dist[child] = newDist;
            }
        }
    }
    return dist;
}
If you re-position your lines a little, you can avoid three if calls. This results in a more clean and readable code.
public Integer[] computeDistFromSource(Map<Integer, List<Integer>> m, int src) {
    Set<Integer> visited = new HashSet<>();
    Integer[] dist = new Integer[m.size()];
    Arrays.fill(dist, Integer.MAX_VALUE);
    dist[src] = 0;
    Queue<Integer> q = new LinkedList<>();
    visited.add(src);   // mark source visited here
    q.add(src);
    while(!q.isEmpty()) {
        var currNode = q.poll();
        for(var child: m.get(currNode)) {
            if (!visited.contains(child)) {
                visited.add(child);
                q.offer(child);
                dist[child] = 1 + dist[currNode];
            }
        }
    }
    return dist;
}
The algorithm employed is Breadth-first search. According to Wikipedia
The time complexity can be expressed as
O(|V| + |E|), since every vertex and every edge will be explored in the worst case.|V|is the number of vertices and|E|is the number of edges in the graph. Note thatO(|E|)may vary betweenO(1)andO(|V|^2), depending on how sparse the input graph is.
Can you help me understand how a larger value of newDist might not get written in current dist[child] without that check? I think the reason is that a child due to the nature of BFS/using queue will be visited first when an univisited node is pulled out and hence the check is not required?
The if(newDist < dist[child]) condition is necessary in your code for correct working. It is not required in the optimized code. The reason is the placement of visited.add(child). In your code, that check happens after a node is polled from queue. In the optimized code, this happens immediately after a node is discovered. This creates a big difference.
Consider your input graph
0 ------- 1
|\        |
|  \      |
|    \    | 
|      \  |
|        \|
|         4
|         |
|         |
|         |
2 ------- 3
The source vertex is 0. Before the beginning of the loop while (!q.isEmpty() we add it to the queue.
In the while loop, we remove 0 and mark it as visited. We explore its neighbors 1, 2 and 4 in that order. We update their distance to 1 and add all of them to the queue. However, none of them have been marked as visited.
Now we go back to the start of the while loop, poll 1, mark it as visited and again explore its neighbors 0 and 4. We do not update the distance of 0 since it is visited. We add 4 to the queue again even though it is already part of the queue. We have added the same node in the queue again this is not a good thing in itself. Notice if there is no if(newDist < dist[child]) condition, its distance will be updated to 2 which is wrong.
The source vertex is 0. Before the beginning of the loop while (!q.isEmpty() we add it to queue and mark it as visited here only.
In the while loop, we remove 0. We explore its neighbors 1, 2 and 4 in that order. We update their distance to 1 and add all of them to the queue and mark all of them as visited. Hence their distance can never be updated again.
Now we go back to the start of the while loop, poll 1 and again explore its neighbors 0 and 4. We do not update the distance of 0 as well as 1 since both of them are visited. The node 4 is also not added to the queue twice.
If you use a min-priority-queue or min-heap, you can reduce the algorithmic complexity to O(|V| * |E|), i.e. the produce of the number of vertices and number of edges. Even with the improvements to your algorithm from @AKSingh's answer, I think it is still O(|V|^2).
Wikipedia has is a good description of Dijkstra's algorithm which is the standard technique for solving the min-path problem with a min-priority-queue. Here is a more tutorial oriented description with a lot of figures to visualize the algorithm.
The following is some sample code that implements the algorithm. I apologize that it is not in Java, but the translation should be straight forward.
#include <iostream>
#include <map>
#include <queue>
#include <set>
#include <vector>
using NodePair = std::pair<int,int>;
using NodePairs = std::vector<NodePair>;
using DistanceVertex = std::pair<int, int>;
using MinQueue = std::priority_queue<DistanceVertex,
                                  std::vector<DistanceVertex>,
                                  std::greater<DistanceVertex>>;
int main(int argc, const char *argv[]) {
    // The sample problem. We store the graph as a adjacency list
    // using a multimap.
    std::multimap<int, int> edges {
        { 0, 1 },
        { 0, 2 },
        { 1, 4 },
        { 2, 3 },
        { 4, 3 },
        { 0, 4 }
    };
    // How many vertices?
    int max_vertex{};
    for (auto [a, b] : edges) {
        max_vertex = std::max(max_vertex, a);
        max_vertex = std::max(max_vertex, b);
    }
    int number_vertices = max_vertex + 1;
    // Initialize the distance from source to each vertex as MAX_INT.
    int source{};
    std::vector<int> distance(number_vertices, std::numeric_limits<int>::max());
    // Initialize distance to source and priority queue
    MinQueue pq;
    distance[source] = 0;
    pq.emplace(0, source);
    while (!pq.empty()) {
        auto [udist, udx] = pq.top();
        pq.pop();
        // Iterate over all neighbors of vdx
        auto [begin, end] = edges.equal_range(udx);
        for (auto iter = begin; iter != end; ++iter) {
            auto vdx = iter->second, vdist = iter->first;
            // If there is a shorter path, record it
            if (udist + vdist < distance[vdx]) {
                distance[vdx] = udist + vdist;
                pq.push({udist, vdx});
            }
        }
    }
    // distance now contains the shortest distance between source and each node
    for (auto i = 0; i < number_vertices; ++i)
        std::cout << distance[i] << std::endl;
    return 0;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With