This codebase contains the following optimized algorithms for finding the minimum cost spanning tree (MST) of an undirected weighted graph:
- Kruskal's algorithm that uses Union-Find data structure. The running time complexity of this algorithm is O(mlogm); m = # edges. For a graph with 500 nodes and ~125,000 edges this optimized (union-find) implementation took 25 milliseconds to execute and the standard implementation 865 milliseconds. Plotting the graph shows that the minimum spanning tree generated by this code is indeed a tree as there are no cycles in the plotted graph.
- Prim's algorithm to find the minimum spanning tree (MST) that uses Heap data structure. The heap implements a min priority queue and stores the node with (current) min cost to be added to the MST in the next iteration. The running time complexity of this algorithm is O(mlogn); m = # edges, n = #nodes. For a graph with 500 nodes and ~125,000 edges this optimized implementation takes ~2 seconds to execute and the standard implementation ~60 seconds. Plot of the MST is also in the codebase.
This codebase also has an implementation of Kruskal's algorithm that generates disjoint k-clusters for the nodes in the graph (where k is the numbers of clusters). This approach of clustering ensures max-spacing between clusters where the spacing is defined as minimum distance between any two nodes of disjoint clusters. The running time complexity of this algorithm is O(mlogm); m = # edges. In this example the code was executed on a graph of 500 nodes and ~125,000 edges and 200 clusters were generated. The max spacing for the graph used was 26. Plotting the graph shows disjoint clusters generated by the algorithm.