adityamuralidaran/Connected-Components-Using-MapReduce
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
Implemented an algorithm using large-star and small-star operations on a large undirected graph to find the connected component label for every vertex using Python/Spark. It is assumed that un-directed graph on which we are operating is too large to be represented in the memory of a single compute node. Run using the following command: PATH=$PATH:/opt/spark-2.2.0-bin-hadoop2.7/bin spark-submit a2.py input.txt output Sample input is given in 'input.txt'