My Summary on Graph Algorithms02 Oct 2016
In this article, I will summarize the some topics on graph theories and algorithms.
In a nutshell, the topics on graph data structure include: traversal, connectivity, shortest path, minimum spanning tree, topology sorting.
DFS, BFS. Too boring to elaborate.
A typical description for this type of problem is finding the number of components in an undirected graph.
Given n nodes labeled from 0 to n - 1 and a list of undirected edges (each edge is a pair of nodes), write a function to find the number of connected components in an undirected graph.
Union-Find algorithm is a very common solution for this type of problems. Here I want to spent a few paragraphs to talk about union-find and disjoint set.
1. Disjoint set
We can just imagine a disjoint set data structure is a group of sets. This structure supports three major methods: union, find, parent. A very nice introduction can be found here on Youtube by Tushar Roy.
2. Methods: find, union, parent
Find method is finding the representative element of the given element’s set.
Union method is join the sets of the given two element together.
Parent is returning the parent of given elements. In most algorithm excercise, we just use a list and its index to perform it. A snippet for the quoted problem above:
def number_of_connected_graph(n, edges): def find(v): while v != parent[v]: # following line is path compression. #A optimization for find parent[v] = parent[parent[v]] v = parent[v] return v def union(e): # e is an edge. parent_u, parent_v = map(find, e) parent[parent_u] = parent_v parent = list(range(n)) for e in edges: union(e) return len(set(map(find,parent))) # Note the last line here: # One more find operation on nodes is to compress the path.
This topic can be categorized into several sub-topics: directed edges vs undirected edges; weighted edges vs unweighted edges.
BFS for unweightededges. Dijkstra for weighted edges, complexity O(EV + V^2 * logV) Floyd for weighted edges, but time complexity is O(V^3). Choosing between Dijkstra and Floyd depends on the density of the graph.
Minimum Spanning Tree
T <- spanning tree S <- any node Q <- minQueue for all neighbor of S while T not complete: n = Q.heappop T.add(n) for all n.neighbors: if not in T: Q.heappush
Topology sort of a directed graph is a linear ordering of its vertices such that for every directed edge uv, u comes before v in the ordering.
DFS vs BFS (Kahn’s algorithm)
There are a total of n courses you have to take, labeled from 0 to n - 1.
Some courses may have prerequisites, for example to take course 0 you have to first take course 1, which is expressed as a pair: [0,1]
Given the total number of courses and a list of prerequisite pairs, is it possible for you to finish all courses?
L ← Empty list that will contain the sorted elements S ← Set of all nodes with no incoming edges while S is non-empty do remove a node n from S add n to tail of L for each node m with an edge e from n to m do remove edge e from the graph if m has no other incoming edges then insert m into S if graph has edges then return error (graph has at least one cycle) else return L (a topologically sorted order)