[toc]
A graph is a fundamental and versatile data structure used in computer science, mathematics, and various fields to represent and analyze relationships between objects or entities. It is a collection of nodes (vertices) connected by edges, where each edge represents a relationship or connection between two nodes. Graphs provide a powerful way to model and study complex systems, networks, and structures, making them a core concept in many areas of computing and science.
Here are some key characteristics and components of graphs:
- Nodes (Vertices): Nodes are the fundamental building blocks of a graph and represent individual entities or elements. Each node can contain data or attributes associated with it.
- Edges: Edges are connections or links between nodes and define the relationships between them. Edges may have weights or labels to represent the strength or type of connection.
- Directed vs. Undirected Graphs: In a directed graph (digraph), edges have a direction, indicating a one-way relationship from one node to another. In an undirected graph, edges have no direction, representing a mutual relationship between nodes.
- Weighted vs. Unweighted Graphs: In a weighted graph, each edge is assigned a numerical weight, which can represent various properties such as distance, cost, or capacity. Unweighted graphs have no weights associated with their edges.
- Cyclic vs. Acyclic Graphs: A graph that contains cycles (closed loops) is called cyclic, while a graph without any cycles is called acyclic. Directed acyclic graphs (DAGs) are particularly important in various applications.
-
Start at the source node and mark it as visited.
-
Enqueue the source node into a queue data structure.
-
While the queue is not empty:
a. Dequeue the front node from the queue.
b. Visit and process the node (e.g., print it or perform some operation).
c. Enqueue all unvisited neighbors of the current node.
d. Mark these neighbors as visited to avoid revisiting them.
from collections import deque
def bfs(graph, u):
visited = set()
queue = deque()
queue.append(u)
visited.add(u)
while queue:
node = queue.popleft()
print(node)
for v in graph[node]:
if v not in visited:
queue.append(v)
visited.add(v)
- Start at the source node and mark it as visited.
- Visit the node (e.g., print it or perform some operation).
- Recursively apply DFS to one unvisited neighbor of the current node.
- If no unvisited neighbors are left, backtrack to the previous node.
- Continue the process until all reachable nodes have been visited.
def dfs(graph, u, visited):
print(u)
visited.add(u)
for neighbor in graph[u]:
if neighbor not in visited:
dfs(graph, neighbor, visited)
A Strongly Connected Component (SCC) is a fundamental concept in graph theory and network analysis, primarily applied to directed graphs, also known as digraphs. An SCC is a subset of vertices within a directed graph with the distinctive property that there exists a directed path from any vertex in the subset to any other vertex within the same subset. In other words, every vertex in an SCC can be reached from every other vertex within the same SCC, making it a highly connected and cohesive substructure.
- First Pass (Forward Pass):
- Perform a depth-first search (DFS) on the original directed graph to assign finishing times to each vertex.
- Create a list of vertices in the order they finish the DFS traversal (topological order).
- Second Pass (Backward Pass):
- Reverse all the edges of the original graph to create a new graph.
- Perform another DFS on the reversed graph, starting from each vertex in the reverse topological order obtained in the first pass.
- During this DFS, identify and group vertices that are reachable from each other. These groups represent the strongly connected components of the original graph.
from collections import defaultdict
class Graph:
def __init__(self, vertices):
self.graph = defaultdict(list)
self.V = vertices
def add_edge(self, u, v):
self.graph[u].append(v)
def _dfs(self, v, visited, stack):
visited[v] = True
for i in self.graph[v]:
if not visited[i]:
self._dfs(i, visited, stack)
stack.append(v)
def get_transpose(self):
g = Graph(self.V)
for i in self.graph:
for j in self.graph[i]:
g.add_edge(j, i)
return g
def kosaraju_scc(self):
stack = []
visited = [False] * self.V
# Fill the stack using the first DFS (forward pass)
for i in range(self.V):
if not visited[i]:
self._dfs(i, visited, stack)
# Create the transpose of the graph
transposed = self.get_transpose()
# Reset visited array for the second DFS (backward pass)
visited = [False] * self.V
sccs = []
while stack:
v = stack.pop()
if not visited[v]:
scc = []
transposed._dfs(v, visited, scc)
sccs.append(scc)
return sccs
# Example usage:
g = Graph(8)
g.add_edge(0, 1)
g.add_edge(1, 2)
g.add_edge(2, 0)
g.add_edge(2, 3)
g.add_edge(3, 4)
g.add_edge(4, 5)
g.add_edge(5, 3)
g.add_edge(6, 7)
sccs = g.kosaraju_scc()
print(sccs)
- Initialize data structures (stack, discovery time, low-link, and SCC list).
- Start a DFS traversal from unvisited vertices.
- Assign a unique discovery time and low-link value.
- Push the vertex onto the stack.
- Recursively explore neighbors:
- Update low-link value based on unvisited neighbors.
- Detect back edges and update low-link accordingly.
- Identify SCCs:
- When low-link equals discovery time, it's the start of an SCC.
- Pop vertices from the stack to form an SCC.
- Repeat steps 2-4 for all unvisited vertices.
- The SCCs found are stored for analysis.
from collections import defaultdict
class TarjanSCC:
def __init__(self, graph):
self.graph = graph
self.index = 0
self.stack = [] # Stack to keep track of vertices in the current DFS path
self.low_link = {} # Mapping of vertices to their low-link values
self.index_map = {} # Mapping of vertices to their discovery times
self.sccs = [] # List to store the strongly connected components (SCCs)
def find_sccs(self):
for vertex in self.graph:
if vertex not in self.index_map:
self._dfs(vertex)
return self.sccs
def _dfs(self, v):
# Initialize the discovery time and low-link values for the current vertex
self.index_map[v] = self.index
self.low_link[v] = self.index
self.index += 1
self.stack.append(v)
for neighbor in self.graph[v]:
if neighbor not in self.index_map:
self._dfs(neighbor)
# Update the low-link value based on the neighbor's low-link
self.low_link[v] = min(self.low_link[v], self.low_link[neighbor])
elif neighbor in self.stack:
# If the neighbor is on the current stack, update the low-link value
self.low_link[v] = min(self.low_link[v], self.index_map[neighbor])
# If the low-link and discovery time are equal, a strongly connected component has been found
if self.low_link[v] == self.index_map[v]:
scc = []
while True:
u = self.stack.pop()
scc.append(u)
if u == v:
break
self.sccs.append(scc)
# Example usage:
graph = {
0: [1, 2],
1: [3],
2: [0],
3: [4],
4: [5, 6],
5: [7],
6: [3],
7: [6, 8],
8: [5]
}
tarjan = TarjanSCC(graph)
sccs = tarjan.find_sccs()
print(sccs)
BFS is highly effective for finding the shortest path in scenarios where all edges have equal weight, and it is well-suited for a wide range of applications, such as navigation, network routing, game development, and more. Its level-order traversal ensures that the algorithm discovers the shortest path before exploring longer paths, making it an essential tool for solving problems related to pathfinding and optimization.
from collections import deque
def bfs(graph, u, v):
visited = set()
queue = deque()
parent = {}
queue.append(u)
visited.add(u)
while queue:
n = queue.popleft()
if n == v:
# Reconstruct the path from v to u
path = []
while n is not None:
path.append(n)
n = parent.get(n)
return len(path) - 1, path[::-1] # Distance and path
for neighbor in graph[n]:
if neighbor not in visited:
visited.add(neighbor)
parent[neighbor] = n
queue.append(neighbor)
return -1, [] # No path found
- Initialization:
- Set all vertex distances to infinity, except for the source vertex, which is set to 0.
- Create a priority queue with the source vertex and its distance (0).
- Iterate:
- While the priority queue is not empty, dequeue the vertex with the smallest known distance.
- Relaxation:
- For each neighbor of the dequeued vertex:
- Calculate the distance from the source to the neighbor through the dequeued vertex.
- If it's shorter than the known distance, update the neighbor's distance.
- Enqueue the neighbor with the updated distance.
- For each neighbor of the dequeued vertex:
- Repeat:
- Continue until the priority queue is empty.
- Path Reconstruction (Optional):
- To find the shortest path, backtrack from the destination to the source using distance information and parent pointers.
import heapq
def dijkstra(graph, s, d):
dist = {v: float('inf') for v in graph}
visited = set()
visited.add(s)
dist[s] = 0
queue = [(0, s)]
parent = {}
while queue:
curr, node = heapq.heappop(queue)
for neighbor in graph[node]:
distance = curr + graph[node][neighbor]
if neighbor not in visited and dist[neighbor] > distance:
parent[neighbor] = node
dist[neighbor] = distance
heapq.heappush(queue, (distance, neighbor))
visited.add(neighbor)
path = []
while d != s:
path.append(d)
d = parent[d]
path.append(d)
return dist, path[::-1]
- Initialization:
- Create a list or dictionary to store the shortest distance from the source vertex to all other vertices. Initialize the distance to the source as 0 and all other distances as infinity.
- Create a list of edges representing the graph's weighted edges.
- Relaxation:
- Iterate through all edges and update the distance to each vertex if a shorter path is found through the current edge. This step is repeated for a total of V-1 iterations, where V is the number of vertices. This ensures that the algorithm explores all possible paths.
- Negative Weight Cycle Detection:
- After the V-1 iterations, check for negative weight cycles. If any distances can be further reduced in the Vth iteration, it indicates the presence of a negative weight cycle. The algorithm may not terminate in this case.
- Path Reconstruction (Optional):
- If you want to find the actual shortest path from the source to a specific destination, you can backtrack through the parent vertices using the updated distance information.
def bellman_ford(graph, s):
dist = {v: float('inf') for v in graph}
dist[s] = 0
edges = [(u, v, graph[u][v]) for u in graph for v in graph[u]]
for _ in range(len(graph)-1):
for u, v, weight in edges:
if dist[u] + weight < dist[v]:
dist[v] = dist[u] + weight
for u, v, weight in edges:
if dist[u] + weight < dist[v]:
raise Exception("Negative edge detected")
return dist
-
Optimal Substructure:
- The Floyd-Warshall algorithm relies on the optimal substructure property, which means that the optimal solution to a problem can be constructed from optimal solutions to its subproblems.
- In the context of shortest paths, if the shortest path from vertex i to vertex j involves an intermediate vertex k, then the subpaths from i to k and from k to j must also be the shortest paths.
-
Recurrence Relation:
-
The algorithm defines a recurrence relation to compute the shortest path distances between all pairs of vertices.
-
Let
dist[i][j]
be the shortest distance from vertex i to vertex j using only the first k vertices as intermediate vertices. The recurrence relation is given by:
cssCopy code dist[i][j] = min(dist[i][j], dist[i][k] + dist[k][j])
-
This relation states that the shortest path from i to j either goes directly from i to j or involves an intermediate vertex k.
-
-
Initialization:
- The algorithm initializes the
dist
matrix with direct edge weights between vertices. If there is no direct edge, the distance is set to infinity. - The diagonal elements (distances from a vertex to itself) are set to 0.
- The algorithm initializes the
-
Dynamic Programming Iteration:
- The algorithm performs a triple-nested loop to iterate over all vertices as potential intermediate vertices (k), updating the
dist
matrix based on the recurrence relation. - After the iteration, the
dist
matrix contains the shortest path distances between all pairs of vertices.
- The algorithm performs a triple-nested loop to iterate over all vertices as potential intermediate vertices (k), updating the
-
Result:
- The final
dist
matrix represents the shortest distances between all pairs of vertices in the graph.
- The final
def floyd_warshall(graph):
num_vertices = len(graph)
dist = [[float('inf') if i != j else 0 for j in range(num_vertices)] for i in range(num_vertices)]
for u in graph:
for v in graph[u]:
dist[u][v] = graph[u][v]
for k in range(num_vertices):
for i in range(num_vertices):
for j in range(num_vertices):
if dist[i][j] > dist[i][k] + dist[k][j]:
dist[i][j] = dist[i][k] + dist[k][j]
return dist
A Minimum Spanning Tree (MST) is a fundamental concept in graph theory and network design. It is a subset of the edges of an undirected graph that connects all the vertices together without forming any cycles and has the minimum possible total edge weight. In other words, a minimum spanning tree is a tree that spans all the vertices of the graph while minimizing the sum of edge weights.
def MST(points):
'''
Kruskal's Algorithm for Minimum Spanning Tree
Distance are decribed by Manhattan distance
Used Union-Find to decide whether two nodes connected or not
'''
size = len(points)
dists = []
if size <= 1:
return 0
for i in range(size-1):
for j in range(i+1, size):
dist = abs(points[j][0]-points[i][0]) + abs(points[j][1]-points[i][1])
dists.append((i, j, dist))
dists = sorted(dists, key=lambda x: x[2])
res = 0
edges = 0
uf = UnionFind(size)
for i, j, dist in dists:
if uf.connected(i, j):
continue
uf.union(i, j)
res += dist
edges += 1
if edges == size-1:
break
return res
import heapq
def MST(points):
size = len(points)
if size <= 1:
return 0
dists = []
visited = set()
visited.add(0)
for j in range(1, size):
dist = abs(points[j][0]-points[0][0]) + abs(points[j][1]-points[0][1])
dists.append((dist, 0, j))
heapq.heapify(dists)
res = 0
while dists and len(visited)<size:
dist, _, j = heapq.heappop(dists)
if j not in visited:
visited.add(j)
res += dist
for i in range(size):
if i in visited:
continue
dist = abs(points[j][0]-points[i][0]) + abs(points[j][1]-points[i][1])
heapq.heappush(dists, (dist, j, i))
return res
def topological_sort(graph):
def helper(v, visited, stack):
visited.add(v)
for neighbor in graph[v]:
if neighbor not in visited:
helper(neighbor, visited, stack)
stack.append(v)
visited = set()
stack = []
for v in graph:
if v not in visited:
helper(v, visited, stack)
return stack[::-1]
import collections
def findOrder(numCourses, prerequisites):
graph = collections.defaultdict(list)
indegree = [0] * numCourses
# Build the graph and calculate in-degrees
for prerequisite in prerequisites:
a, b = prerequisite[0], prerequisite[1]
graph[b].append(a)
indegree[a] += 1
# Initialize the queue with vertices with in-degree 0
queue = collections.deque()
for i, item in enumerate(indegree):
if item == 0:
queue.append(i)
res = []
# Perform topological sort
while queue:
u = queue.popleft()
res.append(u)
for v in graph[u]:
indegree[v] -= 1
if indegree[v] == 0:
queue.append(v)
del graph[u]
# Check if all courses can be taken
if len(res) == numCourses:
return res
else:
return []
# Example usage:
numCourses = 4
prerequisites = [[1,0],[2,0],[3,1],[3,2]]
result = findOrder(numCourses, prerequisites)
print(result)