Hasuer's Studio.

24. Pattern Topological Sort (Graph)

Word count: 7kReading time: 42 min
2024/05/24

416. Partition Equal Subset Sum Design Gurus Educative.io

Introduction

Topological Sort is used to find a linear ordering of elements that have dependencies on each other. For example, if event ‘B’ is dependent on event ‘A’, ‘A’ comes before ‘B’ in topological ordering.

This pattern defines an easy way to understand the technique for performing topological sorting of a set of elements and then solves a few problems using it.

Let’s see this pattern in action.

*Topological Sort (medium)

Design Gurus Educative.io

Problem Statement

Topological Sort of a directed graph (a graph with unidirectional edges) is a linear ordering of its vertices such that for every directed edge (U, V) from vertex U to vertex V, U comes before V in the ordering.

Given a directed graph, find the topological ordering of its vertices.

Example 1:

1
2
3
4
Input: Vertices=4, Edges=[3, 2], [3, 0], [2, 0], [2, 1]
Output: Following are the two valid topological sorts for the given graph:
1) 3, 2, 0, 1
2) 3, 2, 1, 0

Example 2:

1
2
3
4
5
6
7
Input: Vertices=5, Edges=[4, 2], [4, 3], [2, 0], [2, 1], [3, 1]
Output: Following are all valid topological sorts for the given graph:
1) 4, 2, 3, 0, 1
2) 4, 3, 2, 0, 1
3) 4, 3, 2, 1, 0
4) 4, 2, 3, 1, 0
5) 4, 2, 0, 3, 1

Example 3:

1
2
3
4
5
6
7
8
9
10
Input: Vertices=7, Edges=[6, 4], [6, 2], [5, 3], [5, 4], [3, 0], [3, 1], [3, 2], [4, 1]
Output: Following are all valid topological sorts for the given graph:
1) 5, 6, 3, 4, 0, 1, 2
2) 6, 5, 3, 4, 0, 1, 2
3) 5, 6, 4, 3, 0, 2, 1
4) 6, 5, 4, 3, 0, 1, 2
5) 5, 6, 3, 4, 0, 2, 1
6) 5, 6, 3, 4, 1, 2, 0

There are other valid topological ordering of the graph too.

Solution

The basic idea behind the topological sort is to provide a partial ordering among the vertices of the graph such that if there is an edge from U to V then U≤V i.e., U comes before V in the ordering. Here are a few fundamental concepts related to topological sort:

  1. Source: Any node that has no incoming edge and has only outgoing edges is called a source.
  2. Sink: Any node that has only incoming edges and no outgoing edge is called a sink.
  3. So, we can say that a topological ordering starts with one of the sources and ends at one of the sinks.
  4. A topological ordering is possible only when the graph has no directed cycles, i.e. if the graph is a Directed Acyclic Graph (DAG). If the graph has a cycle, some vertices will have cyclic dependencies which makes it impossible to find a linear ordering among vertices.

To find the topological sort of a graph we can traverse the graph in a Breadth First Search (BFS) way. We will start with all the sources, and in a stepwise fashion, save all sources to a sorted list. We will then remove all sources and their edges from the graph. After the removal of the edges, we will have new sources, so we will repeat the above process until all vertices are visited.

Here is the visual representation of this algorithm for Example-3:

This is how we can implement this algorithm:

a. Initialization

  1. We will store the graph in Adjacency Lists, which means each parent vertex will have a list containing all of its children. We will do this using a HashMap where the ‘key’ will be the parent vertex number and the value will be a List containing children vertices.
  2. To find the sources, we will keep a HashMap to count the in-degrees i.e., count of incoming edges of each vertex. Any vertex with ‘0’ in-degree will be a source.

b. Build the graph and find in-degrees of all vertices

  1. We will build the graph from the input and populate the in-degrees HashMap.

c. Find all sources

  1. All vertices with ‘0’ in-degrees will be our sources and we will store them in a Queue.

d. Sort

  1. For each source, do the following things:
    • Add it to the sorted list.
    • Get all of its children from the graph.
    • Decrement the in-degree of each child by 1.
    • If a child’s in-degree becomes ‘0’, add it to the sources Queue.
  2. Repeat step 1, until the source Queue is empty.

Code

Here is what our algorithm will look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
from collections import deque

class Solution:
def sort(self, vertices, edges):
sortedOrder = []
if vertices <= 0:
return sortedOrder

# a. Initialize the graph
inDegree = {i: 0 for i in range(vertices)} # count of incoming edges
graph = {i: [] for i in range(vertices)} # adjacency list graph

# b. Build the graph
for edge in edges:
parent, child = edge[0], edge[1]
graph[parent].append(child) # put the child into it's parent's list
inDegree[child] += 1 # increment child's inDegree

# c. Find all sources i.e., all vertices with 0 in-degrees
sources = deque()
for key in inDegree:
if inDegree[key] == 0:
sources.append(key)

# d. For each source, add it to the sortedOrder and subtract '1' from all of its
# children's in-degrees if a child's in-degree becomes zero, add it to sources queue
while sources:
vertex = sources.popleft()
sortedOrder.append(vertex)
for child in graph[vertex]: # get the node's children to decrement their in-degrees
inDegree[child] -= 1
if inDegree[child] == 0:
sources.append(child)

# topological sort is not possible as the graph has a cycle
if len(sortedOrder) != vertices:
return []

return sortedOrder


def main():
sol = Solution()
print("Topological sort: " +
str(sol.sort(4, [[3, 2], [3, 0], [2, 0], [2, 1]])))
print("Topological sort: " +
str(sol.sort(5, [[4, 2], [4, 3], [2, 0], [2, 1], [3, 1]])))
print("Topological sort: " +
str(sol.sort(7, [[6, 4], [6, 2], [5, 3], [5, 4], \
[3, 0], [3, 1], [3, 2], [4, 1]])))


main()

Time Complexity

In step ‘d’, each vertex will become a source only once and each edge will be accessed and removed once. Therefore, the time complexity of the above algorithm will be O(V+E), where ‘V’ is the total number of vertices and ‘E’ is the total number of edges in the graph.

Space Complexity

The space complexity will be O(V+E), since we are storing all of the edges for each vertex in an adjacency list.


Similar Problems

Problem 1: Find if a given Directed Graph has a cycle in it or not.

Solution: If we can’t determine the topological ordering of all the vertices of a directed graph, the graph has a cycle in it. This was also referred to in the above code:

1
2
if (sortedOrder.size() != vertices) // topological sort is not possible as the graph has a cycle
return new ArrayList<>();

Tasks Scheduling (medium)

Top Interview 150 | 207. Course Schedule Design Gurus Educative.io

Problem Statement

There are ‘N’ tasks, labeled from ‘0’ to ‘N-1’. Each task can have some prerequisite tasks which need to be completed before it can be scheduled. Given the number of tasks and a list of prerequisite pairs, find out if it is possible to schedule all the tasks.

Example 1:

1
2
3
4
Input: Tasks=3, Prerequisites=[0, 1], [1, 2]
Output: true
Explanation: To execute task '1', task '0' needs to finish first. Similarly, task '1' needs to finish
before '2' can be scheduled. A possible sceduling of tasks is: [0, 1, 2]

Example 2:

1
2
3
Input: Tasks=3, Prerequisites=[0, 1], [1, 2], [2, 0]
Output: false
Explanation: The tasks have cyclic dependency, therefore they cannot be sceduled.

Example 3:

1
2
3
Input: Tasks=6, Prerequisites=[2, 5], [0, 5], [0, 4], [1, 4], [3, 2], [1, 3]
Output: true
Explanation: A possible sceduling of tasks is: [0 1 4 3 2 5]

Solution

This problem is asking us to find out if it is possible to find a topological ordering of the given tasks. The tasks are equivalent to the vertices and the prerequisites are the edges.

We can use a similar algorithm as described in Topological Sort to find the topological ordering of the tasks. If the ordering does not include all the tasks, we will conclude that some tasks have cyclic dependencies.

Code

Here is what our algorithm will look like (only the highlighted lines have changed):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# 和上一题一摸一样
from collections import deque

class Solution:
def isSchedulingPossible(self, tasks, prerequisites):
sortedOrder = []
if tasks <= 0:
return False

# a. Initialize the graph
inDegree = {i: 0 for i in range(tasks)} # count of incoming edges
graph = {i: [] for i in range(tasks)} # adjacency list graph

# b. Build the graph
for prerequisite in prerequisites:
parent, child = prerequisite[0], prerequisite[1]
graph[parent].append(child) # put the child into it's parent's list
inDegree[child] += 1 # increment child's inDegree

# c. Find all sources i.e., all vertices with 0 in-degrees
sources = deque()
for key in inDegree:
if inDegree[key] == 0:
sources.append(key)

# d. For each source, add it to the sortedOrder and subtract one from all of its
# children's in-degrees if a child's in-degree becomes zero, add it to sources queue
while sources:
vertex = sources.popleft()
sortedOrder.append(vertex)
for child in graph[vertex]: # get the node's children to decrement their in-degrees
inDegree[child] -= 1
if inDegree[child] == 0:
sources.append(child)

# if sortedOrder doesn't contain all tasks, there is a cyclic dependency between
# tasks, therefore, we will not be able to schedule all tasks
return len(sortedOrder) == tasks


def main():
sol = Solution()
print("Is scheduling possible: " +
str(sol.isSchedulingPossible(3, [[0, 1], [1, 2]])))
print("Is scheduling possible: " +
str(sol.isSchedulingPossible(3, [[0, 1], [1, 2], [2, 0]])))
print("Is scheduling possible: " +
str(sol.isSchedulingPossible(6, [[2, 5], [0, 5], [0, 4], [1, 4], [3, 2], [1, 3]])))

main()

Time complexity

In step ‘d’, each task can become a source only once and each edge (prerequisite) will be accessed and removed once. Therefore, the time complexity of the above algorithm will be O(V+E), where ‘V’ is the total number of tasks and ‘E’ is the total number of prerequisites.

Space complexity

The space complexity will be O(V+E), since we are storing all of the prerequisites for each task in an adjacency list.


Similar Problems

Course Schedule: There are ‘N’ courses, labeled from ‘0’ to ‘N-1’. Each course can have some prerequisite courses which need to be completed before it can be taken. Given the number of courses and a list of prerequisite pairs, find if it is possible for a student to take all the courses.

Solution: This problem is exactly similar to our parent problem. In this problem, we have courses instead of tasks.

Tasks Scheduling Order (medium)

Top Interview 150 | 210. Course Schedule II Design Gurus Educative.io

Problem Statement

There are ‘N’ tasks, labeled from ‘0’ to ‘N-1’. Each task can have some prerequisite tasks which need to be completed before it can be scheduled. Given the number of tasks and a list of prerequisite pairs, write a method to find the ordering of tasks we should pick to finish all tasks.

Example 1:

1
2
3
4
Input: Tasks=3, Prerequisites=[0, 1], [1, 2]
Output: [0, 1, 2]
Explanation: To execute task '1', task '0' needs to finish first. Similarly, task '1' needs to finish
before '2' can be scheduled. A possible scheduling of tasks is: [0, 1, 2]

Example 2:

1
2
3
Input: Tasks=3, Prerequisites=[0, 1], [1, 2], [2, 0]
Output: []
Explanation: The tasks have cyclic dependency, therefore they cannot be scheduled.

Example 3:

1
2
3
Input: Tasks=6, Prerequisites=[2, 5], [0, 5], [0, 4], [1, 4], [3, 2], [1, 3]
Output: [0 1 4 3 2 5]
Explanation: A possible scheduling of tasks is: [0 1 4 3 2 5]

Solution

This problem is similar to Tasks Scheduling, the only difference being that we need to find the best ordering of tasks so that it is possible to schedule them all.

Code

Here is what our algorithm will look like (only the highlighted lines have changed):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
from collections import deque

class Solution:
def findOrder(self, tasks, prerequisites):
sortedOrder = []
if tasks <= 0:
return sortedOrder

# a. Initialize the graph
inDegree = {i: 0 for i in range(tasks)} # count of incoming edges
graph = {i: [] for i in range(tasks)} # adjacency list graph

# b. Build the graph
for prerequisite in prerequisites:
parent, child = prerequisite[0], prerequisite[1]
graph[parent].append(child) # put the child into it's parent's list
inDegree[child] += 1 # increment child's inDegree

# c. Find all sources i.e., all vertices with 0 in-degrees
sources = deque()
for key in inDegree:
if inDegree[key] == 0:
sources.append(key)

# d. For each source, add it to the sortedOrder and subtract one from all of its
# children's in-degrees if a child's in-degree becomes zero, add it to sources queue
while sources:
vertex = sources.popleft()
sortedOrder.append(vertex)
for child in graph[vertex]: # get the node's children to decrement their in-degrees
inDegree[child] -= 1
if inDegree[child] == 0:
sources.append(child)

# if sortedOrder doesn't contain all tasks, there is a cyclic dependency between
# tasks, therefore, we will not be able to schedule all tasks
if len(sortedOrder) != tasks:
return []

return sortedOrder


def main():
sol = Solution()
print("Is scheduling possible: " + str(sol.findOrder(3, [[0, 1], [1, 2]])))
print("Is scheduling possible: " +
str(sol.findOrder(3, [[0, 1], [1, 2], [2, 0]])))
print("Is scheduling possible: " +
str(sol.findOrder(6, [[2, 5], [0, 5], [0, 4], [1, 4], [3, 2], [1, 3]])))


main()

Time complexity

In step ‘d’, each task can become a source only once and each edge (prerequisite) will be accessed and removed once. Therefore, the time complexity of the above algorithm will be O(V+E), where ‘V’ is the total number of tasks and ‘E’ is the total number of prerequisites.

Space complexity

The space complexity will be O(V+E), since we are storing all of the prerequisites for each task in an adjacency list.


Similar Problems

Course Schedule: There are ‘N’ courses, labeled from ‘0’ to ‘N-1’. Each course has some prerequisite courses which need to be completed before it can be taken. Given the number of courses and a list of prerequisite pairs, write a method to find the best ordering of the courses that a student can take in order to finish all courses.

Solution: This problem is exactly similar to our parent problem. In this problem, we have courses instead of tasks.

*All Tasks Scheduling Orders (hard)

Design Gurus Educative.io

Problem Statement

There are ‘N’ tasks, labeled from ‘0’ to ‘N-1’. Each task can have some prerequisite tasks which need to be completed before it can be scheduled. Given the number of tasks and a list of prerequisite pairs, write a method to print all possible ordering of tasks meeting all prerequisites.

Example 1:

1
2
3
Input: Tasks=3, Prerequisites=[0, 1], [1, 2]
Output: [0, 1, 2]
Explanation: There is only possible ordering of the tasks.

Example 2:

1
2
3
4
5
Input: Tasks=4, Prerequisites=[3, 2], [3, 0], [2, 0], [2, 1]
Output:
1) [3, 2, 0, 1]
2) [3, 2, 1, 0]
Explanation: There are two possible orderings of the tasks meeting all prerequisites.

Example 3:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Input: Tasks=6, Prerequisites=[2, 5], [0, 5], [0, 4], [1, 4], [3, 2], [1, 3]
Output:
1) [0, 1, 4, 3, 2, 5]
2) [0, 1, 3, 4, 2, 5]
3) [0, 1, 3, 2, 4, 5]
4) [0, 1, 3, 2, 5, 4]
5) [1, 0, 3, 4, 2, 5]
6) [1, 0, 3, 2, 4, 5]
7) [1, 0, 3, 2, 5, 4]
8) [1, 0, 4, 3, 2, 5]
9) [1, 3, 0, 2, 4, 5]
10) [1, 3, 0, 2, 5, 4]
11) [1, 3, 0, 4, 2, 5]
12) [1, 3, 2, 0, 5, 4]
13) [1, 3, 2, 0, 4, 5]

Solution

This problem is similar to Tasks Scheduling Order the only difference is that we need to find all the topological orderings of the tasks.

At any stage, if we have more than one source available and since we can choose any source, therefore, in this case, we will have multiple orderings of the tasks. We can use a recursive approach with Backtracking to consider all sources at any step.

Code

Here is what our algorithm will look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
from collections import deque

class Solution:
def __init__(self):
self.orders = []


def printOrders(self, tasks, prerequisites):
sortedOrder = []
if tasks <= 0:
return self.orders

# a. Initialize the graph
inDegree = {i: 0 for i in range(tasks)} # count of incoming edges
graph = {i: [] for i in range(tasks)} # adjacency list graph

# b. Build the graph
for prerequisite in prerequisites:
parent, child = prerequisite[0], prerequisite[1]
graph[parent].append(child) # put the child into it's parent's list
inDegree[child] += 1 # increment child's inDegree

# c. Find all sources i.e., all vertices with 0 in-degrees
sources = deque()
for key in inDegree:
if inDegree[key] == 0:
sources.append(key)

self.print_all_topological_sorts(graph, inDegree, sources, sortedOrder)

return self.orders


def print_all_topological_sorts(self, graph, inDegree, sources, sortedOrder):
if sources:
for vertex in sources:
sortedOrder.append(vertex)
sourcesForNextCall = deque(sources) # make a copy of sources
# only remove the current source, all other sources should remain in the queue for
# the next call
sourcesForNextCall.remove(vertex)
# get the node's children to decrement their in-degrees
for child in graph[vertex]:
inDegree[child] -= 1
if inDegree[child] == 0:
sourcesForNextCall.append(child)

# recursive call to print other orderings from the remaining (and new) sources
self.print_all_topological_sorts(
graph, inDegree, sourcesForNextCall, sortedOrder)

# backtrack, remove the vertex from the sorted order and put all of its children
# back to consider the next source instead of the current vertex
sortedOrder.remove(vertex)
for child in graph[vertex]:
inDegree[child] += 1

# if sortedOrder doesn't contain all tasks, either we've a cyclic dependency between
# tasks, or we have not processed all the tasks in this recursive call
if len(sortedOrder) == len(inDegree):
self.orders.append(sortedOrder.copy())


def main():
sol = Solution()
print("Task Orders: ")
result1 = sol.printOrders(3, [[0, 1], [1, 2]])
for order in result1:
print(order)

print("Task Orders: ")
result2 = sol.printOrders(4, [[3, 2], [3, 0], [2, 0], [2, 1]])
for order in result2:
print(order)

print("Task Orders: ")
result3 = sol.printOrders(6, [[2, 5], [0, 5], [0, 4], [1, 4], [3, 2], [1, 3]])
for order in result3:
print(order)

if __name__ == "__main__":
main()

Time and Space Complexity

If we don’t have any prerequisites, all combinations of the tasks can represent a topological ordering. As we know, that there can be N!N! combinations for ‘N’ numbers, therefore the time and space complexity of our algorithm will be O(V! E)* where ‘V’ is the total number of tasks and ‘E’ is the total prerequisites. We need the ‘E’ part because in each recursive call, at max, we remove (and add back) all the edges.

*Alien Dictionary (hard)

没咋看懂题目:应该是对于给定的单词列表,相邻的单词对比,找出第一个不同的字符,这个字符的顺序将定义一条有向边。
Leetcode 269 会员 Design Gurus Educative.io

Problem Statement

There is a dictionary containing words from an alien language for which we don’t know the ordering of the letters. Write a method to find the correct order of the letters in the alien language. It is given that the input is a valid dictionary and there exists an ordering among its letters.

Example 1:

1
2
3
4
5
6
7
8
9
Input: Words: ["ba", "bc", "ac", "cab"]
Output: bac
Explanation: Given that the words are sorted lexicographically by the rules of the alien language, so
from the given words we can conclude the following ordering among its characters:

1. From "ba" and "bc", we can conclude that 'a' comes before 'c'.
2. From "bc" and "ac", we can conclude that 'b' comes before 'a'

From the above two points, we can conclude that the correct character order is: "bac"

Example 2:

1
2
3
4
5
6
7
8
Input: Words: ["cab", "aaa", "aab"]
Output: cab
Explanation: From the given words we can conclude the following ordering among its characters:

1. From "cab" and "aaa", we can conclude that 'c' comes before 'a'.
2. From "aaa" and "aab", we can conclude that 'a' comes before 'b'

From the above two points, we can conclude that the correct character order is: "cab"

Example 3:

1
2
3
4
5
6
7
8
9
10
11
Input: Words: ["ywx", "wz", "xww", "xz", "zyy", "zwz"]
Output: ywxz
Explanation: From the given words we can conclude the following ordering among its characters:

1. From "ywx" and "wz", we can conclude that 'y' comes before 'w'.
2. From "wz" and "xww", we can conclude that 'w' comes before 'x'.
3. From "xww" and "xz", we can conclude that 'w' comes before 'z'
4. From "xz" and "zyy", we can conclude that 'x' comes before 'z'
5. From "zyy" and "zwz", we can conclude that 'y' comes before 'w'

From the above five points, we can conclude that the correct character order is: "ywxz"

Constraints:

  • 1 <= words.length <= 100
  • 1 <= words[i].length <= 100
  • words[i] consists of only lowercase English letters.

Solution

Since the given words are sorted lexicographically by the rules of the alien language, we can always compare two adjacent words to determine the ordering of the characters. Take Example-1 above: [“ba”, “bc”, “ac”, “cab”]

  1. Take the first two words “ba” and “bc”. Starting from the beginning of the words, find the first character that is different in both words: it would be ‘a’ from “ba” and ‘c’ from “bc”. Because of the sorted order of words (i.e. the dictionary!), we can conclude that ‘a’ comes before ‘c’ in the alien language.
  2. Similarly, from “bc” and “ac”, we can conclude that ‘b’ comes before ‘a’.

These two points tell us that we are actually asked to find the topological ordering of the characters, and that the ordering rules should be inferred from adjacent words from the alien dictionary.

This makes the current problem similar to Tasks Scheduling Order, the only difference being that we need to build the graph of the characters by comparing adjacent words first, and then perform the topological sort for the graph to determine the order of the characters.

Code

Here is what our algorithm will look like (only the highlighted lines have changed):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
# 但是有一点不太懂就是为什么。只要选取相邻的两个就可以建立邻接表
from collections import deque

class Solution:
def findOrder(self, words):
if len(words) == 0:
return ""

# a. Initialize the graph
inDegree = {} # count of incoming edges
graph = {} # adjacency list graph
for word in words:
for character in word:
inDegree[character] = 0
graph[character] = []

# b. Build the graph
for i in range(0, len(words)-1):
# find ordering of characters from adjacent words
w1, w2 = words[i], words[i + 1]
for j in range(0, min(len(w1), len(w2))):
parent, child = w1[j], w2[j]
if parent != child: # if the two characters are different
# put the child into it's parent's list
graph[parent].append(child)
inDegree[child] += 1 # increment child's inDegree
break # only the first different character between the two words will help us
# find the order

# c. Find all sources i.e., all vertices with 0 in-degrees
sources = deque()
for key in inDegree:
if inDegree[key] == 0:
sources.append(key)

# d. For each source, add it to the sortedOrder and subtract one from all of its
# children's in-degrees if a child's in-degree becomes zero, add it to sources queue
sortedOrder = []
while sources:
vertex = sources.popleft()
sortedOrder.append(vertex)
for child in graph[vertex]: # get the node's children to decrement their in-degrees
inDegree[child] -= 1
if inDegree[child] == 0:
sources.append(child)

# if sortedOrder doesn't contain all characters, there is a cyclic dependency between
# characters, therefore, we'll not be able to find the correct ordering of characters
if len(sortedOrder) != len(inDegree):
return ""

return ''.join(sortedOrder)


def main():
sol = Solution()
print("Character order: " + sol.findOrder(["ba", "bc", "ac", "cab"]))
print("Character order: " + sol.findOrder(["cab", "aaa", "aab"]))
print("Character order: " + sol.findOrder(["ywx", "wz", "xww", "xz", "zyy", "zwz"]))


main()


Time complexity

In step ‘d’, each task can become a source only once and each edge (a rule) will be accessed and removed once. Therefore, the time complexity of the above algorithm will be O(V+E), where ‘V’ is the total number of different characters and ‘E’ is the total number of the rules in the alien language. Since, at most, each pair of words can give us one rule, therefore, we can conclude that the upper bound for the rules is O(N) where ‘N’ is the number of words in the input. So, we can say that the time complexity of our algorithm is O(V+N).

Space complexity

The space complexity will be O(V+N), since we are storing all of the rules for each character in an adjacency list.

*Problem Challenge 1

Leetcode 444 会员 Design Gurus Educative.io

Reconstructing a Sequence (hard)

Given a sequence originalSeq and an array of sequences, write a method to find if originalSeq can be uniquely reconstructed from the array of sequences.

Unique reconstruction means that we need to find if originalSeq is the only sequence such that all sequences in the array are subsequences of it.

Example 1:

1
2
3
4
5
Input: originalSeq: [1, 2, 3, 4], seqs: [[1, 2], [2, 3], [3, 4]]
Output: true
Explanation: The sequences [1, 2], [2, 3], and [3, 4] can uniquely reconstruct
[1, 2, 3, 4], in other words, all the given sequences uniquely define the order of numbers
in the 'originalSeq'.

Example 2:

1
2
3
4
5
6
Input: originalSeq: [1, 2, 3, 4], seqs: [[1, 2], [2, 3], [2, 4]]
Output: false
Explanation: The sequences [1, 2], [2, 3], and [2, 4] cannot uniquely reconstruct
[1, 2, 3, 4]. There are two possible sequences we can construct from the given sequences:
1) [1, 2, 3, 4]
2) [1, 2, 4, 3]

Example 3:

1
2
3
4
Input: originalSeq: [3, 1, 4, 2, 5], seqs: [[3, 1, 5], [1, 4, 2, 5]]
Output: true
Explanation: The sequences [3, 1, 5] and [1, 4, 2, 5] can uniquely reconstruct
[3, 1, 4, 2, 5].

Constraints:

  • n == originalSeq.length
  • 1 <= n <= 10^4
  • originalSeq is a permutation of all the integers in the range [1, n].
  • 1 <= seqs.length <= 10^4
  • 1 <= seqs[i].length <= 10^4
  • `1 <= sum(seqs[i].length) <= 10^5
  • 1 <= seqs[i][j] <= n
  • All the arrays of sequences are unique.
  • seqs[i] is a subsequence of nums.

Try it yourself

Try solving this question here:

1
2
# 我的想法:什么时候sources中的元素超过了1一个,那就是有有问题的,输出false
# 漏了一种情况,就是可以构建出唯一的序列,但是和给的origin_seq不一样

Solution

Since each sequence in the given array defines the ordering of some numbers, we need to combine all these ordering rules to find two things:

  1. Is it possible to construct the originalSeq from all these rules?
  2. Are these ordering rules not sufficient enough to define the unique ordering of all the numbers in the originalSeq? In other words, can these rules result in more than one sequence?

Take Example-1:

1
originalSeq: [1, 2, 3, 4], seqs:[[1, 2], [2, 3], [3, 4]]

The first sequence tells us that ‘1’ comes before ‘2’; the second sequence tells us that ‘2’ comes before ‘3’; the third sequence tells us that ‘3’ comes before ‘4’. Combining all these sequences will result in a unique sequence: [1, 2, 3, 4].

The above explanation tells us that we are actually asked to find the topological ordering of all the numbers and also to verify that there is only one topological ordering of the numbers possible from the given array of the sequences.

This makes the current problem similar to Tasks Scheduling Order with two differences:

  1. We need to build the graph of the numbers by comparing each pair of numbers in the given array of sequences.
  2. We must perform the topological sort for the graph to determine two things:
    • Can the topological ordering construct the originalSeq?
    • That there is only one topological ordering of the numbers possible. This can be confirmed if we do not have more than one source at any time while finding the topological ordering of numbers.

Code

Here is what our algorithm will look like (only the highlighted lines have changed):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
# 我的想法是:什么时候sources中的元素超过了1一个,那就是有有问题的,输出false
# 漏了一种情况,就是可以构建出唯一的序列,但是和给的origin_seq不一样
# 以下是官方solution
from collections import deque

class Solution:
def canConstruct(self, originalSeq, sequences):
sortedOrder = []
if len(originalSeq) <= 0:
return False

# a. Initialize the graph
inDegree = {} # count of incoming edges
graph = {} # adjacency list graph
for sequence in sequences:
for num in sequence:
inDegree[num] = 0
graph[num] = []

# b. Build the graph
for sequence in sequences:
for i in range(1, len(sequence)):
parent, child = sequence[i - 1], sequence[i]
graph[parent].append(child)
inDegree[child] += 1

# if we don't have ordering rules for all the numbers we'll not able to uniquely
# construct the sequence
if len(inDegree) != len(originalSeq):
return False

# c. Find all sources i.e., all vertices with 0 in-degrees
sources = deque()
for key in inDegree:
if inDegree[key] == 0:
sources.append(key)

# d. For each source, add it to the sortedOrder and subtract one from all of its
# children's in-degrees if a child's in-degree becomes zero, add it to sources queue
while sources:
if len(sources) > 1:
return False # more than one sources mean, there is more than one way to
# reconstruct the sequence
if originalSeq[len(sortedOrder)] != sources[0]:
return False # the next source(or number) is different from the original sequence

vertex = sources.popleft()
sortedOrder.append(vertex)
for child in graph[vertex]: # get the node's children to decrement their in-degrees
inDegree[child] -= 1
if inDegree[child] == 0:
sources.append(child)

# if sortedOrder's size is not equal to original sequence's size, there is no unique
# way to construct
return len(sortedOrder) == len(originalSeq)


def main():
sol = Solution()
print("Can construct: " +
str(sol.canConstruct([1, 2, 3, 4], [[1, 2], [2, 3], [3, 4]])))
print("Can construct: " +
str(sol.canConstruct([1, 2, 3, 4], [[1, 2], [2, 3], [2, 4]])))
print("Can construct: " +
str(sol.canConstruct([3, 1, 4, 2, 5], [[3, 1, 5], [1, 4, 2, 5]])))


main()

Time complexity

In step ‘d’, each number can become a source only once and each edge (a rule) will be accessed and removed once. Therefore, the time complexity of the above algorithm will be O(V+E), where ‘V’ is the count of distinct numbers and ‘E’ is the total number of the rules. Since, at most, each pair of numbers can give us one rule, we can conclude that the upper bound for the rules is O(N) where ‘N’ is the count of numbers in all sequences. So, we can say that the time complexity of our algorithm is O(V+N).

Space complexity

The space complexity will be O(V+N), since we are storing all of the rules for each number in an adjacency list.

*Problem Challenge 2

310. Minimum Height Trees Design Gurus Educative.io

Minimum Height Trees (hard)

We are given an undirected graph that has characteristics of a k-ary tree. In such a graph, we can choose any node as the root to make a k-ary tree. The root (or the tree) with the minimum height will be called Minimum Height Tree (MHT). There can be multiple MHTs for a graph. In this problem, we need to find all those roots which give us MHTs. Write a method to find all MHTs of the given graph and return a list of their roots.

Example 1:

1
2
3
4
Input: vertices: 5, Edges: [[0, 1], [1, 2], [1, 3], [2, 4]]
Output:[1, 2]
Explanation: Choosing '1' or '2' as roots give us MHTs. In the below diagram, we can see that the
height of the trees with roots '1' or '2' is three which is minimum.

Example 2:

1
2
3
4
Input: vertices: 4, Edges: [[0, 1], [0, 2], [2, 3]]
Output:[0, 2]
Explanation: Choosing '0' or '2' as roots give us MHTs. In the below diagram, we can see that the
height of the trees with roots '0' or '2' is three which is minimum.

Example 3:

1
2
Input: vertices: 4, Edges: [[0, 1], [1, 2], [1, 3]]
Output:[1]

Constraints:

  • 1 <= vertices <= 2 * 10^4
  • edges.length == n - 1
  • 0 <= ai, bi < n
  • ai != bi
  • All the pairs (ai, bi) are distinct.
  • The given input is guaranteed to be a tree and there will be no repeated edges.

Solution

The key intuition behind solving this problem is based on the definition of a tree’s height: the height of a tree is the number of edges on the longest path between the root and any leaf. So, an MHT is a tree that minimizes this longest path.

Imagine we have a longest path P in the tree. The path P has two ends; let’s call them end A and end B. Now, let’s consider what the root of an MHT can be:

  • If we select a root that is not on the path P, the height of the tree would at least be the length of P, because there would be a path from the root to either A or B that is longer than P (as it includes P plus some additional edges). Therefore, the root of the MHT must be on P.
  • If the root is on P, but not in the middle of P, then the height of the tree will be larger than if we selected the root in the middle of P, because the longest path will be from the root to either end of P. Therefore, the root of the MHT must be in the middle of P.

So, the problem of finding the MHT root(s) reduces to finding the middle node(s) of the longest path in the tree.

We can find the middle node(s) of the longest path by using an algorithm called ‘leaf pruining’. Let’s look into this.

From the above discussion, we can deduce that the leaves can’t give us MHT, hence, we can remove them from the graph and remove their edges too. Once we remove the leaves, we will have new leaves. Since these new leaves can’t give us MHT, we will repeat the process and remove them from the graph too. We will prune the leaves until we are left with one or two nodes which will be our answer and the roots for MHTs.

The algorithm works because when you trim leaves, you’re essentially trimming the ends of all the longest paths in the tree. If there’s one longest path, you’re trimming it from both ends, and if there are multiple longest paths, you’re trimming them all. Eventually, you’re left with one or two nodes, which must be the middle of the longest path(s), and those are the roots of the MHTs.

We can implement the above process using the topological sort. Any node with only one edge (i.e., a leaf) can be our source and, in a stepwise fashion, we can remove all sources from the graph to find new sources. We will repeat this process until we are left with one or two nodes in the graph, which will be our answer.

This Java algorithm is used to find the root nodes of the Minimum Height Trees (MHTs) in a graph. An MHT is a tree rooted at a specific node that minimizes the tree’s height. In a graph with ‘n’ nodes, there can be one or two MHTs.

Here’s a breakdown of the algorithm:

  1. It starts by checking if the number of nodes is less than or equal to 0, returning an empty list if true, as there would be no trees in the graph. If the graph contains only one node, it returns that single node as an MHT.
  2. Next, it initializes two HashMaps, inDegree to store the count of incoming edges for every vertex and graph as an adjacency list representation of the graph. It populates these HashMaps with initial values.
  3. The algorithm then constructs the graph. As it’s an undirected graph, each edge connects two nodes bi-directionally, meaning it adds a link for both nodes and increments the in-degrees of the two nodes.
  4. The algorithm finds all leaf nodes (nodes with only one in-degree) and adds them to a queue.
  5. Next, it iteratively removes the leaf nodes level by level, subtracting one from the in-degree of each leaf node’s children. If a child node becomes a leaf node as a result, it is added to the queue of leaf nodes. This process repeats until the graph has been reduced to one or two nodes, which represent the roots of the MHTs.
  6. Finally, the algorithm adds the remaining nodes in the leaves queue to minHeightTrees and returns this list. These nodes are the roots of the MHTs in the graph.

Code

Here is what our algorithm will look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
from collections import deque

class Solution:
def findTrees(self, nodes, edges):
if nodes <= 0:
return []

# with only one node, since its in-degrees will be 0, therefore, we need to handle it
# separately
if nodes == 1:
return [0]

# a. Initialize the graph
inDegree = {i: 0 for i in range(nodes)} # count of incoming edges
graph = {i: [] for i in range(nodes)} # adjacency list graph

# b. Build the graph
for edge in edges:
n1, n2 = edge[0], edge[1]
# since this is an undirected graph, therefore, add a link for both the nodes
graph[n1].append(n2)
graph[n2].append(n1)
# increment the in-degrees of both the nodes
inDegree[n1] += 1
inDegree[n2] += 1

# c. Find all leaves i.e., all nodes with 1 in-degrees
leaves = deque()
for key in inDegree:
if inDegree[key] == 1:
leaves.append(key)

# d. Remove leaves level by level and subtract each leave's children's in-degrees.
# Repeat this until we are left with 1 or 2 nodes, which will be our answer.
# Any node that has already been a leaf cannot be the root of a minimum height tree,
# because its adjacent non-leaf node will always be a better candidate.
totalNodes = nodes
while totalNodes > 2:
leavesSize = len(leaves)
totalNodes -= leavesSize
for i in range(0, leavesSize):
vertex = leaves.popleft()
# get the node's children to decrement their in-degrees
for child in graph[vertex]:
inDegree[child] -= 1
if inDegree[child] == 1:
leaves.append(child)

return list(leaves)


def main():
sol = Solution()
print("Roots of MHTs: " +
str(sol.findTrees(5, [[0, 1], [1, 2], [1, 3], [2, 4]])))
print("Roots of MHTs: " +
str(sol.findTrees(4, [[0, 1], [0, 2], [2, 3]])))
print("Roots of MHTs: " +
str(sol.findTrees(4, [[1, 2], [1, 3]])))


main()

Time complexity

In step ‘d’, each node can become a source only once and each edge will be accessed and removed once. Therefore, the time complexity of the above algorithm will be O(V+E), where ‘V’ is the total nodes and ‘E’ is the total number of the edges.

Space complexity

The space complexity will be O(V+E), since we are storing all of the edges for each node in an adjacency list.

CATALOG
  1. 1. Introduction
  2. 2. *Topological Sort (medium)
    1. 2.1. Problem Statement
    2. 2.2. Solution
      1. 2.2.1. Code
      2. 2.2.2. Time Complexity
      3. 2.2.3. Space Complexity
    3. 2.3. Similar Problems
  3. 3. Tasks Scheduling (medium)
    1. 3.1. Problem Statement
    2. 3.2. Solution
      1. 3.2.1. Code
      2. 3.2.2. Time complexity
      3. 3.2.3. Space complexity
    3. 3.3. Similar Problems
  4. 4. Tasks Scheduling Order (medium)
    1. 4.1. Problem Statement
    2. 4.2. Solution
      1. 4.2.1. Code
      2. 4.2.2. Time complexity
      3. 4.2.3. Space complexity
    3. 4.3. Similar Problems
  5. 5. *All Tasks Scheduling Orders (hard)
    1. 5.1. Problem Statement
    2. 5.2. Solution
      1. 5.2.1. Code
      2. 5.2.2. Time and Space Complexity
  6. 6. *Alien Dictionary (hard)
    1. 6.1. Problem Statement
    2. 6.2. Solution
      1. 6.2.1. Code
      2. 6.2.2. Time complexity
      3. 6.2.3. Space complexity
  7. 7. *Problem Challenge 1
    1. 7.1. Reconstructing a Sequence (hard)
    2. 7.2. Try it yourself
    3. 7.3. Solution
      1. 7.3.1. Code
      2. 7.3.2. Time complexity
      3. 7.3.3. Space complexity
  8. 8. *Problem Challenge 2
    1. 8.1. Minimum Height Trees (hard)
    2. 8.2. Solution
    3. 8.3. Code
      1. 8.3.1. Time complexity
      2. 8.3.2. Space complexity