Union-Find 算法(中文稱并查集算法)是解決動(dòng)態(tài)連通性(Dynamic Conectivity)問(wèn)題的一種算法,作者以此為實(shí)例,講述了如何分析和改進(jìn)算法,本節(jié)涉及三個(gè)算法實(shí)現(xiàn),分別是Quick Find, Quick Union 和 Weighted Quick Union。
動(dòng)態(tài)連通性(Dynamic Connectivity)
動(dòng)態(tài)連通性是計(jì)算機(jī)圖論中的一種數(shù)據(jù)結(jié)構(gòu),動(dòng)態(tài)維護(hù)圖結(jié)構(gòu)中相連接的組信息。
簡(jiǎn)單的說(shuō)就是,圖中各個(gè)點(diǎn)之間是否相連、連接后組成了多少個(gè)組等信息。我們稱連接在一起就像形成了一個(gè)圈子似的,成為一個(gè)組(Component),每個(gè)組有其自己的一些特征,比如組內(nèi)所有成員都有同一個(gè)標(biāo)記等。
提到圈子,大家比較好理解,我們?cè)谏缃痪W(wǎng)絡(luò)中,彼此熟悉的人之間組成自己的圈子,"熟悉"用計(jì)算機(jī)中的語(yǔ)言來(lái)表示就是“Connected 連通的。圈子是會(huì)變化的,今天你又新認(rèn)識(shí)了某人,明天你跟某人友盡了,這種變化是動(dòng)態(tài)的,所以有了動(dòng)態(tài)連通性這種數(shù)據(jù)結(jié)構(gòu)和問(wèn)題。
比較常見(jiàn)的應(yīng)用有,社交網(wǎng)絡(luò)中比如LinkedIn, 判斷某個(gè)用戶與其它用戶是否熟悉:如果你與用戶A熟悉,用戶A與用戶B熟悉,則認(rèn)為你與用戶B也是連接的,你可以看到用戶B的信息。在計(jì)算機(jī)網(wǎng)絡(luò)中,也存在類似的情況,判斷網(wǎng)絡(luò)中某兩個(gè)節(jié)點(diǎn)是否相連。
Dynamic Connectivity 的計(jì)算機(jī)語(yǔ)言表述
給定一個(gè)整數(shù)對(duì)(p,q),如果p 和 q尚未連通,則使二者相連通。p 和 q 相連后,我們稱 p 和 q 在同一個(gè)組內(nèi)。
當(dāng)p 和 q 連通時(shí),以下關(guān)系則成立:
- 自反性:p和p自身是相連的
- 對(duì)稱性:如果p和q相連,那么q和p也相連
- 傳遞性:如果p和q相連,q和r相連,那么p和r也相連
在一個(gè)網(wǎng)絡(luò)中,會(huì)存在很多類似的整數(shù)對(duì)(p,q),假設(shè)網(wǎng)絡(luò)容量是 N,我們可以定義一個(gè)從 0 到 N-1的整數(shù)數(shù)組,p,q是其中的值,我們可能需要的操作有:
- 判斷 p 和 q 是否相連
- 如果未相連,則連接 p 和 q, 如果已相連,則可以不做啥
- 查找 p 或 q 屬于哪個(gè)組中 (如圈子)
這里的一個(gè)關(guān)鍵是,如何確定 p 和 q 是在同一個(gè)組內(nèi)。這意味著,每個(gè)組需要有一些特定的屬性,我們?cè)诤竺娴乃惴ㄖ袝?huì)有考慮。
Union-Find 算法描述 Dynamic Connectivity
Union-Find 算法中,提供了對(duì)應(yīng)的方法來(lái)實(shí)現(xiàn)我們前面提到的可能的操作:
connected(): 判斷 p 和 q 是否相連,這里要調(diào)用 find(p) 和 find(q),如果二者屬于同一個(gè)組,則認(rèn)為是相連的,即isConnected()返回true.
union(): 如果未相連,則連接 p 和 q, 如果已相連,則可以不做啥
find(): 查找 p 或 q 屬于哪個(gè)組中 (如圈子),這里返回值是整數(shù),作為組的標(biāo)識(shí)符(component identifier)。
count(): 返回組的數(shù)量
算法4中的API:
class UF:
def __init__(self,N):
def union(self,p,q): # initialize N sites with integer names
def find(self,p): #return component identifier for p
def connected(self,p,q): #return true if p and q are in the same component
def count(): #number of components
Union-Find 算法及實(shí)現(xiàn)
根據(jù)我們前面的描述,如果確定每個(gè)組的標(biāo)識(shí)符似乎比較關(guān)鍵,只要確定了,就可以判斷是否相連。
那用什么來(lái)作為標(biāo)識(shí)符,區(qū)分各個(gè)組呢?
最簡(jiǎn)單的一個(gè)辦法是,所有的節(jié)點(diǎn)都賦予一個(gè) ID,如果兩個(gè)節(jié)點(diǎn)相連,則將這兩個(gè)節(jié)點(diǎn)的 ID 設(shè)成一樣的,這樣,這兩個(gè)節(jié)點(diǎn)便屬于同一個(gè)組了。網(wǎng)絡(luò)中每個(gè)組都有了一個(gè)唯一的 ID。只要節(jié)點(diǎn) p 和 q 的 ID 相同,則認(rèn)為節(jié)點(diǎn) p 和 q 相連。我們用數(shù)組來(lái)放置節(jié)點(diǎn) ID,find()方法可以快速返回 ID,所以我們的第一個(gè)算法就叫做 QuickFind。
QuickFind 算法
QuickFind 算法中,find方法比較簡(jiǎn)單,union(p,q)方法需要考慮的一點(diǎn)是,要將與p相連的所有節(jié)點(diǎn) id 都設(shè)為q當(dāng)前的 id,使p所在的組和q所在的組結(jié)合成了一個(gè)同一組。(注:也可以把與q相連的所有節(jié)點(diǎn)id都設(shè)為p的id)
最開(kāi)始的時(shí)候,所有節(jié)點(diǎn)都互不相連。我們假設(shè)所有的節(jié)點(diǎn)由id=0到N-1的整數(shù)表示。
代碼:
# -*- coding: utf-8 -*-
class QuickFind(object):
id=[]
count=0
def __init__(self,n):
self.count = n
i=0
while i<n:
self.id.append(i)
i+=1
def connected(self,p,q):
return self.find(p) == self.find(q)
def find(self,p):
return self.id[p]
def union(self,p,q):
idp = self.find(p)
if not self.connected(p,q):
for i in range(len(self.id)):
if self.id[i]==idp: # 將p所在組內(nèi)的所有節(jié)點(diǎn)的id都設(shè)為q的當(dāng)前id
self.id[i] = self.id[q]
self.count -= 1
我們的測(cè)試端代碼如下:
# -*- coding: utf-8 -*-
import quickfind
qf = quickfind.QuickFind(10)
print "initial id list is %s" % (",").join(str(x) for x in qf.id)
list = [
(4,3),
(3,8),
(6,5),
(9,4),
(2,1),
(8,9),
(5,0),
(7,2),
(6,1),
(1,0),
(6,7)
]
for k in list:
p = k[0]
q = k[1]
qf.union(p,q)
print "%d and %d is connected? %s" % (p,q,str(qf.connected(p,q) ))
print "final id list is %s" % (",").join(str(x) for x in qf.id)
print "count of components is: %d" % qf.count
運(yùn)行結(jié)果:
initial id list is 0,1,2,3,4,5,6,7,8,9
4 and 3 is connected? True
3 and 8 is connected? True
6 and 5 is connected? True
9 and 4 is connected? True
2 and 1 is connected? True
8 and 9 is connected? True
5 and 0 is connected? True
7 and 2 is connected? True
6 and 1 is connected? True
1 and 0 is connected? True
6 and 7 is connected? True
final id list is 1,1,1,8,8,1,1,1,8,8
count of components is: 2
下圖是算法4中的圖示,可供參考:
QuickFind 算法分析:
find方法快速返回?cái)?shù)組的值,但union方法最壞情況下,幾乎需要遍歷整個(gè)數(shù)組,如果數(shù)組很大(比如社交網(wǎng)絡(luò)巨大) 、需要連接的節(jié)點(diǎn)對(duì)很多的時(shí)候,QuickFind算法的復(fù)雜度就相當(dāng)大了。所以我們需要改進(jìn)一下union方法。
QuickUnion 算法
前面的QuickFind算法中,union的時(shí)候可能需要遍歷整個(gè)數(shù)組,導(dǎo)致算法性能下降。有沒(méi)有什么辦法可以不用遍歷整個(gè)數(shù)組,又可以保證同一個(gè)組內(nèi)的所有節(jié)點(diǎn)都有一個(gè)共同屬性呢?樹(shù)結(jié)構(gòu)。樹(shù)的所有節(jié)點(diǎn)都有一個(gè)共同的根節(jié)點(diǎn),每個(gè)樹(shù)只有一個(gè)根節(jié)點(diǎn),那每個(gè)樹(shù)就可以代表一個(gè)組。union(p,q)的時(shí)候,只要把p所在的樹(shù)附加到q所在的樹(shù)的根節(jié)點(diǎn),這樣,p和q就在同一樹(shù)中了。
改進(jìn)后的算法即是QuickUnion算法。我們同樣要用到 id 數(shù)組,只是這里的 id 放的是節(jié)點(diǎn)所在樹(shù)的根節(jié)點(diǎn)。
find(p): 返回的是 p 所在樹(shù)的根節(jié)點(diǎn)
union(p,q): 將 p 所在樹(shù)的根節(jié)點(diǎn)的 id 設(shè)為 q 所在樹(shù)的根節(jié)點(diǎn)
代碼實(shí)現(xiàn):
# -*- coding: utf-8 -*-
class QuickUnion(object):
id=[]
count=0
def __init__(self,n):
self.count = n
i=0
while i<n:
self.id.append(i)
i+=1
def connected(self,p,q):
if self.find(p) == self.find(q):
return True
else:
return False
def find(self,p):
while (p != self.id[p]):
p = self.id[p]
return p
def union(self,p,q):
idq = self.find(q)
idp = self.find(p)
if not self.connected(p,q):
self.id[idp]=idq
self.count -=1
類似的測(cè)試端代碼:
# -*- coding: utf-8 -*-
import quickunion
qf = quickunion.QuickUnion(10)
print "initial id list is %s" % (",").join(str(x) for x in qf.id)
list = [
(4,3),
(3,8),
(6,5),
(9,4),
(2,1),
(8,9),
(5,0),
(7,2),
(6,1),
(1,0),
(6,7)
]
for k in list:
p = k[0]
q = k[1]
qf.union(p,q)
print "%d and %d is connected? %s" % (p,q,str(qf.connected(p,q) ))
print "final root list is %s" % (",").join(str(x) for x in qf.id)
print "count of components is: %d" % qf.count
運(yùn)行結(jié)果:
initial id list is 0,1,2,3,4,5,6,7,8,9
4 and 3 is connected? True
3 and 8 is connected? True
6 and 5 is connected? True
9 and 4 is connected? True
2 and 1 is connected? True
8 and 9 is connected? True
5 and 0 is connected? True
7 and 2 is connected? True
6 and 1 is connected? True
1 and 0 is connected? True
6 and 7 is connected? True
final root list is 1,1,1,8,3,0,5,1,8,8
count of components is: 2
算法4中的圖示供參考理解:
QuickUnion 算法分析:
union方法已經(jīng)很快速了現(xiàn)在,find方法比QuickFind慢了,其最壞的情況下,如下圖,一次find需要訪問(wèn)1+..+N次數(shù)組,union方法中需要調(diào)用兩次find方法,即復(fù)雜度變成2(1+...+N)=(N+1)N,接近N的平方了。
Weighted Quick Union 算法
前面的QuickUnion算法中,union的時(shí)候只是簡(jiǎn)單的將兩個(gè)樹(shù)合并起來(lái),并沒(méi)有考慮兩個(gè)樹(shù)的大小,所以導(dǎo)致最壞情況的發(fā)生。改進(jìn)的方法可以是,在union之前,先判斷兩個(gè)樹(shù)的大小(節(jié)點(diǎn)數(shù)量),將小點(diǎn)的樹(shù)附加到大點(diǎn)的樹(shù)上,這樣,合并后的樹(shù)的深度不會(huì)變得非常大。
示例如下:
要判斷樹(shù)的大小,需要引進(jìn)一個(gè)新的數(shù)組,size 數(shù)組,存放樹(shù)的大小。初始化的時(shí)候 size 各元素都設(shè)為 1。
代碼:
# -*- coding: utf-8 -*-
class WeightedQuickUnion(object):
id=[]
count=0
sz=[]
def __init__(self,n):
self.count = n
i=0
while i<n:
self.id.append(i)
self.sz.append(1) # inital size of each tree is 1
i+=1
def connected(self,p,q):
if self.find(p) == self.find(q):
return True
else:
return False
def find(self,p):
while (p != self.id[p]):
p = self.id[p]
return p
def union(self,p,q):
idp = self.find(p)
print "id of %d is: %d" % (p,idp)
idq = self.find(q)
print "id of %d is: %d" % (q,idq)
if not self.connected(p,q):
print "Before Connected: tree size of %d's id is: %d" % (p,self.sz[idp])
print "Before Connected: tree size of %d's id is: %d" % (q,self.sz[idq])
if (self.sz[idp] < self.sz[idq]):
print "tree size of %d's id is smaller than %d's id" %(p,q)
print "id of %d's id (%d) is set to %d" % (p,idp,idq)
self.id[idp] = idq
print "tree size of %d's id is incremented by tree size of %d's id" %(q,p)
self.sz[idq] += self.sz[idp]
print "After Connected: tree size of %d's id is: %d" % (p,self.sz[idp])
print "After Connected: tree size of %d's id is: %d" % (q,self.sz[idq])
else:
print "tree size of %d's id is larger than or equal with %d's id" %(p,q)
print "id of %d's id (%d) is set to %d" % (q,idq,idp)
self.id[idq] = idp
print "tree size of %d's id is incremented by tree size of %d's id" %(p,q)
self.sz[idp] += self.sz[idq]
print "After Connected: tree size of %d's id is: %d" % (p,self.sz[idp])
print "After Connected: tree size of %d's id is: %d" % (q,self.sz[idq])
self.count -=1
測(cè)試端代碼:
# -*- coding: utf-8 -*-
import weightedquickunion
qf = weightedquickunion.WeightedQuickUnion(10)
print "initial id list is %s" % (",").join(str(x) for x in qf.id)
list = [
(4,3),
(3,8),
(6,5),
(9,4),
(2,1),
(8,9),
(5,0),
(7,2),
(6,1),
(1,0),
(6,7)
]
for k in list:
p = k[0]
q = k[1]
print "." * 10 + "unioning %d and %d" % (p,q) + "." * 10
qf.union(p,q)
print "%d and %d is connected? %s" % (p,q,str(qf.connected(p,q) ))
print "final id list is %s" % (",").join(str(x) for x in qf.id)
print "count of components is: %d" % qf.count
代碼運(yùn)行結(jié)果:
initial id list is 0,1,2,3,4,5,6,7,8,9
..........unioning 4 and 3..........
id of 4 is: 4
id of 3 is: 3
Before Connected: tree size of 4's id is: 1
Before Connected: tree size of 3's id is: 1
tree size of 4's id is larger than or equal with 3's id
id of 3's id (3) is set to 4
tree size of 4's id is incremented by tree size of 3's id
After Connected: tree size of 4's id is: 2
After Connected: tree size of 3's id is: 1
4 and 3 is connected? True
..........unioning 3 and 8..........
id of 3 is: 4
id of 8 is: 8
Before Connected: tree size of 3's id is: 2
Before Connected: tree size of 8's id is: 1
tree size of 3's id is larger than or equal with 8's id
id of 8's id (8) is set to 4
tree size of 3's id is incremented by tree size of 8's id
After Connected: tree size of 3's id is: 3
After Connected: tree size of 8's id is: 1
3 and 8 is connected? True
..........unioning 6 and 5..........
id of 6 is: 6
id of 5 is: 5
Before Connected: tree size of 6's id is: 1
Before Connected: tree size of 5's id is: 1
tree size of 6's id is larger than or equal with 5's id
id of 5's id (5) is set to 6
tree size of 6's id is incremented by tree size of 5's id
After Connected: tree size of 6's id is: 2
After Connected: tree size of 5's id is: 1
6 and 5 is connected? True
..........unioning 9 and 4..........
id of 9 is: 9
id of 4 is: 4
Before Connected: tree size of 9's id is: 1
Before Connected: tree size of 4's id is: 3
tree size of 9's id is smaller than 4's id
id of 9's id (9) is set to 4
tree size of 4's id is incremented by tree size of 9's id
After Connected: tree size of 9's id is: 1
After Connected: tree size of 4's id is: 4
9 and 4 is connected? True
..........unioning 2 and 1..........
id of 2 is: 2
id of 1 is: 1
Before Connected: tree size of 2's id is: 1
Before Connected: tree size of 1's id is: 1
tree size of 2's id is larger than or equal with 1's id
id of 1's id (1) is set to 2
tree size of 2's id is incremented by tree size of 1's id
After Connected: tree size of 2's id is: 2
After Connected: tree size of 1's id is: 1
2 and 1 is connected? True
..........unioning 8 and 9..........
id of 8 is: 4
id of 9 is: 4
8 and 9 is connected? True
..........unioning 5 and 0..........
id of 5 is: 6
id of 0 is: 0
Before Connected: tree size of 5's id is: 2
Before Connected: tree size of 0's id is: 1
tree size of 5's id is larger than or equal with 0's id
id of 0's id (0) is set to 6
tree size of 5's id is incremented by tree size of 0's id
After Connected: tree size of 5's id is: 3
After Connected: tree size of 0's id is: 1
5 and 0 is connected? True
..........unioning 7 and 2..........
id of 7 is: 7
id of 2 is: 2
Before Connected: tree size of 7's id is: 1
Before Connected: tree size of 2's id is: 2
tree size of 7's id is smaller than 2's id
id of 7's id (7) is set to 2
tree size of 2's id is incremented by tree size of 7's id
After Connected: tree size of 7's id is: 1
After Connected: tree size of 2's id is: 3
7 and 2 is connected? True
..........unioning 6 and 1..........
id of 6 is: 6
id of 1 is: 2
Before Connected: tree size of 6's id is: 3
Before Connected: tree size of 1's id is: 3
tree size of 6's id is larger than or equal with 1's id
id of 1's id (2) is set to 6
tree size of 6's id is incremented by tree size of 1's id
After Connected: tree size of 6's id is: 6
After Connected: tree size of 1's id is: 3
6 and 1 is connected? True
..........unioning 1 and 0..........
id of 1 is: 6
id of 0 is: 6
1 and 0 is connected? True
..........unioning 6 and 7..........
id of 6 is: 6
id of 7 is: 6
6 and 7 is connected? True
final id list is 6,2,6,4,4,6,6,2,4,4
count of components is: 2
算法4中的圖示:
歡迎關(guān)注我的微信公眾號(hào):duhuo2017