1.分布式鎖
在我們進行單機應用開發(fā),涉及并發(fā)同步的時候,我們往往采用synchronized或者Lock的方式來解決多線程間的代碼同步問題。但當我們的應用是分布式部署的情況下,那么就需要一種更加高級的鎖機制來處理這個進程級別的代碼同步問題。那么接下來引出現(xiàn)在比較常用的幾種分布式鎖實現(xiàn)方案,如下圖:
而在這幾次的實現(xiàn)方案也是各有優(yōu)缺點,對比如下:
2.Curator的分布式鎖介紹
今天我們主要介紹這個基于Zookeeper實現(xiàn)的分布式鎖方案(Curator),當然隨著我們?nèi)チ私釩urator這個產(chǎn)品的時候,會驚喜的發(fā)現(xiàn),它帶給我們的不僅僅是分布式鎖的實現(xiàn)。此處先不做介紹,我會另外用博客來記錄,有興趣的朋友可以自行下載這個項目來解讀。 apache/curator
現(xiàn)在先讓我們看看Curator的幾種鎖方案:
- InterProcessMutex:分布式可重入排它鎖
- InterProcessSemaphoreMutex:分布式排它鎖
- InterProcessReadWriteLock:分布式讀寫鎖
- InterProcessMultiLock:將多個鎖作為單個實體管理的容器
接下來我們以InterProcessMutex為例,介紹一下這個分布式可重入排它鎖的實現(xiàn)原理
3.InterProcessMutex代碼跟蹤
一、獲取鎖的過程
1).實例化InterProcessMutex:
// 代碼進入:InterProcessMutex.java
/**
* @param client client
* @param path the path to lock
*/
public InterProcessMutex(CuratorFramework client, String path)
{
this(client, path, new StandardLockInternalsDriver());
}
/**
* @param client client
* @param path the path to lock
* @param driver lock driver
*/
public InterProcessMutex(CuratorFramework client, String path, LockInternalsDriver driver)
{
this(client, path, LOCK_NAME, 1, driver);
}
兩個構造函數(shù)共同的入?yún)ⅲ?/p>
- client:curator實現(xiàn)的zookeeper客戶端
- path:要在zookeeper加鎖的路徑,即后面創(chuàng)建臨時節(jié)點的父節(jié)點
我們可以看到上面兩個構造函數(shù)中,其實第一個也是在調(diào)用第二個構造函數(shù),它傳入了一個默認的StandardLockInternalsDriver對象,即標準的鎖驅(qū)動類(該類的作用在后面會介紹)。就是說InterProcessMutex也支持你傳入自定義的鎖驅(qū)動類來擴展。
// 代碼進入:InterProcessMutex.java
InterProcessMutex(CuratorFramework client, String path, String lockName, int maxLeases, LockInternalsDriver driver)
{
basePath = PathUtils.validatePath(path);
internals = new LockInternals(client, driver, path, lockName, maxLeases);
}
// 代碼進入:LockInternals.java
LockInternals(CuratorFramework client, LockInternalsDriver driver, String path, String lockName, int maxLeases)
{
this.driver = driver;
this.lockName = lockName;
this.maxLeases = maxLeases;
this.client = client.newWatcherRemoveCuratorFramework();
this.basePath = PathUtils.validatePath(path);
this.path = ZKPaths.makePath(path, lockName);
}
跟著構造函數(shù)的代碼走完,它接著做了兩件事:驗證入?yún)ath的合法性 & 實例化了一個LockInternals對象。
2).加鎖方法acquire:
實例化完成的InterProcessMutex對象,開始調(diào)用acquire()方法來嘗試加鎖:
// 代碼進入:InterProcessMutex.java
/**
* Acquire the mutex - blocking until it's available. Note: the same thread
* can call acquire re-entrantly. Each call to acquire must be balanced by a call
* to {@link #release()}
*
* @throws Exception ZK errors, connection interruptions
*/
@Override
public void acquire() throws Exception
{
if ( !internalLock(-1, null) )
{
throw new IOException("Lost connection while trying to acquire lock: " + basePath);
}
}
/**
* Acquire the mutex - blocks until it's available or the given time expires. Note: the same thread
* can call acquire re-entrantly. Each call to acquire that returns true must be balanced by a call
* to {@link #release()}
*
* @param time time to wait
* @param unit time unit
* @return true if the mutex was acquired, false if not
* @throws Exception ZK errors, connection interruptions
*/
@Override
public boolean acquire(long time, TimeUnit unit) throws Exception
{
return internalLock(time, unit);
}
- acquire() :入?yún)榭眨{(diào)用該方法后,會一直堵塞,直到搶奪到鎖資源,或者zookeeper連接中斷后,上拋異常。
- acquire(long time, TimeUnit unit):入?yún)魅氤瑫r時間以及單位,搶奪時,如果出現(xiàn)堵塞,會在超過該時間后,返回false。
對比兩種方式,可以選擇適合自己業(yè)務邏輯的方法。但是一般情況下,我推薦后者,傳入超時時間,避免出現(xiàn)大量的臨時節(jié)點累積以及線程堵塞的問題。
3).鎖的可重入:
// 代碼進入:InterProcessMutex.java
private boolean internalLock(long time, TimeUnit unit) throws Exception
{
/*
Note on concurrency: a given lockData instance
can be only acted on by a single thread so locking isn't necessary
*/
Thread currentThread = Thread.currentThread();
LockData lockData = threadData.get(currentThread);
if ( lockData != null )
{
// re-entering
lockData.lockCount.incrementAndGet();
return true;
}
String lockPath = internals.attemptLock(time, unit, getLockNodeBytes());
if ( lockPath != null )
{
LockData newLockData = new LockData(currentThread, lockPath);
threadData.put(currentThread, newLockData);
return true;
}
return false;
}
這段代碼里面,實現(xiàn)了鎖的可重入。每個InterProcessMutex實例,都會持有一個ConcurrentMap類型的threadData對象,以線程對象作為Key,以LockData作為Value值。通過判斷當前線程threadData是否有值,如果有,則表示線程可以重入該鎖,于是將lockData的lockCount進行累加;如果沒有,則進行鎖的搶奪。
internals.attemptLock方法返回lockPath!=null時,表明了該線程已經(jīng)成功持有了這把鎖,于是乎LockData對象被new了出來,并存放到threadData中。
4).搶奪鎖:
重頭戲來了,attemptLock方法就是核心部分,直接看代碼:
// 代碼進入:LockInternals.java
String attemptLock(long time, TimeUnit unit, byte[] lockNodeBytes) throws Exception
{
final long startMillis = System.currentTimeMillis();
final Long millisToWait = (unit != null) ? unit.toMillis(time) : null;
final byte[] localLockNodeBytes = (revocable.get() != null) ? new byte[0] : lockNodeBytes;
int retryCount = 0;
String ourPath = null;
boolean hasTheLock = false;
boolean isDone = false;
while ( !isDone )
{
isDone = true;
try
{
ourPath = driver.createsTheLock(client, path, localLockNodeBytes);
hasTheLock = internalLockLoop(startMillis, millisToWait, ourPath);
}
catch ( KeeperException.NoNodeException e )
{
// gets thrown by StandardLockInternalsDriver when it can't find the lock node
// this can happen when the session expires, etc. So, if the retry allows, just try it all again
if ( client.getZookeeperClient().getRetryPolicy().allowRetry(retryCount++, System.currentTimeMillis() - startMillis, RetryLoop.getDefaultRetrySleeper()) )
{
isDone = false;
}
else
{
throw e;
}
}
}
if ( hasTheLock )
{
return ourPath;
}
return null;
}
此處注意三個地方
- 1.while循環(huán)
正常情況下,這個循環(huán)會在下一次結(jié)束。但是當出現(xiàn)NoNodeException異常時,會根據(jù)zookeeper客戶端的重試策略,進行有限次數(shù)的重新獲取鎖。 - 2.driver.createsTheLock
顧名思義,這個driver的createsTheLock方法就是在創(chuàng)建這個鎖,即在zookeeper的指定路徑上,創(chuàng)建一個臨時序列節(jié)點。注意:此時只是純粹的創(chuàng)建了一個節(jié)點,不是說線程已經(jīng)持有了鎖。
// 代碼進入:StandardLockInternalsDriver.java
@Override
public String createsTheLock(CuratorFramework client, String path, byte[] lockNodeBytes) throws Exception
{
String ourPath;
if ( lockNodeBytes != null )
{
ourPath = client.create().creatingParentContainersIfNeeded().withProtection().withMode(CreateMode.EPHEMERAL_SEQUENTIAL).forPath(path, lockNodeBytes);
}
else
{
ourPath = client.create().creatingParentContainersIfNeeded().withProtection().withMode(CreateMode.EPHEMERAL_SEQUENTIAL).forPath(path);
}
return ourPath;
}
- 3.internalLockLoop
判斷自身是否能夠持有鎖。如果不能,進入wait,等待被喚醒。
// 代碼進入:LockInternals.java
private boolean internalLockLoop(long startMillis, Long millisToWait, String ourPath) throws Exception
{
boolean haveTheLock = false;
boolean doDelete = false;
try
{
if ( revocable.get() != null )
{
client.getData().usingWatcher(revocableWatcher).forPath(ourPath);
}
while ( (client.getState() == CuratorFrameworkState.STARTED) && !haveTheLock )
{
List<String> children = getSortedChildren();
String sequenceNodeName = ourPath.substring(basePath.length() + 1); // +1 to include the slash
PredicateResults predicateResults = driver.getsTheLock(client, children, sequenceNodeName, maxLeases);
if ( predicateResults.getsTheLock() )
{
haveTheLock = true;
}
else
{
String previousSequencePath = basePath + "/" + predicateResults.getPathToWatch();
synchronized(this)
{
try
{
// use getData() instead of exists() to avoid leaving unneeded watchers which is a type of resource leak
client.getData().usingWatcher(watcher).forPath(previousSequencePath);
if ( millisToWait != null )
{
millisToWait -= (System.currentTimeMillis() - startMillis);
startMillis = System.currentTimeMillis();
if ( millisToWait <= 0 )
{
doDelete = true; // timed out - delete our node
break;
}
wait(millisToWait);
}
else
{
wait();
}
}
catch ( KeeperException.NoNodeException e )
{
// it has been deleted (i.e. lock released). Try to acquire again
}
}
}
}
}
catch ( Exception e )
{
ThreadUtils.checkInterrupted(e);
doDelete = true;
throw e;
}
finally
{
if ( doDelete )
{
deleteOurPath(ourPath);
}
}
return haveTheLock;
}
誒!又是一大片代碼。好吧,咱還是分段挑里面重要的說。
- while循環(huán)
如果你一開始使用無參的acquire方法,那么此處的循環(huán)可能就是一個死循環(huán)。當zookeeper客戶端啟動時,并且當前線程還沒有成功獲取到鎖時,就會開始新的一輪循環(huán)。
- getSortedChildren
這個方法比較簡單,就是獲取到所有子節(jié)點列表,并且從小到大根據(jù)節(jié)點名稱后10位數(shù)字進行排序。在上面提到了,創(chuàng)建的是序列節(jié)點。如下生成的示例:
zookeeper序列節(jié)點
- driver.getsTheLock
// 代碼進入:StandardLockInternalsDriver.java
@Override
public PredicateResults getsTheLock(CuratorFramework client, List<String> children, String sequenceNodeName, int maxLeases) throws Exception
{
int ourIndex = children.indexOf(sequenceNodeName);
validateOurIndex(sequenceNodeName, ourIndex);
boolean getsTheLock = ourIndex < maxLeases;
String pathToWatch = getsTheLock ? null : children.get(ourIndex - maxLeases);
return new PredicateResults(pathToWatch, getsTheLock);
}
判斷是否可以持有鎖,判斷規(guī)則:當前創(chuàng)建的節(jié)點是否在上一步獲取到的子節(jié)點列表的首位。
如果是,說明可以持有鎖,那么getsTheLock = true,封裝進PredicateResults返回。
如果不是,說明有其他線程早已先持有了鎖,那么getsTheLock = false,此處還需要獲取到自己前一個臨時節(jié)點的名稱pathToWatch。(注意這個pathToWatch后面有比較關鍵的作用)
- synchronized(this)
這塊代碼在爭奪鎖失敗以后的邏輯中。那么此處該線程應該做什么呢?
首先添加一個watcher監(jiān)聽,而監(jiān)聽的地址正是上面一步返回的pathToWatch進行basePath + "/" 拼接以后的地址。也就是說當前線程會監(jiān)聽自己前一個節(jié)點的變動,而不是父節(jié)點下所有節(jié)點的變動。然后華麗麗的...wait(millisToWait)。線程交出cpu的占用,進入等待狀態(tài),等到被喚醒。
接下來的邏輯就很自然了,如果自己監(jiān)聽的節(jié)點發(fā)生了變動,那么就將線程從等待狀態(tài)喚醒,重新一輪的鎖的爭奪。
自此, 我們完成了整個鎖的搶奪過程。
二、釋放鎖
相對上面獲取鎖的長篇大論來說,釋放的邏輯就很簡單了。
// 代碼進入:InterProcessMutex.java
/**
* Perform one release of the mutex if the calling thread is the same thread that acquired it. If the
* thread had made multiple calls to acquire, the mutex will still be held when this method returns.
*
* @throws Exception ZK errors, interruptions, current thread does not own the lock
*/
@Override
public void release() throws Exception
{
/*
Note on concurrency: a given lockData instance
can be only acted on by a single thread so locking isn't necessary
*/
Thread currentThread = Thread.currentThread();
LockData lockData = threadData.get(currentThread);
if ( lockData == null )
{
throw new IllegalMonitorStateException("You do not own the lock: " + basePath);
}
int newLockCount = lockData.lockCount.decrementAndGet();
if ( newLockCount > 0 )
{
return;
}
if ( newLockCount < 0 )
{
throw new IllegalMonitorStateException("Lock count has gone negative for lock: " + basePath);
}
try
{
internals.releaseLock(lockData.lockPath);
}
finally
{
threadData.remove(currentThread);
}
}
- 減少重入鎖的計數(shù),直到變成0。
- 釋放鎖,即移除移除Watchers & 刪除創(chuàng)建的節(jié)點
- 從threadData中,刪除自己線程的緩存
三、鎖驅(qū)動類
開始的時候,我們提到了這個StandardLockInternalsDriver-標準鎖驅(qū)動類。還提到了我們可以傳入自定義的,來擴展。
是的,我們先來看看這個它提供的功能接口:
// 代碼進入LockInternalsDriver.java
public PredicateResults getsTheLock(CuratorFramework client, List<String> children, String sequenceNodeName, int maxLeases) throws Exception;
public String createsTheLock(CuratorFramework client, String path, byte[] lockNodeBytes) throws Exception;
// 代碼進入LockInternalsSorter.java
public String fixForSorting(String str, String lockName);
- getsTheLock:判斷是夠獲取到了鎖
- createsTheLock:在zookeeper的指定路徑上,創(chuàng)建一個臨時序列節(jié)點。
- fixForSorting:修復排序,在StandardLockInternalsDriver的實現(xiàn)中,即獲取到臨時節(jié)點的最后序列數(shù),進行排序。
借助于這個類,我們可以嘗試實現(xiàn)自己的鎖機制,比如判斷鎖獲得的策略可以做修改,比如獲取子節(jié)點列表的排序方案可以自定義。。。
4.InterProcessMutex原理總結(jié)
InterProcessMutex通過在zookeeper的某路徑節(jié)點下創(chuàng)建臨時序列節(jié)點來實現(xiàn)分布式鎖,即每個線程(跨進程的線程)獲取同一把鎖前,都需要在同樣的路徑下創(chuàng)建一個節(jié)點,節(jié)點名字由uuid + 遞增序列組成。而通過對比自身的序列數(shù)是否在所有子節(jié)點的第一位,來判斷是否成功獲取到了鎖。當獲取鎖失敗時,它會添加watcher來監(jiān)聽前一個節(jié)點的變動情況,然后進行等待狀態(tài)。直到watcher的事件生效將自己喚醒,或者超時時間異常返回。
5.參考資料
6.寫在最后的話
在最近看的一本書叫《從Paxos到Zookeeper 分布式一致性原理與實踐》中也提到了一個關于基于zookeeper的排它鎖的實現(xiàn)方案,大致的想法是通過zookeeper節(jié)點不能重復的特性,來判斷是否成功持有了鎖。跟InterProcessMutex對比來看,還是后者更靈活些,而且后者的監(jiān)聽范圍僅限于前一個節(jié)點的變動,更小粒度的監(jiān)聽范圍可以帶來更好的性能。
如若此文能讓您有所得,便是吾之大幸!
本博文歡迎轉(zhuǎn)載,轉(zhuǎn)載請注明出處和作者。