為Hibiscus寫文之定時器篇——HashedWheelTimer

說明

去年一年在簡書大約寫了25篇,在公司內(nèi)網(wǎng)寫了5篇博客。今年定個小目標吧,在簡書產(chǎn)出高質(zhì)量的博客50篇,加油!

首先本片文章參考了[10w定時任務(wù),如何高效觸發(fā)超時](http://chuansong.me/n/1650380646616),感謝作者!

前言

在工作中,經(jīng)常會碰到需要定時或者超時任務(wù)場景。例如在各種RPC框架或者IM、PUSH等框架中,通常需要在server和client端之間維持一條長連接。而這條長連接通常需要有心跳保持,client端(或server)通常需要給server端(或client)定時發(fā)送心跳消息,server端在一定時間內(nèi)收不到來client的心跳消息時會close掉連接。

常見方案

對于上文中提到的心跳消息處理,通常server端在收到心跳消息時會更新對應(yīng)channel的最近讀寫時間。而處理心跳超時通常會有兩種做法:

  • 使用一個Timer(或者是ScheduledThreadPoolExecutor),定時對所有的channels進行遍歷,然后根據(jù)最近讀寫時間和超時時間計算是否超時
  • 對每個channel使用一個Timer或者對每個channel開啟一個定時任務(wù),定時檢查該channel是否超時

在Dubbo中采用的是客戶端超時采用的是方案二,服務(wù)端超時采用的是方案一(嚴格意義上,這么區(qū)分不完全正確),具體的代碼如下:

private void startHeatbeatTimer() {
        stopHeartbeatTimer();
        if ( heartbeat > 0 ) {
            heatbeatTimer = scheduled.scheduleWithFixedDelay(
                    new HeartBeatTask( new HeartBeatTask.ChannelProvider() {
                        public Collection<Channel> getChannels() {
                            return Collections.<Channel>singletonList( HeaderExchangeClient.this );
                        }
                    }, heartbeat, heartbeatTimeout),
                    heartbeat, heartbeat, TimeUnit.MILLISECONDS );
        }
    }

對于每一個HeaderExchangeClient都會創(chuàng)建一個單獨的HeartBeatTask任務(wù),而HeartBeatTask處理超時的方式如下:

public void run() {
        try {
            long now = System.currentTimeMillis();
            for ( Channel channel : channelProvider.getChannels() ) {
                if (channel.isClosed()) {
                    continue;
                }
                try {
                    Long lastRead = ( Long ) channel.getAttribute(
                            HeaderExchangeHandler.KEY_READ_TIMESTAMP );
                    Long lastWrite = ( Long ) channel.getAttribute(
                            HeaderExchangeHandler.KEY_WRITE_TIMESTAMP );
                    if ( ( lastRead != null && now - lastRead > heartbeat )
                            || ( lastWrite != null && now - lastWrite > heartbeat ) ) {
                        Request req = new Request();
                        req.setVersion( "2.0.0" );
                        req.setTwoWay( true );
                        req.setEvent( Request.HEARTBEAT_EVENT );
                        channel.send( req );
                        if ( logger.isDebugEnabled() ) {
                            logger.debug( "Send heartbeat to remote channel " + channel.getRemoteAddress()
                                                  + ", cause: The channel has no data-transmission exceeds a heartbeat period: " + heartbeat + "ms" );
                        }
                    }
                    if ( lastRead != null && now - lastRead > heartbeatTimeout ) {
                        logger.warn( "Close channel " + channel
                                             + ", because heartbeat read idle time out: " + heartbeatTimeout + "ms" );
                        if (channel instanceof Client) {
                            try {
                                ((Client)channel).reconnect();
                            }catch (Exception e) {
                                //do nothing
                            }
                        } else {
                            channel.close();
                        }
                    }
                } catch ( Throwable t ) {
                    logger.warn( "Exception when heartbeat to remote channel " + channel.getRemoteAddress(), t );
                }
            }
        } catch ( Throwable t ) {
            logger.warn( "Unhandled exception when heartbeat, cause: " + t.getMessage(), t );
        }
    }

對于客戶端來說channelProvider.getChannels()其實只有一個,就是一個HeaderExchangeClient;對于服務(wù)端來說,channelProvider.getChannels()是連接到server的所有channels。

以上兩種方案各種利弊,方案一每次需要遍歷效率不高,方案二資源可能有些浪費(通常以為這多個線程,如果是單線程其實就退化成了方案一)。

更好的做法

其實業(yè)界已經(jīng)提出了一個更高效更優(yōu)雅的做法,有論文,而Netty基于該論文實現(xiàn)了HashedWheelTimer并使用。那接下來就分析下HashedWheelTimer的使用以及怎么實現(xiàn)的呢。

簡單來說呢。HashedWheelTimer維護了一個環(huán)形的隊列。往環(huán)中添加超時任務(wù)的時候會根據(jù)超時時間計算該超時任務(wù)需要落在環(huán)中的那個節(jié)點中(還會記錄需要經(jīng)過的圈數(shù))。每tick一下會移動到環(huán)中的下一個節(jié)點,取出節(jié)點中所有的超時任務(wù)遍歷,如果超時任務(wù)剩余的圈數(shù)為1證明已經(jīng)到了超時時間則執(zhí)行超時,如果剩余圈數(shù)大于1在減1.然后繼續(xù)tick。

需要說明的是,HashedWheelTimer并非精確定時,精度取決于tickDuration。

構(gòu)造方法

先看一下HashedWheelTimer的構(gòu)造方法

public HashedWheelTimer(
            ThreadFactory threadFactory,
            long tickDuration, TimeUnit unit, int ticksPerWheel) {

        if (threadFactory == null) {
            throw new NullPointerException("threadFactory");
        }
        if (unit == null) {
            throw new NullPointerException("unit");
        }
        if (tickDuration <= 0) {
            throw new IllegalArgumentException("tickDuration must be greater than 0: " + tickDuration);
        }
        if (ticksPerWheel <= 0) {
            throw new IllegalArgumentException("ticksPerWheel must be greater than 0: " + ticksPerWheel);
        }

        // Normalize ticksPerWheel to power of two and initialize the wheel.
        wheel = createWheel(ticksPerWheel);
        mask = wheel.length - 1;

        // Convert tickDuration to nanos.
        this.tickDuration = unit.toNanos(tickDuration);

        // Prevent overflow.
        if (this.tickDuration >= Long.MAX_VALUE / wheel.length) {
            throw new IllegalArgumentException(String.format(
                    "tickDuration: %d (expected: 0 < tickDuration in nanos < %d",
                    tickDuration, Long.MAX_VALUE / wheel.length));
        }
        workerThread = threadFactory.newThread(worker);

        leak = leakDetector.open(this);
    }

我們需要傳入threadFactory,這個threadFactory會用來創(chuàng)建worker線程。第二個參數(shù)tickDuration代表每個tick經(jīng)過的時間。第三個參數(shù)unit表示tickDuration的時間單位。第四個參數(shù)ticksPerWheel代表環(huán)的大小。
其中需要注意的是方法createWheel(ticksPerWheel)

private static HashedWheelBucket[] createWheel(int ticksPerWheel) {
        if (ticksPerWheel <= 0) {
            throw new IllegalArgumentException(
                    "ticksPerWheel must be greater than 0: " + ticksPerWheel);
        }
        if (ticksPerWheel > 1073741824) {
            throw new IllegalArgumentException(
                    "ticksPerWheel may not be greater than 2^30: " + ticksPerWheel);
        }

        ticksPerWheel = normalizeTicksPerWheel(ticksPerWheel);
        HashedWheelBucket[] wheel = new HashedWheelBucket[ticksPerWheel];
        for (int i = 0; i < wheel.length; i ++) {
            wheel[i] = new HashedWheelBucket();
        }
        return wheel;
    }
    
  private static int normalizeTicksPerWheel(int ticksPerWheel) {
        int normalizedTicksPerWheel = 1;
        while (normalizedTicksPerWheel < ticksPerWheel) {
            normalizedTicksPerWheel <<= 1;
        }
        return normalizedTicksPerWheel;
    }

以上代碼中normalizeTicksPerWheel得出環(huán)的大小,取了一個大于等于ticksPerWheel且是2的N次冪的整數(shù)。為啥要取成2的N次冪呢,主要是因為在大小而2的N次冪的環(huán)上求索引非常的方便,a & (b-1) = a % b,當b時2的N次冪時成立。

start方法

public void start() {
        switch (WORKER_STATE_UPDATER.get(this)) {
            case WORKER_STATE_INIT:
                if (WORKER_STATE_UPDATER.compareAndSet(this, WORKER_STATE_INIT, WORKER_STATE_STARTED)) {
                    workerThread.start();
                }
                break;
            case WORKER_STATE_STARTED:
                break;
            case WORKER_STATE_SHUTDOWN:
                throw new IllegalStateException("cannot be started once stopped");
            default:
                throw new Error("Invalid WorkerState");
        }

        // Wait until the startTime is initialized by the worker.
        while (startTime == 0) {
            try {
                startTimeInitialized.await();
            } catch (InterruptedException ignore) {
                // Ignore - it will be ready very soon.
            }
        }
    }

start方法也非常的講究,可以認為WORKER_STATE_UPDATER是一個AtomicInteger變量,代表著當前HashedWheelTimer的狀態(tài),當狀態(tài)為WORKER_STATE_INIT是會啟動workerThread。在啟動worker線程之后會一直等待startTime變成非0。這段代碼還是很凸顯功底的。稍后再分析workerThread的時候會解釋下startTimeInitialized的作用。

newTimeout方法

 public Timeout newTimeout(TimerTask task, long delay, TimeUnit unit) {
        if (task == null) {
            throw new NullPointerException("task");
        }
        if (unit == null) {
            throw new NullPointerException("unit");
        }
        start();

        // Add the timeout to the timeout queue which will be processed on the next tick.
        // During processing all the queued HashedWheelTimeouts will be added to the correct HashedWheelBucket.
        long deadline = System.nanoTime() + unit.toNanos(delay) - startTime;
        HashedWheelTimeout timeout = new HashedWheelTimeout(this, task, deadline);
        timeouts.add(timeout);
        return timeout;
    }

這是個非常重要的方法,我們調(diào)用此方法來增加一個定時任務(wù)。該方法有三個參數(shù),第一個參數(shù)描述了定時任務(wù),在任務(wù)超時的時候會執(zhí)行其run(Timeout timeout)方法,第二個參數(shù)為超時時間,也就是距離當前時刻多久之后執(zhí)行超時任務(wù),第三個參數(shù)是超時時間的時間單位。整個方法比較簡單,先計算deadline,也就是任務(wù)超時需要經(jīng)過的納秒級時間,然后構(gòu)建一個相應(yīng)的HashedWheelTimeout放入到timeouts隊列中,需要注意的是此時并沒有將HashedWheelTimeout放到環(huán)上,按照注釋Add the timeout to the timeout queue which will be processed on the next ticktimeouts超時任務(wù)隊列中超時任務(wù)將在下個tick被放入到正確的bucket中。

需要特別注意的是,newTimeout中調(diào)用了start()方法,最佳實踐是不要直接調(diào)用start(),而是在有超時任務(wù)需要執(zhí)行的時候通過newTimeout來觸發(fā)start(),以避免worker線程無畏的空轉(zhuǎn)。

HashedWheelBucket

HashedWheelBucket是一個內(nèi)部類,代表的是環(huán)上的節(jié)點。在構(gòu)造方法中會構(gòu)造一個HashedWheelBucket數(shù)組。

 private static final class HashedWheelBucket {
        // Used for the linked-list datastructure
        private HashedWheelTimeout head;
        private HashedWheelTimeout tail;
        }

HashedWheelBucket中維持了一個鏈表來存儲超時任務(wù)。

Worker線程

public void run() {
            // Initialize the startTime.
            startTime = System.nanoTime();
            if (startTime == 0) {
                // We use 0 as an indicator for the uninitialized value here, so make sure it's not 0 when initialized.
                startTime = 1;
            }

            // Notify the other threads waiting for the initialization at start().
            startTimeInitialized.countDown();

            do {
                final long deadline = waitForNextTick();
                if (deadline > 0) {
                    int idx = (int) (tick & mask);
                    processCancelledTasks();
                    HashedWheelBucket bucket =
                            wheel[idx];
                    transferTimeoutsToBuckets();
                    bucket.expireTimeouts(deadline);
                    tick++;
                }
            } while (WORKER_STATE_UPDATER.get(HashedWheelTimer.this) == WORKER_STATE_STARTED);

            // Fill the unprocessedTimeouts so we can return them from stop() method.
            for (HashedWheelBucket bucket: wheel) {
                bucket.clearTimeouts(unprocessedTimeouts);
            }
            for (;;) {
                HashedWheelTimeout timeout = timeouts.poll();
                if (timeout == null) {
                    break;
                }
                if (!timeout.isCancelled()) {
                    unprocessedTimeouts.add(timeout);
                }
            }
            processCancelledTasks();
        }

整個WheelTimer中最重要的就是Woker線程了。前面提到start()方法中會啟動worker線程,并且會等待startTime不為0,worker線程會把startTime設(shè)置為當前的納秒時間,并且startTimeInitialized.countDown()喚醒阻塞在start()方法的線程。

在之后,只要WheelTimer還在WORKER_STATE_STARTED狀態(tài)(目前改變狀態(tài)會會在start和stop方法)。

waitForNextTick()

private long waitForNextTick() {
            long deadline = tickDuration * (tick + 1);

            for (;;) {
                final long currentTime = System.nanoTime() - startTime;
                long sleepTimeMs = (deadline - currentTime + 999999) / 1000000;

                if (sleepTimeMs <= 0) {
                    if (currentTime == Long.MIN_VALUE) {
                        return -Long.MAX_VALUE;
                    } else {
                        return currentTime;
                    }
                }

                // Check if we run on windows, as if thats the case we will need
                // to round the sleepTime as workaround for a bug that only affect
                // the JVM if it runs on windows.
                //
                // See https://github.com/netty/netty/issues/356
                if (PlatformDependent.isWindows()) {
                    sleepTimeMs = sleepTimeMs / 10 * 10;
                }

                try {
                    Thread.sleep(sleepTimeMs);
                } catch (InterruptedException ignored) {
                    if (WORKER_STATE_UPDATER.get(HashedWheelTimer.this) == WORKER_STATE_SHUTDOWN) {
                        return Long.MIN_VALUE;
                    }
                }
            }
        }

waitForNextTick()比較簡單,就是讓woker線程休眠一個tick的時間,休眠完之后返回當前納秒時間。

processCancelledTasks()

private void processCancelledTasks() {
            for (;;) {
                Runnable task = cancelledTimeouts.poll();
                if (task == null) {
                    // all processed
                    break;
                }
                try {
                    task.run();
                } catch (Throwable t) {
                    if (logger.isWarnEnabled()) {
                        logger.warn("An exception was thrown while process a cancellation task", t);
                    }
                }
            }
        }

WheelTime中維護了一個cancelledTimeouts隊列,每次tick都會處理cancelledTimeouts隊列中的所有超時任務(wù),至于任務(wù)是在什么時候怎么被添加到cancelledTimeouts隊列中的后面再說。

transferTimeoutsToBuckets()

 private void transferTimeoutsToBuckets() {
            // transfer only max. 100000 timeouts per tick to prevent a thread to stale the workerThread when it just
            // adds new timeouts in a loop.
            for (int i = 0; i < 100000; i++) {
                HashedWheelTimeout timeout = timeouts.poll();
                if (timeout == null) {
                    // all processed
                    break;
                }
                if (timeout.state() == HashedWheelTimeout.ST_CANCELLED) {
                    // Was cancelled in the meantime.
                    continue;
                }

                long calculated = timeout.deadline / tickDuration;
                timeout.remainingRounds = (calculated - tick) / wheel.length;

                final long ticks = Math.max(calculated, tick); // Ensure we don't schedule for past.
                int stopIndex = (int) (ticks & mask);

                HashedWheelBucket bucket = wheel[stopIndex];
                bucket.addTimeout(timeout);
            }
        }

前面提到,在newTimeout的時候,超時任務(wù)并不會立馬添加到環(huán)中,而是先放到了timeout隊列中。在每個tick來臨的時候,worker會將timeout中的所有超時任務(wù)方法環(huán)中。而計算remainingRounds和stopIndex的方法還是很巧妙的

long calculated = timeout.deadline / tickDuration;
                timeout.remainingRounds = (calculated - tick) / wheel.length;

                final long ticks = Math.max(calculated, tick); // Ensure we don't schedule for past.
                int stopIndex = (int) (ticks & mask);

然后將超時任務(wù)添加到對應(yīng)的HashedWheelBucket中。

bucket.expireTimeouts(deadline);

public void expireTimeouts(long deadline) {
            HashedWheelTimeout timeout = head;

            // process all timeouts
            while (timeout != null) {
                boolean remove = false;
                if (timeout.remainingRounds <= 0) {
                    if (timeout.deadline <= deadline) {
                        timeout.expire();
                    } else {
                        // The timeout was placed into a wrong slot. This should never happen.
                        throw new IllegalStateException(String.format(
                                "timeout.deadline (%d) > deadline (%d)", timeout.deadline, deadline));
                    }
                    remove = true;
                } else if (timeout.isCancelled()) {
                    remove = true;
                } else {
                    timeout.remainingRounds --;
                }
                // store reference to next as we may null out timeout.next in the remove block.
                HashedWheelTimeout next = timeout.next;
                if (remove) {
                    remove(timeout);
                }
                timeout = next;
            }
        }

處理環(huán)中對應(yīng)bucket中所有的超時任務(wù),如果remainingRounds小于等于0,證明超時時間到了,則執(zhí)行timeout.expire();,如果remainingRounds大于0,則減1,如果超時任務(wù)超時或者取消,移除超時任務(wù)。

HashedWheelTimeout#cancel

public boolean cancel() {
            // only update the state it will be removed from HashedWheelBucket on next tick.
            if (!compareAndSetState(ST_INIT, ST_CANCELLED)) {
                return false;
            }
            // If a task should be canceled we create a new Runnable for this to another queue which will
            // be processed on each tick. So this means that we will have a GC latency of max. 1 tick duration
            // which is good enough. This way we can make again use of our MpscLinkedQueue and so minimize the
            // locking / overhead as much as possible.
            //
            // It is important that we not just add the HashedWheelTimeout itself again as it extends
            // MpscLinkedQueueNode and so may still be used as tombstone.
            timer.cancelledTimeouts.add(new Runnable() {
                @Override
                public void run() {
                    HashedWheelBucket bucket = HashedWheelTimeout.this.bucket;
                    if (bucket != null) {
                        bucket.remove(HashedWheelTimeout.this);
                    }
                }
            });
            return true;
        }```

前面提到了cancelledTimeouts隊列,在調(diào)用HashedWheelTimeout#cancel時會像cancelledTimeouts隊列中增加任務(wù),該任務(wù)就是將超時任務(wù)從對應(yīng)的bucket中移除

### stop()

public Set<Timeout> stop() {
if (Thread.currentThread() == workerThread) {
throw new IllegalStateException(
HashedWheelTimer.class.getSimpleName() +
".stop() cannot be called from " +
TimerTask.class.getSimpleName());
}

    if (!WORKER_STATE_UPDATER.compareAndSet(this, WORKER_STATE_STARTED, WORKER_STATE_SHUTDOWN)) {
        // workerState can be 0 or 2 at this moment - let it always be 2.
        WORKER_STATE_UPDATER.set(this, WORKER_STATE_SHUTDOWN);

        if (leak != null) {
            leak.close();
        }

        return Collections.emptySet();
    }

    boolean interrupted = false;
    while (workerThread.isAlive()) {
        workerThread.interrupt();
        try {
            workerThread.join(100);
        } catch (InterruptedException ignored) {
            interrupted = true;
        }
    }

    if (interrupted) {
        Thread.currentThread().interrupt();
    }

    if (leak != null) {
        leak.close();
    }
    return worker.unprocessedTimeouts();
}
我一直認為寫程序有兩點非常考驗功底,1是生命周期管理,2是異常情況處理

WheelTimer有start()方法也應(yīng)該有stop()方法,該stop方法有比較多的技巧值得學習

if (!WORKER_STATE_UPDATER.compareAndSet(this, WORKER_STATE_STARTED, WORKER_STATE_SHUTDOWN)) {
// workerState can be 0 or 2 at this moment - let it always be 2.
WORKER_STATE_UPDATER.set(this, WORKER_STATE_SHUTDOWN);

        if (leak != null) {
            leak.close();
        }

        return Collections.emptySet();
    }

這里相當于有多個線程同時調(diào)用stop()方法時,只有一個能成功把狀態(tài)從WORKER_STATE_STARTED設(shè)置為WORKER_STATE_SHUTDOWN,如果設(shè)置不成功則強制設(shè)置為WORKER_STATE_SHUTDOWN(保證總有一個成功,其實應(yīng)該沒有必要),然后返回空列表(表示該線程不需要處理了,總會有另外一個成功的線程完成后面的事情)。

while (workerThread.isAlive()) {
workerThread.interrupt();
try {
workerThread.join(100);
} catch (InterruptedException ignored) {
interrupted = true;
}
}

如果workerThread.isAlive,如果worker線程仍活著,或嘗試workerThread.interrupt()(要想停止一個線程可以使用xxxThread.interrupt(),然后讓xxxThread響應(yīng)xxxThread.isInterrupted(),雖然該wokerThread沒有響應(yīng)這個...)。在WheelTimer中,stop的時候想要workerThread優(yōu)雅的處理完事情,并且返回未能處理完的任務(wù)后退出,所以使用` workerThread.join(100);`在線程中等待workerThread執(zhí)行100ms。

// Fill the unprocessedTimeouts so we can return them from stop() method.
for (HashedWheelBucket bucket: wheel) {
bucket.clearTimeouts(unprocessedTimeouts);
}
for (;;) {
HashedWheelTimeout timeout = timeouts.poll();
if (timeout == null) {
break;
}
if (!timeout.isCancelled()) {
unprocessedTimeouts.add(timeout);
}
}
processCancelledTasks()

在worker線程中,最后會將bucket中所有沒來得及處理的任務(wù)和timeout隊列中沒超時的任務(wù)放入到unprocessedTimeouts中,然后會處理掉已經(jīng)取消的超時任務(wù),然后就完成了它的使命等待被回收。

其中有響應(yīng)InterruptedException的部分處理,關(guān)于InterruptedException的處理估計會要出一篇文章詳細講解。

## 總結(jié)
代碼寫得非常好,有很多值得學習的地方,HashedWheelTimer可以用起來了。


最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請聯(lián)系作者
平臺聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點,簡書系信息發(fā)布平臺,僅提供信息存儲服務(wù)。

推薦閱讀更多精彩內(nèi)容

  • Spring Cloud為開發(fā)人員提供了快速構(gòu)建分布式系統(tǒng)中一些常見模式的工具(例如配置管理,服務(wù)發(fā)現(xiàn),斷路器,智...
    卡卡羅2017閱讀 134,869評論 18 139
  • 1、線程安全與鎖 線程安全的本質(zhì),在于 存在了共享的可變狀態(tài) status, 在多線程共同操作狀態(tài)變量時,當計算的...
    軒居晨風閱讀 364評論 1 1
  • 我叫張啟明,是最小的一批80后
    南風煮雨閱讀 158評論 0 0
  • 今天周日,晚上在夜市里逛吃,很多很好吃的小吃,芒果糯米飯,芒果奶昔,椰子肉,菠蘿炒飯,炒米粉~~~味道好價格也是是...
    媽咪充電寶閱讀 229評論 0 0
  • 知了鉆進我的耳朵把心事拖出來吃掉 孩子拉斷了蟬的翅膀假裝自己曾經(jīng)會飛 你們都在哪里啊回頭搔搔發(fā)尾沒有看到一個人只有...
    最喜小兒無賴閱讀 273評論 0 0