Disruptor源碼閱讀

最近在用netty處理Http請(qǐng)求時(shí),需要用到隊(duì)列,一直聽(tīng)說(shuō)Disruptor的RingBuffer比JDK的隊(duì)列性能更好,因此準(zhǔn)備先大概了解下實(shí)現(xiàn)原理;

RingBuffer

RingBuffer本質(zhì)上就是個(gè)隊(duì)列,它的成員變量包括:

private final Object[] entries;
protected final int bufferSize;
protected final Sequencer sequencer;

其中entries的length為bufferSize,首尾相連,形成一個(gè)環(huán)狀;Sequence是Disruptor中比較重要的一個(gè)概念,后續(xù)會(huì)講到;
Disruptor的RingBuffer實(shí)現(xiàn)有幾個(gè)要注意的地方:

CacheLine Padding:

定義了8個(gè)long類(lèi)型變量,長(zhǎng)度為8*8=64,而通常情況下的CPU緩存行大小為64字節(jié)(可選范圍為32-256),因此可以解決多核環(huán)境下的緩存行失效問(wèn)題;
查看cpu一級(jí)緩存大?。?/p>

cat /sys/devices/system/cpu/cpu0/cache/index0/size
cat /sys/devices/system/cpu/cpu0/cache/index0/type
cat /sys/devices/system/cpu/cpu0/cache/index0/level

查看緩存行大小:

cat /sys/devices/system/cpu/cpu0/cache/index0/coherency_line_size

生產(chǎn)者:

RingBuffer有生產(chǎn)者和消費(fèi)者,生產(chǎn)者將元素添加到隊(duì)列;在具體的實(shí)現(xiàn)上是通過(guò)next和publish方法實(shí)現(xiàn)的:

  1. next方法:
public long next(){    return sequencer.next();}

可以看到具體的實(shí)現(xiàn)是通過(guò)MultiProducerSequencerSingleProducerSequencer的next方法實(shí)現(xiàn)的:

  • SingleProducerSequencer
public long next(int n)
    {
        if (n < 1)
        {
            throw new IllegalArgumentException("n must be > 0");
        }

        long nextValue = this.nextValue;

        long nextSequence = nextValue + n;
        long wrapPoint = nextSequence - bufferSize;
        long cachedGatingSequence = this.cachedValue;

        if (wrapPoint > cachedGatingSequence || cachedGatingSequence > nextValue)
        {
            long minSequence;
            while (wrapPoint > (minSequence = Util.getMinimumSequence(gatingSequences, nextValue)))
            {
                LockSupport.parkNanos(1L); // TODO: Use waitStrategy to spin?
            }

            this.cachedValue = minSequence;
        }

        this.nextValue = nextSequence;

        return nextSequence;
    }

由于生產(chǎn)者為單線程,因此不用考慮并發(fā),通過(guò)wrapPoint > (minSequence = Util.getMinimumSequence(gatingSequences, nextValue))判斷隊(duì)列是否已滿(消費(fèi)者速度慢,導(dǎo)致緩存隊(duì)列滿);如果隊(duì)列已滿,則通過(guò)LockSupport.parkNanos(1L)暫停線程執(zhí)行,循環(huán)判斷直至可用;可以看到作者在此添加了注釋?zhuān)蛟S后續(xù)版本會(huì)采用WaitStrategy來(lái)實(shí)現(xiàn);

  • MultiProducerSequencer
public long next(int n)
    {
        if (n < 1)
        {
            throw new IllegalArgumentException("n must be > 0");
        }

        long current;
        long next;

        do
        {
            current = cursor.get();
            next = current + n;

            long wrapPoint = next - bufferSize;
            long cachedGatingSequence = gatingSequenceCache.get();

            if (wrapPoint > cachedGatingSequence || cachedGatingSequence > current)
            {
                long gatingSequence = Util.getMinimumSequence(gatingSequences, current);

                if (wrapPoint > gatingSequence)
                {
                    LockSupport.parkNanos(1); // TODO, should we spin based on the wait strategy?
                    continue;
                }

                gatingSequenceCache.set(gatingSequence);
            }
            else if (cursor.compareAndSet(current, next))
            {
                break;
            }
        }
        while (true);

        return next;
    }

可用看到是通過(guò)cursor.compareAndSet(current, next),也就是CAS的方式來(lái)實(shí)現(xiàn)的;

生產(chǎn)者占位成功以后,通過(guò)publish方法將元素添加到隊(duì)列,并更新?tīng)顟B(tài),具體的邏輯如下:

SingleProducerSequencer:

    public void publish(long sequence)
    {
        cursor.set(sequence);
        waitStrategy.signalAllWhenBlocking();
    }

MultiProducerSequencer:

    public void publish(final long sequence)
    {
        setAvailable(sequence);
        waitStrategy.signalAllWhenBlocking();
    }

可以看到disruptor生產(chǎn)者唯一存在競(jìng)爭(zhēng)的地方在于獲取sequence;

消費(fèi)者

直接上代碼,見(jiàn)Disruptor類(lèi):

EventHandlerGroup<T> createEventProcessors(final Sequence[] barrierSequences,
                                               final EventHandler<? super T>[] eventHandlers)
    {
        checkNotStarted();

        final Sequence[] processorSequences = new Sequence[eventHandlers.length];
        final SequenceBarrier barrier = ringBuffer.newBarrier(barrierSequences);

        for (int i = 0, eventHandlersLength = eventHandlers.length; i < eventHandlersLength; i++)
        {
            final EventHandler<? super T> eventHandler = eventHandlers[i];

            final BatchEventProcessor<T> batchEventProcessor = new BatchEventProcessor<T>(ringBuffer, barrier, eventHandler);

            if (exceptionHandler != null)
            {
                batchEventProcessor.setExceptionHandler(exceptionHandler);
            }

            consumerRepository.add(batchEventProcessor, eventHandler, barrier);
            processorSequences[i] = batchEventProcessor.getSequence();
        }

        if (processorSequences.length > 0)
        {
            consumerRepository.unMarkEventProcessorsAsEndOfChain(barrierSequences);
        }

        return new EventHandlerGroup<T>(this, consumerRepository, processorSequences);
    }

可用看到,每個(gè)EventProcessor(默認(rèn)BatchEventProcessor)都有自己的Sequence,也就是說(shuō)隊(duì)列中的每個(gè)消息,每個(gè)EventProcessor都要消費(fèi)一次;那么當(dāng)有多個(gè)Handler時(shí),Handler的處理順序是怎么樣的?這個(gè)問(wèn)題留待后面再回答;

Disruptor通過(guò)start方法,啟動(dòng)消費(fèi)者:

public RingBuffer<T> start()
    {
        final Sequence[] gatingSequences = consumerRepository.getLastSequenceInChain(true);
        ringBuffer.addGatingSequences(gatingSequences);

        checkOnlyStartedOnce();
        for (final ConsumerInfo consumerInfo : consumerRepository)
        {
            consumerInfo.start(executor);
        }

        return ringBuffer;
    }
public void start(final Executor executor)
{
    executor.execute(eventprocessor);
}

executor是java.util.concurrent.Executor類(lèi)型的對(duì)象,根據(jù)需要可用采用JDK提供的ThreadPoolExecutor,也可以自己實(shí)現(xiàn);但從上面代碼也可以看出,通過(guò)executor.execute方式啟動(dòng)eventprocessor,也就是disruptor通常會(huì)將每個(gè)handler放在獨(dú)立的線程中進(jìn)行處理;但這有個(gè)問(wèn)題,比如http請(qǐng)求,如果使用一個(gè)handler進(jìn)行實(shí)現(xiàn),那意味著所有http請(qǐng)求將由一個(gè)線程進(jìn)行處理,那豈不是性能反而下降了?那disruptor的優(yōu)勢(shì)體現(xiàn)在哪呢?

繼續(xù)分析代碼,consumerInfo.start(executor)實(shí)際上調(diào)用的是BatchEventProcessor的run方法:

public void run()
    {
        if (!running.compareAndSet(false, true))//避免重復(fù)啟動(dòng)
        {
            throw new IllegalStateException("Thread is already running");
        }
        sequenceBarrier.clearAlert();//清空alerted標(biāo)志

        notifyStart();//如果實(shí)現(xiàn)了LifecycleAware,觸發(fā)onStart事件

        T event = null;
        long nextSequence = sequence.get() + 1L;//下一個(gè)消息的讀索引
        try
        {
            while (true)
            {
                try
                {
                    final long availableSequence = sequenceBarrier.waitFor(nextSequence);//從隊(duì)列中獲取可用的消息索引,如果隊(duì)列為空,則會(huì)等待,依賴于WaitStrategy

                    while (nextSequence <= availableSequence)
                    {
                        event = dataProvider.get(nextSequence);
                        eventHandler.onEvent(event, nextSequence, nextSequence == availableSequence);
                        nextSequence++;
                    }

                    sequence.set(availableSequence);
                }
                catch (final TimeoutException e)
                {
                    notifyTimeout(sequence.get());
                }
                catch (final AlertException ex)
                {
                    if (!running.get())
                    {
                        break;
                    }
                }
                catch (final Throwable ex)
                {//如果發(fā)生異常,會(huì)調(diào)用exceptionHandler進(jìn)行處理,流程并不會(huì)中斷
                    exceptionHandler.handleEventException(ex, nextSequence, event);
                    sequence.set(nextSequence);
                    nextSequence++;
                }
            }
        }
        finally
        {
            notifyShutdown();
            running.set(false);
        }
    }

可以看到run方法是通過(guò)sequenceBarrier.waitFor(nextSequence)獲取下一個(gè)可用的事件;
下面看其具體實(shí)現(xiàn),見(jiàn)ProcessingSequenceBarrier類(lèi):

public long waitFor(final long sequence)
        throws AlertException, InterruptedException, TimeoutException
    {
        checkAlert();

        long availableSequence = waitStrategy.waitFor(sequence, cursorSequence, dependentSequence, this);

        if (availableSequence < sequence)
        {
            return availableSequence;
        }

        return sequencer.getHighestPublishedSequence(sequence, availableSequence);
    }

這段代碼中,waitStrategy出場(chǎng)了,waitStrategy有如下幾種實(shí)現(xiàn):

  • BlockingWaitStrategy
  • BusySpinWaitStrategy
  • LiteBlockingWaitStrategy
  • PhasedBackoffWaitStrategy
  • SleepingWaitStrategy
  • TimeoutBlockingWaitStrategy
  • YieldingWaitStrategy

策略比較多,就不一一看了,拿其中的BlockingWaitStrategy和BusySpinWaitStrategy為例:

public long waitFor(long sequence, Sequence cursorSequence, Sequence dependentSequence, SequenceBarrier barrier)
        throws AlertException, InterruptedException
    {
        long availableSequence;
        if ((availableSequence = cursorSequence.get()) < sequence)
        {
            lock.lock();
            try
            {
                while ((availableSequence = cursorSequence.get()) < sequence)
                {
                    barrier.checkAlert();
                    processorNotifyCondition.await();
                }
            }
            finally
            {
                lock.unlock();
            }
        }

        while ((availableSequence = dependentSequence.get()) < sequence)
        {
            barrier.checkAlert();
        }

        return availableSequence;
    }

可以看到BlockingWaitStrategy是通過(guò)鎖的方式實(shí)現(xiàn)的,因此性能比較一般,但cpu使用率會(huì)比較穩(wěn)定;
另外注意到這里有個(gè)變量dependentSequence,可以看到不同handler之間是存在依賴關(guān)系的,定義依賴關(guān)系的示例代碼如下:

Executor executor = Executors.newFixedThreadPool(4);

Disruptor<DataEvent> disruptor = new Disruptor<DataEvent>(

DataEvent.FACTORY, 1024, DaemonThreadFactory.INSTANCE);

TransformingHandler handler1 = new TransformingHandler(0);

TransformingHandler handler2 = new TransformingHandler(1);

TransformingHandler handler3 = new TransformingHandler(2);

CollatingHandler collator = new CollatingHandler();

disruptor.handleEventsWith(handler1, handler2, handler3).then(collator);

disruptor.start();

BusySpinWaitStrategy實(shí)現(xiàn):

 public long waitFor(final long sequence, Sequence cursor, final Sequence dependentSequence, final SequenceBarrier barrier)
        throws AlertException, InterruptedException
    {
        long availableSequence;

        while ((availableSequence = dependentSequence.get()) < sequence)
        {
            barrier.checkAlert();
        }

        return availableSequence;
    }

可以看到當(dāng)隊(duì)列沒(méi)有可消費(fèi)的數(shù)據(jù)時(shí),會(huì)一直執(zhí)行循環(huán),因此會(huì)造成CPU的使用率很高,需要特別注意;

Sequence

Disruptor中很多地方都要用到Sequence,比如生產(chǎn)者獲取下一個(gè)可寫(xiě)的索引,因此特別看看它的實(shí)現(xiàn)是如何解決并發(fā)的:

public class Sequence extends RhsPadding
{
    private static final Unsafe UNSAFE;
    private static final long VALUE_OFFSET;//value在內(nèi)存中的偏移量

    static
    {
        UNSAFE = Util.getUnsafe();
        try
        {
            VALUE_OFFSET = UNSAFE.objectFieldOffset(Value.class.getDeclaredField("value"));
        }
        catch (final Exception e)
        {
            throw new RuntimeException(e);
        }
    }
 
    public Sequence(final long initialValue)
    {
        UNSAFE.putOrderedLong(this, VALUE_OFFSET, initialValue);
    }

    /**
     * Perform an ordered write of this sequence.  The intent is
     * a Store/Store barrier between this write and any previous
     * store.
    該方法會(huì)插入StoreStore Memory Barrier,保證在該方法之前的Store操作都會(huì)提交;
     *
     * @param value The new value for the sequence.
     */
    public void set(final long value)
    {
        UNSAFE.putOrderedLong(this, VALUE_OFFSET, value);
    }

    /**
     * Performs a volatile write of this sequence.  The intent is
     * a Store/Store barrier between this write and any previous
     * write and a Store/Load barrier between this write and any
     * subsequent volatile read.
     * 插入StoreStore和StoreLoad Memory Barrier,保證變量在寫(xiě)之后會(huì)立即提交到主內(nèi)存,從而保證了可見(jiàn)性;
     * @param value The new value for the sequence.
     */
    public void setVolatile(final long value)
    {
        UNSAFE.putLongVolatile(this, VALUE_OFFSET, value);
    }

    /**
     * Perform a compare and set operation on the sequence.
     *
     * @param expectedValue The expected current value.
     * @param newValue The value to update to.
     * @return true if the operation succeeds, false otherwise.
     */
    public boolean compareAndSet(final long expectedValue, final long newValue)
    {
        return UNSAFE.compareAndSwapLong(this, VALUE_OFFSET, expectedValue, newValue);
    }

    /**
     * Atomically increment the sequence by one.
     *
     * @return The value after the increment
     */
    public long incrementAndGet()
    {
        return addAndGet(1L);
    }

    /**
     * Atomically add the supplied value.
     *采用CAS方式,避免加鎖
     * @param increment The value to add to the sequence.
     * @return The value after the increment.
     */
    public long addAndGet(final long increment)
    {
        long currentValue;
        long newValue;

        do
        {
            currentValue = get();
            newValue = currentValue + increment;
        }
        while (!compareAndSet(currentValue, newValue));

        return newValue;
    }
}

源碼比較長(zhǎng),我刪除了一些不關(guān)注的方法,下面看看它的實(shí)現(xiàn):
1.CacheLine Padding:15個(gè)long變量,15*8=120, 在加上對(duì)象頭的長(zhǎng)度,即使在壓縮指針的情況下,也會(huì)有120+8=128個(gè)字節(jié);
2.采用了JDK底層的Unsafe,直接訪問(wèn)操作系統(tǒng)底層:
3.采用CAS避免加鎖;

結(jié)論

可以看到Disruptor采用的是類(lèi)似pub-sub模式,事件會(huì)被多個(gè)消費(fèi)者消費(fèi);原因在于createEventProcessors時(shí),每個(gè)BatchEventProcessor都有單獨(dú)的Sequence;
如果要避免事件被消費(fèi)多次,可以采用WorkPool,一個(gè)WorkPool中的多個(gè)WorkProcessor共用一個(gè)Sequence;當(dāng)然由于Disruptor為每個(gè)WorkProcessor起一個(gè)線程,因此需要注冊(cè)多個(gè)WorkProcessor;
在我的項(xiàng)目中,根據(jù)cpu個(gè)個(gè)數(shù),決定WorkProcessor的個(gè)數(shù),這些WorkProcessor的邏輯都是相同的(同一個(gè)類(lèi)的多個(gè)實(shí)例),從RingBuffer中搶占式獲取事件進(jìn)行處理;

最后編輯于
?著作權(quán)歸作者所有,轉(zhuǎn)載或內(nèi)容合作請(qǐng)聯(lián)系作者
平臺(tái)聲明:文章內(nèi)容(如有圖片或視頻亦包括在內(nèi))由作者上傳并發(fā)布,文章內(nèi)容僅代表作者本人觀點(diǎn),簡(jiǎn)書(shū)系信息發(fā)布平臺(tái),僅提供信息存儲(chǔ)服務(wù)。

推薦閱讀更多精彩內(nèi)容