Netty線程源碼分析(一)

一、NioEventLoopGroup

繼承關系圖1-1:


1
1

Netty允許處理IO和接收連接使用同一個EventLoopGroup


1
1

1.1 NioEventLoopGroup和NioEventLoop是什么關系?

NioEventLoopGroup實際是NioEventLoop的線程組,它包含了一個或多個EventLoop,而EventLoop就是一個Channel執行實際工作的線程,當注冊一個Channel后,Netty將這個Channel綁定到一個EventLoop上,在其生命周期內始終被綁定在這個EventLoop上不會被改變。

從圖1-2可以看出很多Channle會共享一個EventLoop。這意味著在一個Channel在被EventLoop使用時會禁止其它Channel綁定到相同的EventLoop。我們可以理解為EventLoop是一個事件循環線程,而EventLoopGroup是一個事件循環集合。

圖1-2:

1
1

1.2 NioEventLoopGroup初始化

首先看NioEventLoopGroup構造方法:

public NioEventLoopGroup() {
    this(0);
}

public NioEventLoopGroup(int nThreads) {
    this(nThreads, null);
}

public NioEventLoopGroup(int nThreads, ThreadFactory threadFactory) {
    this(nThreads, threadFactory, SelectorProvider.provider());
}

public NioEventLoopGroup(
        int nThreads, ThreadFactory threadFactory, final SelectorProvider selectorProvider) {
    super(nThreads, threadFactory, selectorProvider);
}

以上代碼可以發現NioEventLoopGroup雖然有4個構造方法,但最終調用的是MultithreadEventLoopGroup的構造方法,代碼如下:

protected MultithreadEventLoopGroup(int nThreads, ThreadFactory threadFactory, Object... args) {  
    super(nThreads == 0? DEFAULT_EVENT_LOOP_THREADS : nThreads, threadFactory, args);  
}

private static final int DEFAULT_EVENT_LOOP_THREADS;

    static {
        DEFAULT_EVENT_LOOP_THREADS = Math.max(1, SystemPropertyUtil.getInt(
                "io.netty.eventLoopThreads", Runtime.getRuntime().availableProcessors() * 2));

        if (logger.isDebugEnabled()) {
            logger.debug("-Dio.netty.eventLoopThreads: {}", DEFAULT_EVENT_LOOP_THREADS);
        }
    }

在NioEventLoopGroup初始化之前,會先執行父類MultithreadEventLoopGroup的靜態模塊,NioEventLoop的默認大小是2倍的CPU核數,但這并不是一個恒定的最佳數量,為了避免線程上下文切換,只要能滿足要求,這個值其實越少越好。

MultithreadEventExecutorGroup的構造方法:

//EventExecutor數組,保存eventLoop
private final EventExecutor[] children;
//從children中選取一個eventLoop的策略
private final EventExecutorChooser chooser;

protected MultithreadEventExecutorGroup(int nThreads, ThreadFactory threadFactory, Object... args) {
    if (nThreads <= 0) {
        throw new IllegalArgumentException(String.format("nThreads: %d (expected: > 0)", nThreads));
    }

    if (threadFactory == null) {
        //是一個通用的ThreadFactory實現,方便配置線程池
        threadFactory = newDefaultThreadFactory();
    }
    
    //根據線程數創建SingleThreadEventExecutor數組,從命名上可以看出SingleThreadEventExecutor是一個只有一個線程的線程池
    children = new SingleThreadEventExecutor[nThreads];
    //根據數組的大小,采用不同策略初始化chooser。
    if (isPowerOfTwo(children.length)) {
        chooser = new PowerOfTwoEventExecutorChooser();
    } else {
        chooser = new GenericEventExecutorChooser();
    }

    for (int i = 0; i < nThreads; i ++) {
        boolean success = false;
        try {
            //
            children[i] = newChild(threadFactory, args);
            success = true;
        } catch (Exception e) {
            // TODO: Think about if this is a good exception type
            throw new IllegalStateException("failed to create a child event loop", e);
        } finally {
            //如果沒有創建成功,循環關閉所有SingleThreadEventExecutor
            if (!success) {
                for (int j = 0; j < i; j ++) {
                    children[j].shutdownGracefully();
                }

                //等待關閉成功
                for (int j = 0; j < i; j ++) {
                    EventExecutor e = children[j];
                    try {
                        while (!e.isTerminated()) {
                            e.awaitTermination(Integer.MAX_VALUE, TimeUnit.SECONDS);
                        }
                    } catch (InterruptedException interrupted) {
                        Thread.currentThread().interrupt();
                        break;
                    }
                }
            }
        }
    }

    final FutureListener<Object> terminationListener = new FutureListener<Object>() {
        @Override
        public void operationComplete(Future<Object> future) throws Exception {
            if (terminatedChildren.incrementAndGet() == children.length) {
                terminationFuture.setSuccess(null);
            }
        }
    };

    for (EventExecutor e: children) {
        e.terminationFuture().addListener(terminationListener);
    }
}

回到NioEventLoopGroup的newChild方法重載

@Override
protected EventExecutor newChild(
        ThreadFactory threadFactory, Object... args) throws Exception {
    return new NioEventLoop(this, threadFactory, (SelectorProvider) args[0]);
}

MultithreadEventExecutorGroup構造方法中執行的是NioEventLoopGroup中的newChild方法,所以children元素的實際類型是NioEventLoop。

解釋下EventExecutorChooser的選擇

    //判斷一個數是否是2的冪次方
    private static boolean isPowerOfTwo(int val) {
        return (val & -val) == val;
    }
    
    private final class PowerOfTwoEventExecutorChooser implements EventExecutorChooser {
        @Override
        public EventExecutor next() {
            return children[childIndex.getAndIncrement() & children.length - 1];
        }
    }

    private final class GenericEventExecutorChooser implements EventExecutorChooser {
        @Override
        public EventExecutor next() {
            return children[Math.abs(childIndex.getAndIncrement() % children.length)];
        }
    }

EventExecutorChooser是根據線程數組大小是否是2的冪次方來選擇初始化chooser。如果大小為2的冪次方,則采用PowerOfTwoEventExecutorChooser,否則使用GenericEventExecutorChooser。也就是如果線程數是2的倍數時,Netty選擇線程時會使用PowerOfTwoEventExecutorChooser,因為&比%更快(Netty為了性能也是拼了).

二、NioEventLoop

NioEventLoop中維護了一個線程,里面有一個任務隊列和一個延遲任務隊列,每個EventLoop有一個Selector,這里強制借用一張圖而且不留地址!

image

繼承關系:


1
1

NioEventLoop初始化

NioEventLoop(NioEventLoopGroup parent, ThreadFactory threadFactory, SelectorProvider selectorProvider) {
    super(parent, threadFactory, false);
    if (selectorProvider == null) {
        throw new NullPointerException("selectorProvider");
    }
    provider = selectorProvider;
    selector = openSelector();
}

1、調用父類方法構造一個taskQueue,它是一個LinkedBlockingQueue

2、openSelector(): Netty是基于Nio實現的,所以也離不開selector。

3、DISABLE_KEYSET_OPTIMIZATION: 判斷是否需要對sun.nio.ch.SelectorImpl中的selectedKeys進行優化, 不做配置的話默認需要優化,通過反射將selectedKeySet與sun.nio.ch.SelectorImpl中的兩個field綁定

4、主要優化在哪: SelectorImpl原來的selectedKeys和publicSelectedKeys數據結構是HashSet,大家知道HashSet的數據結構是數組+鏈表,新的數據結構是由2個數組A、B組成,初始大小是1024,避免了HashSet擴容帶來的性能問題。除了擴容外,遍歷效率也是一個原因,對于需要遍歷selectedKeys的全部元素, 數組效率無疑是最高的。

private Selector openSelector() {
    final Selector selector;
    try {
        selector = provider.openSelector();
    } catch (IOException e) {
        throw new ChannelException("failed to open a new selector", e);
    }

    if (DISABLE_KEYSET_OPTIMIZATION) {
        return selector;
    }

    try {
        SelectedSelectionKeySet selectedKeySet = new SelectedSelectionKeySet();

        Class<?> selectorImplClass =
                Class.forName("sun.nio.ch.SelectorImpl", false, PlatformDependent.getSystemClassLoader());

        // Ensure the current selector implementation is what we can instrument.
        if (!selectorImplClass.isAssignableFrom(selector.getClass())) {
            return selector;
        }

        Field selectedKeysField = selectorImplClass.getDeclaredField("selectedKeys");
        Field publicSelectedKeysField = selectorImplClass.getDeclaredField("publicSelectedKeys");

        selectedKeysField.setAccessible(true);
        publicSelectedKeysField.setAccessible(true);

        selectedKeysField.set(selector, selectedKeySet);
        publicSelectedKeysField.set(selector, selectedKeySet);

        selectedKeys = selectedKeySet;
        logger.trace("Instrumented an optimized java.util.Set into: {}", selector);
    } catch (Throwable t) {
        selectedKeys = null;
        logger.trace("Failed to instrument an optimized java.util.Set into: {}", selector, t);
    }

    return selector;
}

NioEventLoop的啟動

在上一遍講過NioEventLoop中維護了一個線程,線程啟動時會調用NioEventLoop的run方法,loop會不斷循環一個過程:select -> processSelectedKeys(IO任務) -> runAllTasks(非IO任務)

  • I/O任務: 即selectionKey中ready的事件,如accept、connect、read、write等

  • 非IO任務: 添加到taskQueue中的任務,如bind、channelActive等

@Override
protected void run() {
    for (;;) {
        boolean oldWakenUp = wakenUp.getAndSet(false);
        try {
            // 判斷是否有非IO任務,如果有立刻返回
            if (hasTasks()) {
                selectNow();
            } else {
                select(oldWakenUp);

                if (wakenUp.get()) {
                    selector.wakeup();
                }
            }

            cancelledKeys = 0;
            needsToSelectAgain = false;
            final int ioRatio = this.ioRatio;
            if (ioRatio == 100) {
                // IO任務
                processSelectedKeys();
                // 非IO任務
                runAllTasks();
            } else {
                // 用以控制IO任務與非IO任務的運行時間比
                final long ioStartTime = System.nanoTime();
                // IO任務
                processSelectedKeys();

                final long ioTime = System.nanoTime() - ioStartTime;
                // 非IO任務
                runAllTasks(ioTime * (100 - ioRatio) / ioRatio);
            }

            if (isShuttingDown()) {
                closeAll();
                if (confirmShutdown()) {
                    break;
                }
            }
        } catch (Throwable t) {
            // Prevent possible consecutive immediate failures that lead to
            // excessive CPU consumption.
            try {
                Thread.sleep(1000);
            } catch (InterruptedException e) {
                // Ignore.
            }
        }
    }
}

1、wakenUp: 用來決定是否調用selector.wakeup(),只有當wakenUp未true時才會調用,目的是為了減少wake-up的負載,因為Selector.wakeup()是一個昂貴的操作。

2、hasTask(): 判斷是否有非IO任務,如果有的話,選擇調用非阻塞的selectNow()讓select立即返回, 否則以阻塞的方式調用select.timeoutMillis是阻塞時間

3、ioRatio: 控制兩種任務的執行時間,你可以通過它來限制非IO任務的執行時間, 默認值是50, 表示允許非IO任務獲得和IO任務相同的執行時間,這個值根據自己的具體場景來設置.

4、processSelectedKeys(): 處理IO事件

5、runAllTasks(): 處理非IO任務

6、isShuttingDown(): 檢查state是否被標記為ST_SHUTTING_DOWN

private void select(boolean oldWakenUp) throws IOException {
        Selector selector = this.selector;
        try {
            int selectCnt = 0;
            long currentTimeNanos = System.nanoTime();
            long selectDeadLineNanos = currentTimeNanos + delayNanos(currentTimeNanos);
            for (;;) {
                long timeoutMillis = (selectDeadLineNanos - currentTimeNanos + 500000L) / 1000000L;
                if (timeoutMillis <= 0) {
                    if (selectCnt == 0) {
                        selector.selectNow();
                        selectCnt = 1;
                    }
                    break;
                }

                int selectedKeys = selector.select(timeoutMillis);
                selectCnt ++;

                if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {
                    // - Selected something,
                    // - waken up by user, or
                    // - the task queue has a pending task.
                    // - a scheduled task is ready for processing
                    break;
                }
                if (Thread.interrupted()) {
                    // Thread was interrupted so reset selected keys and break so we not run into a busy loop.
                    // As this is most likely a bug in the handler of the user or it's client library we will
                    // also log it.
                    //
                    // See https://github.com/netty/netty/issues/2426
                    if (logger.isDebugEnabled()) {
                        logger.debug("Selector.select() returned prematurely because " +
                                "Thread.currentThread().interrupt() was called. Use " +
                                "NioEventLoop.shutdownGracefully() to shutdown the NioEventLoop.");
                    }
                    selectCnt = 1;
                    break;
                }

                long time = System.nanoTime();
                if (time - TimeUnit.MILLISECONDS.toNanos(timeoutMillis) >= currentTimeNanos) {
                    // timeoutMillis elapsed without anything selected.
                    selectCnt = 1;
                } else if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&
                        selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {
                    // The selector returned prematurely many times in a row.
                    // Rebuild the selector to work around the problem.
                    logger.warn(
                            "Selector.select() returned prematurely {} times in a row; rebuilding selector.",
                            selectCnt);

                    rebuildSelector();
                    selector = this.selector;

                    // Select again to populate selectedKeys.
                    selector.selectNow();
                    selectCnt = 1;
                    break;
                }

                currentTimeNanos = time;
            }

            if (selectCnt > MIN_PREMATURE_SELECTOR_RETURNS) {
                if (logger.isDebugEnabled()) {
                    logger.debug("Selector.select() returned prematurely {} times in a row.", selectCnt - 1);
                }
            }
        } catch (CancelledKeyException e) {
            if (logger.isDebugEnabled()) {
                logger.debug(CancelledKeyException.class.getSimpleName() + " raised by a Selector - JDK bug?", e);
            }
        }
    }
protected long delayNanos(long currentTimeNanos) {
    ScheduledFutureTask<?> scheduledTask = peekScheduledTask();
    if (scheduledTask == null) {
        return SCHEDULE_PURGE_INTERVAL;
    }

    return scheduledTask.delayNanos(currentTimeNanos);
}

public long delayNanos(long currentTimeNanos) {
    return Math.max(0, deadlineNanos() - (currentTimeNanos - START_TIME));
}

public long deadlineNanos() {
    return deadlineNanos;
}

1、delayNanos(currentTimeNanos): 在父類SingleThreadEventExecutor中有一個延遲執行任務的隊列,delayNanos就是去這個延遲隊列里看是否有非IO任務未執行

  • 如果沒有則返回1秒鐘。
  • 如果延遲隊列里有任務并且最終的計算出來的時間(selectDeadLineNanos - currentTimeNanos)小于500000L納秒,就調用selectNow()直接返回,反之執行阻塞的select

2、select如果遇到以下幾種情況會立即返回

if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {
                    // - Selected something,
                    // - waken up by user, or
                    // - the task queue has a pending task.
                    // - a scheduled task is ready for processing
                    break;
                }
  1. Selected something 如果select到了就緒連接(selectedKeys > 0)
  2. waken up by user 被用戶喚醒了
  3. the task queue has a pending task.任務隊列來了一個新任務
  4. a scheduled task is ready for processing 延遲隊列里面有個預約任務需要到期執行

3、selectCnt: 記錄select空轉的次數(selectCnt),該方法解決了Nio中臭名昭著selector的select方法導致cpu100%的BUG,當空轉的次數超過了512(定義一個閥值,這個閥值默認是512,可以在應用層通過設置系統屬性io.netty.selectorAutoRebuildThreshold傳入),Netty會重新構建新的Selector,將老Selector上注冊的Channel轉移到新建的Selector上,關閉老Selector,用新的Selector代替老Selector。詳細看下面rebuildSelector()方法

4、rebuildSelector(): 就是上面說過得。

public void rebuildSelector() {
    if (!inEventLoop()) {
        execute(new Runnable() {
            @Override
            public void run() {
                rebuildSelector();
            }
        });
        return;
    }

    final Selector oldSelector = selector;
    final Selector newSelector;

    if (oldSelector == null) {
        return;
    }

    try {
        newSelector = openSelector();
    } catch (Exception e) {
        logger.warn("Failed to create a new Selector.", e);
        return;
    }

    // Register all channels to the new Selector.
    int nChannels = 0;
    for (;;) {
        try {
            for (SelectionKey key: oldSelector.keys()) {
                Object a = key.attachment();
                try {
                    if (!key.isValid() || key.channel().keyFor(newSelector) != null) {
                        continue;
                    }

                    int interestOps = key.interestOps();
                    key.cancel();
                    SelectionKey newKey = key.channel().register(newSelector, interestOps, a);
                    if (a instanceof AbstractNioChannel) {
                        // Update SelectionKey
                        ((AbstractNioChannel) a).selectionKey = newKey;
                    }
                    nChannels ++;
                } catch (Exception e) {
                    logger.warn("Failed to re-register a Channel to the new Selector.", e);
                    if (a instanceof AbstractNioChannel) {
                        AbstractNioChannel ch = (AbstractNioChannel) a;
                        ch.unsafe().close(ch.unsafe().voidPromise());
                    } else {
                        @SuppressWarnings("unchecked")
                        NioTask<SelectableChannel> task = (NioTask<SelectableChannel>) a;
                        invokeChannelUnregistered(task, key, e);
                    }
                }
            }
        } catch (ConcurrentModificationException e) {
            // Probably due to concurrent modification of the key set.
            continue;
        }

        break;
    }

    selector = newSelector;

    try {
        // time to close the old selector as everything else is registered to the new one
        oldSelector.close();
    } catch (Throwable t) {
        if (logger.isWarnEnabled()) {
            logger.warn("Failed to close the old Selector.", t);
        }
    }

    logger.info("Migrated " + nChannels + " channel(s) to the new Selector.");
}
最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。

推薦閱讀更多精彩內容