線程池之ThreadPoolExecutor概述

Java源碼里面都有大量的注釋,認真讀懂這些注釋,就可以把握其七分工作機制了。關于ThreadPoolExecutor的解析,我們就從其類注釋開始。


ThreadPoolExecutor.png

現將注釋大致翻譯如下:

ExecutorService(ThreadPoolExecutor的頂層接口)使用線程池中的線程執行每個提交的任務,通常我們使用Executors的工廠方法來創建ExecutorService。

線程池解決了兩個不同的問題:

  1. 提升性能:它們通常在執行大量異步任務時,由于減少了每個任務的調用開銷,并且它們提供了一種限制和管理資源(包括線程)的方法,使得性能提升明顯;
  2. 統計信息:每個ThreadPoolExecutor保持一些基本的統計信息,例如完成的任務數量。

為了在廣泛的上下文中有用,此類提供了許多可調參數和可擴展性鉤子。 但是,在常見場景中,我們預配置了幾種線程池,我們敦促程序員使用更方便的Executors的工廠方法直接使用。

  • Executors.newCachedThreadPool(無界線程池,自動線程回收)
  • Executors.newFixedThreadPool(固定大小的線程池);
  • Executors.newSingleThreadExecutor(單一后臺線程);

注:這里沒有提到ScheduledExecutorService ,后續解析。

在自定義線程池時,請參考以下指南:

一、Core and maximum pool sizes 核心和最大線程池數量

參數 翻譯
corePoolSize 核心線程池數量
maximumPoolSize 最大線程池數量

線程池執行器將會根據corePoolSize和maximumPoolSize自動地調整線程池大小。

當在execute(Runnable)方法中提交新任務并且少于corePoolSize線程正在運行時,即使其他工作線程處于空閑狀態,也會創建一個新線程來處理該請求。 如果有多于corePoolSize但小于maximumPoolSize線程正在運行,則僅當隊列已滿時才會創建新線程。 通過設置corePoolSize和maximumPoolSize相同,您可以創建一個固定大小的線程池。 通過將maximumPoolSize設置為基本上無界的值,例如Integer.MAX_VALUE,您可以允許池容納任意數量的并發任務。 通常,核心和最大池大小僅在構建時設置,但也可以使用setCorePoolSizesetMaximumPoolSize進行動態更改。

這段話詳細了描述了線程池對任務的處理流程,這里用個圖總結一下

線程任務處理流程.png

二、prestartCoreThread 核心線程預啟動
在默認情況下,只有當新任務到達時,才開始創建和啟動核心線程,但是我們可以使用 prestartCoreThread()prestartAllCoreThreads() 方法動態調整。
如果使用非空隊列構建池,則可能需要預先啟動線程。

方法 作用
prestartCoreThread() 創一個空閑任務線程等待任務的到達
prestartAllCoreThreads() 創建核心線程池數量的空閑任務線程等待任務的到達

三、ThreadFactory 線程工廠

新線程使用ThreadFactory創建。 如果未另行指定,則使用Executors.defaultThreadFactory默認工廠,使其全部位于同一個ThreadGroup中,并且具有相同的NORM_PRIORITY優先級和非守護進程狀態。

通過提供不同的ThreadFactory,您可以更改線程的名稱,線程組,優先級,守護進程狀態等。如果ThreadCactory在通過從newThread返回null詢問時未能創建線程,則執行程序將繼續,但可能無法執行任何任務。

線程應該有modifyThread權限。 如果工作線程或使用該池的其他線程不具備此權限,則服務可能會降級:配置更改可能無法及時生效,并且關閉池可能會保持可終止但尚未完成的狀態。

四、Keep-alive times 線程存活時間

如果線程池當前擁有超過corePoolSize的線程,那么多余的線程在空閑時間超過keepAliveTime時會被終止 ( 請參閱getKeepAliveTime(TimeUnit) )。這提供了一種在不積極使用線程池時減少資源消耗的方法。

如果池在以后變得更加活躍,則應構建新線程。 也可以使用方法setKeepAliveTime(long,TimeUnit)進行動態調整。

防止空閑線程在關閉之前終止,可以使用如下方法:

setKeepAliveTime(Long.MAX_VALUE,TimeUnit.NANOSECONDS);

默認情況下,keep-alive策略僅適用于存在超過corePoolSize線程的情況。 但是,只要keepAliveTime值不為零,方法allowCoreThreadTimeOut(boolean)也可用于將此超時策略應用于核心線程

五、Queuing 隊列

BlockingQueu用于存放提交的任務,隊列的實際容量與線程池大小相關聯。

  • 如果當前線程池任務線程數量小于核心線程池數量,執行器總是優先創建一個任務線程,而不是從線程隊列中取一個空閑線程。

  • 如果當前線程池任務線程數量大于核心線程池數量,執行器總是優先從線程隊列中取一個空閑線程,而不是創建一個任務線程。

  • 如果當前線程池任務線程數量大于核心線程池數量,且隊列中無空閑任務線程,將會創建一個任務線程,直到超出maximumPoolSize,如果超時maximumPoolSize,則任務將會被拒絕。

這個過程參考[線程任務處理流程圖.png]

主要有三種隊列策略:

  1. Direct handoffs 直接握手隊列
    Direct handoffs 的一個很好的默認選擇是 SynchronousQueue,它將任務交給線程而不需要保留。這里,如果沒有線程立即可用來運行它,那么排隊任務的嘗試將失敗,因此將構建新的線程。
    此策略在處理可能具有內部依賴關系的請求集時避免鎖定。Direct handoffs 通常需要無限制的maximumPoolSizes來避免拒絕新提交的任務。 但得注意,當任務持續以平均提交速度大余平均處理速度時,會導致線程數量會無限增長問題。

  2. Unbounded queues 無界隊列
    當所有corePoolSize線程繁忙時,使用無界隊列(例如,沒有預定義容量的LinkedBlockingQueue)將導致新任務在隊列中等待,從而導致maximumPoolSize的值沒有任何作用。當每個任務互不影響,完全獨立于其他任務時,這可能是合適的; 例如,在網頁服務器中, 這種隊列方式可以用于平滑瞬時大量請求。但得注意,當任務持續以平均提交速度大余平均處理速度時,會導致隊列無限增長問題。

  3. Bounded queues 有界隊列
    一個有界的隊列(例如,一個ArrayBlockingQueue)和有限的maximumPoolSizes配置有助于防止資源耗盡,但是難以控制。隊列大小和maximumPoolSizes需要 相互權衡

  • 使用大隊列和較小的maximumPoolSizes可以最大限度地減少CPU使用率,操作系統資源和上下文切換開銷,但會導致人為的低吞吐量。如果任務經常被阻塞(比如I/O限制),那么系統可以調度比我們允許的更多的線程。
  • 使用小隊列通常需要較大的maximumPoolSizes,這會使CPU更繁忙,但可能會遇到不可接受的調度開銷,這也會降低吞吐量。
    這里主要為了說明有界隊列大小和maximumPoolSizes的大小控制,若何降低資源消耗的同時,提高吞吐量

六、Rejected tasks 拒絕任務
拒絕任務有兩種情況:1. 線程池已經被關閉;2. 任務隊列已滿且maximumPoolSizes已滿;
無論哪種情況,都會調用RejectedExecutionHandler的rejectedExecution方法。預定義了四種處理策略:

  1. AbortPolicy:默認測策略,拋出RejectedExecutionException運行時異常;
  2. CallerRunsPolicy:這提供了一個簡單的反饋控制機制,可以減慢提交新任務的速度;
  3. DiscardPolicy:直接丟棄新提交的任務;
  4. DiscardOldestPolicy:如果執行器沒有關閉,隊列頭的任務將會被丟棄,然后執行器重新嘗試執行任務(如果失敗,則重復這一過程);
    我們可以自己定義RejectedExecutionHandler,以適應特殊的容量和隊列策略場景中。

七、Hook methods 鉤子方法
ThreadPoolExecutor為提供了每個任務執行前后提供了鉤子方法,重寫beforeExecute(Thread,Runnable)afterExecute(Runnable,Throwable)方法來操縱執行環境; 例如,重新初始化ThreadLocals,收集統計信息或記錄日志等。此外,terminated()在Executor完全終止后需要完成后會被調用,可以重寫此方法,以執行任殊處理。
注意:如果hook或回調方法拋出異常,內部的任務線程將會失敗并結束。

八、Queue maintenance 維護隊列
getQueue()方法可以訪問任務隊列,一般用于監控和調試。絕不建議將這個方法用于其他目的。當在大量的隊列任務被取消時,remove()purge()方法可用于回收空間。

九、Finalization 關閉

如果程序中不在持有線程池的引用,并且線程池中沒有線程時,線程池將會自動關閉。如果您希望確保即使用戶忘記調用 shutdown()方法也可以回收未引用的線程池,使未使用線程最終死亡。那么必須通過設置適當的 keep-alive times 并設置allowCoreThreadTimeOut(boolean) 或者 使 corePoolSize下限為0 。
一般情況下,線程池啟動后建議手動調用shutdown()關閉。

總結,通過解讀ThreadPoolExecutor的注釋,我們對ThreadPoolExecutor應該有了比較全面的了解,其實現方式,后續章節詳解。

多線程系列目錄(不斷更新中):
線程啟動原理
線程中斷機制
多線程實現方式
FutureTask實現原理
線程池之ThreadPoolExecutor概述
線程池之ThreadPoolExecutor使用
線程池之ThreadPoolExecutor狀態控制
線程池之ThreadPoolExecutor執行原理
線程池之ScheduledThreadPoolExecutor概述
線程池的優雅關閉實踐

英文原文如下:

/**
 * An {@link ExecutorService} that executes each submitted task using
 * one of possibly several pooled threads, normally configured
 * using {@link Executors} factory methods.
 *
 * <p>Thread pools address two different problems: they usually
 * provide improved performance when executing large numbers of
 * asynchronous tasks, due to reduced per-task invocation overhead,
 * and they provide a means of bounding and managing the resources,
 * including threads, consumed when executing a collection of tasks.
 * Each {@code ThreadPoolExecutor} also maintains some basic
 * statistics, such as the number of completed tasks.
 *
 * <p>To be useful across a wide range of contexts, this class
 * provides many adjustable parameters and extensibility
 * hooks. However, programmers are urged to use the more convenient
 * {@link Executors} factory methods {@link
 * Executors#newCachedThreadPool} (unbounded thread pool, with
 * automatic thread reclamation), {@link Executors#newFixedThreadPool}
 * (fixed size thread pool) and {@link
 * Executors#newSingleThreadExecutor} (single background thread), that
 * preconfigure settings for theost common usage
 * scenarios. Otherwise, use the following guide when manually
 * configuring and tuning this class:
 *
 * <dl>
 *
 * <dt>Core and maximum pool sizes</dt>
 *
 * <dd>A {@code ThreadPoolExecutor} will automatically adjust the
 * pool size (see {@link #getPoolSize})
 * according to the bounds set by
 * corePoolSize (see {@link #getCorePoolSize}) and
 * maximumPoolSize (see {@link #getMaximumPoolSize}).
 *
 * When a new task is submitted in method {@link #execute(Runnable)},
 * and fewer than corePoolSize threads are running, a new thread is
 * created to handle the request, even if other worker threads are
 * idle.  If there are more than corePoolSize but less than
 * maximumPoolSize threads running, a new thread will be created only
 * if the queue is full.  By setting corePoolSize and maximumPoolSize
 * the same, you create a fixed-size thread pool. By setting
 * maximumPoolSize to an essentially unbounded value such as {@code
 * Integer.MAX_VALUE}, you allow the pool to accommodate an arbitrary
 * number of concurrent tasks. Most typically, core and maximum pool
 * sizes are set only upon construction, but they may also be changed
 * dynamically using {@link #setCorePoolSize} and {@link
 * #setMaximumPoolSize}. </dd>
 *
 * <dt>On-demand construction</dt>
 *
 * <dd>By default, even core threads are initially created and
 * started only when new tasks arrive, but this can be overridden
 * dynamically using method {@link #prestartCoreThread} or {@link
 * #prestartAllCoreThreads}.  You probably want to prestart threads if
 * you construct the pool with a non-empty queue. </dd>
 *
 * <dt>Creating new threads</dt>
 *
 * <dd>New threads are created using a {@link ThreadFactory}.  If not
 * otherwise specified, a {@link Executors#defaultThreadFactory} is
 * used, that creates threads to all be in the same {@link
 * ThreadGroup} and with the same {@code NORM_PRIORITY} priority and
 * non-daemon status. By supplying a different ThreadFactory, you can
 * alter the thread's name, thread group, priority, daemon status,
 * etc. If a {@code ThreadFactory} fails to create a thread when asked
 * by returning null from {@code newThread}, the executor will
 * continue, but might not be able to execute any tasks. Threads
 * should possess the "modifyThread" {@code RuntimePermission}. If
 * worker threads or other threads using the pool do not possess this
 * permission, service may be degraded: configuration changes may not
 * take effect in a timely manner, and a shutdown pool may remain in a
 * state in which termination is possible but not completed.</dd>
 *
 * <dt>Keep-alive times</dt>
 *
 * <dd>If the pool currently has more than corePoolSize threads,
 * excess threads will be terminated if they have been idle for more
 * than the keepAliveTime (see {@link #getKeepAliveTime(TimeUnit)}).
 * This provides a means of reducing resource consumption when the
 * pool is not being actively used. If the pool becomes more active
 * later, new threads will be constructed. This parameter can also be
 * changed dynamically using method {@link #setKeepAliveTime(long,
 * TimeUnit)}.  Using a value of {@code Long.MAX_VALUE} {@link
 * TimeUnit#NANOSECONDS} effectively disables idle threads from ever
 * terminating prior to shut down. By default, the keep-alive policy
 * applies only when there are more than corePoolSize threads. But
 * method {@link #allowCoreThreadTimeOut(boolean)} can be used to
 * apply this time-out policy to core threads as well, so long as the
 * keepAliveTime value is non-zero. </dd>
 *
 * <dt>Queuing</dt>
 *
 * <dd>Any {@link BlockingQueue} may be used to transfer and hold
 * submitted tasks.  The use of this queue interacts with pool sizing:
 *
 * <ul>
 *
 * <li> If fewer than corePoolSize threads are running, the Executor
 * always prefers adding a new thread
 * rather than queuing.</li>
 *
 * <li> If corePoolSize or more threads are running, the Executor
 * always prefers queuing a request rather than adding a new
 * thread.</li>
 *
 * <li> If a request cannot be queued, a new thread is created unless
 * this would exceed maximumPoolSize, in which case, the task will be
 * rejected.</li>
 *
 * </ul>
 *
 * There are three general strategies for queuing:
 * <ol>
 *
 * <li> <em> Direct handoffs.</em> A good default choice for a work
 * queue is a {@link SynchronousQueue} that hands off tasks to threads
 * without otherwise holding them. Here, an attempt to queue a task
 * will fail if no threads are immediately available to run it, so a
 * new thread will be constructed. This policy avoids lockups when
 * handling sets of requests that might have internal dependencies.
 * Direct handoffs generally require unbounded maximumPoolSizes to
 * avoid rejection of new submitted tasks. This in turn admits the
 * possibility of unbounded thread growth when commands continue to
 * arrive on average faster than they can be processed.  </li>
 *
 * <li><em> Unbounded queues.</em> Using an unbounded queue (for
 * example a {@link LinkedBlockingQueue} without a predefined
 * capacity) will cause new tasks to wait in the queue when all
 * corePoolSize threads are busy. Thus, no more than corePoolSize
 * threads will ever be created. (And the value of the maximumPoolSize
 * therefore doesn't have any effect.)  This may be appropriate when
 * each task is completely independent of others, so tasks cannot
 * affect each others execution; for example, in a web page server.
 * While this style of queuing can be useful in smoothing out
 * transient bursts of requests, it admits the possibility of
 * unbounded work queue growth when commands continue to arrive on
 * average faster than they can be processed.  </li>
 *
 * <li><em>Bounded queues.</em> A bounded queue (for example, an
 * {@link ArrayBlockingQueue}) helps prevent resource exhaustion when
 * used with finite maximumPoolSizes, but can be more difficult to
 * tune and control.  Queue sizes and maximum pool sizes may be traded
 * off for each other: Using large queues and small pools minimizes
 * CPU usage, OS resources, and context-switching overhead, but can
 * lead to artificially low throughput.  If tasks frequently block (for
 * example if they are I/O bound), a system may be able to schedule
 * time for more threads than you otherwise allow. Use of small queues
 * generally requires larger pool sizes, which keeps CPUs busier but
 * may encounter unacceptable scheduling overhead, which also
 * decreases throughput.  </li>
 *
 * </ol>
 *
 * </dd>
 *
 * <dt>Rejected tasks</dt>
 *
 * <dd>New tasks submitted in method {@link #execute(Runnable)} will be
 * <em>rejected</em> when the Executor has been shut down, and also when
 * the Executor uses finite bounds for both maximum threads and work queue
 * capacity, and is saturated.  In either case, the {@code execute} method
 * invokes the {@link
 * RejectedExecutionHandler#rejectedExecution(Runnable, ThreadPoolExecutor)}
 * method of its {@link RejectedExecutionHandler}.  Four predefined handler
 * policies are provided:
 *
 * <ol>
 *
 * <li> In the default {@link ThreadPoolExecutor.AbortPolicy}, the
 * handler throws a runtime {@link RejectedExecutionException} upon
 * rejection. </li>
 *
 * <li> In {@link ThreadPoolExecutor.CallerRunsPolicy}, the thread
 * that invokes {@code execute} itself runs the task. This provides a
 * simple feedback control mechanism that will slow down the rate that
 * new tasks are submitted. </li>
 *
 * <li> In {@link ThreadPoolExecutor.DiscardPolicy}, a task that
 * cannot be executed is simply dropped.  </li>
 *
 * <li>In {@link ThreadPoolExecutor.DiscardOldestPolicy}, if the
 * executor is not shut down, the task at the head of the work queue
 * is dropped, and then execution is retried (which can fail again,
 * causing this to be repeated.) </li>
 *
 * </ol>
 *
 * It is possible to define and use other kinds of {@link
 * RejectedExecutionHandler} classes. Doing so requires some care
 * especially when policies are designed to work only under particular
 * capacity or queuing policies. </dd>
 *
 * <dt>Hook methods</dt>
 *
 * <dd>This class provides {@code protected} overridable
 * {@link #beforeExecute(Thread, Runnable)} and
 * {@link #afterExecute(Runnable, Throwable)} methods that are called
 * before and after execution of each task.  These can be used to
 * manipulate the execution environment; for example, reinitializing
 * ThreadLocals, gathering statistics, or adding log entries.
 * Additionally, method {@link #terminated} can be overridden to perform
 * any special processing that needs to be done once the Executor has
 * fully terminated.
 *
 * <p>If hook or callback methods throw exceptions, internal worker
 * threads may in turn fail and abruptly terminate.</dd>
 *
 * <dt>Queue maintenance</dt>
 *
 * <dd>Method {@link #getQueue()} allows access to the work queue
 * for purposes of monitoring and debugging.  Use of this method for
 * any other purpose is strongly discouraged.  Two supplied methods,
 * {@link #remove(Runnable)} and {@link #purge} are available to
 * assist in storage reclamation when large numbers of queued tasks
 * become cancelled.</dd>
 *
 * <dt>Finalization</dt>
 *
 * <dd>A pool that is no longer referenced in a program <em>AND</em>
 * has no remaining threads will be {@code shutdown} automatically. If
 * you would like to ensure that unreferenced pools are reclaimed even
 * if users forget to call {@link #shutdown}, then you must arrange
 * that unused threads eventually die, by setting appropriate
 * keep-alive times, using a lower bound of zero core threads and/or
 * setting {@link #allowCoreThreadTimeOut(boolean)}.  </dd>
 *
 * </dl>
 *
 * @since 1.5
 * @author Doug Lea
 */
最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。

推薦閱讀更多精彩內容