引入
上一篇文章DAGScheduler源碼淺析主要從提交Job的流程角度介紹了DAGScheduler源碼中的重要函數和關鍵點,這篇DAGScheduler源碼淺析2主要參考fxjwind的Spark源碼分析 – DAGScheduler一文,介紹一下DAGScheduler文件中之前沒有介紹的幾個重要函數。
事件處理
在Spark 1.0版本之前,在DAGScheduler類中加入eventQueue私有成員,設置eventLoop Thread循環讀取事件進行處理。在Spark 1.0源碼中,事件處理通過Actor的方式進行,涉及的DAGEventProcessActor類進行主要的事件處理工作。
可能由于scala不再支持原生actor方式,而將akka actor作為官方標準的原因,在我查看Spark 1.4的源碼中,DAGScheduler重新采用eventQueue的方式進行事件處理,為了代碼邏輯更加清晰,耦合性更小,1.4的源碼中編寫了DAGSchedulerEventProcessLoop類進行事件處理。
private[scheduler] class DAGSchedulerEventProcessLoop(dagScheduler: DAGScheduler)
extends EventLoop[DAGSchedulerEvent]("dag-scheduler-event-loop") with Logging {
這里DAGSchedulerEventProcessLoop繼承了EventLoop類,其中:
private[spark] abstract class EventLoop[E](name: String) extends Logging {
private val eventQueue: BlockingQueue[E] = new LinkedBlockingDeque[E]()
private val stopped = new AtomicBoolean(false)
private val eventThread = new Thread(name) {
setDaemon(true)
override def run(): Unit = {
try {
while (!stopped.get) {
val event = eventQueue.take()
try {
onReceive(event)
} catch {
case NonFatal(e) => {
try {
onError(e)
} catch {
case NonFatal(e) => logError("Unexpected error in " + name, e)
}
}
}
}
} catch {
case ie: InterruptedException => // exit even if eventQueue is not empty
case NonFatal(e) => logError("Unexpected error in " + name, e)
}
}
}
......
我們可以看到,DAGScheduler通過向DAGSchedulerEventProcessLoop對象投遞event,即向eventQueue發送event,eventThread不斷從eventQueue中獲取event并調用onReceive函數進行處理。
override def onReceive(event: DAGSchedulerEvent): Unit = event match {
case JobSubmitted(jobId, rdd, func, partitions, allowLocal, callSite, listener, properties) =>
dagScheduler.handleJobSubmitted(jobId, rdd, func, partitions, allowLocal, callSite,
listener, properties)
......
JobWaiter
JobWaiter首先實現JobListener的taskSucceeded和jobFailed函數,當DAGScheduler收到tasksuccess或fail的event就會調用相應的函數在tasksuccess會判斷當所有task都success時,就表示jobFinished而awaitResult,就是一直等待jobFinished被置位。
可以看到在submitJob函數中創建了JobWaiter實例,作為參數傳入的事件實例中,最終在調用handleJobSubmitted函數中,如果發生錯誤,就會調用JobWaiter的jobFailed函數。
下面是JobWaiter類的代碼:
private[spark] class JobWaiter[T](
dagScheduler: DAGScheduler,
val jobId: Int,
totalTasks: Int,
resultHandler: (Int, T) => Unit)
extends JobListener {
private var finishedTasks = 0
// Is the job as a whole finished (succeeded or failed)?
@volatile
private var _jobFinished = totalTasks == 0
def jobFinished = _jobFinished
// If the job is finished, this will be its result. In the case of 0 task jobs (e.g. zero
// partition RDDs), we set the jobResult directly to JobSucceeded.
private var jobResult: JobResult = if (jobFinished) JobSucceeded else null
/**
* Sends a signal to the DAGScheduler to cancel the job. The cancellation itself is handled
* asynchronously. After the low level scheduler cancels all the tasks belonging to this job, it
* will fail this job with a SparkException.
*/
def cancel() {
dagScheduler.cancelJob(jobId)
}
override def taskSucceeded(index: Int, result: Any): Unit = synchronized {
if (_jobFinished) {
throw new UnsupportedOperationException("taskSucceeded() called on a finished JobWaiter")
}
resultHandler(index, result.asInstanceOf[T])
finishedTasks += 1
if (finishedTasks == totalTasks) {
_jobFinished = true
jobResult = JobSucceeded
this.notifyAll()
}
}
override def jobFailed(exception: Exception): Unit = synchronized {
_jobFinished = true
jobResult = JobFailed(exception)
this.notifyAll()
}
def awaitResult(): JobResult = synchronized {
while (!_jobFinished) {
this.wait()
}
return jobResult
}
}
小結
這一小節內容介紹了DAGScheduler.scala文件中的幾個小細節,下一篇文章我會就DAGScheduler.scala文件中stage劃分和依賴性進行分析介紹。
轉載請注明作者Jason Ding及其出處
GitCafe博客主頁(http://jasonding1354.gitcafe.io/)
Github博客主頁(http://jasonding1354.github.io/)
CSDN博客(http://blog.csdn.net/jasonding1354)
簡書主頁(http://www.lxweimin.com/users/2bd9b48f6ea8/latest_articles)
Google搜索jasonding1354進入我的博客主頁