什么是delegation token

delegation token其實就是hadoop里一種輕量級認證方法，作為kerberos認證的一種補充。理論上只使用kerberos來認證是足夠了，為什么hadoop還要自己開發一套使用delegation token的認證方式呢？這是因為如果在一個很大的分布式系統當中，如果每個節點訪問某個服務的時候都使用kerberos來作為認證方式，那么勢必對KDC造成很大的壓力，KDC就會成為一個系統的瓶頸。

與kerberos的區別

kerberos認證需三方參與，client， kdc， server三方協作完成認證。通常包括三個子過程：

client向kdc申請TGT（Ticket Granting Ticket），TGT包含client信息及client和KDC之間的 session key兩部分信息，并且使用KDC的master key加密
client使用TGT向KDC申請訪問某個服務的Ticket，Ticket包含client信息及client和server之間的session key兩部分信息，并且使用server的master key加密。
client使用Ticket訪問某個服務。

Delegation token的認證只需要兩方參與，client和server。在server端生成token并發送給client端。client使用該token訪問server，server對該token進行認證。
Delegation token可以傳遞給其它服務使用，這也是它叫delegation token的原因。比如在client端獲取到hdfs delegation token后，可以分發到Mapper端和Reducer端。這樣Map，Reduce就不用在通過Kerberos認證而直接使用該token訪問hdfs。同時，delegation token可以指定更新者（renewer），比如yarn，或者自己。token快要過期的時候需要更新，更新的時候只涉及更新者和server端。其它任何使用該token的人都不會受到影響。

delegation token 期限

delegation token有過期時間，需要定期刷新才能保證token有效。但是刷新次數不是無限的，也就是說每個token都有個最大生存時間，超過該時間，該token就失效。比如token每個24小時需要刷新一次，否則就失效。同時每個token最大生命值為7天，那么七天后該token就不能在被使用。

delegation 包含的內容

Token.java

  private byte[] identifier;
  private byte[] password;
  private Text kind;
  private Text service;
  private TokenRenewer renewer;

其中identifiertoken的標識，password用于server端認證該token。kind參數表示該token的類型，比如HDFS_DELEGATION_TOKEN，service表示該token訪問的服務，比如ha-hdfs:<nameservice>，renewer表示刷新者。
以上幾個部分是client拿到的token包含的內容。token的失效時間、owner、realuser等信息存放在server端。

delegation 生命周期

上圖展示了在yarn應用中，delegation token的生命流程。
1）client端首先通過Kerberos 認證方式訪問namenode，獲取DT（delegation token）
2）client向yarn提交應用，并且把DT傳給RM。同時會指定yarn作為該token的renewer。
3）rm選一個節點啟動Am，隨后AM向RM申請資源，將worker contaiern都啟動。這一步中DT都會分發到相應的container中。
4）所有的工作節點都使用DT去訪問hdfs
5）當工作結束后，RM釋放該DT。

delegation token過期應該怎么辦

delegation token會失效，集群默認配置是renew的間隔為一天，token最大生存時間為7天。對于像mapreduce這種批處理任務可能不會面臨token失效的問題，但對于spark streaming, storm等這種長時運行應用來說，不得不面臨一個問題：token存在最大生命周期。當token達到其最大生命周期的時候，比如七天，所有的工作節點（比如spark streaming的executor）中使用的token都會失效，此時在使用該token去訪問hdfs就會被namenode拒絕，導致應用異常退出。

一種解決思路是將keytab文件分發給Am及每個container，讓am和container去訪問kdc來認證，但這種方式會造成文章開頭所說的問題：對KDC造成很大的訪問壓力，導致KDC會誤認為自己遭受了DDos攻擊，從而影響程序性能。
另一種解決思路是先由client把keytab文件放到hdfs上。然后在Am中使用keytab登錄，并申請delegation token。Am在啟動worker的時候把該token分發給相應的容器。當token快要過期的時候，Am重新登錄一次，并重新獲取delegation token，并告知所有的worker使用更新后的token訪問服務。

spark中怎么解決delegation token過期問題

spark使用的就是第二種解決思路，接下來詳細分析下spark1.6是怎么解決token過期問題的。
spark 為了解決DT失效問題，加了兩個參數"--keytab"和"--principal"，分別指定用于kerberos登錄的keytab文件和principal。spark中用于提交yarn任務的類為Client

org.apache.spark.deploy.yarn.Client

def submitApplication(): ApplicationId = {
   var appId: ApplicationId = null
   try {
     launcherBackend.connect()
     // Setup the credentials before doing anything else,
     // so we have don't have issues at any point.
     setupCredentials()
     yarnClient.init(yarnConf)
     yarnClient.start()

     logInfo("Requesting a new application from cluster with %d NodeManagers"
       .format(yarnClient.getYarnClusterMetrics.getNumNodeManagers))

     // Get a new application from our RM
     val newApp = yarnClient.createApplication()
     val newAppResponse = newApp.getNewApplicationResponse()
     appId = newAppResponse.getApplicationId()
     reportLauncherState(SparkAppHandle.State.SUBMITTED)
     launcherBackend.setAppId(appId.toString())

     // Verify whether the cluster has enough resources for our AM
     verifyClusterResources(newAppResponse)

     // Set up the appropriate contexts to launch our AM
     val containerContext = createContainerLaunchContext(newAppResponse)
     val appContext = createApplicationSubmissionContext(newApp, containerContext)

     // Finally, submit and monitor the application
     logInfo(s"Submitting application ${appId.getId} to ResourceManager")
     yarnClient.submitApplication(appContext)
     appId
   } catch {
     case e: Throwable =>
       if (appId != null) {
         cleanupStagingDir(appId)
       }
       throw e
   }

submitApplication是提交yarn任務的入口，該函數的最開始調用了setupCredentials函數用于設置Credential。

def setupCredentials(): Unit = {
    loginFromKeytab = args.principal != null || sparkConf.contains("spark.yarn.principal")
    ...
    // Defensive copy of the credentials
    credentials = new Credentials(UserGroupInformation.getCurrentUser.getCredentials)
  }

先判斷參數中是否包含“--principal”，如果包含，則成員變量logingFromKeytab置為true。這個成員變量用于后面的一些判斷。另外就是獲取當前ugi中包含的Credentials對象。
設置完credentials后，調用yarn的接口創建一個Application，并取得appid，然后需要創建containerLaunchContext，這是走的yarn程序的標準流程，進一步分析該函數

private def createContainerLaunchContext(newAppResponse: GetNewApplicationResponse)
    : ContainerLaunchContext = {
    logInfo("Setting up container launch context for our AM")
    val appId = newAppResponse.getApplicationId
    val appStagingDir = getAppStagingDir(appId)
    val pySparkArchives =
      if (sparkConf.getBoolean("spark.yarn.isPython", false)) {
        findPySparkArchives()
      } else {
        Nil
      }
    val launchEnv = setupLaunchEnv(appStagingDir, pySparkArchives)
    val localResources = prepareLocalResources(appStagingDir, pySparkArchives)

    // Set the environment variables to be passed on to the executors.
    distCacheMgr.setDistFilesEnv(launchEnv)
    distCacheMgr.setDistArchivesEnv(launchEnv)

    val amContainer = Records.newRecord(classOf[ContainerLaunchContext])
    amContainer.setLocalResources(localResources.asJava)
    amContainer.setEnvironment(launchEnv.asJava)
    ...
    setupSecurityToken(amContainer)

    amContainer
  }

這個函數很長，省略了一些代碼，主要是設置container啟動的命令行，環境變量，classpath之類的東西。在setupLaunchEnv函數里設置了后期在Am中用于通知executor的credentials文件

private def setupLaunchEnv(
      stagingDir: String,
      pySparkArchives: Seq[String]): HashMap[String, String] = {
    ...
    if (loginFromKeytab) {
      val remoteFs = FileSystem.get(hadoopConf)
      val stagingDirPath = new Path(remoteFs.getHomeDirectory, stagingDir)
      val credentialsFile = "credentials-" + UUID.randomUUID().toString
      sparkConf.set(
        "spark.yarn.credentials.file", new Path(stagingDirPath, credentialsFile).toString)
      logInfo(s"Credentials file set to: $credentialsFile")
      val renewalInterval = getTokenRenewalInterval(stagingDirPath)
      sparkConf.set("spark.yarn.token.renewal.interval", renewalInterval.toString)
    }

    ...
}

先判斷loginFromKeytab是否為true。在前面我們看到只要在命令行中使用了"--principal"參數，loginFromKeytab就為true，這里聲明了用于存放token的文件位置，默認為hdfs上位于/user/{user}/.sparkSgating/{appid}目錄下的以credentials開頭的文件。并把該文件位置放在sparkConf中，key為“spark.yarn.credentials.file”，這在后邊會用到。同時這里獲取了DT renew的間隔。也同樣放在sparkConf中，key為“spark.yarn.token.renewal.interval”。
設置完登錄環境后，進入到另一個函數prepareLocalResources，這個函數里邊有一個關鍵的步驟：獲取hdfs
delegation token

YarnSparkHadoopUtil.get.obtainTokensForNamenodes(nns, hadoopConf, credentials)

def obtainTokensForNamenodes(
    paths: Set[Path],
    conf: Configuration,
    creds: Credentials,
    renewer: Option[String] = None
  ): Unit = {
    if (UserGroupInformation.isSecurityEnabled()) {
      val delegTokenRenewer = renewer.getOrElse(getTokenRenewer(conf))
      paths.foreach { dst =>
        val dstFs = dst.getFileSystem(conf)
        logInfo("getting token for namenode: " + dst)
        dstFs.addDelegationTokens(delegTokenRenewer, creds)
      }
    }
  }

這個函數用于向namenode索取hdfs delegation token，并把該token添加到Credentials對象中。前面我們講過，credentiasl對象的初值為UserGroupInformation.getCurrentUser.getCredentials, 而ugi中默認是不包含hdfs delegation token的。因此通過該函數會吧hdfs delegation token添加到credentials中。
然后回到createContainerLaunchContext，準備工作都做好后，創建了amContainer，并調用setupSecurityToken函數給amContainer設置剛剛獲取到的token。所以當Am起來后不需要通過kerberos認證，可以直接使用hdfs delegation token與namenode交互。

private def setupSecurityToken(amContainer: ContainerLaunchContext): Unit = {
    val dob = new DataOutputBuffer
    credentials.writeTokenStorageToStream(dob)
    amContainer.setTokens(ByteBuffer.wrap(dob.getData))
  }

至此，有關token的東西都準備好了，調用yarnClient.submitApplication(appContext)向yarn提交任務。yarn在收到請求后會先找一個機器啟動AmContainer。yarn啟動container的命令其實就是client傳給yarn的。大概就是“bin/java xxx org.apache.spark.deploy.yarn.ApplicationMaster xxx”。Am的入口為ApplicationMaster的run函數。

org.apache.spark.deploy.yarn.ApplicationMaster

final def run(): Int = {
    try {
      ...
      // If the credentials file config is present, we must periodically renew tokens. So create
      // a new AMDelegationTokenRenewer
      if (sparkConf.contains("spark.yarn.credentials.file")) {
        delegationTokenRenewerOption = Some(new AMDelegationTokenRenewer(sparkConf, yarnConf))
        // If a principal and keytab have been set, use that to create new credentials for executors
        // periodically
        delegationTokenRenewerOption.foreach(_.scheduleLoginFromKeytab())
      }

      if (isClusterMode) {
        runDriver(securityMgr)
      } else {
        runExecutorLauncher(securityMgr)
      }
    } catch {
      case e: Exception =>
        // catch everything else if not specifically handled
        logError("Uncaught exception: ", e)
        finish(FinalApplicationStatus.FAILED,
          ApplicationMaster.EXIT_UNCAUGHT_EXCEPTION,
          "Uncaught exception: " + e)
    }
    exitCode
  }

這里看到了一個熟悉的配置“spark.yarn.credentials.file”，還記得我們之前講過該參數被設置為什么了嗎？就是Am用于保存token文件的位置。所以，如果spark submit啟動的時候傳遞了"--principal"參數，就會在sparkConf中生成一個“spark.yarn.credentials.file”配置，如果sparkConf中有“spark.yarn.credentials.file”配置，在AM中，也就是run函數中，會生成一個AMDelegationTokenRenewer對象。從名字也可以看出，這個對象就負責定期的更新token，將token寫入到一個hdfs文件，然后executor從該文件中獲取新的token從而防止token過期的作用了。

AMDelegationTokenRenewer

private[spark] def scheduleLoginFromKeytab(): Unit = {
    val principal = sparkConf.get("spark.yarn.principal")
    val keytab = sparkConf.get("spark.yarn.keytab")

    /**
     * Schedule re-login and creation of new tokens. If tokens have already expired, this method
     * will synchronously create new ones.
     */
    def scheduleRenewal(runnable: Runnable): Unit = {
      val credentials = UserGroupInformation.getCurrentUser.getCredentials
      val renewalInterval = hadoopUtil.getTimeFromNowToRenewal(sparkConf, 0.75, credentials)
      // Run now!
      if (renewalInterval <= 0) {
        logInfo("HDFS tokens have expired, creating new tokens now.")
        runnable.run()
      } else {
        logInfo(s"Scheduling login from keytab in $renewalInterval millis.")
        delegationTokenRenewer.schedule(runnable, renewalInterval, TimeUnit.MILLISECONDS)
      }
    }

    // This thread periodically runs on the driver to update the delegation tokens on HDFS.
    val driverTokenRenewerRunnable =
      new Runnable {
        override def run(): Unit = {
          try {
            writeNewTokensToHDFS(principal, keytab)
            cleanupOldFiles()
          } catch {
            case e: Exception =>
              // Log the error and try to write new tokens back in an hour
              logWarning("Failed to write out new credentials to HDFS, will try again in an " +
                "hour! If this happens too often tasks will fail.", e)
              delegationTokenRenewer.schedule(this, 1, TimeUnit.HOURS)
              return
          }
          scheduleRenewal(this)
        }
      }
    // Schedule update of credentials. This handles the case of updating the tokens right now
    // as well, since the renenwal interval will be 0, and the thread will get scheduled
    // immediately.
    scheduleRenewal(driverTokenRenewerRunnable)
  }

首先判斷token是否快要過期了，如果是，則調用writeNewTokensToHDFS函數獲取新的token，并寫到hdfs上。否則，生成一個調度任務再一段時間后重新判斷。

private def writeNewTokensToHDFS(principal: String, keytab: String): Unit = {
    // Keytab is copied by YARN to the working directory of the AM, so full path is
    // not needed.

    // HACK:
    // HDFS will not issue new delegation tokens, if the Credentials object
    // passed in already has tokens for that FS even if the tokens are expired (it really only
    // checks if there are tokens for the service, and not if they are valid). So the only real
    // way to get new tokens is to make sure a different Credentials object is used each time to
    // get new tokens and then the new tokens are copied over the the current user's Credentials.
    // So:
    // - we login as a different user and get the UGI
    // - use that UGI to get the tokens (see doAs block below)
    // - copy the tokens over to the current user's credentials (this will overwrite the tokens
    // in the current user's Credentials object for this FS).
    // The login to KDC happens each time new tokens are required, but this is rare enough to not
    // have to worry about (like once every day or so). This makes this code clearer than having
    // to login and then relogin every time (the HDFS API may not relogin since we don't use this
    // UGI directly for HDFS communication.
    logInfo(s"Attempting to login to KDC using principal: $principal")
    //1）重新登錄kdc
    val keytabLoggedInUGI = UserGroupInformation.loginUserFromKeytabAndReturnUGI(principal, keytab)
    logInfo("Successfully logged into KDC.")
    val tempCreds = keytabLoggedInUGI.getCredentials
    val credentialsPath = new Path(credentialsFile)
    val dst = credentialsPath.getParent
    //2）使用新的登錄身份信息向namenode拿hdfs delegation token，并添加到tempCreds中
    keytabLoggedInUGI.doAs(new PrivilegedExceptionAction[Void] {
      // Get a copy of the credentials
      override def run(): Void = {
        val nns = YarnSparkHadoopUtil.get.getNameNodesToAccess(sparkConf) + dst
        hadoopUtil.obtainTokensForNamenodes(nns, freshHadoopConf, tempCreds)
        null
      }
    })
    // Add the temp credentials back to the original ones.
   //3)將新獲取的token添加到當前登錄用戶中
    UserGroupInformation.getCurrentUser.addCredentials(tempCreds)
    val remoteFs = FileSystem.get(freshHadoopConf)
    // If lastCredentialsFileSuffix is 0, then the AM is either started or restarted. If the AM
    // was restarted, then the lastCredentialsFileSuffix might be > 0, so find the newest file
    // and update the lastCredentialsFileSuffix.
    if (lastCredentialsFileSuffix == 0) {
      hadoopUtil.listFilesSorted(
        remoteFs, credentialsPath.getParent,
        credentialsPath.getName, SparkHadoopUtil.SPARK_YARN_CREDS_TEMP_EXTENSION)
        .lastOption.foreach { status =>
        lastCredentialsFileSuffix = hadoopUtil.getSuffixForCredentialsPath(status.getPath)
      }
    }
    val nextSuffix = lastCredentialsFileSuffix + 1
    val tokenPathStr =
      credentialsFile + SparkHadoopUtil.SPARK_YARN_CREDS_COUNTER_DELIM + nextSuffix
    val tokenPath = new Path(tokenPathStr)
    val tempTokenPath = new Path(tokenPathStr + SparkHadoopUtil.SPARK_YARN_CREDS_TEMP_EXTENSION)
    logInfo("Writing out delegation tokens to " + tempTokenPath.toString)
    val credentials = UserGroupInformation.getCurrentUser.getCredentials
    //4）將credentials信息寫到目標文件中
    credentials.writeTokenStorageFile(tempTokenPath, freshHadoopConf)
    logInfo(s"Delegation Tokens written out successfully. Renaming file to $tokenPathStr")
    remoteFs.rename(tempTokenPath, tokenPath)
    logInfo("Delegation token file rename complete.")
    lastCredentialsFileSuffix = nextSuffix
  }

更新token的代碼都在這個函數里了。其實就包括以下幾個步驟
1）使用keytab和principal重新登錄kerberos，并獲取登錄的ugi信息：keytabLoggedInUGI，注意，這里僅僅是kerberos的一個認證過程，還并不涉及到hdfs delegation token的東西，即keytabLoggedInUGI 中并不包含token信息。loginUserFromKeytabAndReturnUGI函數回返回一個新的用戶對象，而不會影響當前登錄的用戶
2）獲取keytabLoggedInUGI 中的credentials對象，然后使用keytabLoggedInUGI 身份去向namenode獲取新的hdfs delegation token。并將token添加到一個臨時的credentials對象中
3）將臨時的credentials對象中的token添加到當前登錄的ugi中。此時Am中使用token已經被更新，所以Am不會出現token expired問題，但是還需要把token更新到executor中。
4）生成token的存放目錄，token存放目錄為/user/{user}/.sparkStaging/${appid}, token文件名格式為“credentials-UUID-suffix”， suffix為后綴，按文件個數遞增。token file默認保留五天。
至此，Am端token已經寫到hdfs文件了。接下來就是executor端怎么讀到最新的token文件，并把token更新到自己的ugi當中。
Executor啟動過程這里就不在分析，主要是在AM當中會生成executor的啟動信息及上下文，并發給NodeManager，由NodeManager來啟動executor的container。在spark中，最終代表executor的類為：CoarseGrainedExecutorBackend

CoarseGrainedExecutorBackend

private def run(
      driverUrl: String,
      executorId: String,
      hostname: String,
      cores: Int,
      appId: String,
      workerUrl: Option[String],
      userClassPath: Seq[URL]) {

     ...
      if (driverConf.contains("spark.yarn.credentials.file")) {
        logInfo("Will periodically update credentials from: " +
          driverConf.get("spark.yarn.credentials.file"))
        SparkHadoopUtil.get.startExecutorDelegationTokenRenewer(driverConf)
      }

      ...
    }
  }

這里也是先判斷在sparkConf中是否有"spark.yarn.credentials.file"配置，如果有，會生成一個ExecutorDelegationTokenUpdater對象，并調用其updateCredentialsIfRequired更新token

ExecutorDelegationTokenUpdater

try {
      val credentialsFilePath = new Path(credentialsFile)
      val remoteFs = FileSystem.get(freshHadoopConf)
      SparkHadoopUtil.get.listFilesSorted(
        remoteFs, credentialsFilePath.getParent,
        credentialsFilePath.getName, SparkHadoopUtil.SPARK_YARN_CREDS_TEMP_EXTENSION)
        .lastOption.foreach { credentialsStatus =>
        val suffix = SparkHadoopUtil.get.getSuffixForCredentialsPath(credentialsStatus.getPath)
        if (suffix > lastCredentialsFileSuffix) {
          logInfo("Reading new delegation tokens from " + credentialsStatus.getPath)
          val newCredentials = getCredentialsFromHDFSFile(remoteFs, credentialsStatus.getPath)
          lastCredentialsFileSuffix = suffix
          UserGroupInformation.getCurrentUser.addCredentials(newCredentials)
          logInfo("Tokens updated from credentials file.")
        } else {
          // Check every hour to see if new credentials arrived.
          logInfo("Updated delegation tokens were expected, but the driver has not updated the " +
            "tokens yet, will check again in an hour.")
          delegationTokenRenewer.schedule(executorUpdaterRunnable, 1, TimeUnit.HOURS)
          return
        }
      }
      val timeFromNowToRenewal =
        SparkHadoopUtil.get.getTimeFromNowToRenewal(
          sparkConf, 0.8, UserGroupInformation.getCurrentUser.getCredentials)
      if (timeFromNowToRenewal <= 0) {
        // We just checked for new credentials but none were there, wait a minute and retry.
        // This handles the shutdown case where the staging directory may have been removed(see
        // SPARK-12316 for more details).
        delegationTokenRenewer.schedule(executorUpdaterRunnable, 1, TimeUnit.MINUTES)
      } else {
        logInfo(s"Scheduling token refresh from HDFS in $timeFromNowToRenewal millis.")
        delegationTokenRenewer.schedule(
          executorUpdaterRunnable, timeFromNowToRenewal, TimeUnit.MILLISECONDS)
      }
    } catch {
      // Since the file may get deleted while we are reading it, catch the Exception and come
      // back in an hour to try again
      case NonFatal(e) =>
        logWarning("Error while trying to update credentials, will try again in 1 hour", e)
        delegationTokenRenewer.schedule(executorUpdaterRunnable, 1, TimeUnit.HOURS)
    }

這個函數就是executor端更新token的整個過程，包括幾個步驟
1）獲取hdfs上保存credentials目錄下的最近更新的文件，并取出其suffix，與當前程序中保存的lastCredentialsFileSuffix 比較，如果比lastCredentialsFileSuffix 大，則表示AM端更新了token，需要重新讀取token并更新。
2）如果Am端還沒更新，則過一小時重試
3）token更新完后，會再次判斷下次待更新的時間，并生成一個調度任務，到期執行更新操作。
在executor端更新其實就是把hdfs上的credentials文件讀取出來，使用 UserGroupInformation.getCurrentUser.addCredentials(newCredentials)函數對當前的ugi添加新的token信息就可以了。
至此，spark上解決hdfs delegation token過期問題就分析完了。整個過程類似與下面這張圖：

總結下來就是在Am端更新token信息，并把更新后的token寫到hdfs，在executor端讀取hdfs上更新的token，并更新到自己的ugi當中。按理說這樣能解決token過期的問題了，但是用過spark streaming的同學可能會遇到一個奇怪的問題，即使在提交任務的時候帶上了"--principal"參數，還是會遇到hdfs delegation token 過期的問題，那又是怎么一會事呢？下面繼續分析hdfs的一個bug。

hdfs delegation token bug

如上文分析，Spark在Am端會去更新token，因此理論上來講應該不會出現token過期問題了，但在我們使用過程中還是會出現token過期的情況，網上查了后說是hfds上的一個bug導致:hdfs-9276

https://issues.apache.org/jira/browse/HDFS-9276

理解這個bug要先知道一個概念，即Token的service字段是client從server端獲取token后添加的，client用于區分不同服務的token，在server端根本沒有service字段的概念。客戶端通過FileSystem.addDelegationTokens函數向namenode申請hdfs delegation token。當從server端申請到token后，會給token設置service字段：

DFSClient.java

public Token<DelegationTokenIdentifier> getDelegationToken(Text renewer)
      throws IOException {
    assert dtService != null;
    Token<DelegationTokenIdentifier> token =
      namenode.getDelegationToken(renewer);

    if (token != null) {
      token.setService(this.dtService);
      LOG.info("Created " + DelegationTokenIdentifier.stringifyToken(token));
    } else {
      LOG.info("Cannot get delegation token from " + renewer);
    }
    return token;

  }

這里將service字段設置為dtService。在HA這種情況下，客戶端使用nameservice訪問hdfs，所以dtService的值為：ha-hdfs:<nameservice>。
這個service我們暫且稱之為logicService。但是，client必須使用IP:PORT訪問server。當client確定active的namenode后，怎么確定使用哪個token來和server端認證呢？之前講過token的service字段用于區分不同的server，但是該字段里并不包含具體的ip和端口。為了解決這個問題，其實每次在new 一個DFSClient實例時，會把token拷貝兩份，并把里面的service字段替換成具體的ip和端口：

HAUtil.java

public static void cloneDelegationTokenForLogicalUri(
      UserGroupInformation ugi, URI haUri,
      Collection<InetSocketAddress> nnAddrs) {
    // this cloning logic is only used by hdfs
    Text haService = HAUtil.buildTokenServiceForLogicalUri(haUri,
        HdfsConstants.HDFS_URI_SCHEME);
    Token<DelegationTokenIdentifier> haToken =
        tokenSelector.selectToken(haService, ugi.getTokens());
    if (haToken != null) {
      for (InetSocketAddress singleNNAddr : nnAddrs) {
        // this is a minor hack to prevent physical HA tokens from being
        // exposed to the user via UGI.getCredentials(), otherwise these
        // cloned tokens may be inadvertently propagated to jobs
        Token<DelegationTokenIdentifier> specificToken =
            new Token.PrivateToken<DelegationTokenIdentifier>(haToken);
        SecurityUtil.setTokenService(specificToken, singleNNAddr);
        Text alias = new Text(
            buildTokenServicePrefixForLogicalUri(HdfsConstants.HDFS_URI_SCHEME)
                + "http://" + specificToken.getService());
        ugi.addToken(alias, specificToken);
        LOG.debug("Mapped HA service delegation token for logical URI " +
            haUri + " to namenode " + singleNNAddr);
      }
    } else {
      LOG.debug("No HA service delegation token found for logical URI " +
          haUri);
    }
  }

這樣一來，在client端，每一個token其實就有三個拷貝，分別為一個HA token，和兩個對應到具體namenode的namenode token。于是，client想和哪個namenode通信就能選擇到相應的token了。
那hdfs-9276這個bug就很明顯了，意思是當用戶使用UserGroupInformation.getCurrentUser().addCredentials(credentials)方法更新token時，只能更新HA token，并不能更新兩個namenode token。所以當client使用namenode 的ip和port選擇到某個namenode token時，該token其實還是老的token，并沒有被更新，因此使用該token去訪問server端，就會被server拒絕，并提示token過期異常。
所以9276的patch把這個問題解決了（代碼就不分析，感興趣的可以自己去看下），當用戶addCredentials的時候，會把HA token對應的兩個namenode token也更新。細心的讀者應該發現，當每次new一個DFSClient實例的時候，內部就會把HA token拷貝兩份，生成新的兩個namenode token，因此如果每次都new 一個DFSClient是可以繞過9276描述的問題的。
其實spark也是嘗試這么做的，回到spark 在excutor端更新token的過程：

ExecutorDelegationTokenUpdater

try {
      val credentialsFilePath = new Path(credentialsFile)
      //此處獲取了一個新的FileSystem對象，但此時ugi中的HA token還沒有被更新
      val remoteFs = FileSystem.get(freshHadoopConf)
      SparkHadoopUtil.get.listFilesSorted(
        remoteFs, credentialsFilePath.getParent,
        credentialsFilePath.getName, SparkHadoopUtil.SPARK_YARN_CREDS_TEMP_EXTENSION)
        .lastOption.foreach { credentialsStatus =>
        val suffix = SparkHadoopUtil.get.getSuffixForCredentialsPath(credentialsStatus.getPath)
        if (suffix > lastCredentialsFileSuffix) {
          logInfo("Reading new delegation tokens from " + credentialsStatus.getPath)
          //從hdfs中讀取HA token
          val newCredentials = getCredentialsFromHDFSFile(remoteFs, credentialsStatus.getPath)
          lastCredentialsFileSuffix = suffix
          //把新的HA token更新到ugi當中
          UserGroupInformation.getCurrentUser.addCredentials(newCredentials)
          logInfo("Tokens updated from credentials file.")
        } else {
          // Check every hour to see if new credentials arrived.
          logInfo("Updated delegation tokens were expected, but the driver has not updated the " +
            "tokens yet, will check again in an hour.")
          delegationTokenRenewer.schedule(executorUpdaterRunnable, 1, TimeUnit.HOURS)
          return
        }
      }
      val timeFromNowToRenewal =
        SparkHadoopUtil.get.getTimeFromNowToRenewal(
          sparkConf, 0.8, UserGroupInformation.getCurrentUser.getCredentials)
      if (timeFromNowToRenewal <= 0) {
        // We just checked for new credentials but none were there, wait a minute and retry.
        // This handles the shutdown case where the staging directory may have been removed(see
        // SPARK-12316 for more details).
        delegationTokenRenewer.schedule(executorUpdaterRunnable, 1, TimeUnit.MINUTES)
      } else {
      //更新完token后，馬上調度了下一個更新任務，而這個任務要在更新的token快要過期時才會執行
        logInfo(s"Scheduling token refresh from HDFS in $timeFromNowToRenewal millis.")
        delegationTokenRenewer.schedule(
          executorUpdaterRunnable, timeFromNowToRenewal, TimeUnit.MILLISECONDS)
      }
    } catch {
      // Since the file may get deleted while we are reading it, catch the Exception and come
      // back in an hour to try again
      case NonFatal(e) =>
        logWarning("Error while trying to update credentials, will try again in 1 hour", e)
        delegationTokenRenewer.schedule(executorUpdaterRunnable, 1, TimeUnit.HOURS)
    }

在函數的開始獲使用FileSystem.get(freshHadoopConf)獲取remoteFs對象，其中freshHadoopConf的"fs.hdfs.impl.disable.cache"設置為true，表示新生產一個FileSystem對象。這里其實很明顯就是想讓繞過9276 bug。但是很可惜，用的地方不對。在新生產這個對象的時候，ugi中保存的token其實還并沒有被更新。隨后讀取hdfs中新的token，并更新到ugi當中。然后便調度下一個任務了。可以看到在更新token后，沒有在new 一個FileSystem，所以ugi中的namenode token就得不到更新，因此還是會出現token過期問題。

hdfs delegation token過期問題分析到此結束。

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

hdfs delegation token 過期問題分析

hdfs delegation token 過期問題分析

什么是delegation token

與kerberos的區別

delegation token 期限

delegation 包含的內容

delegation 生命周期

delegation token過期應該怎么辦

spark中怎么解決delegation token過期問題

hdfs delegation token bug

推薦閱讀更多精彩內容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

hdfs delegation token 過期問題分析

什么是delegation token

與kerberos的區別

delegation token 期限

delegation 包含的內容

delegation 生命周期

delegation token過期應該怎么辦

spark中怎么解決delegation token過期問題

hdfs delegation token bug

推薦閱讀更多精彩內容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频