一.Kafka發(fā)送消息的整體流程：

kafka生產(chǎn)者整體架構(gòu) (1).png

步驟：
1.ProducerInterceptors對(duì)消息進(jìn)行攔截。
2.Serializer對(duì)消息的key和value進(jìn)行序列化。
3.Partitioner為消息選擇合適的Partition。
4.RecordAccumulator收集消息，實(shí)現(xiàn)批量發(fā)送。
5.Sender從RecordAccumulator獲取消息。
6.構(gòu)造ClientRequest。
7.將ClientRequest交給NetworkClient,準(zhǔn)備發(fā)送。
8.NetworkClient將請(qǐng)求送入KafkaChannel的緩存。
9.執(zhí)行網(wǎng)絡(luò)I/O,發(fā)送請(qǐng)求。
10.收到響應(yīng)，調(diào)用ClientRequest的回調(diào)函數(shù)。
11.調(diào)用RecordBatch的回調(diào)函數(shù)，最終調(diào)用每個(gè)消息上注冊(cè)的回調(diào)函數(shù)。
消息發(fā)送過(guò)程中，涉及兩個(gè)線程協(xié)同工作。主線程首先將業(yè)務(wù)數(shù)據(jù)封裝成ProducerRecord對(duì)象，之后調(diào)用send()方法將消息放入RecordAccumulator（消息收集器，也是主線程和sender線程共享的緩沖區(qū)）中暫存。Sender線程負(fù)責(zé)將消息信息構(gòu)成請(qǐng)求，最終執(zhí)行網(wǎng)絡(luò)I/O的線程，它從RecordAccumulator中取出消息并批量發(fā)送出去。KafkaProducer是線程安全的，多個(gè)線程可以共享使用一個(gè)KafkaProducer對(duì)象。
KafkaProducer實(shí)現(xiàn)了Producer接口，在Producer接口中定義了KafkaProducer對(duì)外提供的API,分為四類(lèi)方法：

send()方法：發(fā)送消息，實(shí)際上是將消息放入RecordAccumulator暫存，等待發(fā)送。
flush()方法：刷新操作，等待RecordAccumulator所有信息發(fā)送完，在刷新完成前會(huì)阻塞調(diào)用線程。
partitionFor()方法：在KafkaProducer中維護(hù)了一個(gè)Metadata對(duì)象用于存儲(chǔ)Kafka集群的元數(shù)據(jù)，Metadata中的元素會(huì)定期更新。partitionFor()方法負(fù)責(zé)從Metadata中獲取指定Topic分區(qū)信息。
close()方法：關(guān)閉Producer對(duì)象，主要操作是設(shè)置close標(biāo)志，等待RecordAccumulator中的消息清空，關(guān)閉Sender線程。

二.KafkaProducer分析：

KafkaProducer重要的字段：

KafkaProducer.jpeg

clientId:這個(gè)生產(chǎn)者的唯一標(biāo)識(shí)。
partitioner: 分區(qū)選擇器，根據(jù)一定的策略，將消息路由到合適的分區(qū)。
maxRequestSize:消息的最大長(zhǎng)度，這個(gè)長(zhǎng)度包括消息頭，序列化后的key和序列化后的value的長(zhǎng)度。
totalMemorySize:發(fā)送單個(gè)消息的緩沖區(qū)大小。
accumulator:RecordAccumulator,用于收集并緩存消息，等待Sender線程發(fā)送。
sender: 發(fā)送消息的Sender任務(wù)，實(shí)現(xiàn)了Runnable接口，在ioThread線程中運(yùn)行。
ioThread：執(zhí)行Sender任務(wù)發(fā)送消息的線程，稱(chēng)為“Sender線程”。
compressionType:壓縮算法，可選項(xiàng)有none,gzip,snappy,lz4。這是針對(duì)RecordAccumulator中多條消息進(jìn)行的壓縮，所以消息越多，壓縮效果越好。
keySerializer: key的序列化器。
valueSerializer: value的序列化器。
metadata :整個(gè)Kafka集群的元數(shù)據(jù)。
maxBlockTimeMs: 等待更新Kafka集群元數(shù)據(jù)的最長(zhǎng)時(shí)長(zhǎng)。
requestTimeoutMs: 消息的超時(shí)時(shí)間，也就是從消息發(fā)送到收到ACK響應(yīng)的最長(zhǎng)時(shí)長(zhǎng)。
interceptor: ProducerInterceptor集合，ProducerInterceptor可以在消息發(fā)送前堆其進(jìn)行攔截或修改；也可以先于用戶的Callback，對(duì)ACK響應(yīng)進(jìn)行預(yù)處理。
producerConfig:配置對(duì)象，使用反射初始化KafkaProducer配置的相對(duì)對(duì)象。

KafkaProducer的構(gòu)造函數(shù)：

KafkaProducer端配置加載

自定義屬性

自定義屬性，比如ProducerConfig.BOOTSTRAP_SERVERS_CONFIG：A list of host/port pairs。表示用于初始化連接Kafka cluster的ip:port，用來(lái)獲得全部Kafka cluster的列表。所以不用全部寫(xiě)，但是最好多寫(xiě)幾個(gè)，為了防止一個(gè)掛了。

props.put(ProducerConfig.CLIENT_ID_CONFIG, "testConstructorClose");
        props.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "localhost:9092");
        props.put(ProducerConfig.METRIC_REPORTER_CLASSES_CONFIG, MockMetricsReporter.class.getName());
        props.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.IntegerSerializer");
        props.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,"org.apache.kafka.common.serialization.StringSerializer");
KafkaProducer producer=new KafkaProducer<>(props);

加載默認(rèn)屬性和自定義屬性

實(shí)例化類(lèi)ProducerConfig，類(lèi)ProducerConfig里有個(gè)靜態(tài)屬性ConfigDef CONFIG = new ConfigDef()。

/**
     * A producer is instantiated by providing a set of key-value pairs as configuration. Valid configuration strings
     * are documented <a >here</a>.
     * @param properties   The producer configs
     */
    public KafkaProducer(Properties properties) {
        this(new ProducerConfig(properties), null, null);
    }

加載默認(rèn)屬性：

static {
        CONFIG = new ConfigDef().define(BOOTSTRAP_SERVERS_CONFIG, Type.LIST, Importance.HIGH, CommonClientConfigs.BOOSTRAP_SERVERS_DOC)
                                .define(BUFFER_MEMORY_CONFIG, Type.LONG, 32 * 1024 * 1024L, atLeast(0L), Importance.HIGH, BUFFER_MEMORY_DOC)
......

最后把自定義的屬性value覆蓋默認(rèn)的屬性value:

public Map<String, Object> parse(Map<?, ?> props) {
        // Check all configurations are defined
        List<String> undefinedConfigKeys = undefinedDependentConfigs();
        if (!undefinedConfigKeys.isEmpty()) {
            String joined = Utils.join(undefinedConfigKeys, ",");
            throw new ConfigException("Some configurations in are referred in the dependents, but not defined: " + joined);
        }
        // parse all known keys
        Map<String, Object> values = new HashMap<>();
        for (ConfigKey key : configKeys.values()) {
            Object value;
            // props map contains setting - assign ConfigKey value
            if (props.containsKey(key.name)) {
                value = parseType(key.name, props.get(key.name), key.type);
                // props map doesn't contain setting, the key is required because no default value specified - its an error
            } else if (key.defaultValue == NO_DEFAULT_VALUE) {
                throw new ConfigException("Missing required configuration \"" + key.name + "\" which has no default value.");
            } else {
                // otherwise assign setting its default value
                value = key.defaultValue;
            }
            if (key.validator != null) {
                key.validator.ensureValid(key.name, value);
            }
            values.put(key.name, value);
        }
        return values;
    }

KafkaProducer的構(gòu)造函數(shù)開(kāi)始

KafkaProducer的構(gòu)造函數(shù)會(huì)初始化上面的字段，幾個(gè)重要的字段介紹下。

 private KafkaProducer(ProducerConfig config, Serializer<K> keySerializer, Serializer<V> valueSerializer) {
        try {
            log.trace("Starting the Kafka producer");
         ......
//通過(guò)反射機(jī)制實(shí)例化配置的partitioner類(lèi)。
            this.partitioner = config.getConfiguredInstance(ProducerConfig.PARTITIONER_CLASS_CONFIG, Partitioner.class);
 
//創(chuàng)建并更新kafka集群的元數(shù)據(jù)
            this.metadata = new Metadata(retryBackoffMs, config.getLong(ProducerConfig.METADATA_MAX_AGE_CONFIG));
  List<InetSocketAddress> addresses = ClientUtils.parseAndValidateAddresses(config.getList(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG));
            this.metadata.update(Cluster.bootstrap(addresses), time.milliseconds());           
      
//創(chuàng)建RecordAccumulator
            this.accumulator = new RecordAccumulator(config.getInt(ProducerConfig.BATCH_SIZE_CONFIG),
                    this.totalMemorySize,
                    this.compressionType,
                    config.getLong(ProducerConfig.LINGER_MS_CONFIG),
                    retryBackoffMs,
                    metrics,
                    time);
          
           ......
ChannelBuilder channelBuilder = ClientUtils.createChannelBuilder(config.values());
//創(chuàng)建NetworkClient，這個(gè)是KafkaProducer網(wǎng)絡(luò)I/O的核心，后面會(huì)講到。
            NetworkClient client = new NetworkClient(
                    new Selector(config.getLong(ProducerConfig.CONNECTIONS_MAX_IDLE_MS_CONFIG), this.metrics, time, "producer", channelBuilder),
                    this.metadata,
                    clientId,
                    config.getInt(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION),
                    config.getLong(ProducerConfig.RECONNECT_BACKOFF_MS_CONFIG),
                    config.getInt(ProducerConfig.SEND_BUFFER_CONFIG),
                    config.getInt(ProducerConfig.RECEIVE_BUFFER_CONFIG),
                    this.requestTimeoutMs, time);
            this.sender = new Sender(client,
                    this.metadata,
                    this.accumulator,
                    config.getInt(ProducerConfig.MAX_IN_FLIGHT_REQUESTS_PER_CONNECTION) == 1,
                    config.getInt(ProducerConfig.MAX_REQUEST_SIZE_CONFIG),
                    (short) parseAcks(config.getString(ProducerConfig.ACKS_CONFIG)),
                    config.getInt(ProducerConfig.RETRIES_CONFIG),
                    this.metrics,
                    new SystemTime(),
                    clientId,
                    this.requestTimeoutMs);
           
String ioThreadName = "kafka-producer-network-thread" + (clientId.length() > 0 ? " | " + clientId : "");
//啟動(dòng)Sender對(duì)應(yīng)的線程
            this.ioThread = new KafkaThread(ioThreadName, this.sender, true);
            this.ioThread.start();

//通過(guò)反射機(jī)制實(shí)例化配置的keySerializer類(lèi)，valueSerializer類(lèi)
            if (keySerializer == null) {
                this.keySerializer = config.getConfiguredInstance(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG,
                        Serializer.class);
                this.keySerializer.configure(config.originals(), true);
            } else {
                config.ignore(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG);
                this.keySerializer = keySerializer;
            }
            if (valueSerializer == null) {
                this.valueSerializer = config.getConfiguredInstance(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG,
                        Serializer.class);
                this.valueSerializer.configure(config.originals(), false);
            } else {
                config.ignore(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG);
                this.valueSerializer = valueSerializer;
            }

            // load interceptors and make sure they get clientId
            userProvidedConfigs.put(ProducerConfig.CLIENT_ID_CONFIG, clientId);
            List<ProducerInterceptor<K, V>> interceptorList = (List) (new ProducerConfig(userProvidedConfigs)).getConfiguredInstances(ProducerConfig.INTERCEPTOR_CLASSES_CONFIG,
                    ProducerInterceptor.class);
            this.interceptors = interceptorList.isEmpty() ? null : new ProducerInterceptors<>(interceptorList);

            config.logUnused();
            AppInfoParser.registerAppInfo(JMX_PREFIX, clientId);
            log.debug("Kafka producer started");
        } catch (Throwable t) {
            // call close methods if internal objects are already constructed
            // this is to prevent resource leak. see KAFKA-2121
            close(0, TimeUnit.MILLISECONDS, true);
            // now propagate the exception
            throw new KafkaException("Failed to construct kafka producer", t);
        }
    }

KafkaProducer的send()方法：

KafkaProducer的send()方法的調(diào)用流程：

KafkaProducer send()方法的調(diào)用流程.jpg

調(diào)用ProducerInterceptors.onSend()方法，通過(guò)ProducerInterceptor對(duì)消息進(jìn)行攔截或修改。
調(diào)用waitOnMetadata()方法獲取Kafka集群的信息，底層會(huì)喚醒Send線程更新Metadata中保存的Kafka集群元數(shù)據(jù)。
調(diào)用Serializer.serialize()方法序列化消息的key和value。
調(diào)用partition()為消息選擇合適的分區(qū)。
調(diào)用RecordAccumulator.append()方法，將消息追加到RecordAccumulator中。
喚醒Sender線程，由Sender線程將RecordAccumulator中緩存的消息發(fā)出去。

三.ProducerInterceptors&ProducerInterceptor

ProducerInterceptors是一個(gè)ProducerInterceptor集合，方法onSend(),onAcknowledgement(),onSendError(),實(shí)際上是循環(huán)調(diào)用其封裝的ProducerInterceptor集合的對(duì)應(yīng)方法。
ProducerInterceptor對(duì)象可以在消息發(fā)送之前對(duì)其進(jìn)行攔截或修改，也可以先于用戶的Callback,對(duì)ACK響應(yīng)進(jìn)行預(yù)處理。可以把它想象成java web的filter。創(chuàng)建ProducerInterceptor類(lèi)，只要實(shí)現(xiàn)ProducerInterceptor接口，創(chuàng)建其對(duì)象并添加到ProducerInterceptors中即可。

四.Kafka集群元數(shù)據(jù)

Leader副本的動(dòng)態(tài)變化

在生產(chǎn)者的角度來(lái)看，分區(qū)的數(shù)量以及Leader副本的分布是動(dòng)態(tài)變化的。比如：

在運(yùn)行過(guò)程中，Leader副本隨時(shí)都有可能出現(xiàn)故障而導(dǎo)致Leader副本的重新選舉，新的Leader副本會(huì)在其他Broker上繼續(xù)對(duì)外提供服務(wù)。
當(dāng)需要提高某Topic的并行處理消息的能力時(shí)，我們可以通過(guò)增加其分區(qū)的數(shù)量來(lái)實(shí)現(xiàn)。

KafkaProducer發(fā)送消息時(shí)的路由方法：

KafkaProducer要將此消息追加到指定Topic的某個(gè)分區(qū)的Leader副本中，首先需要知道Topic的分區(qū)數(shù)量，經(jīng)過(guò)路由后確定目標(biāo)分區(qū)。然后KafkaProducer需要知道目標(biāo)分區(qū)的Leader副本所在服務(wù)器的地址，端口等信息，才能建立連接，將消息發(fā)送到Kafka中。

KafkaProducer元數(shù)據(jù)：

在KafkaProducer維護(hù)了Kafka集群的元數(shù)據(jù)：包括某個(gè)topic中有哪幾個(gè)分區(qū)，每個(gè)分區(qū)的Leader副本分配哪個(gè)節(jié)點(diǎn)上，F(xiàn)ollower副本分配在哪些節(jié)點(diǎn)上，哪些副本在ISR集合中以及這些節(jié)點(diǎn)的網(wǎng)絡(luò)地址，端口。
在KafkaProducer中，使用Node,TopicPartition,PartitionInfo這三個(gè)類(lèi)封裝了Kafka集群的相關(guān)數(shù)據(jù)。

image.png

222222.jpeg

333333.jpeg

Node表示集群中的一個(gè)節(jié)點(diǎn)。Node記錄了這個(gè)節(jié)點(diǎn)的host,ip,port等信息。
TopicPartition表示某Topic的一個(gè)分區(qū)，其中的topic字段是Topic的名稱(chēng)，partition字段是此分區(qū)在Topic中的分區(qū)編號(hào)（ID）。
PartitionInfo表示一個(gè)分區(qū)的詳細(xì)信息。其中topic字段和partition字段的含義與TopicPartition中的相同，除此之外，leader字段保持了Leader副本所在節(jié)點(diǎn)的id,replica字段記錄了全部副本所在的節(jié)點(diǎn)信息，inSyncReplicas字段記錄了ISR集合中所有副本所在的節(jié)點(diǎn)信息。
通過(guò)這三個(gè)類(lèi)的組合，我們可以完整表示出KafkaProducer需要的集群元數(shù)據(jù)。這些元數(shù)據(jù)保存在了Cluster這個(gè)類(lèi)中，并按照不同的映射方式進(jìn)行存放，方便查詢。Cluster類(lèi)的核心字段如下：

image.png
nodes: Kafka集群中節(jié)點(diǎn)信息列表。
nodesById: BrokerId與Node節(jié)點(diǎn)之間的對(duì)于關(guān)系，方便按照BrokerId進(jìn)行查詢。
partitionsByTopicPartition: 記錄了TopicPartition與PartitionInfo的映射關(guān)系。
partitionsByTopic: 記錄了Topic名稱(chēng)和PartitionInfo的映射關(guān)系，可以按照Topic的名稱(chēng)查詢?nèi)康姆謪^(qū)詳細(xì)信息。
availablePartitionsByTopic: 記錄了Topic名稱(chēng)和PartitionInfo的映射關(guān)系，這里的List<PartitionInfo>中存放的分區(qū)必須是有Leader副本的Partition,而partitionsByTopic中記錄的分區(qū)則不一定有Leader副本，因?yàn)橛行┲虚g狀態(tài)，如Leader副本宕機(jī)而觸發(fā)的選舉過(guò)程中，分區(qū)不一定有Leader副本。
partitionByNode：記錄了Node與PartitionInfo的映射關(guān)系，可以按照節(jié)點(diǎn)id查詢其分布的全部分區(qū)的詳細(xì)信息。

Cluster的方法比較簡(jiǎn)單，主要是針對(duì)以上的操作，方便集群元數(shù)據(jù)的查詢，例如partitionsForTopic方法：

/**
     * Get the list of partitions for this topic
     * @param topic The topic name
     * @return A list of partitions
     */
    public List<PartitionInfo> partitionsForTopic(String topic) {
        return this.partitionsByTopic.get(topic);
    }

注意：Node,TopicPartition,PartitionInfo,Cluster的所有字段都是private final修飾的，且只提供了查詢方法，并未提供任何修改方法，這就保證了這四個(gè)類(lèi)的對(duì)象都是不可變性對(duì)象，它們就是線程安全的對(duì)象。
Metadata中封裝了Cluster對(duì)象，并保存Cluster數(shù)據(jù)的最后更新時(shí)間，版本號(hào)（version）,是否需要更新等待信息。

MetaData核心字段：

Metadata中封裝了Cluster對(duì)象，并保存Cluster數(shù)據(jù)的最后更新時(shí)間，版本號(hào)（version）,是否需要更新等待信息。

888888.jpeg

topics:記錄了當(dāng)前已知的所有topic,在cluster字段中記錄了Topic最新的元數(shù)據(jù)。
version:表示Kafka集群元數(shù)據(jù)的版本號(hào)。Kafka集群元數(shù)據(jù)每更新成功一次，version的值加1。通過(guò)新舊版本號(hào)的比較，判斷集群元數(shù)據(jù)是否更新完成。
metadataExpireMs: 每隔多久，更新一次。默認(rèn)是300*1000，也就是5分鐘。
refreshBackOffMs:兩次發(fā)出更新Cluster保存的元數(shù)據(jù)信息的最小時(shí)間差，默認(rèn)為100ms。這是為了防止更新操作過(guò)于頻繁而造成網(wǎng)絡(luò)阻塞和增加服務(wù)端的壓力。在Kafka中與重試操作有關(guān)的操作中，都有“退避(backoff)時(shí)間”設(shè)計(jì)的身影。
lastRefreshMs:記錄上一次更新元數(shù)據(jù)的時(shí)間戳（也包含更新失敗的情況）。
lastSuccessfulRefreshMs: 上一次成功更新的時(shí)間戳。如果每次都成功，則lastSuccessfulRefreshMs，lastRefreshMs相等。否則，lastRefreshMs>lastSuccessfulRefreshMs。
cluster：記錄Kafka集群的元數(shù)據(jù)。
needUpdate:標(biāo)識(shí)是否強(qiáng)制更新Cluster,這是觸發(fā)Sender線程更新集群元數(shù)據(jù)的條件之一。
listeners: 監(jiān)聽(tīng)Metadata更新的監(jiān)聽(tīng)器集合。自定義Metadata監(jiān)聽(tīng)實(shí)現(xiàn)Metadata.Listener.onMetadataUpdate()方法即可，在更新Metadata中的cluster字段之前，會(huì)通知listener集合中全部Listener對(duì)象。
needMetadataForAllTopics:是否需要更新全部Topic的元數(shù)據(jù)，一般情況下，KafkaProducer只維護(hù)它用到的Topic元素，是集群中全部Topic的子集。
MetaData的方法比較簡(jiǎn)單，主要是操作上面的幾個(gè)字段，主要介紹主線程用的requestUpdate()和awaitUpdate()。requestUpdate()方法將needUpdate字段修改為true,這樣當(dāng)Sender線程運(yùn)行時(shí)會(huì)更新Metadata記錄的集群元數(shù)據(jù)，然后返回version字段的值。awaitUpdate()是通過(guò)version來(lái)判斷元數(shù)據(jù)是否更新完成，更新未完成則阻塞等待：

/**
     * Request an update of the current cluster metadata info, return the current version before the update
     */
    public synchronized int requestUpdate() {
        //needUpdate設(shè)置為true,表示需要強(qiáng)制更新Cluster
        this.needUpdate = true;
        //返回當(dāng)前Kafka集群元數(shù)據(jù)的版本號(hào)
        return this.version;
    }

 /**
     * Wait for metadata update until the current version is larger than the last version we know of
     */
    public synchronized void awaitUpdate(final int lastVersion, final long maxWaitMs) throws InterruptedException {
        if (maxWaitMs < 0) {
            throw new IllegalArgumentException("Max time to wait for metadata updates should not be < 0 milli seconds");
        }
        long begin = System.currentTimeMillis();
        long remainingWaitMs = maxWaitMs;
        while (this.version <= lastVersion) {
            if (remainingWaitMs != 0)
                wait(remainingWaitMs);
            long elapsed = System.currentTimeMillis() - begin;
            if (elapsed >= maxWaitMs)
                throw new TimeoutException("Failed to update metadata after " + maxWaitMs + " ms.");
            remainingWaitMs = maxWaitMs - elapsed;
        }
    }

解釋?zhuān)?br> maxWaitMs:更新metadata版本需要的最長(zhǎng)時(shí)間。
remainingWaitMs:wait需要等待的時(shí)間。
elapsed = System.currentTimeMillis() - begin：本次循環(huán)更新消耗的時(shí)間。

Metadata中的字段可以由主線程讀，Sender線程更新，因此它必須是線程安全的，所以上面的方法都使用synchronized同步。Sender線程的內(nèi)存會(huì)在后面介紹。

KafkaProducer.waitOnMetadata()方法分析：

這個(gè)方法觸發(fā)了Kafka元數(shù)據(jù)的更新，并阻塞主線程等待更新完畢。步驟：

 /**
     * Wait for cluster metadata including partitions for the given topic to be available.
     * @param topic The topic we want metadata for
     * @param maxWaitMs The maximum time in ms for waiting on the metadata
     * @return The amount of time we waited in ms
     */
    private long waitOnMetadata(String topic, long maxWaitMs) throws InterruptedException {
        // 查看Metadata中是否包含指定Topic的元數(shù)據(jù),若不包含，則將Topic添加到topics集合中。
        if (!this.metadata.containsTopic(topic))
            this.metadata.add(topic);
       //成功獲取分區(qū)的詳細(xì)信息
        if (metadata.fetch().partitionsForTopic(topic) != null)
            return 0;

        long begin = time.milliseconds();
        long remainingWaitMs = maxWaitMs;
        while (metadata.fetch().partitionsForTopic(topic) == null) {
            log.trace("Requesting metadata update for topic {}.", topic);
            //設(shè)置needupdate,獲取當(dāng)前元數(shù)據(jù)版本號(hào)
            int version = metadata.requestUpdate();
            sender.wakeup();//喚醒Sender線程
            //阻塞等待元數(shù)據(jù)更新完畢
            metadata.awaitUpdate(version, remainingWaitMs);
            long elapsed = time.milliseconds() - begin;
            if (elapsed >= maxWaitMs)//超時(shí)檢驗(yàn)
                throw new TimeoutException("Failed to update metadata after " + maxWaitMs + " ms.");
            //權(quán)限檢驗(yàn)
            if (metadata.fetch().unauthorizedTopics().contains(topic))
                throw new TopicAuthorizationException(topic);
            remainingWaitMs = maxWaitMs - elapsed;
        }
        return time.milliseconds() - begin;
    }

1.查看Metadata中是否包含指定Topic的元數(shù)據(jù)，若不包含，則將Topic添加到topics集合中，下次更新時(shí)會(huì)從服務(wù)端獲得指定Topic元數(shù)據(jù)。
2.嘗試獲取Topic中分區(qū)的詳細(xì)信息，失敗后會(huì)調(diào)用requestUpdate()方法設(shè)置Metadata.needUpdate字段，并得到當(dāng)前元數(shù)據(jù)版本號(hào)。
3.喚醒Sender線程，由Sender線程更新Metadata中保存的Kafka集群元數(shù)據(jù)。
4.主線程調(diào)用awaitUpdate()方法，等待Sender線程完成更新。
5.從Metadata中獲取指定Topic分區(qū)的詳細(xì)信息（即PartitionInfo集合）。若失敗，回到步驟2繼續(xù)嘗試，若等待時(shí)間超時(shí)，則拋出異常。

主線程喚醒的Sender線程會(huì)調(diào)用update（）去服務(wù)端拉取Cluster信息：

/**
     * Update the cluster metadata
     */
    public synchronized void update(Cluster cluster, long now) {
        this.needUpdate = false;
        this.lastRefreshMs = now;
        this.lastSuccessfulRefreshMs = now;
        this.version += 1;

        for (Listener listener: listeners)
            listener.onMetadataUpdate(cluster);

        // Do this after notifying listeners as subscribed topics' list can be changed by listeners
        this.cluster = this.needMetadataForAllTopics ? getClusterForCurrentTopics(cluster) : cluster;

        notifyAll();
        log.debug("Updated cluster metadata version {} to {}", this.version, this.cluster);
    }

五.Serializer&Deserializer:

image.png

Kafka已經(jīng)提供了java基本類(lèi)型的Serializer實(shí)現(xiàn)和Deserializer實(shí)現(xiàn)，我們也可以自定義Serializer實(shí)現(xiàn)和Deserializer實(shí)現(xiàn)，只要實(shí)現(xiàn)Serializer接口和Deserializer接口。介紹下Serializer，Deserializer是逆操作。
configure()方法是序列化之前的配置，如StringSerializer.configure()方法內(nèi)會(huì)選擇合適的的編碼類(lèi)型（encoding）,默認(rèn)是UTF-8; serializer()方法是真正進(jìn)行序列化的地方，將傳入的java對(duì)象序列化成byte[]。

六.Partitioner:

Partitioner是為消息選擇分區(qū)的分區(qū)器。
業(yè)務(wù)邏輯可以控制消息路由到哪個(gè)分區(qū)，也可以不用關(guān)心分區(qū)的選擇。
先調(diào)用KafkaProducer.partition()方法：

/**
     * computes partition for given record.
     * if the record has partition returns the value otherwise
     * calls configured partitioner class to compute the partition.
     */
    private int partition(ProducerRecord<K, V> record, byte[] serializedKey , byte[] serializedValue, Cluster cluster) {
        Integer partition = record.partition();
        if (partition != null) {
            List<PartitionInfo> partitions = cluster.partitionsForTopic(record.topic());
            int lastPartition = partitions.size() - 1;
            // they have given us a partition, use it
            if (partition < 0 || partition > lastPartition) {
                throw new IllegalArgumentException(String.format("Invalid partition given with record: %d is not in the range [0...%d].", partition, lastPartition));
            }
            return partition;
        }
        return this.partitioner.partition(record.topic(), record.key(), serializedKey, record.value(), serializedValue,
            cluster);
    }

如果業(yè)務(wù)代碼沒(méi)有定義消息路由到哪個(gè)partition,那么調(diào)用Partitioner接口的默認(rèn)實(shí)現(xiàn)DefaultPartitioner。
當(dāng)創(chuàng)建KafkaProducer時(shí)傳入的key/value配置項(xiàng)會(huì)保存到AbstractConfig的originals字段中。AbstractConfig的核心方法是getConfiguredInstance()方法，功能是通過(guò)反射的機(jī)制實(shí)例化originals字段中指定的類(lèi)。
獲取分區(qū)對(duì)象partitioner，通過(guò)反射加載默認(rèn)配置。默認(rèn)配置在ProducerConfig類(lèi)的靜態(tài)屬性Config里：

 .define(PARTITIONER_CLASS_CONFIG,
                                        Type.CLASS,
                                        DefaultPartitioner.class.getName(),
                                        Importance.MEDIUM, PARTITIONER_CLASS_DOC)

反射獲取對(duì)象：

 this.partitioner = config.getConfiguredInstance(ProducerConfig.PARTITIONER_CLASS_CONFIG, Partitioner.class);


設(shè)計(jì)Configurable接口的目的是統(tǒng)一反射后的初始化過(guò)程，對(duì)外提供統(tǒng)一的初始化接口。在AbstractConfig.getConfiguredInstance方法中通過(guò)發(fā)射構(gòu)造出來(lái)的對(duì)象，都是通過(guò)無(wú)參構(gòu)造函數(shù)構(gòu)造的，需要初始化的字段個(gè)數(shù)和類(lèi)型各種各樣，Configurable接口的configure()方法封裝了對(duì)象初始化過(guò)程且只有一個(gè)參數(shù)（originals字段），這樣對(duì)外接口實(shí)現(xiàn)了統(tǒng)一。**這個(gè)設(shè)計(jì)注意積累。**
KafkaProducer.partition()方法負(fù)責(zé)在ProduceRecord中沒(méi)有明確指定分區(qū)編號(hào)的時(shí)候，圍棋選擇合適的分區(qū)：如果消息沒(méi)有key,會(huì)根據(jù)counter與Partition個(gè)數(shù)取模來(lái)確定分區(qū)編號(hào)，counter不斷遞增，確保消息不會(huì)都發(fā)到一個(gè)partition里；如果有key就對(duì)key進(jìn)行hash。

private final AtomicInteger counter = new AtomicInteger(new Random().nextInt());
public int partition(String topic, Object key, byte[] keyBytes, Object value, byte[] valueBytes, Cluster cluster) {
List<PartitionInfo> partitions = cluster.partitionsForTopic(topic);
int numPartitions = partitions.size();
if (keyBytes == null) {
int nextValue = counter.getAndIncrement();
List<PartitionInfo> availablePartitions = cluster.availablePartitionsForTopic(topic);
if (availablePartitions.size() > 0) {
int part = DefaultPartitioner.toPositive(nextValue) % availablePartitions.size();
return availablePartitions.get(part).partition();
} else {
// no partitions are available, give a non-available partition
return DefaultPartitioner.toPositive(nextValue) % numPartitions;
}
} else {
// hash the keyBytes to choose a partition
return DefaultPartitioner.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
}
}

counter用不用int而用

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

Kafka源碼分析-Producer（1）-KafkaProducer分析

Kafka源碼分析-Producer（1）-KafkaProducer分析

一.Kafka發(fā)送消息的整體流程：

二.KafkaProducer分析：

KafkaProducer重要的字段：

KafkaProducer的構(gòu)造函數(shù)：

KafkaProducer端配置加載

自定義屬性

加載默認(rèn)屬性和自定義屬性

KafkaProducer的構(gòu)造函數(shù)開(kāi)始

KafkaProducer的send()方法：

三.ProducerInterceptors&ProducerInterceptor

四.Kafka集群元數(shù)據(jù)

Leader副本的動(dòng)態(tài)變化

KafkaProducer發(fā)送消息時(shí)的路由方法：

KafkaProducer元數(shù)據(jù)：

MetaData核心字段：

KafkaProducer.waitOnMetadata()方法分析：

五.Serializer&Deserializer:

六.Partitioner:

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

Kafka源碼分析-Producer（1）-KafkaProducer分析

一.Kafka發(fā)送消息的整體流程：

二.KafkaProducer分析：

KafkaProducer重要的字段：

KafkaProducer的構(gòu)造函數(shù)：

KafkaProducer端配置加載

自定義屬性

加載默認(rèn)屬性和自定義屬性

KafkaProducer的構(gòu)造函數(shù)開(kāi)始

KafkaProducer的send()方法：

三.ProducerInterceptors&ProducerInterceptor

四.Kafka集群元數(shù)據(jù)

Leader副本的動(dòng)態(tài)變化

KafkaProducer發(fā)送消息時(shí)的路由方法：

KafkaProducer元數(shù)據(jù)：

MetaData核心字段：

KafkaProducer.waitOnMetadata()方法分析：

五.Serializer&Deserializer:

六.Partitioner:

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频