欧美性大战xxxxx久久久,韩国理伦片在线观看影片,亚洲一区AV无码少妇电影☆

Keyed State and Operator State

在Flink中有兩種基本類型的狀態(tài)：Keyed State and Operator State。

Keyed State

Keyed State總是和keys相關(guān)，并且只能用于KeyedStream上的函數(shù)和操作。
你可以將Keyed State認為是已經(jīng)被分段或分區(qū)的Operator State，每個key都有且僅有一個state-partition。每個keyed-state邏輯上綁定到一個唯一的<parallel-operator-instance, key>組合上，并且由于每個key“屬于”keyed operator的一個并行實例，所以我們可以簡單的認為是<operator,key>。
Keyed State進一步被組織到所謂的Key Groups中。Key Groups是Flink能夠重新分配keyed State的原子單元。Key Groups的數(shù)量等于定義的最大并行度。在一個keyed operator的并行實例執(zhí)行期間，它與一個或多個Key Groups配合工作。

Operator State

對于Operator State(或者non-keyed state)，每個operator state綁定到一個并行operator實例上。在Flink中，Kafka Connector是一個使用Operator State的很好的例子。每個并行Kafka消費者實例維護一個主題分區(qū)和偏移的map作為它的Operator State。
當并行度被修改時，Operator State接口支持在并行operator實例上重新分配狀態(tài)。進行這種重新分配可以有不同的方案。

Raw and Managed State

Keyed State 和 Operator State 有兩種形式: managed和raw。
Managed State表示數(shù)據(jù)結(jié)構(gòu)由Flink runtime控制，例如內(nèi)部哈希表，或者RocksDB。例如，“ValueState”，“ListState”等等。Flink的runtime層編碼State并將其寫入checkpoint中。
Raw State是operator保存在它的數(shù)據(jù)結(jié)構(gòu)中的state。當進行checkpoint時，它只寫入字節(jié)序列到checkpoint中。Flink并不知道狀態(tài)的數(shù)據(jù)結(jié)構(gòu)，并且只能看到raw字節(jié)。
所有的數(shù)據(jù)流函數(shù)都可以使用managed state，但是raw state接口只可以在操作符的實現(xiàn)類中使用。推薦使用managed state(而不是raw state)，因為使用managed state，當并行度變化時，F(xiàn)link可以自動的重新分布狀態(tài)，并且可以做更好的內(nèi)存管理。
注意如果你的managed state需要自定義序列化邏輯，請參見managed state的自定義序列化以確保未來的兼容性。Flink默認的序列化不需要特殊處理。

使用Managed Keyed State

managed keyed state接口提供了對當前輸入元素的key的不同類型的狀態(tài)的訪問。這意味著這種類型的狀態(tài)只能在KeyedStream中使用，它可以通過stream.keyBy(...)創(chuàng)建。
現(xiàn)在，我們首先看下不同類型的狀態(tài)，然后展示如何在程序中使用它們?？捎玫臓顟B(tài)原語是：

ValueState<T>:它會保存一個可以被更新和查詢的值(限于上面提到的輸入元素的key，因此操作看到的每個key可能都是同一個值)?？墒鞘褂胾pdate(T) 和 T value() 更新和查詢值。
ListState<T>: 它保存了一個元素列表。你可以添加元素和檢索Iterable來獲取所有當前存儲的元素。添加元素使用add(T)方法，獲取Iterable使用Iterable<T> get()方法。
ReducingState<T>: 它保存了一個聚合了所有添加到這個狀態(tài)的值的結(jié)果。接口和ListState相同，但是使用add(T)方法本質(zhì)是使用指定ReduceFunction的聚合行為。
AggregatingState<IN, OUT>: 它保存了一個聚合了所有添加到這個狀態(tài)的值的結(jié)果。與ReducingState想反，聚合類型可能不同于添加到狀態(tài)的元素的類型。接口和ListState相同，但是使用add(IN)添加的元素通過使用指定的AggregateFunction進行聚合。
FoldingState<T, ACC>:它保存了一個聚合了所有添加到這個狀態(tài)的值的結(jié)果。與ReducingState想反，聚合類型可能不同于添加到狀態(tài)的元素的類型。接口和ListState相同，但是使用add(IN)添加的元素通過使用指定的FoldFunction折疊進行聚合。
MapState<UK, UV>:它保存了一個映射列表。你可以將key-value對放入狀態(tài)中，并通過Iterable檢索所有當前存儲的映射關(guān)系。使用put(UK, UV) 或 putAll(Map<UK, UV>)添加映射關(guān)系。使用get(UK)獲取key相關(guān)的value。分別使用entries(), keys() 和 values() 獲取映射關(guān)系，key和value的視圖。
所有類型的狀態(tài)都有一個clear()方法，它清除當前活躍key(即輸入元素的key)的狀態(tài)。
注意 FoldingState 和 FoldingStateDescriptor在Flink1.4中已經(jīng)被廢棄，并且可能在將來完全刪除。請使用AggregatingState和 AggregatingStateDescriptor替代。
首先需要記住的是這些狀態(tài)對象只能用來與狀態(tài)進行交互。狀態(tài)不一定存儲在內(nèi)存中，但是可能存儲在磁盤或者其他地方。第二個需要記住的是，從狀態(tài)獲取的值依賴于輸入元素的key。因此如果包含不同的key，那么在你的用戶函數(shù)中的一個調(diào)用獲得的值和另一個調(diào)用獲得值可能不同。
為了獲得狀態(tài)句柄，必須創(chuàng)建一個StateDescriptor。它維護了狀態(tài)的名稱(稍后將看到，你可以創(chuàng)建多個狀態(tài)，并且他們必須有唯一的名稱，以便你可以引用它們)，狀態(tài)維護的值的類型，和可能用戶指定的function，例如ReduceFunction。根據(jù)你想要查詢的狀態(tài)的類型，你可以創(chuàng)建ValueStateDescriptor，ListStateDescriptor，ReducingStateDescriptor，F(xiàn)oldingStateDescriptor或MapStateDescriptor。
使用RuntimeContext訪問狀態(tài)，因此它只有在rich function中才可以使用。rich function的相關(guān)信息請看這里，但是我們也很快會看到一個示例。RichFunction中，RuntimeContext有這些訪問狀態(tài)的方法:
ValueState<T> getState(ValueStateDescriptor<T>)
ReducingState<T> getReducingState(ReducingStateDescriptor<T>)
ListState<T> getListState(ListStateDescriptor<T>)
AggregatingState<IN, OUT> getAggregatingState(AggregatingState<IN, OUT>)
FoldingState<T, ACC> getFoldingState(FoldingStateDescriptor<T, ACC>)
MapState<UK, UV> getMapState(MapStateDescriptor<UK, UV>)

這是一個顯示了所有部分如何組合在一起的FlatMapFunction示例:

public class CountWindowAverage extends 
RichFlatMapFunction<Tuple2<Long, Long>, Tuple2<Long, Long>> {

    /**
     * The ValueState handle. The first field is the count, the second field a running sum.
     */
    private transient ValueState<Tuple2<Long, Long>> sum;

    @Override
    public void flatMap(Tuple2<Long, Long> input, Collector<Tuple2<Long, Long>> out) throws Exception {

        // access the state value
        Tuple2<Long, Long> currentSum = sum.value();

        // update the count
        currentSum.f0 += 1;

        // add the second field of the input value
        currentSum.f1 += input.f1;

        // update the state
        sum.update(currentSum);

        // if the count reaches 2, emit the average and clear the state
        if (currentSum.f0 >= 2) {
            out.collect(new Tuple2<>(input.f0, currentSum.f1 / currentSum.f0));
            sum.clear();
        }
    }

    @Override
    public void open(Configuration config) {
        ValueStateDescriptor<Tuple2<Long, Long>> descriptor =
                new ValueStateDescriptor<>(
                        "average", // the state name
                        TypeInformation.of(new TypeHint<Tuple2<Long, Long>>() {}), // type information
                        Tuple2.of(0L, 0L)); // default value of the state, if nothing was set
        sum = getRuntimeContext().getState(descriptor);
    }
}

// this can be used in a streaming program like this (assuming we have a StreamExecutionEnvironment env)
env.fromElements(Tuple2.of(1L, 3L), Tuple2.of(1L, 5L), Tuple2.of(1L, 7L), Tuple2.of(1L, 4L), Tuple2.of(1L, 2L))
    .keyBy(0)
    .flatMap(new CountWindowAverage())
    .print();

// the printed output will be (1,4) and (1,5)

這個例子實現(xiàn)了一個計數(shù)窗口。我們以元組的第一個屬性為key(在示例中都有相同的key 1)。該函數(shù)存儲計數(shù)器和一個累加和到ValueState中。一旦計數(shù)器達到2，它會發(fā)出平均值并且清空狀態(tài)以便重新從0開始。注意，如果我們在元組的第一個屬性中有不同的值，那么將為每個不同的輸入key保留不同的狀態(tài)值。

State in the Scala DataStream API

除了上面描述的接口，Scala API在KeyedStream上為使用單個ValueState的有狀態(tài)的map() 或 flatMap() 函數(shù)提供了快捷方式。用戶函數(shù)在Option中獲取ValueState的當前值，并且必須返回一個更新后的值，該值將用于更新狀態(tài)。

val stream: DataStream[(String, Int)] = ...

val counts: DataStream[(String, Int)] = stream
  .keyBy(_._1)
  .mapWithState((in: (String, Int), count: Option[Int]) =>
    count match {
      case Some(c) => ( (in._1, c), Some(c + in._2) )
      case None => ( (in._1, 0), Some(in._2) )
    })

Using Managed Operator State

為了使用managed operator state，有狀態(tài)的函數(shù)可以實現(xiàn)更通用的CheckpointedFunction接口，或者ListCheckpointed<T extends Serializable>接口。

CheckpointedFunction

CheckpointedFunction接口提供了訪問具備不同的重新分配策略的非keyed狀態(tài)。它需要方式的實現(xiàn):

void snapshotState(FunctionSnapshotContext context) throws Exception;

void initializeState(FunctionInitializationContext context) throws Exception;

每當要執(zhí)行checkpoint時，都會調(diào)用snapshotState()方法。對應(yīng)的 initializeState()在每次用戶定義的函數(shù)初始化時調(diào)用，即函數(shù)第一次初始化或者函數(shù)從較早的checkpoint恢復(fù)時。因此initializeState()不僅是不同類型的狀態(tài)初始化的地方，也是包含恢復(fù)邏輯的地方。
目前，支持列表風格的managed操作符狀態(tài)。狀態(tài)期望是一個可序列化對象的列表，每個元素都是獨立的，因此可以在彈性擴容時重新分配。換句話說，這些對象是非keyed狀態(tài)可重新分配的最佳粒度。根據(jù)狀態(tài)訪問方法，定義了下屬重新分配方案：

Even-split redistribution: 每個操作符返回一個狀態(tài)元素列表。完整的狀態(tài)邏輯上是所有列表的連接。在恢復(fù)/重新分配時，列表被均勻的分成操作符并行度數(shù)量相同的子列表。每個操作符獲得一個子列表，它可以是空的，或者包含一個或多個元素。例如，如果操作符的并行度為1，checkpoint包含元素element1和element2，當并行度增加到2時，element1可能分配到操作符實例0中，而element2分配到操作符實例1中。

Union redistribution:每個操作符返回一個狀態(tài)元素列表。完整的狀態(tài)邏輯上是所有列表的連接。在恢復(fù)/重新分配時，每個操作符獲得狀態(tài)元素的完整列表。
下面有一個有狀態(tài)的SinkFunction示例，它使用CheckpointedFunction來緩存將發(fā)送到外部世界的元素。它展示了基本的均勻在分配列表狀態(tài)：

public class BufferingSink
      implements SinkFunction<Tuple2<String, Integer>>,
             CheckpointedFunction {

    private final int threshold;

    private transient ListState<Tuple2<String, Integer>> checkpointedState;

    private List<Tuple2<String, Integer>> bufferedElements;

    public BufferingSink(int threshold) {
        this.threshold = threshold;
        this.bufferedElements = new ArrayList<>();
    }

    @Override
    public void invoke(Tuple2<String, Integer> value) throws Exception {
        bufferedElements.add(value);
        if (bufferedElements.size() == threshold) {
            for (Tuple2<String, Integer> element: bufferedElements) {
                // send it to the sink
            }
            bufferedElements.clear();
        }
    }

    @Override
    public void snapshotState(FunctionSnapshotContext context) throws Exception {
        checkpointedState.clear();
        for (Tuple2<String, Integer> element : bufferedElements) {
            checkpointedState.add(element);
        }
    }

    @Override
    public void initializeState(FunctionInitializationContext context) throws Exception {
        ListStateDescriptor<Tuple2<String, Integer>> descriptor =
            new ListStateDescriptor<>(
                "buffered-elements",
                TypeInformation.of(new TypeHint<Tuple2<Long, Long>>() {}));

        checkpointedState = context.getOperatorStateStore().getListState(descriptor);

        if (context.isRestored()) {
            for (Tuple2<String, Integer> element : checkpointedState.get()) {
                bufferedElements.add(element);
            }
        }
    }
}

initializeState方法接受FunctionInitializationContext作為參數(shù)。它用來初始化非keyed狀態(tài)“容器”。上面是ListState類型的容器，當進行checkpoint時非keyed狀態(tài)的對象存儲在ListState中。
注意狀態(tài)是如何初始化的，類似于keyed狀態(tài)，有一個包含狀態(tài)的名稱和狀態(tài)所持有的狀態(tài)的信息的StateDescriptor：

ListStateDescriptor<Tuple2<String, Integer>> descriptor =
    new ListStateDescriptor<>(
        "buffered-elements",
        TypeInformation.of(new TypeHint<Tuple2<Long, Long>>() {}));

checkpointedState = context.getOperatorStateStore().getListState(descriptor);

狀態(tài)訪問方法的命名約定包含它的狀態(tài)結(jié)構(gòu)的重新分配的模式。例如，在恢復(fù)時使用Union redistribution方案的list state，通過使用getUnionListState(descriptor)方法訪問狀態(tài)。如果方法名不包含重新分配模式，例如getListState(descriptor)，它意味著重新分配方案使用基本的even-split redistribution。
初始化容器后，我們使用context的isRestored()方法來檢查我們是否正在從故障中恢復(fù)。如果是true，也就是正在恢復(fù)中，則應(yīng)用恢復(fù)邏輯。
就像BufferingSink代碼中所示，在狀態(tài)初始化時恢復(fù)的ListState保存在一個類變量中，以便snapshotState()中使用。ListState清除所有前一個checkpoint包含的所有對象，然后填充我們想要checkpoint的新對象。
另外，keyed狀態(tài)也能在 initializeState() 方法中初始化。這通過使用提供的FunctionInitializationContext實現(xiàn)。

ListCheckpointed

ListCheckpointed接口是CheckpointedFunction的限制更嚴的變體，它只支持恢復(fù)時使用even-split redistribution方案的列表風格的狀態(tài)。它也要求實現(xiàn)兩個方法：

List<T> snapshotState(long checkpointId, long timestamp) throws Exception;

void restoreState(List<T> state) throws Exception;

在snapshotState()上操作符應(yīng)該返回一個checkpoint的對象列表，并且恢復(fù)時restoreState必須處理這樣一個列表。如果狀態(tài)是不可分割的，你可以在snapshotState()上總是返回Collections.singletonList(MY_STATE)。

Stateful Source Functions

有狀態(tài)的Source相比其它操作符需要關(guān)注多一點。為了保證狀態(tài)和輸出集合的更新是原子的(精確一次語義在故障/恢復(fù)時要求)，用戶要求從Source的context中獲取鎖。

public static class CounterSource
        extends RichParallelSourceFunction<Long>
        implements ListCheckpointed<Long> {

    /**  current offset for exactly once semantics */
    private Long offset;

    /** flag for job cancellation */
    private volatile boolean isRunning = true;

    @Override
    public void run(SourceContext<Long> ctx) {
        final Object lock = ctx.getCheckpointLock();

        while (isRunning) {
            // output and state update are atomic
            synchronized (lock) {
                ctx.collect(offset);
                offset += 1;
            }
        }
    }

    @Override
    public void cancel() {
        isRunning = false;
    }

    @Override
    public List<Long> snapshotState(long checkpointId, long checkpointTimestamp) {
        return Collections.singletonList(offset);
    }

    @Override
    public void restoreState(List<Long> state) {
        for (Long s : state)
            offset = s;
    }
}

一些操作符當一個checkpoint被Flink完全確認時可能需要與外部世界通信。在這種情況下見org.apache.flink.runtime.state.CheckpointListener接口。

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

Working with State

Working with State

Keyed State and Operator State

Keyed State

Operator State

Raw and Managed State

使用Managed Keyed State

State in the Scala DataStream API

Using Managed Operator State

CheckpointedFunction

ListCheckpointed

Stateful Source Functions

推薦閱讀更多精彩內(nèi)容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

Working with State

Keyed State and Operator State

Keyed State

Operator State

Raw and Managed State

使用Managed Keyed State

State in the Scala DataStream API

Using Managed Operator State

CheckpointedFunction

ListCheckpointed

Stateful Source Functions

推薦閱讀更多精彩內(nèi)容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频