在上一篇中我們介紹了 mpi4py 中的全規(guī)約操作方法,下面我們將介紹規(guī)約發(fā)散操作。
對組內(nèi)通信子上的規(guī)約發(fā)散操作,首先對各個進程所保有的輸入向量實施規(guī)約操作,再將結(jié)果向量散發(fā)到各個進程。相當于以某個進程為根,執(zhí)行一次規(guī)約操作后跟一次散發(fā)操作。
對組間通信子上的規(guī)約發(fā)散操作,對與之相關聯(lián)的組 group A 和 group B,將 A 中所有進程提供的數(shù)據(jù)的規(guī)約結(jié)果散發(fā)到 B 的進程中,反之亦然。
方法接口
mpi4py 中的規(guī)約發(fā)散操作的方法(MPI.Comm 類的方法)接口為:
Reduce_scatter_block(self, sendbuf, recvbuf, Op op=SUM)
Reduce_scatter(self, sendbuf, recvbuf, recvcounts=None, Op op=SUM)
注意:沒有對應的以小寫字母開頭的方法。Reduce_scatter 相對于 Reduce_scatter_block 多了一個 recvcounts
參數(shù),用以指定每個進程接收到的數(shù)據(jù)個數(shù),每個進程接收到的數(shù)據(jù)量可以不同,因此 Reduce_scatter 的散發(fā)步驟實際上執(zhí)行的是 Scatterv 操作,而 Reduce_scatter_block 在散發(fā)步驟則執(zhí)行的是 Scatter 操作,即散發(fā)到各個進程的數(shù)據(jù)量相同。
對組內(nèi)通信子對象的 Reduce_scatter_block 和 Reduce_scatter,可以將其 sendbuf
參數(shù)設置成 MPI.IN_PLACE,此時 recvbuf
將既作為發(fā)送緩沖區(qū)又作為接收緩沖區(qū),每個進程將從 recvbuf
中提取數(shù)據(jù),并將規(guī)約后的結(jié)果填充到 recvbuf
中。當結(jié)果短于 recvbuf
的容量時,只會填充其起始部分。
例程
下面給出全規(guī)約操作的使用例程。
# reduce_scatter.py
"""
Demonstrates the usage of Reduce_scatter_block, Reduce_scatter.
Run this with 4 processes like:
$ mpiexec -n 4 python reduce_scatter.py
"""
import numpy as np
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
# ------------------------------------------------------------------------------
# reduce scatter a numpy array by using Reduce_scatter_block
send_buf = np.arange(8, dtype='i')
recv_buf = np.empty(2, dtype='i')
# first step: reduce
# rank 0 | 0 1 2 3 4 5 6 7
# rank 1 | 0 1 2 3 4 5 6 7
# rank 2 | 0 1 2 3 4 5 6 7
# rank 3 | 0 1 2 3 4 5 6 7
# --------+-------------------------
# SUM | 0 4 8 12 16 20 24 28
# second step: scatter
# rank 0 | rank 1 | rank 2 | rank 3
# ---------+----------+---------+---------
# 0 4 | 8 12 | 16 20 | 24 28
comm.Reduce_scatter_block(send_buf, recv_buf, op=MPI.SUM)
print 'Reduce_scatter_block by SUM: rank %d has %s' % (rank, recv_buf)
# ------------------------------------------------------------------------------
# reduce scatter a numpy array by using Reduce_scatter_block with MPI.IN_PLACE
recv_buf = np.arange(8, dtype='i')
# with MPI.IN_PLACE, recv_buf is used as both send buffer and receive buffer
# the first two elements of recv_buf will be filled with the scattered results
comm.Reduce_scatter_block(MPI.IN_PLACE, recv_buf, op=MPI.SUM)
print 'Reduce_scatter_block by SUM with MPI.IN_PLACE: rank %d has %s' % (rank, recv_buf)
# ------------------------------------------------------------------------------
# reduce scatter a numpy array by using Reduce_scatter
send_buf = np.arange(8, dtype='i')
recvcounts = [2, 3, 1, 2]
recv_buf = np.empty(recvcounts[rank], dtype='i')
# first step: reduce
# rank 0 | 0 1 2 3 4 5 6 7
# rank 1 | 0 1 2 3 4 5 6 7
# rank 2 | 0 1 2 3 4 5 6 7
# rank 3 | 0 1 2 3 4 5 6 7
# --------+-------------------------
# SUM | 0 4 8 12 16 20 24 28
# second step: scatterv with [2, 3, 1, 2]
# rank 0 | rank 1 | rank 2 | rank 3
# ---------+----------+---------+---------
# 0 4 | 8 12 16 | 20 | 24 28
comm.Reduce_scatter(send_buf, recv_buf, recvcounts=[2, 3, 1, 2], op=MPI.SUM)
print 'Reduce_scatter by SUM: rank %d has %s' % (rank, recv_buf)
運行結(jié)果如下:
$ mpiexec -n 4 python reduce_scatter.py
Reduce_scatter_block by SUM: rank 0 has [0 4]
Reduce_scatter_block by SUM: rank 1 has [ 8 12]
Reduce_scatter_block by SUM: rank 3 has [24 28]
Reduce_scatter_block by SUM with MPI.IN_PLACE: rank 0 has [0 4 2 3 4 5 6 7]
Reduce_scatter by SUM: rank 0 has [0 4]
Reduce_scatter_block by SUM: rank 2 has [16 20]
Reduce_scatter_block by SUM with MPI.IN_PLACE: rank 2 has [16 20 2 3 4 5 6 7]
Reduce_scatter by SUM: rank 2 has [20]
Reduce_scatter_block by SUM with MPI.IN_PLACE: rank 1 has [ 8 12 2 3 4 5 6 7]
Reduce_scatter by SUM: rank 1 has [ 8 12 16]
Reduce_scatter_block by SUM with MPI.IN_PLACE: rank 3 has [24 28 2 3 4 5 6 7]
Reduce_scatter by SUM: rank 3 has [24 28]
以上我們介紹了 mpi4py 中的規(guī)約發(fā)散操作方法,在下一篇中我們將介紹全發(fā)散操作。