進(jìn)程的基本概念
進(jìn)程是程序的一次執(zhí)行,每個進(jìn)程都有自己的地址空間,內(nèi)存,數(shù)據(jù)棧以及其他記錄其運行軌跡的輔助數(shù)據(jù)。多進(jìn)程就是在一個程序中執(zhí)行多個任務(wù),可以提高腳本的并行執(zhí)行能力。當(dāng)然使用多進(jìn)程往往是用來處理CPU密集型(科學(xué)計算)的需求。
使用fork創(chuàng)建進(jìn)程
但是fork()調(diào)用一次,返回兩次,因為操作系統(tǒng)自動把當(dāng)前進(jìn)程(稱為父進(jìn)程)復(fù)制了一份(稱為子進(jìn)程),然后,分別在父進(jìn)程和子進(jìn)程內(nèi)返回,子進(jìn)程永遠(yuǎn)返回0,而父進(jìn)程返回子進(jìn)程的ID
import os
# 此方法只在Unix、Linux平臺上有效
print('Proccess {} is start'.format(os.getpid()))
subprocess = os.fork()
source_num = 9
if subprocess == 0:
print('I am in child process, my pid is {0}, and my father pid is {1}'.format(os.getpid(), os.getppid()))
source_num = source_num * 2
print('The source_num in ***child*** process is {}'.format(source_num))
else:
print('I am in father proccess, my child process is {}'.format(subprocess))
source_num = source_num ** 2
print('The source_num in ---father--- process is {}'.format(source_num))
print('The source_num is {}'.format(source_num))
Proccess 16600 is start
I am in father proccess, my child process is 19193
The source_num in ---father--- process is 81
The source_num is 81
Proccess 16600 is start
I am in child process, my pid is 19193, and my father pid is 16600
The source_num in ***child*** process is 18
The source_num is 18
很明顯,多進(jìn)程之間的數(shù)據(jù)并無相互影響
multiprocessing模塊
Multiprocessing是一個Python模塊,使用與threading模塊類似的API產(chǎn)生進(jìn)程。它通過使用進(jìn)程代替線程可以為本地和遠(yuǎn)程并發(fā)性的、有效的避開GIL。因此,該multiprocessing模塊允許程序員充分利用給定機器上的多個處理器。
創(chuàng)建管理進(jìn)程模塊:
- Process(用于創(chuàng)建進(jìn)程):通過創(chuàng)建一個Process對象然后調(diào)用它的start()方法來生成進(jìn)程。Process遵循threading.Thread的API。
- Pool(用于創(chuàng)建進(jìn)程管理池):可以創(chuàng)建一個進(jìn)程池,該進(jìn)程將執(zhí)行與Pool該類一起提交給它的任務(wù),當(dāng)子進(jìn)程較多需要管理時使用。
- Queue(用于進(jìn)程通信,資源共享):進(jìn)程間通信,保證進(jìn)程安全。
- Value,Array(用于進(jìn)程通信,資源共享):
- Pipe(用于管道通信):管道操作。
- Manager(用于資源共享):創(chuàng)建進(jìn)程間共享的數(shù)據(jù),包括在不同機器上運行的進(jìn)程之間的網(wǎng)絡(luò)共享。
同步子進(jìn)程模塊:
- Condition
- Event:用來實現(xiàn)進(jìn)程間同步通信。
- Lock:當(dāng)多個進(jìn)程需要訪問共享資源的時候,Lock可以用來避免訪問的沖突。
- RLock
- Semaphore:用來控制對共享資源的訪問數(shù)量,例如池的最大連接數(shù)。
1.Process
創(chuàng)建進(jìn)程的類:Process(group=None, target=None, name=None, args=(), kwargs={}, *, daemon=None):
group永遠(yuǎn)為0
target表示run()方法要調(diào)用的對象
name為別名
args表示調(diào)用對象的位置參數(shù)元組
kwargs表示調(diào)用對象的字典
deamon設(shè)置守護(hù)進(jìn)程
方法:
run():表示進(jìn)程活動的方法
start():開始進(jìn)程
join():表示阻塞當(dāng)前進(jìn)程,待調(diào)用join()的進(jìn)程結(jié)束后,再開始當(dāng)前方法
name:進(jìn)程的名字
is_alive():返回進(jìn)程是否活著(與death狀態(tài)相反)
deamon:守護(hù)進(jìn)程的標(biāo)識
pid:返回進(jìn)程ID
teminate():終結(jié)進(jìn)程,強制終結(jié)
創(chuàng)建單個進(jìn)程
import os
from multiprocessing import Process
def hello_pro(name):
print('I am in process {0}, It\'s PID is {1}' .format(name, os.getpid()))
if __name__ == '__main__':
print('Parent Process PID is {}'.format(os.getpid()))
p = Process(target=hello_pro, args=('test',), name='test_proc')
# 開始進(jìn)程
p.start()
print('Process\'s ID is {}'.format(p.pid))
print('The Process is alive? {}'.format(p.is_alive()))
print('Process\' name is {}'.format(p.name))
# join方法表示阻塞當(dāng)前進(jìn)程,待p代表的進(jìn)程執(zhí)行完后,再執(zhí)行當(dāng)前進(jìn)程
p.join()
Parent Process PID is 16600
I am in process test, It's PID is 19925
Process's ID is 19925
The Process is alive? True
Process' name is test_proc
創(chuàng)建多個進(jìn)程
import os
from multiprocessing import Process, current_process
def doubler(number):
"""
A doubling function that can be used by a process
"""
result = number * 2
proc_name = current_process().name
print('{0} doubled to {1} by: {2}'.format(
number, result, proc_name))
if __name__ == '__main__':
numbers = [5, 10, 15, 20, 25]
procs = []
proc = Process(target=doubler, args=(5,))
for index, number in enumerate(numbers):
proc = Process(target=doubler, args=(number,))
procs.append(proc)
proc.start()
proc = Process(target=doubler, name='Test', args=(2,))
proc.start()
procs.append(proc)
for proc in procs:
proc.join()
5 doubled to 10 by: Process-8
20 doubled to 40 by: Process-11
10 doubled to 20 by: Process-9
15 doubled to 30 by: Process-10
25 doubled to 50 by: Process-12
2 doubled to 4 by: Test
將進(jìn)程創(chuàng)建為類
import os
import time
from multiprocessing import Process
class DoublerProcess(Process):
def __init__(self, numbers):
Process.__init__(self)
self.numbers = numbers
# 重寫run()函數(shù)
def run(self):
for number in self.numbers:
result = number * 2
proc_name = current_process().name
print('{0} doubled to {1} by: {2}'.format(number, result, proc_name))
if __name__ == '__main__':
dp = DoublerProcess([5, 20, 10, 15, 25])
dp.start()
dp.join()
5 doubled to 10 by: DoublerProcess-16
20 doubled to 40 by: DoublerProcess-16
10 doubled to 20 by: DoublerProcess-16
15 doubled to 30 by: DoublerProcess-16
25 doubled to 50 by: DoublerProcess-16
2.Lock
代碼來自Python多進(jìn)程編程
import multiprocessing
import sys
def worker_with(lock, f):
# lock支持上下文協(xié)議,可以使用with語句
with lock:
fs = open(f, 'a+')
n = 10
while n > 1:
print('Lockd acquired via with')
fs.write("Lockd acquired via with\n")
n -= 1
fs.close()
def worker_no_with(lock, f):
# 獲取lock
lock.acquire()
try:
fs = open(f, 'a+')
n = 10
while n > 1:
print('Lock acquired directly')
fs.write("Lock acquired directly\n")
n -= 1
fs.close()
finally:
# 釋放Lock
lock.release()
if __name__ == "__main__":
lock = multiprocessing.Lock()
f = "file.txt"
w = multiprocessing.Process(target = worker_with, args=(lock, f))
nw = multiprocessing.Process(target = worker_no_with, args=(lock, f))
w.start()
nw.start()
w.join()
nw.join()
print('END!')
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lockd acquired via with
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
Lock acquired directly
END!
3.Pool
Pool可以提供指定數(shù)量的進(jìn)程,供用戶調(diào)用,當(dāng)有新的請求提交到pool中時,如果池還沒有滿,那么就會創(chuàng)建一個新的進(jìn)程用來執(zhí)行該請求;但如果池中的進(jìn)程數(shù)已經(jīng)達(dá)到規(guī)定最大值,那么該請求就會等待,直到池中有進(jìn)程結(jié)束,才會創(chuàng)建新的進(jìn)程來它
import time
import os
from multiprocessing import Pool, cpu_count
def f(msg):
print('Starting: {}, PID: {}, Time: {}'.format(msg, os.getpid(), time.ctime()))
time.sleep(3)
print('Ending: {}, PID: {}, Time: {}'.format(msg, os.getpid(), time.ctime()))
if __name__ == '__main__':
print('Starting Main Function')
print('This Computer has {} CPU'.format(cpu_count()))
# 創(chuàng)建4個進(jìn)程
p = Pool(4)
for i in range(5):
msg = 'Process {}'.format(i)
# 將函數(shù)和參數(shù)傳入進(jìn)程
p.apply_async(f, (msg, ))
# 禁止增加新的進(jìn)程
p.close()
# 阻塞當(dāng)前進(jìn)程
p.join()
print('All Done!!!')
Starting Main Function
This Computer has 4 CPU
Starting: Process 2, PID: 8332, Time: Fri Sep 1 08:53:12 2017
Starting: Process 1, PID: 8331, Time: Fri Sep 1 08:53:12 2017
Starting: Process 0, PID: 8330, Time: Fri Sep 1 08:53:12 2017
Starting: Process 3, PID: 8333, Time: Fri Sep 1 08:53:12 2017
Ending: Process 2, PID: 8332, Time: Fri Sep 1 08:53:15 2017
Ending: Process 3, PID: 8333, Time: Fri Sep 1 08:53:15 2017
Starting: Process 4, PID: 8332, Time: Fri Sep 1 08:53:15 2017
Ending: Process 1, PID: 8331, Time: Fri Sep 1 08:53:15 2017
Ending: Process 0, PID: 8330, Time: Fri Sep 1 08:53:15 2017
Ending: Process 4, PID: 8332, Time: Fri Sep 1 08:53:18 2017
All Done!!!
本機為4個CPU,所以前0-3號進(jìn)程直接同時執(zhí)行,4號進(jìn)程等待,帶0-3號中有進(jìn)程執(zhí)行完畢后,4號進(jìn)程開始執(zhí)行。而當(dāng)前進(jìn)程執(zhí)行完畢后,再執(zhí)行當(dāng)前進(jìn)程,打印“All Done!!!”。方法apply_async()是非阻塞式的,而方法apply()則是阻塞式的。
將apply_async()替換為apply()方法
import time
import os
from multiprocessing import Pool, cpu_count
def f(msg):
print('Starting: {}, PID: {}, Time: {}'.format(msg, os.getpid(), time.ctime()))
time.sleep(3)
print('Ending: {}, PID: {}, Time: {}'.format(msg, os.getpid(), time.ctime()))
if __name__ == '__main__':
print('Starting Main Function')
print('This Computer has {} CPU'.format(cpu_count()))
# 創(chuàng)建4個進(jìn)程
p = Pool(4)
for i in range(5):
msg = 'Process {}'.format(i)
# 將apply_async()方法替換為apply()方法
p.apply(f, (msg, ))
# 禁止增加新的進(jìn)程
p.close()
# 阻塞當(dāng)前進(jìn)程
p.join()
print('All Done!!!')
Starting Main Function
This Computer has 4 CPU
Starting: Process 0, PID: 8281, Time: Fri Sep 1 08:51:18 2017
Ending: Process 0, PID: 8281, Time: Fri Sep 1 08:51:21 2017
Starting: Process 1, PID: 8282, Time: Fri Sep 1 08:51:21 2017
Ending: Process 1, PID: 8282, Time: Fri Sep 1 08:51:24 2017
Starting: Process 2, PID: 8283, Time: Fri Sep 1 08:51:24 2017
Ending: Process 2, PID: 8283, Time: Fri Sep 1 08:51:27 2017
Starting: Process 3, PID: 8284, Time: Fri Sep 1 08:51:27 2017
Ending: Process 3, PID: 8284, Time: Fri Sep 1 08:51:30 2017
Starting: Process 4, PID: 8281, Time: Fri Sep 1 08:51:30 2017
Ending: Process 4, PID: 8281, Time: Fri Sep 1 08:51:33 2017
All Done!!!
可以看到阻塞式的在一個接一個執(zhí)行,待上一個執(zhí)行完畢后才執(zhí)行下一個。
使用get方法獲取結(jié)果
import time
import os
from multiprocessing import Pool, cpu_count
def f(msg):
print('Starting: {}, PID: {}, Time: {}'.format(msg, os.getpid(), time.ctime()))
time.sleep(3)
print('Ending: {}, PID: {}, Time: {}'.format(msg, os.getpid(), time.ctime()))
return 'Done {}'.format(msg)
if __name__ == '__main__':
print('Starting Main Function')
print('This Computer has {} CPU'.format(cpu_count()))
# 創(chuàng)建4個進(jìn)程
p = Pool(4)
results = []
for i in range(5):
msg = 'Process {}'.format(i)
results.append(p.apply_async(f, (msg, )))
# 禁止增加新的進(jìn)程
p.close()
# 阻塞當(dāng)前進(jìn)程
p.join()
for result in results:
print(result.get())
print('All Done!!!')
Starting Main Function
This Computer has 4 CPU
Starting: Process 0, PID: 8526, Time: Fri Sep 1 09:00:04 2017
Starting: Process 1, PID: 8527, Time: Fri Sep 1 09:00:04 2017
Starting: Process 2, PID: 8528, Time: Fri Sep 1 09:00:04 2017
Starting: Process 3, PID: 8529, Time: Fri Sep 1 09:00:04 2017
Ending: Process 1, PID: 8527, Time: Fri Sep 1 09:00:07 2017
Starting: Process 4, PID: 8527, Time: Fri Sep 1 09:00:07 2017
Ending: Process 3, PID: 8529, Time: Fri Sep 1 09:00:07 2017
Ending: Process 0, PID: 8526, Time: Fri Sep 1 09:00:07 2017
Ending: Process 2, PID: 8528, Time: Fri Sep 1 09:00:07 2017
Ending: Process 4, PID: 8527, Time: Fri Sep 1 09:00:10 2017
Done Process 0
Done Process 1
Done Process 2
Done Process 3
Done Process 4
All Done!!!
4.Queue
Queue是多進(jìn)程安全的隊列,可以使用Queue實現(xiàn)多進(jìn)程之間的數(shù)據(jù)傳遞。
put方法用以插入數(shù)據(jù)到隊列中,put方法還有兩個可選參數(shù):blocked和timeout。如果blocked為True(默認(rèn)值),并且timeout為正值,該方法會阻塞timeout指定的時間,直到該隊列有剩余的空間。如果超時,會拋出Queue.Full異常。如果blocked為False,但該Queue已滿,會立即拋出Queue.Full異常。
get方法可以從隊列讀取并且刪除一個元素。同樣,get方法有兩個可選參數(shù):blocked和timeout。如果blocked為True(默認(rèn)值),并且timeout為正值,那么在等待時間內(nèi)沒有取到任何元素,會拋出Queue.Empty異常。如果blocked為False,有兩種情況存在,如果Queue有一個值可用,則立即返回該值,否則,如果隊列為空,則立即拋出Queue.Empty異常
import os
import time
from multiprocessing import Queue, Process
def write_queue(q):
for i in ['first', 'two', 'three', 'four', 'five']:
print('Write "{}" to Queue'.format(i))
q.put(i)
time.sleep(3)
print('Write Done!')
def read_queue(q):
print('Start to read!')
while True:
data = q.get()
print('Read "{}" from Queue!'.format(data))
if __name__ == '__main__':
q = Queue()
wq = Process(target=write_queue, args=(q,))
rq = Process(target=read_queue, args=(q,))
wq.start()
rq.start()
# #這個表示是否阻塞方式啟動進(jìn)程,如果要立即讀取的話,兩個進(jìn)程的啟動就應(yīng)該是非阻塞式的,
# 所以wq在start后不能立即使用wq.join(), 要等rq.start后方可
wq.join()
# 服務(wù)進(jìn)程,強制停止,因為read_queue進(jìn)程李是死循環(huán)
rq.terminate()
Write "first" to Queue
Start to read!
Read "first" from Queue!
Write "two" to Queue
Read "two" from Queue!
Write "three" to Queue
Read "three" from Queue!
Write "four" to Queue
Read "four" from Queue!
Write "five" to Queue
Read "five" from Queue!
Write Done!
5.Pipe
Pipe方法返回(conn1, conn2)代表一個管道的兩個端。
Pipe方法有duplex參數(shù),如果duplex參數(shù)為True(默認(rèn)值),那么這個管道是全雙工模式,也就是說conn1和conn2均可收發(fā)。duplex為False,conn1只負(fù)責(zé)接受消息,conn2只負(fù)責(zé)發(fā)送消息。
send和recv方法分別是發(fā)送和接受消息的方法。例如,在全雙工模式下,可以調(diào)用conn1.send發(fā)送消息,conn1.recv接收消息。如果沒有消息可接收,recv方法會一直阻塞。如果管道已經(jīng)被關(guān)閉,那么recv方法會拋出EOFError。
可參考使用pipe管道使python fork多進(jìn)程之間通信
import os, time, sys
from multiprocessing import Pipe, Process
def send_pipe(p):
for i in ['first', 'two', 'three', 'four', 'five']:
print('Send "{}" to Pipe'.format(i))
p.send(i)
time.sleep(3)
print('Send Done!')
def receive_pipe(p):
print('Start to receive!')
while True:
data = p.recv()
print('Read "{}" from Pipe!'.format(data))
if __name__ == '__main__':
sp_pipe, rp_pipe = Pipe()
sp = Process(target=send_pipe, args=(sp_pipe,))
rp = Process(target=receive_pipe, args=(rp_pipe,))
sp.start()
rp.start()
wq.join()
rq.terminate()
Start to receive!
Send "first" to Pipe
Read "first" from Pipe!
Send "two" to Pipe
Read "two" from Pipe!
Send "three" to Pipe
Read "three" from Pipe!
Send "four" to Pipe
Read "four" from Pipe!
Send "five" to Pipe
Read "five" from Pipe!
Send Done!
6.Semaphore
Semaphore用來控制對共享資源的訪問數(shù)量,例如池的最大連接數(shù)
import multiprocessing
import time
def worker(s, i):
s.acquire()
print(multiprocessing.current_process().name + "acquire");
time.sleep(i)
print(multiprocessing.current_process().name + "release\n");
s.release()
if __name__ == "__main__":
s = multiprocessing.Semaphore(3)
for i in range(5):
p = multiprocessing.Process(target = worker, args=(s, i*2))
p.start()
Process-170acquire
Process-168acquire
Process-168release
Process-169acquire
Process-171acquire
Process-169release
Process-172acquire
Process-170release
Process-171release
Process-172release