首發于:https://zhuanlan.zhihu.com/p/22297799?refer=lihang
并發還是并行
Concurrency is about dealing with lots of things at once. Parallelism is about doing lots of things at once.[1]
并發的目的在于把當個 CPU 的利用率使用到最高。并行則需要多核 CPU 的支持。
CSP
Communicating Sequential Processes,譯為通信順序進程。是一種形式語言,用來描述并發系統間進行交互的模式。
例如:
COPY = *[c:character; west?c → east!c]
the process repeatedly receives a character from the process named west, and then sends that character to process named east. The parallel composition
[west::DISASSEMBLE || X::COPY || east::ASSEMBLE]
assigns the names west to the DISASSEMBLE process, X to the COPY process, and east to the ASSEMBLE process, and executes these three processes concurrently.[3]
CSP 通過把輸入/輸出和并發環境下的進程通信作為基礎的方法和結構。定義了一套自己的原語,包括:并發執行,輸入/輸出,循環執行,條件判斷等。然后用定義好的語法去實現一些常見的問題,比如:協程,數論,經典的同步問題(生產者消費者,哲學家就餐)。
通過 CSP 的定義,使并發編程能在更高的層次實現,編寫的程序不需要關心底層的資源共享、加鎖、調度切換等細節,使并發程序的編寫更簡單。[6]
Go 就是基于 CSP 的思想來實現的并發模型。
Go runtime scheduler
Why need Go scheduler?
主要有兩個原因:
- 線程較多時,開銷較大。
- OS 的調度,程序不可控。而 Go GC 需要停止所有的線程,使內存達到一致狀態。[7]
Struct
M 代表系統線程,G 代表 goroutine,P 代表 context。
M:N
There are 3 usual models for threading. One is N:1 where several userspace threads are run on one OS thread. This has the advantage of being very quick to context switch but cannot take advantage of multi-core systems. Another is 1:1 where one thread of execution matches one OS thread. It takes advantage of all of the cores on the machine, but context switching is slow because it has to trap through the OS.[7]
M:N 則綜合兩種方式(N:1, 1:1)的優勢。多個 goroutines 可以在多個 OS threads 上處理。既能快速切換上下文,也能利用多核的優勢。
Context switch
在程序中任何對系統 API 的調用,都會被 runtime 層攔截來方便調度。
Goroutine 在 system call 和 channel call 時都可能發生阻塞[8],但這兩種阻塞發生后,處理方式又不一樣的。
- 當程序發生 system call,M 會發生阻塞,同時喚起(或創建)一個新的 M 繼續執行其他的 G。
If the Go code requires the M to block, for instance by invoking a system call, then another M will be woken up from the global queue of idle M’s. This is done to ensure that goroutines, still capable of running, are not blocked from running by the lack of an available M.[11]
- 當程序發起一個 channel call,程序可能會阻塞,但不會阻塞 M,G 的狀態會設置為 waiting,M 繼續執行其他的 G。當 G 的調用完成,會有一個可用的 M 繼續執行它。
If a goroutine makes a channel call, it may need to block, but there is no reason that the M running that G should be forced to block as well. In a case such as this, the G’s status is set to waiting and the M that was previously running it continues running other G’s until the channel communication is complete. At that point the G’s status is set back to runnable and will be run as soon as there is an M capable of running it.[11]
P 的作用:
- 每個 P 都有一個隊列,用來存正在執行的 G。避免 Global Sched Lock。
- 每個 M 運行都需要一個 MCache 結構。M Pool 中通常有較多 M,但執行的只有幾個,為每個池子中的每個 M 分配一個 MCache 則會形成不必要的浪費,通過把 cache 從 M 移到 P,每個運行的 M 都有關聯的 P,這樣只有運行的 M 才有自己的 MCache。[11]
Goroutine vs OS thread
其實 goroutine 用到的就是線程池的技術,當 goroutine 需要執行時,會從 thread pool 中選出一個可用的 M 或者新建一個 M。而 thread pool 中如何選取線程,擴建線程,回收線程,Go Scheduler 進行了封裝,對程序透明,只管調用就行,從而簡化了 thread pool 的使用。[12]
Goroutine vs Python yield
- 創建成本:Go 原生支持協程,通過
go func()
就可以創建一個 goroutine。Python 可以通過gevent.spawn
來新建一個 coroutine,需要第三方庫來支持。 - Goroutine 之間的通信更簡單,通過 channel call 即可實現,上下文切換透明(只有少部分需要自己注意 Gosched)。Python 需要 yield 來傳遞數據和切換上下文(通過一些庫封裝后對調用者來說也是透明的,比如:gevent/tornado)。
- Python coroutine 只會使用一個線程,所以只能利用單核。Goroutine 可以被多個線程調度,可以利用多核。
References:
- [1]Concurrency is not parallelism
- [2]并發編程框架背后的實現,goroutine 背后的系統知識
- [3]CSP wikipedia
- [4]Hoare 的 CSP 論文
- [5]Go 對 Hoare CSP 論文中實例的實現
- [6]Why build concurrency on the ideas of CSP?
- [7]The Go scheduler
- [8]How goroutines work
- [9]Goroutines vs OS threads
- [10]Why goroutines instead of threads?
- [11]Analysis of the Go runtime scheduler
- [12]golang 的 goroutine 是如何實現的? - 知乎