本文是Fallacies of Distributed Computing Explained的筆記。
8個分布式計算的謬論
- The network is reliable.
- Latency is zero.
- Bandwidth is infinite.
- The network is secure.
- Topology doesn't change.
- There is one administrator.
- Transport cost is zero.
- The network is homogeneous.
先來看第一個
1. The network is reliable
網絡是不可靠的,因此在軟件設計的時候,我們需要考慮
- retry
- acknowledge important messages
- identify/ignore duplicates(去重)或者冪等
- reorder messages(對消息排序)
- verify message integrity(驗證消息完整性)
2. Latency is zero
延遲指數據從一個地方傳遞到另一個地方需要多長時間
帶寬則是決定了同時可以傳輸多少數據
Latency is how much time it takes for data to move from one place to another
bandwidth which is how much data we can transfer during that time
延遲比帶寬更難解決,延遲會由信息的傳播速度決定,而光速是恒定的,意味著延遲的low bound是固定的
"B ut I think that it’s really interesting to see that the end-to-end bandwidth increased by 1468 times within the last 11 years while the latency (the time a single ping takes) has only been improved tenfold. If this w ouldn’t be enough, there is even a natural cap on latency. The minimum round-trip time between two points of this earth is determined by the maximum speed of information transmission: the speed of light. At roughly 300,000 kilometers per second (3.6 * 10E12 teraangstrom per fortnight), it will always take at least 30 milliseconds to send a ping from Europe to the US and back, even if the processing would be done in real time."
既然延遲無法避免,我們只能盡可能的去減少消息傳輸。
Taking latency into consideration means you should strive to make as few as possible calls and assuming you have enough bandwidth (which will talk about next time) you'd want to move as much data out in each of this calls.
3. Bandwidth is infinite
帶寬無限的謬論主要有兩方面原因:
- 隨著帶寬的增長,我們傳輸的數據也在增加;
- 丟包問題
One is that while the bandwidth grows, so does the amount of information we try to squeeze through it. VoIP, videos, and IPTV are some of the newer applications that take up bandwidth
The other force at work to lower bandwidth is packet loss (along with frame size).
帶寬不是不限的事實,讓我們去減少信息的傳遞,但是延遲的無法避免,有讓我們去盡可能的傳遞多的數據,我們能做的只能是trade-off。
4. The Network is Secure
作為一個架構師你不必要是一個安全專家,但是你需要了解它,知道怎么去解決她。
Security is usually a multi-layered solution that is handled on the network, infrastructure, and application levels.
5. Topology doesn’t change
可能這個謬論的得來是只有在實驗環境中Topology 才不會變。
"Topology doesn't change." That's right, it doesn’t--as long as it stays in the test lab.
給我們的啟示:
- 不要依賴特定的路由或節點
- 需要同時提供位置透明性或發現服務
Try not to depend on specific endpoints or routes, if you can't be prepared to renegotiate endpoints.
You would want to either provide location transparency (e.g. using an ESB, multicast) or provide discovery services (e.g. a Active Directory/JNDI/LDAP).
6. There is one administrator
當沒有出現問題的時候,我們不會去關心是否有多個administrator,但是一旦問題發生,你就抓狂了。
"Okay, there is more than one administrator. But why should I care?" Well, as long as everything works, maybe you don't care. You do care, however, when things go astray and there is a need to pinpoint a problem (and solve it).
為了防止administrators的問題,我們需要注意:
- 在系統小的時候,就提供工具來監控系統操作
A proactive approach is to also include tools for monitoring on-going operations as well;
總結起來,當我們面對多administrator的時候,必然會收到administrator的約束,我們能做的就是幫助他們管理自己的應用。
To sum up, when there is more than one administrator (unless we are talking about a simple system and even that can evolve later if it is successful), you need to remember that administrators can constrain your options (administrators that sets disk quotas, limited privileges, limited ports and protocols and so on), and that you need to help them manage your applications.
7. Transport cost is zero
我們可以從多個方面去解釋上面結論是謬誤
其中一個我們從從應用層到傳輸層的數據傳遞,我們需要對數據進行編碼,會消耗time and resources
One way is that going from the application level to the transport level is free. This is a fallacy since we have to do marshaling (serialize information into bits) to get data unto the wire, which takes both computer resources and adds to the latency
第二個方式則是設置和運行網絡都需要代價,我們需要很多很多money 買買買!
The second way to interpret the statement is that the costs (as in cash money) for setting and running the network are free. This is also far from being true. There are costs--costs for buying the routers, costs for securing the network, costs for leasing the bandwidth for internet connections, and costs for operating and maintaining the network running. Someone, somewhere will have to pick the tab and pay these costs.
8. The network is homogeneous
網絡是同構的,這是最后一個謬論。我們需要注意不用去依賴一些自營的協議,這樣后續在集成的時候會遇到大麻煩。
It is worthwhile to pay attention to the fact the network is not homogeneous at the application level
Do not rely on proprietary protocols--it would be harder to integrate them later
總結
分布式系統雖然已經發展好多年了,但是面臨的問題卻一直是那么多,但是可怕的是好多架構師在設計時候卻仍然忽略了其中的一些問題,希望上面的列舉出來的謬論能幫助架構師在設計的時候,避免一些問題。