上一篇文章中,我們介紹完了Peer的start()方法,本文將深入start()里的調(diào)用方法來分析Peer的收發(fā)消息機(jī)制。start()方法中的第一步便是交換Version消息,我們來看看negotiateInboundProtocol()方法:
//btcd/peer/peer.go
// negotiateInboundProtocol waits to receive a version message from the peer
// then sends our version message. If the events do not occur in that order then
// it returns an error.
func (p *Peer) negotiateInboundProtocol() error {
if err := p.readRemoteVersionMsg(); err != nil {
return err
}
return p.writeLocalVersionMsg()
}
其主要步驟是:
- 等待并讀取、處理Peer發(fā)過來的Version消息;
- 向Peer發(fā)送自己的Version消息;
negotiateOutboundProtocol()與negotiateInboundProtocol()類似,只是上述兩步的順序相反。在readRemoteVersionMsg()中,先是讀取并解析出Version消息,然后調(diào)用Peer的handleRemoteVersionMsg來處理Version消息,最后回調(diào)MessageListeners中的OnVersion進(jìn)一步處理。這里,我們主要來看handleRemoteVersionMsg():
//btcd/peer/peer.go
// handleRemoteVersionMsg is invoked when a version bitcoin message is received
// from the remote peer. It will return an error if the remote peer's version
// is not compatible with ours.
func (p *Peer) handleRemoteVersionMsg(msg *wire.MsgVersion) error {
// Detect self connections.
if !allowSelfConns && sentNonces.Exists(msg.Nonce) {
return errors.New("disconnecting peer connected to self")
}
// Notify and disconnect clients that have a protocol version that is
// too old.
if msg.ProtocolVersion < int32(wire.MultipleAddressVersion) {
// Send a reject message indicating the protocol version is
// obsolete and wait for the message to be sent before
// disconnecting.
reason := fmt.Sprintf("protocol version must be %d or greater",
wire.MultipleAddressVersion)
rejectMsg := wire.NewMsgReject(msg.Command(), wire.RejectObsolete,
reason)
return p.writeMessage(rejectMsg)
}
// Updating a bunch of stats.
p.statsMtx.Lock()
p.lastBlock = msg.LastBlock
p.startingHeight = msg.LastBlock
// Set the peer's time offset.
p.timeOffset = msg.Timestamp.Unix() - time.Now().Unix()
p.statsMtx.Unlock()
// Negotiate the protocol version.
p.flagsMtx.Lock()
p.advertisedProtoVer = uint32(msg.ProtocolVersion)
p.protocolVersion = minUint32(p.protocolVersion, p.advertisedProtoVer)
p.versionKnown = true
log.Debugf("Negotiated protocol version %d for peer %s",
p.protocolVersion, p)
// Set the peer's ID.
p.id = atomic.AddInt32(&nodeCount, 1)
// Set the supported services for the peer to what the remote peer
// advertised.
p.services = msg.Services
// Set the remote peer's user agent.
p.userAgent = msg.UserAgent
p.flagsMtx.Unlock()
return nil
}
可以看到,Peer在處理Version消息時(shí),主要進(jìn)行了:
- 檢測(cè)Version消息里的Nonce是否是自己緩存的nonce值,如果是,則表明該Version消息由自己發(fā)送給自己,在實(shí)際網(wǎng)絡(luò)下,不允許節(jié)點(diǎn)自己與自己結(jié)成Peer,所以這時(shí)會(huì)返回錯(cuò)誤;
- 檢測(cè)Version消息里的ProtocolVersion,如果Peer的版本低于209,則拒絕與之相連;
- Nonce和ProtocolVersion檢查通過后,就開始更新Peer的相關(guān)信息,如Peer的最新區(qū)塊高度、Peer與本地節(jié)點(diǎn)的時(shí)間偏移等;
- 最后,更新Peer的版本號(hào)、支持的服務(wù)、UserAgent等信息,同時(shí)為其分配一個(gè)id。
Peer的區(qū)塊高度、支持的服務(wù)等信息將用于本地節(jié)點(diǎn)判斷是否同步Peer的區(qū)塊,我們將在后文中介紹,總之,交換Version消息是為了保證后續(xù)的消息交換及同步區(qū)塊過程能夠順利進(jìn)行。向Peer發(fā)送Version消息時(shí),最主要的是填充及封裝Version消息,我們將在介紹btcd/wire時(shí)再詳細(xì)說明,這里暫不展開。接下來,我們開始介紹start()中新起的goroutine里運(yùn)行的各個(gè)Handler的實(shí)現(xiàn),由于這些goroutine里大量處理了channel消息,為了便于理解后續(xù)代碼,我們先給出各個(gè)goroutine與其關(guān)聯(lián)的channel的關(guān)系圖:
圖中帶箭頭的黑色大圓圈代表一個(gè)goroutine,藍(lán)色和紅色“管道”代表一個(gè)channel,這里的channel均是雙向管道。在分析或設(shè)計(jì)Go的并發(fā)編程代碼時(shí),大家不防也采用類似的“圓圈”上“插管”的方式來幫助直觀理解各個(gè)協(xié)程及它們的同步關(guān)系。
我們首先來看inHandler():
//btcd/peer/peer.go
// inHandler handles all incoming messages for the peer. It must be run as a
// goroutine.
func (p *Peer) inHandler() {
// Peers must complete the initial version negotiation within a shorter
// timeframe than a general idle timeout. The timer is then reset below
// to idleTimeout for all future messages.
idleTimer := time.AfterFunc(idleTimeout, func() {
log.Warnf("Peer %s no answer for %s -- disconnecting", p, idleTimeout)
p.Disconnect()
})
out:
for atomic.LoadInt32(&p.disconnect) == 0 {
// Read a message and stop the idle timer as soon as the read
// is done. The timer is reset below for the next iteration if
// needed.
rmsg, buf, err := p.readMessage()
idleTimer.Stop()
......
atomic.StoreInt64(&p.lastRecv, time.Now().Unix())
p.stallControl <- stallControlMsg{sccReceiveMessage, rmsg}
// Handle each supported message type.
p.stallControl <- stallControlMsg{sccHandlerStart, rmsg}
switch msg := rmsg.(type) {
case *wire.MsgVersion:
p.PushRejectMsg(msg.Command(), wire.RejectDuplicate,
"duplicate version message", nil, true)
break out
......
case *wire.MsgGetAddr:
if p.cfg.Listeners.OnGetAddr != nil {
p.cfg.Listeners.OnGetAddr(p, msg)
}
case *wire.MsgAddr:
if p.cfg.Listeners.OnAddr != nil {
p.cfg.Listeners.OnAddr(p, msg)
}
case *wire.MsgPing:
p.handlePingMsg(msg)
if p.cfg.Listeners.OnPing != nil {
p.cfg.Listeners.OnPing(p, msg)
}
case *wire.MsgPong:
p.handlePongMsg(msg)
if p.cfg.Listeners.OnPong != nil {
p.cfg.Listeners.OnPong(p, msg)
}
......
case *wire.MsgBlock:
if p.cfg.Listeners.OnBlock != nil {
p.cfg.Listeners.OnBlock(p, msg, buf)
}
......
default:
log.Debugf("Received unhandled message of type %v "+
"from %v", rmsg.Command(), p)
}
p.stallControl <- stallControlMsg{sccHandlerDone, rmsg}
// A message was received so reset the idle timer.
idleTimer.Reset(idleTimeout)
}
// Ensure the idle timer is stopped to avoid leaking the resource.
idleTimer.Stop()
// Ensure connection is closed.
p.Disconnect()
close(p.inQuit)
log.Tracef("Peer input handler done for %s", p)
}
其主要步驟包括:
- 設(shè)定一個(gè)idleTimer,其超時(shí)時(shí)間為5分鐘。如果每隔5分鐘內(nèi)沒有從Peer接收到消息,則主動(dòng)與該P(yáng)eer斷開連接。我們?cè)诤竺娣治鰌ingHandler時(shí)將會(huì)看到,往Peer發(fā)送ping消息的周期是2分鐘,也就是說最多約2分鐘多一點(diǎn)(2min + RTT + Peer處理Ping的時(shí)間,其中RTT一般為ms級(jí))需要收到Peer回復(fù)的Pong消息,所以如果5min沒有收到回復(fù),可以認(rèn)為Peer已經(jīng)失去聯(lián)系;
- 循環(huán)讀取和處理從Peer發(fā)過來的消息。當(dāng)5min內(nèi)收到消息時(shí),idleTimer暫時(shí)停止。請(qǐng)注意,消息讀取完畢后,inHandler向stallHandler通過stallControl channel發(fā)送了sccReceiveMessage消息,并隨后發(fā)送了sccHandlerStart,stallHandler會(huì)根據(jù)這些消息來計(jì)算節(jié)點(diǎn)接收并處理消息所消耗的時(shí)間,我們?cè)诤竺娣治鰏tallHandler分詳細(xì)介紹。
- 在處理Peer發(fā)送過來的消息時(shí),inHandler可能先對(duì)其作處理,如MsgPing和MsgPong,也可能不對(duì)其作任何處理,如MsgBlock等等,然后回調(diào)MessageListener的對(duì)應(yīng)函數(shù)作處理。
- 在處理完一條消息后,inHandler向stallHandler發(fā)送sccHandlerDone,通知stallHandler消息處理完畢。同時(shí),將idleTimer復(fù)位再次開始計(jì)時(shí),并等待讀取下一條消息;
- 當(dāng)主動(dòng)調(diào)用Disconnect()與Peer斷開連接后,消息讀取和處理循環(huán)將退出,inHandler協(xié)和也準(zhǔn)備退出。退出之前,先將idleTimer停止,并再次主動(dòng)調(diào)用Disconnect()強(qiáng)制與Peer斷開連接,最后通過inQuit channel向stallHandler通知自己已經(jīng)退出。
inHandler協(xié)程主要處理接收消息,并回調(diào)MessageListener中的消息處理函數(shù)對(duì)消息進(jìn)行處理,需要注意的是,回調(diào)函數(shù)處理消息時(shí)不能太耗時(shí),否則會(huì)收引起超時(shí)斷連。outHandler主要發(fā)送消息,我們來看看它的代碼:
//btcd/peer/peer.go
// outHandler handles all outgoing messages for the peer. It must be run as a
// goroutine. It uses a buffered channel to serialize output messages while
// allowing the sender to continue running asynchronously.
func (p *Peer) outHandler() {
out:
for {
select {
case msg := <-p.sendQueue:
switch m := msg.msg.(type) {
case *wire.MsgPing:
// Only expects a pong message in later protocol
// versions. Also set up statistics.
if p.ProtocolVersion() > wire.BIP0031Version {
p.statsMtx.Lock()
p.lastPingNonce = m.Nonce
p.lastPingTime = time.Now()
p.statsMtx.Unlock()
}
}
p.stallControl <- stallControlMsg{sccSendMessage, msg.msg}
if err := p.writeMessage(msg.msg); err != nil {
p.Disconnect()
......
continue
}
......
p.sendDoneQueue <- struct{}{}
case <-p.quit:
break out
}
}
<-p.queueQuit
// Drain any wait channels before we go away so we don't leave something
// waiting for us. We have waited on queueQuit and thus we can be sure
// that we will not miss anything sent on sendQueue.
cleanup:
for {
select {
case msg := <-p.sendQueue:
if msg.doneChan != nil {
msg.doneChan <- struct{}{}
}
// no need to send on sendDoneQueue since queueHandler
// has been waited on and already exited.
default:
break cleanup
}
}
close(p.outQuit)
log.Tracef("Peer output handler done for %s", p)
}
可以看出,outHandler主要是從sendQueue循環(huán)取出消息,并調(diào)用writeMessage()向Peer發(fā)送消息。當(dāng)消息發(fā)送前,它向stallHandler發(fā)送sccSendMessage消息,通知stallHandler開始跟蹤這條消息的響應(yīng)是否超時(shí);消息發(fā)成功后,通過sendDoneQueue channel通知queueHandler發(fā)送下一條消息。需要注意的是,sendQueue是buffer size為1的channel,它與sendDoneQueue配合保證發(fā)送緩沖隊(duì)列outputQueue里的消息按順序一一發(fā)送。當(dāng)Peer斷開連接時(shí),p.quit的接收代碼會(huì)被觸發(fā),從而讓循環(huán)退出。通過queueQuit同步,outHandler退出之前需要等待queueHandler退出,是為了讓queueHandler將發(fā)送緩沖中的消息清空。最后,通過outQuit channel通知stallHandler自己退出。
發(fā)送消息的隊(duì)列由queueHandler維護(hù),它通過sendQueue將隊(duì)列中的消息送往outHandler并向Peer發(fā)送。queueHandler還專門處理了Inventory的發(fā)送,我們來看看它的代碼:
//btcd/peer/peer.go
// queueHandler handles the queuing of outgoing data for the peer. This runs as
// a muxer for various sources of input so we can ensure that server and peer
// handlers will not block on us sending a message. That data is then passed on
// to outHandler to be actually written.
func (p *Peer) queueHandler() {
pendingMsgs := list.New()
invSendQueue := list.New()
trickleTicker := time.NewTicker(trickleTimeout)
defer trickleTicker.Stop()
// We keep the waiting flag so that we know if we have a message queued
// to the outHandler or not. We could use the presence of a head of
// the list for this but then we have rather racy concerns about whether
// it has gotten it at cleanup time - and thus who sends on the
// message's done channel. To avoid such confusion we keep a different
// flag and pendingMsgs only contains messages that we have not yet
// passed to outHandler.
waiting := false
// To avoid duplication below.
queuePacket := func(msg outMsg, list *list.List, waiting bool) bool { (1)
if !waiting {
p.sendQueue <- msg
} else {
list.PushBack(msg)
}
// we are always waiting now.
return true
}
out:
for { (2)
select {
case msg := <-p.outputQueue: (3)
waiting = queuePacket(msg, pendingMsgs, waiting)
// This channel is notified when a message has been sent across
// the network socket.
case <-p.sendDoneQueue: (4)
// No longer waiting if there are no more messages
// in the pending messages queue.
next := pendingMsgs.Front()
if next == nil {
waiting = false
continue
}
// Notify the outHandler about the next item to
// asynchronously send.
val := pendingMsgs.Remove(next)
p.sendQueue <- val.(outMsg)
case iv := <-p.outputInvChan: (5)
// No handshake? They'll find out soon enough.
if p.VersionKnown() {
invSendQueue.PushBack(iv)
}
case <-trickleTicker.C: (6)
// Don't send anything if we're disconnecting or there
// is no queued inventory.
// version is known if send queue has any entries.
if atomic.LoadInt32(&p.disconnect) != 0 ||
invSendQueue.Len() == 0 {
continue
}
// Create and send as many inv messages as needed to
// drain the inventory send queue.
invMsg := wire.NewMsgInvSizeHint(uint(invSendQueue.Len()))
for e := invSendQueue.Front(); e != nil; e = invSendQueue.Front() {
iv := invSendQueue.Remove(e).(*wire.InvVect)
// Don't send inventory that became known after
// the initial check.
if p.knownInventory.Exists(iv) { (7)
continue
}
invMsg.AddInvVect(iv)
if len(invMsg.InvList) >= maxInvTrickleSize {
waiting = queuePacket( (8)
outMsg{msg: invMsg},
pendingMsgs, waiting)
invMsg = wire.NewMsgInvSizeHint(uint(invSendQueue.Len()))
}
// Add the inventory that is being relayed to
// the known inventory for the peer.
p.AddKnownInventory(iv) (9)
}
if len(invMsg.InvList) > 0 {
waiting = queuePacket(outMsg{msg: invMsg}, (10)
pendingMsgs, waiting)
}
case <-p.quit:
break out
}
}
// Drain any wait channels before we go away so we don't leave something
// waiting for us.
for e := pendingMsgs.Front(); e != nil; e = pendingMsgs.Front() { (11)
val := pendingMsgs.Remove(e)
msg := val.(outMsg)
if msg.doneChan != nil {
msg.doneChan <- struct{}{}
}
}
cleanup:
for { (12)
select {
case msg := <-p.outputQueue:
if msg.doneChan != nil {
msg.doneChan <- struct{}{}
}
case <-p.outputInvChan:
// Just drain channel
// sendDoneQueue is buffered so doesn't need draining.
default:
break cleanup
}
}
close(p.queueQuit) (13)
log.Tracef("Peer queue handler done for %s", p)
}
queueHandler()中的主要步驟:
- 代碼(1)處定義了一個(gè)函數(shù)值,它的主要邏輯為: 當(dāng)從outputQueue接收到待發(fā)送消息時(shí),如果有消息正在通過outHandler發(fā)送,則將消息緩存到pendingMsgs或invSendQueue;
- 代碼(2)處開始循環(huán)處理channel消息。請(qǐng)注意,這里的select語句沒有定義default分支,也就是說管道中沒有數(shù)據(jù)時(shí),循環(huán)將阻塞在select語句處;
- 當(dāng)有發(fā)送消息的請(qǐng)求時(shí),發(fā)送方向outputQueue寫入數(shù)據(jù),代碼(3)處的接收代碼將會(huì)被觸發(fā),并調(diào)用queuePacket(),要么立即發(fā)向outHandler,要么緩存起來排隊(duì)發(fā)送;
- 當(dāng)outHandler發(fā)送完一條消息時(shí),它向sendDoneQueue寫入數(shù)據(jù),代碼(4)處的接收代碼被觸發(fā),queueHandler從緩存在pendingMsgs中的待發(fā)送消息取出一條發(fā)往outHandler;
- 當(dāng)要發(fā)送Inventory時(shí),發(fā)送方向outputInvChan寫入數(shù)據(jù),代碼(5)處的接收代碼被觸發(fā),待發(fā)送的Inventory將被緩存到invSendQueue中;
- 代碼(6)處trickleTicker 10s被觸發(fā)一次,它首先從invSendQueue中取出一條Inventory,隨后驗(yàn)證它是否已經(jīng)向Peer發(fā)送過,如代碼(7)處所示;如果是新的Inventroy,則將各個(gè)Inventory組成Inventory Vector,通過inv消息發(fā)往Peer。需要注意的是,代碼(8)處限制每個(gè)inv消息里的Inventory Vector的size最大為1000,當(dāng)超過該限制時(shí),invSendQueue中的Inventory將分成多個(gè)inv消息發(fā)送。代碼(9)處將發(fā)送過的Inventory緩存下來,以防后面重復(fù)發(fā)送;
- 當(dāng)調(diào)用Peer的Disconnect()時(shí),p.quit的接收代碼會(huì)被觸發(fā),循環(huán)退出;同時(shí)代碼(11)處將pendingMsgs中的待發(fā)送消息清空,代碼(12)處將管道中的消息清空,隨后代碼(12)處通過queueQuit channel通知outHandler退出。
queueHandler()通過outputQueue和outputInvChan這兩上帶緩沖的channel,以及pendingMsgs和invSendQueue兩個(gè)List,實(shí)現(xiàn)了發(fā)送消息列隊(duì);而且,它通過緩存大小為1的channel sendQueue保證待發(fā)送消息按順序串行發(fā)送。inHandler,outHandler和queueHandler在不同goroutine中執(zhí)行,實(shí)現(xiàn)了異步收發(fā)消息。然而正如我們?cè)趇nHandler中所了解的,消息的接收處理也是一條一條地串行處理的,如果沒有超時(shí)控制,假如某一時(shí)間段內(nèi)發(fā)送隊(duì)列中有大量待發(fā)送消息,而且inHandler中處理某些消息太耗時(shí)導(dǎo)致后續(xù)消息無法讀取時(shí),Peer之間的消息交換將發(fā)生嚴(yán)重的“擁塞”。為了防止這種情況,stallHandler中作了超時(shí)處理:
//btcd/peer/peer.go
// stallHandler handles stall detection for the peer. This entails keeping
// track of expected responses and assigning them deadlines while accounting for
// the time spent in callbacks. It must be run as a goroutine.
func (p *Peer) stallHandler() {
// These variables are used to adjust the deadline times forward by the
// time it takes callbacks to execute. This is done because new
// messages aren't read until the previous one is finished processing
// (which includes callbacks), so the deadline for receiving a response
// for a given message must account for the processing time as well.
var handlerActive bool
var handlersStartTime time.Time
var deadlineOffset time.Duration
// pendingResponses tracks the expected response deadline times.
pendingResponses := make(map[string]time.Time)
// stallTicker is used to periodically check pending responses that have
// exceeded the expected deadline and disconnect the peer due to
// stalling.
stallTicker := time.NewTicker(stallTickInterval)
defer stallTicker.Stop()
// ioStopped is used to detect when both the input and output handler
// goroutines are done.
var ioStopped bool
out:
for {
select {
case msg := <-p.stallControl:
switch msg.command {
case sccSendMessage: (1)
// Add a deadline for the expected response
// message if needed.
p.maybeAddDeadline(pendingResponses,
msg.message.Command())
case sccReceiveMessage: (2)
// Remove received messages from the expected
// response map. Since certain commands expect
// one of a group of responses, remove
// everything in the expected group accordingly.
switch msgCmd := msg.message.Command(); msgCmd {
case wire.CmdBlock:
fallthrough
case wire.CmdMerkleBlock:
fallthrough
case wire.CmdTx:
fallthrough
case wire.CmdNotFound:
delete(pendingResponses, wire.CmdBlock)
delete(pendingResponses, wire.CmdMerkleBlock)
delete(pendingResponses, wire.CmdTx)
delete(pendingResponses, wire.CmdNotFound)
default:
delete(pendingResponses, msgCmd) (3)
}
case sccHandlerStart: (4)
// Warn on unbalanced callback signalling.
if handlerActive {
log.Warn("Received handler start " +
"control command while a " +
"handler is already active")
continue
}
handlerActive = true
handlersStartTime = time.Now()
case sccHandlerDone: (5)
// Warn on unbalanced callback signalling.
if !handlerActive {
log.Warn("Received handler done " +
"control command when a " +
"handler is not already active")
continue
}
// Extend active deadlines by the time it took
// to execute the callback.
duration := time.Since(handlersStartTime)
deadlineOffset += duration
handlerActive = false
default:
log.Warnf("Unsupported message command %v",
msg.command)
}
case <-stallTicker.C: (6)
// Calculate the offset to apply to the deadline based
// on how long the handlers have taken to execute since
// the last tick.
now := time.Now()
offset := deadlineOffset
if handlerActive {
offset += now.Sub(handlersStartTime) (7)
}
// Disconnect the peer if any of the pending responses
// don't arrive by their adjusted deadline.
for command, deadline := range pendingResponses {
if now.Before(deadline.Add(offset)) { (8)
continue
}
log.Debugf("Peer %s appears to be stalled or "+
"misbehaving, %s timeout -- "+
"disconnecting", p, command)
p.Disconnect()
break
}
// Reset the deadline offset for the next tick.
deadlineOffset = 0
case <-p.inQuit: (9)
// The stall handler can exit once both the input and
// output handler goroutines are done.
if ioStopped {
break out
}
ioStopped = true
case <-p.outQuit: (10)
// The stall handler can exit once both the input and
// output handler goroutines are done.
if ioStopped {
break out
}
ioStopped = true
}
}
// Drain any wait channels before going away so there is nothing left
// waiting on this goroutine.
cleanup:
for { (11)
select {
case <-p.stallControl:
default:
break cleanup
}
}
log.Tracef("Peer stall handler done for %s", p)
}
其中的主要邏輯為:
- 當(dāng)收到outHandler發(fā)來的sccSendMessage時(shí),將為已經(jīng)發(fā)送的消息設(shè)定收到響應(yīng)消息的超時(shí)時(shí)間deadline,并緩存入pendingResponses中,如代碼(1)處所示;
- 當(dāng)收到inHandler發(fā)來的sccReceiveMessage時(shí),如果是響應(yīng)消息,則將對(duì)應(yīng)消息命令和其deadline從pendingResponses中移除,不需要再跟蹤該消息響應(yīng)是否超時(shí),如代碼(2)、(3)處所示。請(qǐng)注意這里只是根據(jù)消息命令或者類型來匹配請(qǐng)求和響應(yīng),并沒有通過序列號(hào)或請(qǐng)求ID來嚴(yán)格匹配,這一方面是由于節(jié)點(diǎn)對(duì)收和發(fā)均作了串行化處理,另一方面是由于節(jié)點(diǎn)同步到最新區(qū)塊后,Peer之間的消息交換并不是非常頻繁;
- 當(dāng)收到inHandler發(fā)來的sccHandlerStart時(shí),說明inHandler開始處理接收到的消息,為了防止下一條響應(yīng)消息因?yàn)楫?dāng)前消息處理時(shí)間過程而導(dǎo)致超時(shí),stallHandler將在收到sccHandlerStart和sccHandlerDone時(shí),計(jì)算處理當(dāng)前消息的時(shí)間,并在檢測(cè)下一條響應(yīng)消息是否超時(shí)時(shí)將前一條消息的處理時(shí)間考慮進(jìn)去;
- 代碼(5)處收到inHandler發(fā)來的sccHandlerDone時(shí),表明當(dāng)前接收到的消息已經(jīng)處理完畢,用當(dāng)前時(shí)間減去開始處理消息的時(shí)點(diǎn),即得到處理消息所花費(fèi)的時(shí)間deadlineOffset,這個(gè)時(shí)間差將被用于調(diào)節(jié)下一個(gè)響應(yīng)消息的超時(shí)門限;
- 代碼(6)處stallTicker每隔15s觸發(fā),用于周期性地檢查是否有消息的響應(yīng)超時(shí),如果有響應(yīng)已經(jīng)超時(shí),則主動(dòng)斷開該P(yáng)eer連接。如果在當(dāng)前檢查時(shí)點(diǎn)與上一個(gè)檢查時(shí)點(diǎn)之間有一條接收消息正在處理或者剛處理完畢,則超時(shí)門限延長前一條接收消息的處理時(shí)長,如代碼(7)、(8)處所示,以免因前一條消息處理太耗時(shí)而導(dǎo)致下一條響應(yīng)消息超時(shí)。然而,如果某一條消息的處理時(shí)間過長,導(dǎo)致有多于1條響應(yīng)消息被延遲讀取和處理,則下一條消息之后的響應(yīng)消息大概仍然會(huì)超時(shí),所以要避免在處理接收消息的回調(diào)函數(shù)中作耗時(shí)操作;如果網(wǎng)絡(luò)延時(shí)大,導(dǎo)致inHandler讀取下一條響應(yīng)消息時(shí)等待時(shí)間過長,也會(huì)導(dǎo)致超時(shí);
- 代碼(9)、(10)處保證當(dāng)inHandler和outHandler均退出后,stallHandler才結(jié)束處理循環(huán),準(zhǔn)備退出;
- 代碼(11)處stallHandler將stallControl channel中的消息清空,并最后退出;
stallHandler跟蹤發(fā)送消息與對(duì)應(yīng)的響應(yīng)消息,每隔15s檢查是否有響應(yīng)消息超時(shí),同時(shí)修正了當(dāng)前響應(yīng)消息處理時(shí)間對(duì)下一條響應(yīng)消息超時(shí)檢查的影響,當(dāng)超時(shí)發(fā)生時(shí)主動(dòng)斷開與Peer的連接,可以重新選擇其它Peer開始同步,保證了Peer收發(fā)消息時(shí)不會(huì)因網(wǎng)絡(luò)延遲或處理耗時(shí)而影響區(qū)塊同步效率。當(dāng)然,為了維持和Peer之間的連接關(guān)系,當(dāng)前節(jié)點(diǎn)與Peer節(jié)點(diǎn)之間定時(shí)發(fā)送Ping/Pong心跳,Ping消息的發(fā)送由pingHandler來處理,Peer節(jié)點(diǎn)收到后回復(fù)Pong消息。
//btcd/peer/peer.go
// pingHandler periodically pings the peer. It must be run as a goroutine.
func (p *Peer) pingHandler() {
pingTicker := time.NewTicker(pingInterval)
defer pingTicker.Stop()
out:
for {
select {
case <-pingTicker.C:
nonce, err := wire.RandomUint64()
if err != nil {
log.Errorf("Not sending ping to %s: %v", p, err)
continue
}
p.QueueMessage(wire.NewMsgPing(nonce), nil)
case <-p.quit:
break out
}
}
}
pingHandler的邏輯相對(duì)簡單,主要是以2分鐘為周期向Peer發(fā)送Ping消息;當(dāng)p.quit被關(guān)閉時(shí),pingHandler退出。
到此,我們已經(jīng)全部了解了5個(gè)Handler或goroutine的執(zhí)行過程,它們是Peer之間收發(fā)消息的框架。然而,我們還沒有介紹消息是由誰發(fā)送出去或者從哪里讀到,為了弄清楚它,我們可以看看Peer的readMessage和writeMessage方法:
//btcd/peer/peer.go
// readMessage reads the next bitcoin message from the peer with logging.
func (p *Peer) readMessage() (wire.Message, []byte, error) {
n, msg, buf, err := wire.ReadMessageN(p.conn, p.ProtocolVersion(),
p.cfg.ChainParams.Net)
atomic.AddUint64(&p.bytesReceived, uint64(n))
if p.cfg.Listeners.OnRead != nil {
p.cfg.Listeners.OnRead(p, n, msg, err)
}
if err != nil {
return nil, nil, err
}
......
return msg, buf, nil
}
// writeMessage sends a bitcoin message to the peer with logging.
func (p *Peer) writeMessage(msg wire.Message) error {
// Don't do anything if we're disconnecting.
if atomic.LoadInt32(&p.disconnect) != 0 {
return nil
}
......
// Write the message to the peer.
n, err := wire.WriteMessageN(p.conn, msg, p.ProtocolVersion(),
p.cfg.ChainParams.Net)
atomic.AddUint64(&p.bytesSent, uint64(n))
if p.cfg.Listeners.OnWrite != nil {
p.cfg.Listeners.OnWrite(p, n, msg, err)
}
return err
}
可以看到,真正的收發(fā)消息都由wire的ReadMessage()和WriteMessage()處理,這里我們不展開分析,將在后續(xù)文章介紹btcd/wire時(shí)說明。實(shí)際上,消息的收發(fā)最終是讀或者寫p.conn,它是一個(gè)net.Conn,也就是消息的收發(fā)都是讀寫Peer之間的net連接。p.conn在Peer的AssociateConnection()方法中初始化,它是在connMgr成功建立起Peer之間的TCP連接后調(diào)用的。
// AssociateConnection associates the given conn to the peer. Calling this
// function when the peer is already connected will have no effect.
func (p *Peer) AssociateConnection(conn net.Conn) {
// Already connected?
if !atomic.CompareAndSwapInt32(&p.connected, 0, 1) {
return
}
p.conn = conn
p.timeConnected = time.Now()
......
go func() {
if err := p.start(); err != nil {
log.Debugf("Cannot start peer %v: %v", p, err)
p.Disconnect()
}
}()
}
到此,我們就了解了Peer收發(fā)消息機(jī)制的全貌,它的基本機(jī)制如下圖所示:
可以看到,Peer之間收發(fā)消息的前提是成功建立了網(wǎng)絡(luò)連接,那Peer之間是如何建立并維護(hù)它們之間的TCP連接的呢?我們將在下一篇文章《Btcd區(qū)塊在P2P網(wǎng)絡(luò)上的傳播之ConnMgr》中介紹。