欧美日本日韩,狠狠色综合网站久久久久久久高清 ,成人动漫综合网

介紹完BoltDB后，我們回到btcd/database的源代碼。了解了BoltDB的實現(xiàn)后，btcd/database的接口定義和其調(diào)用方法將變得容易理解。然而，包database并未實現(xiàn)一個數(shù)據(jù)庫，它實際上是btcd中的存儲框架，使btcd支持多種數(shù)據(jù)庫，其中，ffldb是database包中提供的默認數(shù)據(jù)庫。在clone完代碼后，可以發(fā)現(xiàn)包database主要包含的文件有:

cmd/dbtool: 實現(xiàn)了一個從db文件中讀寫block的工具。
ffldb: 實現(xiàn)了一個默認的數(shù)據(jù)庫驅(qū)動，它參考BoltDB實現(xiàn)了DB、Bucket、Tx等;
internal/treap：一個樹堆的實現(xiàn)，用于緩存元數(shù)據(jù);
testdata: 包含用于測試的db文件;
driver.go: 定義了Driver類型及注冊、打開數(shù)據(jù)庫的方法;
interface.go: 定義了DB、Bucket、Tx、Cursor等接口，幾乎與BoltDB中的定義一致;
error.go: 定義了包database中的錯誤代碼及對應的提示字符;
doc.go: 包database的描述;
driver_test.go、error_test.go、example_test.go、export_test.go: 對應的測試文件;

需要說明的是，ffldb并不是真正意義上的數(shù)據(jù)庫，它利用leveldb來存儲元數(shù)據(jù)，用文件來存區(qū)塊。對元數(shù)據(jù)的存儲，ffldb參考BoltDB的實現(xiàn)，支持Bucket及嵌套子Bucket；對區(qū)塊或者元數(shù)據(jù)的讀寫，它也實現(xiàn)了類似的Transaction。特別地，ffldb通過leveldb存儲元數(shù)據(jù)時，增加了一層緩存以提高讀寫效率。它的基本框架如下圖所示:

我們先來看看包database中的接口DB的定義:

//btcd/database/interface.go

type DB interface {
    // Type returns the database driver type the current database instance
    // was created with.
    Type() string

    ......
    Begin(writable bool) (Tx, error)

    ......
    View(fn func(tx Tx) error) error

    ......
    Update(fn func(tx Tx) error) error

    ......
    Close() error
}

可以看出，其中的接口定義與BoltDB中的定義幾乎一樣。事實上，Bucket及Cursor等接口均與BoltDB類似，Tx接口由于增加了對metadata和block的操作，有所不同:

//btcd/database/interface.go

// Tx represents a database transaction.  It can either by read-only or
// read-write.  The transaction provides a metadata bucket against which all
// read and writes occur.
//
// As would be expected with a transaction, no changes will be saved to the
// database until it has been committed.  The transaction will only provide a
// view of the database at the time it was created.  Transactions should not be
// long running operations.
type Tx interface {
    // Metadata returns the top-most bucket for all metadata storage.
    Metadata() Bucket

    ......
    StoreBlock(block *btcutil.Block) error

    ......
    HasBlock(hash *chainhash.Hash) (bool, error)

    ......
    HasBlocks(hashes []chainhash.Hash) ([]bool, error)

    ......
    FetchBlockHeader(hash *chainhash.Hash) ([]byte, error)

    ......
    FetchBlockHeaders(hashes []chainhash.Hash) ([][]byte, error)

    ......
    FetchBlock(hash *chainhash.Hash) ([]byte, error)

    ......
    FetchBlocks(hashes []chainhash.Hash) ([][]byte, error)

    ......
    FetchBlockRegion(region *BlockRegion) ([]byte, error)

    ......
    FetchBlockRegions(regions []BlockRegion) ([][]byte, error)

    // ******************************************************************
    // Methods related to both atomic metadata storage and block storage.
    // ******************************************************************

    ......
    Commit() error

    ......
    Rollback() error
}

由于篇幅原因，我們略去了各接口的注釋，讀者可以從源文件中閱讀。從Tx接口的定義中可以看出，它主要定義了三類方法:

Metadata(), 通過它可以獲得根Bucket，所有的元數(shù)據(jù)均歸屬于Bucket，Bucket及其中的K/V對最終存于leveldb中。在一個Transaction中，對元數(shù)據(jù)的操作均是通過Metadata()得到Bucket后，再在Bucket中進行操作的;
XxxBlockXxx，與Block操作相關的接口，它們主要是通過讀寫文件來讀寫B(tài)lock;
Commit()和Rollback，在可寫Tx中寫元數(shù)據(jù)或者區(qū)塊后，均需要通過Commit()來提交修改并關閉Tx，或者通過Rollback來丟棄修改或關閉只讀Tx，作用與BoltDB中的一致;

ffldb提供了對上述各接口的實現(xiàn)，我們接下來著重分析它的代碼。我們先來看看它的db類型定義:

//btcd/database/ffldb/db.go

// db represents a collection of namespaces which are persisted and implements
// the database.DB interface.  All database access is performed through
// transactions which are obtained through the specific Namespace.
type db struct {
    writeLock sync.Mutex   // Limit to one write transaction at a time.
    closeLock sync.RWMutex // Make database close block while txns active.
    closed    bool         // Is the database closed?
    store     *blockStore  // Handles read/writing blocks to flat files.
    cache     *dbCache     // Cache layer which wraps underlying leveldb DB.
}

其中各字段意義是:

writeLock: 互斥鎖，保證同時只有一個可寫transaction;
closeLock: 保證數(shù)據(jù)庫Close時所有已經(jīng)打開的transaction均已結(jié)束;
closed: 指示數(shù)據(jù)庫是否已經(jīng)關閉;
store: 指向blockStore，用于讀寫區(qū)塊;
cach: 指向dbCache，用于讀寫元數(shù)據(jù);

db實現(xiàn)了database.DB接口，其中各方法的實現(xiàn)與BoltDB中基本類似，也是通過View()或者Update()的回調(diào)方法獲取Tx對象或其引用，然后調(diào)用Tx中接口進行數(shù)據(jù)庫操作，故我們不再分析db的各方法實現(xiàn)，重點分析Tx的實現(xiàn)。ffldb中transaction的定義如下，它實現(xiàn)了database.Tx接口:

//btcd/database/ffldb/db.go

// transaction represents a database transaction.  It can either be read-only or
// read-write and implements the database.Bucket interface.  The transaction
// provides a root bucket against which all read and writes occur.
type transaction struct {
    managed        bool             // Is the transaction managed?
    closed         bool             // Is the transaction closed?
    writable       bool             // Is the transaction writable?
    db             *db              // DB instance the tx was created from.
    snapshot       *dbCacheSnapshot // Underlying snapshot for txns.
    metaBucket     *bucket          // The root metadata bucket.
    blockIdxBucket *bucket          // The block index bucket.

    // Blocks that need to be stored on commit.  The pendingBlocks map is
    // kept to allow quick lookups of pending data by block hash.
    pendingBlocks    map[chainhash.Hash]int
    pendingBlockData []pendingBlock

    // Keys that need to be stored or deleted on commit.
    pendingKeys   *treap.Mutable
    pendingRemove *treap.Mutable

    // Active iterators that need to be notified when the pending keys have
    // been updated so the cursors can properly handle updates to the
    // transaction state.
    activeIterLock sync.RWMutex
    activeIters    []*treap.Iterator
}

其中各字段意義:

managed: transaction是否被db托管，托管狀態(tài)的transaction不能再主動調(diào)用Commit()或者Rollback();
closed: 指示當前transaction是否已經(jīng)結(jié)束;
writable: 指示當前transaction是否可寫;
db: 指向與當前transaction綁定的db對象;
snapshot: 當前transaction讀到的元數(shù)據(jù)緩存的一個快照，在transaction打開的時候?qū)bCache進行快照得到的，也是元數(shù)據(jù)存儲中MVCC機制的一部分，類似于BoltDB中讀meta page;
metaBucket: 存儲元數(shù)據(jù)的根Bucket;
blockIdxBucket: 存儲區(qū)塊hash與其序號的Bucket，它是metaBucket的第一個子Bucket，且只在ffldb內(nèi)部使用;
pendingBlocks: 記錄待提交Block的哈希與其在pendingBlockData中的位置的對應關系;
pendingBlockData: 順序記錄所有待提交Block的字節(jié)序列;
pendingKeys: 待添加或者更新的元數(shù)據(jù)集合，請注意，它指向一個樹堆;
pendingRemove: 待刪除的元數(shù)據(jù)集合，它也指向一個樹堆，與pendingKeys一樣，它們均通過dbCache向leveldb中更新;
activeIterLock: 對activeIters的保護鎖;
activeIters: 用于記錄當前transaction中查找dbCache的Iterators，當向dbCache中更新Key時，樹堆旋轉(zhuǎn)會更新節(jié)點間關系，故需將所有活躍的Iterator復位;

我們說transaction中主要有三類方法，我們先來看看它的Metadata()方法:

//btcd/database/ffldb/db.go

// Metadata returns the top-most bucket for all metadata storage.
//
// This function is part of the database.Tx interface implementation.
func (tx *transaction) Metadata() database.Bucket {
    return tx.metaBucket
}

可以看出它僅僅是返回根Bucket，剩下的操作均通過它來進行。我們來看看bucket的定義，它實現(xiàn)了database.Bucket：

//btcd/database/ffldb/db.go

// bucket is an internal type used to represent a collection of key/value pairs
// and implements the database.Bucket interface.
type bucket struct {
    tx *transaction
    id [4]byte
}

需要注意的是，ffldb中的bucket與BoltDB中的Bucket雖然有著相同的接口定義，但它們底層實際存儲K/V對的數(shù)據(jù)結(jié)構(gòu)并不相同，所以bucket的定義和查找方法大不相同。ffldb利用leveldb來存儲K/V，leveldb底層數(shù)據(jù)結(jié)構(gòu)為LSM樹（log-structured merge-tree），而BoltDB采用B+Tree。ffldb利用leveldb提供的接口來讀寫K/V，而levealdb中沒有Bucket的概念，也沒有對Key進行分層管理的方法，那ffldb中是如何實現(xiàn)bucket的呢？我們可以通過CreateBucket()來分析:

//btcd/database/ffldb/db.go

// CreateBucket creates and returns a new nested bucket with the given key.
//
// Returns the following errors as required by the interface contract:
//   - ErrBucketExists if the bucket already exists
//   - ErrBucketNameRequired if the key is empty
//   - ErrIncompatibleValue if the key is otherwise invalid for the particular
//     implementation
//   - ErrTxNotWritable if attempted against a read-only transaction
//   - ErrTxClosed if the transaction has already been closed
//
// This function is part of the database.Bucket interface implementation.
func (b *bucket) CreateBucket(key []byte) (database.Bucket, error) {
    
    ......

    // Ensure bucket does not already exist.
    bidxKey := bucketIndexKey(b.id, key)
    ......

    // Find the appropriate next bucket ID to use for the new bucket.  In
    // the case of the special internal block index, keep the fixed ID.
    var childID [4]byte
    if b.id == metadataBucketID && bytes.Equal(key, blockIdxBucketName) {
        childID = blockIdxBucketID
    } else {
        var err error
        childID, err = b.tx.nextBucketID()
        if err != nil {
            return nil, err
        }
    }

    // Add the new bucket to the bucket index.
    if err := b.tx.putKey(bidxKey, childID[:]); err != nil {
        str := fmt.Sprintf("failed to create bucket with key %q", key)
        return nil, convertErr(str, err)
    }
    return &bucket{tx: b.tx, id: childID}, nil
}

上面代碼主要包含:

通過bucketIndexKey()創(chuàng)建子Bucket的Key;
為子Bucket指定或者選擇一個id;
將子Bucket的Key和id作為K/V記錄存入父Bucket中，這一點與BoltDB相似;

與BoltDB中通過K/V的flag來標記Bucket不同，ffldb中通過Key的格式來標記Bucket:

//btcd/database/ffldb/db.go

// bucketIndexKey returns the actual key to use for storing and retrieving a
// child bucket in the bucket index.  This is required because additional
// information is needed to distinguish nested buckets with the same name.
func bucketIndexKey(parentID [4]byte, key []byte) []byte {
    // The serialized bucket index key format is:
    //   <bucketindexprefix><parentbucketid><bucketname>
    indexKey := make([]byte, len(bucketIndexPrefix)+4+len(key))
    copy(indexKey, bucketIndexPrefix)
    copy(indexKey[len(bucketIndexPrefix):], parentID[:])
    copy(indexKey[len(bucketIndexPrefix)+4:], key)
    return indexKey
}

可以看出，一個子Bucket的Key總是“<bucketindexprefix><parentbucketid><bucketname>”的形式，反過來說，如果一個Bucket中的Key是這一形式，那它對應一個子Bucket，它的Value記錄子Bucket的id。也就是說，ffldb中是通過Bucket的Key的分層形式來標記父子關系的。然而，BoltDB中子Bucket對應一顆獨立的B+Tree，當向子Bucket中添加K/V時，就是向?qū)腂+Tree中插入記錄，那ffldb是如何實現(xiàn)向子Bucket中添加K/V的呢，反過來說，如何確定K/V屬于哪個Bucket呢？我們來看看bucket的Put()方法:

//btcd/database/ffldb/db.go

// Put saves the specified key/value pair to the bucket.  Keys that do not
// already exist are added and keys that already exist are overwritten.
//
// Returns the following errors as required by the interface contract:
//   - ErrKeyRequired if the key is empty
//   - ErrIncompatibleValue if the key is the same as an existing bucket
//   - ErrTxNotWritable if attempted against a read-only transaction
//   - ErrTxClosed if the transaction has already been closed
//
// This function is part of the database.Bucket interface implementation.
func (b *bucket) Put(key, value []byte) error {

    ......

    return b.tx.putKey(bucketizedKey(b.id, key), value)
}

其中的關鍵也在于Key，在Bucket中添加記錄時，會通過bucketizedKey()對key進行處理:

//btcd/database/ffldb/db.go

// bucketizedKey returns the actual key to use for storing and retrieving a key
// for the provided bucket ID.  This is required because bucketizing is handled
// through the use of a unique prefix per bucket.
func bucketizedKey(bucketID [4]byte, key []byte) []byte {
    // The serialized block index key format is:
    //   <bucketid><key>
    bKey := make([]byte, 4+len(key))
    copy(bKey, bucketID[:])
    copy(bKey[4:], key)
    return bKey
}

也就是說，在向bucket中添加K/V時，Key會被轉(zhuǎn)換成“<bucketid><key>”的形式，從而標記這一記錄屬于id為“<bucketid>”的bucket。以上就是ffldb中通過兩種Key的分層格式來標記子Bucket及子Bucket中K/V對的方法，最終向leveldb中寫入K/V時并沒有bucket的概念，所有的Key是一種扁平結(jié)構(gòu)。與bucket綁定的cursor也是通過leveldb中的Iterator實現(xiàn)的，我們不再專門分析，感興趣的讀者可以自行分析。同時，從bucket的Put()方法也可以看出，添加的K/V會通過transaction的putKey()方法先加入到pendingKeys中:

//btcd/database/ffldb/db.go

// putKey adds the provided key to the list of keys to be updated in the
// database when the transaction is committed.
//
// NOTE: This function must only be called on a writable transaction.  Since it
// is an internal helper function, it does not check.
func (tx *transaction) putKey(key, value []byte) error {
    // Prevent the key from being deleted if it was previously scheduled
    // to be deleted on transaction commit.
    tx.pendingRemove.Delete(key)

    // Add the key/value pair to the list to be written on transaction
    // commit.
    tx.pendingKeys.Put(key, value)
    tx.notifyActiveIters()
    return nil
}

類似地，bucket的Delete()方法也是調(diào)用transaction的deleteKey()方法來實現(xiàn)。deleteKey()中，會將要刪除的key添加到pendingRemove中，待transaction Commit最終將pendingKeys添加到leveldb中，pendingRemove中的Key從leveldb中刪除。bucket的Get()方法也會最終調(diào)用transaction的fetchKey()方法來查詢，fetchKey()先從
pendingRemove或者pendingKeys查找，如果找不到再從dbCache的一個快照中查找。

transaction中第二類是讀取Block相關的方法，我們主要分析StoreBlock()和FetchBlock()，先來看看StoreBlock:

//btcd/database/ffldb/db.go

// StoreBlock stores the provided block into the database.  There are no checks
// to ensure the block connects to a previous block, contains double spends, or
// any additional functionality such as transaction indexing.  It simply stores
// the block in the database.
//
// Returns the following errors as required by the interface contract:
//   - ErrBlockExists when the block hash already exists
//   - ErrTxNotWritable if attempted against a read-only transaction
//   - ErrTxClosed if the transaction has already been closed
//
// This function is part of the database.Tx interface implementation.
func (tx *transaction) StoreBlock(block *btcutil.Block) error {

    ......

    // Reject the block if it already exists.
    blockHash := block.Hash()
    ......

    blockBytes, err := block.Bytes()
    ......

    // Add the block to be stored to the list of pending blocks to store
    // when the transaction is committed.  Also, add it to pending blocks
    // map so it is easy to determine the block is pending based on the
    // block hash.
    if tx.pendingBlocks == nil {
        tx.pendingBlocks = make(map[chainhash.Hash]int)
    }
    tx.pendingBlocks[*blockHash] = len(tx.pendingBlockData)
    tx.pendingBlockData = append(tx.pendingBlockData, pendingBlock{
        hash:  blockHash,
        bytes: blockBytes,
    })
    log.Tracef("Added block %s to pending blocks", blockHash)

    return nil
}

可以看出，StoreBlock()主要是把block先放入pendingBlockData，等待Commit時寫入文件。我們再來看看FetchBlock():

//btcd/database/ffldb/db.go

// FetchBlock returns the raw serialized bytes for the block identified by the
// given hash.  The raw bytes are in the format returned by Serialize on a
// wire.MsgBlock.
//
// Returns the following errors as required by the interface contract:
//   - ErrBlockNotFound if the requested block hash does not exist
//   - ErrTxClosed if the transaction has already been closed
//   - ErrCorruption if the database has somehow become corrupted
//
// In addition, returns ErrDriverSpecific if any failures occur when reading the
// block files.
//
// NOTE: The data returned by this function is only valid during a database
// transaction.  Attempting to access it after a transaction has ended results
// in undefined behavior.  This constraint prevents additional data copies and
// allows support for memory-mapped database implementations.
//
// This function is part of the database.Tx interface implementation.
func (tx *transaction) FetchBlock(hash *chainhash.Hash) ([]byte, error) {

    ......

    // When the block is pending to be written on commit return the bytes
    // from there.
    if idx, exists := tx.pendingBlocks[*hash]; exists {
        return tx.pendingBlockData[idx].bytes, nil
    }

    // Lookup the location of the block in the files from the block index.
    blockRow, err := tx.fetchBlockRow(hash)
    if err != nil {
        return nil, err
    }
    location := deserializeBlockLoc(blockRow)

    // Read the block from the appropriate location.  The function also
    // performs a checksum over the data to detect data corruption.
    blockBytes, err := tx.db.store.readBlock(hash, location)
    if err != nil {
        return nil, err
    }

    return blockBytes, nil
}

讀Block時先從pendingBlocks中查找，如果有則直接從pendingBlockData中返回；否則，通過db中的blockStore讀出區(qū)塊。我們先不深入blockStore，待介紹完transaction的Commit后再來分析它。關鍵地，我們可以發(fā)現(xiàn)，通過transaction讀寫元數(shù)據(jù)或者Block時，均會先對pendingBlocks或pendingKeys與pedingRemove讀寫，它們可以看作transaction的緩沖，在Commit時被同步到文件或者leveldb中。Commit()最終調(diào)用writePendingAndCommit()進行實際操作:

//btcd/database/ffldb/db.go

// writePendingAndCommit writes pending block data to the flat block files,
// updates the metadata with their locations as well as the new current write
// location, and commits the metadata to the memory database cache.  It also
// properly handles rollback in the case of failures.
//
// This function MUST only be called when there is pending data to be written.
func (tx *transaction) writePendingAndCommit() error {

    ......

    // Loop through all of the pending blocks to store and write them.
    for _, blockData := range tx.pendingBlockData {
        log.Tracef("Storing block %s", blockData.hash)
        location, err := tx.db.store.writeBlock(blockData.bytes)
        if err != nil {
            rollback()
            return err
        }

        // Add a record in the block index for the block.  The record
        // includes the location information needed to locate the block
        // on the filesystem as well as the block header since they are
        // so commonly needed.
        blockHdr := blockData.bytes[0:blockHdrSize]
        blockRow := serializeBlockRow(location, blockHdr)
        err = tx.blockIdxBucket.Put(blockData.hash[:], blockRow)
        if err != nil {
            rollback()
            return err
        }
    }

    // Update the metadata for the current write file and offset.
    writeRow := serializeWriteRow(wc.curFileNum, wc.curOffset)
    if err := tx.metaBucket.Put(writeLocKeyName, writeRow); err != nil {
        rollback()
        return convertErr("failed to store write cursor", err)
    }

    // Atomically update the database cache.  The cache automatically
    // handles flushing to the underlying persistent storage database.
    return tx.db.cache.commitTx(tx)
}

writePendingAndCommit()中，主要包含:

通過blockStore將pendingBlockData中的區(qū)塊寫入文件，同時將區(qū)塊的hash與它在文件中的位置寫入blockIdxBucket，以便后續(xù)查找;
更新metaBucket中記錄當前文件讀寫位置的K/V;
通過dbCache中commitTx()將待提交的K/V寫入樹堆緩存，必要時寫入leveldb;

blockStore

transaction中讀寫元數(shù)據(jù)或者區(qū)塊時，最終會通過blockStore讀寫文件或者dbCache讀寫樹堆或者leveldb。所以接下來，我們主要分析blockStore和dbCache。我們先來看blockStore的定義:

//btcd/database/ffldb/blockio.go

// blockStore houses information used to handle reading and writing blocks (and
// part of blocks) into flat files with support for multiple concurrent readers.
type blockStore struct {
    // network is the specific network to use in the flat files for each
    // block.
    network wire.BitcoinNet

    // basePath is the base path used for the flat block files and metadata.
    basePath string

    // maxBlockFileSize is the maximum size for each file used to store
    // blocks.  It is defined on the store so the whitebox tests can
    // override the value.
    maxBlockFileSize uint32

    // The following fields are related to the flat files which hold the
    // actual blocks.   The number of open files is limited by maxOpenFiles.
    //
    // obfMutex protects concurrent access to the openBlockFiles map.  It is
    // a RWMutex so multiple readers can simultaneously access open files.
    //
    // openBlockFiles houses the open file handles for existing block files
    // which have been opened read-only along with an individual RWMutex.
    // This scheme allows multiple concurrent readers to the same file while
    // preventing the file from being closed out from under them.
    //
    // lruMutex protects concurrent access to the least recently used list
    // and lookup map.
    //
    // openBlocksLRU tracks how the open files are refenced by pushing the
    // most recently used files to the front of the list thereby trickling
    // the least recently used files to end of the list.  When a file needs
    // to be closed due to exceeding the the max number of allowed open
    // files, the one at the end of the list is closed.
    //
    // fileNumToLRUElem is a mapping between a specific block file number
    // and the associated list element on the least recently used list.
    //
    // Thus, with the combination of these fields, the database supports
    // concurrent non-blocking reads across multiple and individual files
    // along with intelligently limiting the number of open file handles by
    // closing the least recently used files as needed.
    //
    // NOTE: The locking order used throughout is well-defined and MUST be
    // followed.  Failure to do so could lead to deadlocks.  In particular,
    // the locking order is as follows:
    //   1) obfMutex
    //   2) lruMutex
    //   3) writeCursor mutex
    //   4) specific file mutexes
    //
    // None of the mutexes are required to be locked at the same time, and
    // often aren't.  However, if they are to be locked simultaneously, they
    // MUST be locked in the order previously specified.
    //
    // Due to the high performance and multi-read concurrency requirements,
    // write locks should only be held for the minimum time necessary.
    obfMutex         sync.RWMutex
    lruMutex         sync.Mutex
    openBlocksLRU    *list.List // Contains uint32 block file numbers.
    fileNumToLRUElem map[uint32]*list.Element
    openBlockFiles   map[uint32]*lockableFile

    // writeCursor houses the state for the current file and location that
    // new blocks are written to.
    writeCursor *writeCursor

    // These functions are set to openFile, openWriteFile, and deleteFile by
    // default, but are exposed here to allow the whitebox tests to replace
    // them when working with mock files.
    openFileFunc      func(fileNum uint32) (*lockableFile, error)
    openWriteFileFunc func(fileNum uint32) (filer, error)
    deleteFileFunc    func(fileNum uint32) error
}

其各字段意義如下:

network: 指示當前Block網(wǎng)絡類型，比如MainNet、TestNet或SimNet，在向文件中寫入?yún)^(qū)塊時會指定該區(qū)塊來自哪類網(wǎng)絡;
basePath: 存儲Block的文件在磁盤上的存儲路徑;
maxBlockFileSize: 存儲Block文件的最大的Size;
obfMutex: 對openBlockFiles進行保護的讀寫鎖;
lruMutex：對openBlocksLRU和fileNumToLRUElem進行保護的互斥鎖;
openBlocksLRU: 已打開文件的序號的LRU列表，默認的最大打開文件數(shù)是25;
fileNumToLRUElem: 記錄文件序號與openBlocksLRU中元素的對應關系;
openBlockFiles: 記錄所有打開的只讀文件的序號與文件指針的對應關系;
writeCursor: 指向當前寫入的文件，記錄其文件序號和寫偏移;
openFileFunc、openWriteFileFunc以及deleteFileFunc: openFile、openWriteFile和deleteFile的接口方法，主要用于測試，它的默認實現(xiàn)是blockStore的對應方法。

我們還是通過blockStore的readBlock()和writeBlock()方法來了解blockStore的工作機制。我們先來看看readBlokc():

//btcd/database/ffldb/blockio.go

// readBlock reads the specified block record and returns the serialized block.
// It ensures the integrity of the block data by checking that the serialized
// network matches the current network associated with the block store and
// comparing the calculated checksum against the one stored in the flat file.
// This function also automatically handles all file management such as opening
// and closing files as necessary to stay within the maximum allowed open files
// limit.
//
// Returns ErrDriverSpecific if the data fails to read for any reason and
// ErrCorruption if the checksum of the read data doesn't match the checksum
// read from the file.
//
// Format: <network><block length><serialized block><checksum>
func (s *blockStore) readBlock(hash *chainhash.Hash, loc blockLocation) ([]byte, error) {
    // Get the referenced block file handle opening the file as needed.  The
    // function also handles closing files as needed to avoid going over the
    // max allowed open files.
    blockFile, err := s.blockFile(loc.blockFileNum)
    if err != nil {
        return nil, err
    }

    serializedData := make([]byte, loc.blockLen)
    n, err := blockFile.file.ReadAt(serializedData, int64(loc.fileOffset))
    blockFile.RUnlock()
    if err != nil {
        str := fmt.Sprintf("failed to read block %s from file %d, "+
            "offset %d: %v", hash, loc.blockFileNum, loc.fileOffset,
            err)
        return nil, makeDbErr(database.ErrDriverSpecific, str, err)
    }

    // Calculate the checksum of the read data and ensure it matches the
    // serialized checksum.  This will detect any data corruption in the
    // flat file without having to do much more expensive merkle root
    // calculations on the loaded block.
    serializedChecksum := binary.BigEndian.Uint32(serializedData[n-4:])
    calculatedChecksum := crc32.Checksum(serializedData[:n-4], castagnoli)
    if serializedChecksum != calculatedChecksum {
        str := fmt.Sprintf("block data for block %s checksum "+
            "does not match - got %x, want %x", hash,
            calculatedChecksum, serializedChecksum)
        return nil, makeDbErr(database.ErrCorruption, str, nil)
    }

    // The network associated with the block must match the current active
    // network, otherwise somebody probably put the block files for the
    // wrong network in the directory.
    serializedNet := byteOrder.Uint32(serializedData[:4])
    if serializedNet != uint32(s.network) {
        str := fmt.Sprintf("block data for block %s is for the "+
            "wrong network - got %d, want %d", hash, serializedNet,
            uint32(s.network))
        return nil, makeDbErr(database.ErrDriverSpecific, str, nil)
    }

    // The raw block excludes the network, length of the block, and
    // checksum.
    return serializedData[8 : n-4], nil
}

其主要步驟為:

通過blockFile()查詢已經(jīng)打開的文件或者新打開一個文件;
通過file.ReadAt()方法從文件中的loc.fileOffset位置讀出區(qū)塊數(shù)據(jù)，它的格式是“<network><block length><serialized block><checksum>”;
從區(qū)塊數(shù)據(jù)中解析出block的字節(jié)流;

其中比較重要的是通過blockFile()得到一個文件句柄，我們來看看它的實現(xiàn):

//btcd/database/ffldb/blockio.go

// blockFile attempts to return an existing file handle for the passed flat file
// number if it is already open as well as marking it as most recently used.  It
// will also open the file when it's not already open subject to the rules
// described in openFile.
//
// NOTE: The returned block file will already have the read lock acquired and
// the caller MUST call .RUnlock() to release it once it has finished all read
// operations.  This is necessary because otherwise it would be possible for a
// separate goroutine to close the file after it is returned from here, but
// before the caller has acquired a read lock.
func (s *blockStore) blockFile(fileNum uint32) (*lockableFile, error) {
    // When the requested block file is open for writes, return it.
    wc := s.writeCursor
    wc.RLock()
    if fileNum == wc.curFileNum && wc.curFile.file != nil {
        obf := wc.curFile
        obf.RLock()
        wc.RUnlock()
        return obf, nil
    }
    wc.RUnlock()

    // Try to return an open file under the overall files read lock.
    s.obfMutex.RLock()
    if obf, ok := s.openBlockFiles[fileNum]; ok {
        s.lruMutex.Lock()
        s.openBlocksLRU.MoveToFront(s.fileNumToLRUElem[fileNum])
        s.lruMutex.Unlock()

        obf.RLock()
        s.obfMutex.RUnlock()
        return obf, nil
    }
    s.obfMutex.RUnlock()

    // Since the file isn't open already, need to check the open block files
    // map again under write lock in case multiple readers got here and a
    // separate one is already opening the file.
    s.obfMutex.Lock()                                                               (1)
    if obf, ok := s.openBlockFiles[fileNum]; ok {
        obf.RLock()
        s.obfMutex.Unlock()
        return obf, nil
    }

    // The file isn't open, so open it while potentially closing the least
    // recently used one as needed.
    obf, err := s.openFileFunc(fileNum)
    if err != nil {
        s.obfMutex.Unlock()
        return nil, err
    }
    obf.RLock()
    s.obfMutex.Unlock()
    return obf, nil
}

它的主要步驟是:

檢查要查找的文件是否是writeCursor指向的文件，如果是則直接返回；請注意，對writeCursor的訪問通過其讀鎖保護；同時，blockFile()返回的lockableFile對象已經(jīng)被自己的讀鎖保護，由調(diào)用方負責釋放文件的讀鎖。如果返回writeCursor指向的文件，即正在向該文件寫入?yún)^(qū)塊，它在被寫滿時會被關閉，讀鎖可以保護關閉該文件時必須等讀文件結(jié)束;
接著，從blockStore記錄的openBlockFiles中查找文件，如果找到，將文件移至LRU列表的首位置，同時獲得文件讀鎖后返回;
代碼(1)處獲取s.obfMutex的寫鎖并再次從openBlockFiles中查找文件，這是為了防止剛剛從openBlockFiles查找完成后，目標文件被其他線程打開并添加到openBlockFiles中了，如果不作此保護，在openBlockFiles中未找到就打開新文件，可能出現(xiàn)同一個文件被多次打開的情況。有讀者可能會想到: 為什么不在第一次查找openBlockFiles就通過s.obfMutex的寫鎖保護呢？這里也為了提高對openBlockFiles的讀寫并發(fā)，openBlockFiles存的均是最近打開過的文件，有較大概率在第一次查找openBlockFiles就能找到目標文件，通過s.obfMutex的讀鎖保護，能提高從openBlockFiles查找的并發(fā)量;
如果openBlockFiles中找不到目標文件，就調(diào)用openFile()打開新文件，請注意整個openFile()調(diào)用均在s.obfMutex的寫鎖保護下;

//btcd/database/ffldb/blockio.go

// openFile returns a read-only file handle for the passed flat file number.
// The function also keeps track of the open files, performs least recently
// used tracking, and limits the number of open files to maxOpenFiles by closing
// the least recently used file as needed.
//
// This function MUST be called with the overall files mutex (s.obfMutex) locked
// for WRITES.
func (s *blockStore) openFile(fileNum uint32) (*lockableFile, error) {
    // Open the appropriate file as read-only.
    filePath := blockFilePath(s.basePath, fileNum)
    file, err := os.Open(filePath)
    if err != nil {
        return nil, makeDbErr(database.ErrDriverSpecific, err.Error(),
            err)
    }
    blockFile := &lockableFile{file: file}

    // Close the least recently used file if the file exceeds the max
    // allowed open files.  This is not done until after the file open in
    // case the file fails to open, there is no need to close any files.
    //
    // A write lock is required on the LRU list here to protect against
    // modifications happening as already open files are read from and
    // shuffled to the front of the list.
    //
    // Also, add the file that was just opened to the front of the least
    // recently used list to indicate it is the most recently used file and
    // therefore should be closed last.
    s.lruMutex.Lock()
    lruList := s.openBlocksLRU
    if lruList.Len() >= maxOpenFiles {
        lruFileNum := lruList.Remove(lruList.Back()).(uint32)
        oldBlockFile := s.openBlockFiles[lruFileNum]

        // Close the old file under the write lock for the file in case
        // any readers are currently reading from it so it's not closed
        // out from under them.
        oldBlockFile.Lock()
        _ = oldBlockFile.file.Close()
        oldBlockFile.Unlock()

        delete(s.openBlockFiles, lruFileNum)
        delete(s.fileNumToLRUElem, lruFileNum)
    }
    s.fileNumToLRUElem[fileNum] = lruList.PushFront(fileNum)
    s.lruMutex.Unlock()

    // Store a reference to it in the open block files map.
    s.openBlockFiles[fileNum] = blockFile

    return blockFile, nil
}

openFile()中主要執(zhí)行:

直接通過os.Open()調(diào)用以只讀模式打開目標文件;
檢測openBlocksLRU是否已滿，如果已滿，則將列表末尾元素移除，同時將對應的文件關閉并從openBlockFiles將其移除，然后將新打開的文件添加了列表首位置；其中對openBlocksLRU和fileNumToLRUElem的訪問均在s.lruMutex保護下;
將新打開的文件放入openBlockFiles中;

從openFile()中可以看出，blockStore通過openBlockFiles和openBlocksLRU及fileNumToLRUElem維護了一個已經(jīng)打開的只讀文件的LRU緩存列表，可以加快從文件中讀區(qū)塊的速度。接下來，我們再來看看writeBlock():

//btcd/database/ffldb/blockio.go

// writeBlock appends the specified raw block bytes to the store's write cursor
// location and increments it accordingly.  When the block would exceed the max
// file size for the current flat file, this function will close the current
// file, create the next file, update the write cursor, and write the block to
// the new file.
//
// The write cursor will also be advanced the number of bytes actually written
// in the event of failure.
//
// Format: <network><block length><serialized block><checksum>
func (s *blockStore) writeBlock(rawBlock []byte) (blockLocation, error) {
    // Compute how many bytes will be written.
    // 4 bytes each for block network + 4 bytes for block length +
    // length of raw block + 4 bytes for checksum.
    blockLen := uint32(len(rawBlock))
    fullLen := blockLen + 12

    // Move to the next block file if adding the new block would exceed the
    // max allowed size for the current block file.  Also detect overflow
    // to be paranoid, even though it isn't possible currently, numbers
    // might change in the future to make it possible.
    //
    // NOTE: The writeCursor.offset field isn't protected by the mutex
    // since it's only read/changed during this function which can only be
    // called during a write transaction, of which there can be only one at
    // a time.
    wc := s.writeCursor
    finalOffset := wc.curOffset + fullLen
    if finalOffset < wc.curOffset || finalOffset > s.maxBlockFileSize {
        // This is done under the write cursor lock since the curFileNum
        // field is accessed elsewhere by readers.
        //
        // Close the current write file to force a read-only reopen
        // with LRU tracking.  The close is done under the write lock
        // for the file to prevent it from being closed out from under
        // any readers currently reading from it.
        wc.Lock()
        wc.curFile.Lock()                                                    (1)
        if wc.curFile.file != nil {
            _ = wc.curFile.file.Close()
            wc.curFile.file = nil
        }
        wc.curFile.Unlock()

        // Start writes into next file.
        wc.curFileNum++                                                      (2)
        wc.curOffset = 0                                                     (3)
        wc.Unlock()
    }

    // All writes are done under the write lock for the file to ensure any
    // readers are finished and blocked first.
    wc.curFile.Lock()
    defer wc.curFile.Unlock()

    // Open the current file if needed.  This will typically only be the
    // case when moving to the next file to write to or on initial database
    // load.  However, it might also be the case if rollbacks happened after
    // file writes started during a transaction commit.
    if wc.curFile.file == nil {
        file, err := s.openWriteFileFunc(wc.curFileNum)                      (4)
        if err != nil {
            return blockLocation{}, err
        }
        wc.curFile.file = file
    }

    // Bitcoin network.
    origOffset := wc.curOffset                                               (5)
    hasher := crc32.New(castagnoli)
    var scratch [4]byte
    byteOrder.PutUint32(scratch[:], uint32(s.network))
    if err := s.writeData(scratch[:], "network"); err != nil {
        return blockLocation{}, err
    }
    _, _ = hasher.Write(scratch[:])

    // Block length.
    byteOrder.PutUint32(scratch[:], blockLen)
    if err := s.writeData(scratch[:], "block length"); err != nil {
        return blockLocation{}, err
    }
    _, _ = hasher.Write(scratch[:])

    // Serialized block.
    if err := s.writeData(rawBlock[:], "block"); err != nil {
        return blockLocation{}, err
    }
    _, _ = hasher.Write(rawBlock)

    // Castagnoli CRC-32 as a checksum of all the previous.
    if err := s.writeData(hasher.Sum(nil), "checksum"); err != nil {
        return blockLocation{}, err
    }

    loc := blockLocation{                                                    (6)
        blockFileNum: wc.curFileNum,
        fileOffset:   origOffset,
        blockLen:     fullLen,
    }
    return loc, nil
}

其主要步驟為:

檢測寫入?yún)^(qū)塊后是否超過文件大小限制，如果超過，則關閉當前文件，新創(chuàng)建一個文件; 否則，直接在當前文件的wc.curOffset偏移處開始寫區(qū)塊;
代碼(1)處關閉writeCursor指向的文件，在調(diào)用Close()之前，獲取了lockableFile的寫鎖，以防其他線程正在讀該文件;
代碼(2)將writeCursor指向下一個文件，代碼(3)處將文件內(nèi)偏移復位;
代碼(4)處調(diào)用openWriteFile()以可讀寫方式打開或者創(chuàng)建一個新的文件，同時將writeCursor指向該文件；
代碼(5)處記錄下寫區(qū)塊的文件內(nèi)起始偏移位置，隨后開始向文件中寫區(qū)塊數(shù)據(jù);
依次向文件中寫入網(wǎng)絡號、區(qū)塊長度值、區(qū)塊數(shù)據(jù)和前三項的crc32檢驗和，可以看出存于文件上的區(qū)塊封裝格式為: "<network><block length><serialized block><checksum>"
代碼(6)處創(chuàng)建被寫入?yún)^(qū)塊對應的blockLocation對象，它由存儲區(qū)塊的文件的序號、區(qū)塊存儲位置在該文件內(nèi)的起始偏移及封裝后的區(qū)塊長度構(gòu)成，最后返回該blockLocation對象;

dbCache

通過readBlock()和writeBlock()我們基本上可以了解blockStore的整個工作機制，它主要是通過一個LRU列表來管理已經(jīng)打開的只讀文件，并通過writeCursor來記錄當前寫的入文件及文件內(nèi)偏移，在寫入?yún)^(qū)塊時，如果寫入?yún)^(qū)塊后超過了設置的最大文件Size，則另起一個新的文件寫入。理解了這一點后，blockStore的其他代碼均不難理解。接下來，我們主要分析dbCache的代碼，先來看看它的定義：

//btcd/database/ffldb/dbcache.go

// dbCache provides a database cache layer backed by an underlying database.  It
// allows a maximum cache size and flush interval to be specified such that the
// cache is flushed to the database when the cache size exceeds the maximum
// configured value or it has been longer than the configured interval since the
// last flush.  This effectively provides transaction batching so that callers
// can commit transactions at will without incurring large performance hits due
// to frequent disk syncs.
type dbCache struct {
    // ldb is the underlying leveldb DB for metadata.
    ldb *leveldb.DB

    // store is used to sync blocks to flat files.
    store *blockStore

    // The following fields are related to flushing the cache to persistent
    // storage.  Note that all flushing is performed in an opportunistic
    // fashion.  This means that it is only flushed during a transaction or
    // when the database cache is closed.
    //
    // maxSize is the maximum size threshold the cache can grow to before
    // it is flushed.
    //
    // flushInterval is the threshold interval of time that is allowed to
    // pass before the cache is flushed.
    //
    // lastFlush is the time the cache was last flushed.  It is used in
    // conjunction with the current time and the flush interval.
    //
    // NOTE: These flush related fields are protected by the database write
    // lock.
    maxSize       uint64
    flushInterval time.Duration
    lastFlush     time.Time

    // The following fields hold the keys that need to be stored or deleted
    // from the underlying database once the cache is full, enough time has
    // passed, or when the database is shutting down.  Note that these are
    // stored using immutable treaps to support O(1) MVCC snapshots against
    // the cached data.  The cacheLock is used to protect concurrent access
    // for cache updates and snapshots.
    cacheLock    sync.RWMutex
    cachedKeys   *treap.Immutable
    cachedRemove *treap.Immutable
}

其中各字段意義如下:

ldb: 指向leveldb的DB對象，用于向leveldb中存取K/V;
store: 指向當前db下的blockStore，用于向leveldb中寫元數(shù)據(jù)之前，通過blockStore將區(qū)塊緩存強制寫入磁盤;
maxSize: 簡單地講，它是緩存的待添加和刪除的元數(shù)據(jù)的總大小限制，默認值為100M;
flushInterval: 向leveldb中寫數(shù)據(jù)的時間間隔;
lastFlush: 上次向leveldb中寫數(shù)據(jù)的時間戳;
cacheLock: 對cachedKeys和cachedRemove進行讀寫保護，它們會在dbCache向leveldb寫數(shù)據(jù)時更新，在dbCache快照時被讀取;
cachedKeys: 緩存待添加的Key，它指向一個樹堆;
cachedRemove: 緩存待刪除的Key，它也指向一個樹堆，請注意，cachedKeys和cachedRemove與transaction中的pendingKeys和pendingRemove有區(qū)別，pendingKeys和pendingRemove是可修改樹堆(*treap.Mutable)，而cachedKeys和cachedRemove是不可修改樹堆(*treap.Immutable)，且通常情況下（不滿足needsFlush()時）pendingKeys和pendingRemove先向cachedKeys和cachedRemove同步，再向leveldb中更新，我們將在dbCache的commitTx()中更清楚地了解這一點。treap.Mutable和treap.Immutable將在本文最后介紹。

我們在transaction的writePendingAndCommit()方法中看到transaction Commit的最后一步就是調(diào)用dbCache的commitTx()來提交元數(shù)據(jù)的更新，所以我們先來看看commitTX()方法：

//btcd/database/ffldb/dbcache.go

// commitTx atomically adds all of the pending keys to add and remove into the
// database cache.  When adding the pending keys would cause the size of the
// cache to exceed the max cache size, or the time since the last flush exceeds
// the configured flush interval, the cache will be flushed to the underlying
// persistent database.
//
// This is an atomic operation with respect to the cache in that either all of
// the pending keys to add and remove in the transaction will be applied or none
// of them will.
//
// The database cache itself might be flushed to the underlying persistent
// database even if the transaction fails to apply, but it will only be the
// state of the cache without the transaction applied.
//
// This function MUST be called during a database write transaction which in
// turn implies the database write lock will be held.
func (c *dbCache) commitTx(tx *transaction) error {
    // Flush the cache and write the current transaction directly to the
    // database if a flush is needed.
    if c.needsFlush(tx) {                                                     (1)
        if err := c.flush(); err != nil {                                     (2)
            return err
        }

        // Perform all leveldb updates using an atomic transaction.
        err := c.commitTreaps(tx.pendingKeys, tx.pendingRemove)               (3)
        if err != nil {
            return err
        }

        // Clear the transaction entries since they have been committed.
        tx.pendingKeys = nil
        tx.pendingRemove = nil
        return nil
    }

    // At this point a database flush is not needed, so atomically commit
    // the transaction to the cache.

    // Since the cached keys to be added and removed use an immutable treap,
    // a snapshot is simply obtaining the root of the tree under the lock
    // which is used to atomically swap the root.
    c.cacheLock.RLock()
    newCachedKeys := c.cachedKeys
    newCachedRemove := c.cachedRemove
    c.cacheLock.RUnlock()

    // Apply every key to add in the database transaction to the cache.
    tx.pendingKeys.ForEach(func(k, v []byte) bool {                           (5)
        newCachedRemove = newCachedRemove.Delete(k)
        newCachedKeys = newCachedKeys.Put(k, v)
        return true
    })
    tx.pendingKeys = nil

    // Apply every key to remove in the database transaction to the cache.
    tx.pendingRemove.ForEach(func(k, v []byte) bool {                         (6)
        newCachedKeys = newCachedKeys.Delete(k)
        newCachedRemove = newCachedRemove.Put(k, nil)
        return true
    })
    tx.pendingRemove = nil

    // Atomically replace the immutable treaps which hold the cached keys to
    // add and delete.
    c.cacheLock.Lock()
    c.cachedKeys = newCachedKeys                                              (7)
    c.cachedRemove = newCachedRemove
    c.cacheLock.Unlock()
    return nil
}

其中的主要步驟為:

如果離上一次flush已經(jīng)超過一個刷新周期且dbCache已滿，則調(diào)用flush()將樹堆中的緩存寫入leveldb，并將transaction中的待添加和移除的Keys通過commitTreaps()方法直接寫入leveldb，寫完后清空pendingKeys和pendingRemove；
如果不需要flush，則代碼(5)和(6)處將transaction中的pendingKeys添加到newCachedKeys中，將pendingRemove添加到newCachedRemove，即將tx中待添加和刪除的Keys寫入dbCache。這里要注意兩點: 1). 將pendingKeys中的Key添加到newCachedKeys時，得先將相同的Key從newCachedRemove中移除，以免寫入leveldb時該Key被刪除。向newCachedRemove添加Key時也須將相同Key從newCachedKeys移除，以免本來要刪除的Key又被寫入leveldb；2). cachedKeys和cachedRemove均是treap.Immutable指針，相應地，newCachedKeys和newCachedRemove也是treap.Immutable指針。treap.Immutable類型的樹堆實現(xiàn)了類似于寫時復制(COW)的機制來提高讀寫并發(fā)，當通過Put()或者Delete()來更新樹堆的節(jié)點時，需要更新的節(jié)點會復制一份與不需要更新的老的節(jié)點組成一顆新的樹堆返回。代碼(5)和(6)處newCachedKeys和newCachedRemove重新指向Delete()或者Put()調(diào)用的返回值，實際上是指向了一個新的樹堆，而c.cachedKeys和c.cachedRemove仍然指向修改之前的樹堆，所以這時如果通過Snapshot()獲取dbCache的快照，快照中的cachedKeys和cachedRemove并不包含transation的pendingKeys和pendingRemove。這可以看成是dbCache的MVCC實現(xiàn)。
最后，代碼(7)處更新dbCache中的cachedKeys和cachedRemove。請注意，更新操作通過c.cacheLock的寫鎖保護。更新c.cachedKeys和c.cachedRemove后，再通過Snapshot()拿到的dbCache快照中就包含了transaction提交的pendingKeys和pendingRemove;

接下來，我們看看flush的實現(xiàn):

//btcd/database/ffldb/dbcache.go

// flush flushes the database cache to persistent storage.  This involes syncing
// the block store and replaying all transactions that have been applied to the
// cache to the underlying database.
//
// This function MUST be called with the database write lock held.
func (c *dbCache) flush() error {
    c.lastFlush = time.Now()

    // Sync the current write file associated with the block store.  This is
    // necessary before writing the metadata to prevent the case where the
    // metadata contains information about a block which actually hasn't
    // been written yet in unexpected shutdown scenarios.
    if err := c.store.syncBlocks(); err != nil {                              (1)
        return err
    }

    // Since the cached keys to be added and removed use an immutable treap,
    // a snapshot is simply obtaining the root of the tree under the lock
    // which is used to atomically swap the root.
    c.cacheLock.RLock()
    cachedKeys := c.cachedKeys
    cachedRemove := c.cachedRemove
    c.cacheLock.RUnlock()

    // Nothing to do if there is no data to flush.
    if cachedKeys.Len() == 0 && cachedRemove.Len() == 0 {
        return nil
    }

    // Perform all leveldb updates using an atomic transaction.
    if err := c.commitTreaps(cachedKeys, cachedRemove); err != nil {         (2)
        return err
    }

    // Clear the cache since it has been flushed.
    c.cacheLock.Lock()
    c.cachedKeys = treap.NewImmutable()                                      (3)
    c.cachedRemove = treap.NewImmutable()
    c.cacheLock.Unlock()

    return nil
}

其中主要步驟為:

調(diào)用blockStore的syncBlocks()強制將文件緩沖寫入磁盤文件，以防止meta數(shù)據(jù)與區(qū)塊文件中的狀態(tài)不一致;
通過commitTreaps()將dbCache中的緩存寫入leveldb;
將cachedKeys和cachedRemove置為空的樹堆，實際上是清空dbCache;

dbCache的commitTreaps()比較簡單，它主要是調(diào)用leveldb的Put和Delete依次將cachedKeys和cachedRemove更新到leveldb中，我們就不作專門分析了，讀者可以自行閱讀其源代碼。我們來看看dbCache的Snapshot():

//btcd/database/ffldb/dbcache.go

// Snapshot returns a snapshot of the database cache and underlying database at
// a particular point in time.
//
// The snapshot must be released after use by calling Release.
func (c *dbCache) Snapshot() (*dbCacheSnapshot, error) {
    dbSnapshot, err := c.ldb.GetSnapshot()
    if err != nil {
        str := "failed to open transaction"
        return nil, convertErr(str, err)
    }

    // Since the cached keys to be added and removed use an immutable treap,
    // a snapshot is simply obtaining the root of the tree under the lock
    // which is used to atomically swap the root.
    c.cacheLock.RLock()
    cacheSnapshot := &dbCacheSnapshot{
        dbSnapshot:    dbSnapshot,
        pendingKeys:   c.cachedKeys,
        pendingRemove: c.cachedRemove,
    }
    c.cacheLock.RUnlock()
    return cacheSnapshot, nil
}

可以看到，它實際上就是通過leveldb的Snapshot、c.cachedKeys和c.cachedRemove構(gòu)建一個dbCacheSnapshot對象，在dbCacheSnapshot中查找Key時，先從cachedKeys或cachedRemove查找，再從leveldb的Snapshot查找。transaction中的snapshot就是指向該對象。

treap

通過上面幾個方法的分析，我們就清楚了dbCache緩存Key、刷新緩存及讀緩存的過程。dbCache中用于實際緩存的數(shù)據(jù)結(jié)構(gòu)是treap.Immutable，它是dbCache的核心。Btcd中的treap既實現(xiàn)了Immutable版本，同時也提供了Muttable版本。接下來，我們就開始分析treap的實現(xiàn)。對于不了解treap的讀者，可以閱讀BYVoid同學寫的《隨機平衡二叉查找樹Treap的分析與應用》。簡單地講，樹堆是二叉查找樹與堆的結(jié)合體，為了實現(xiàn)動態(tài)平衡，在二叉查找樹的節(jié)點中引入一個隨機值，用于對節(jié)點進行堆排序，讓二叉查找樹同時形成最大堆或者最小堆，從而保證其平衡性。樹堆查找的時間復雜度為O(logN)。由于篇幅限制，我們不打算完整分析treap的代碼，將主要分析Mutable和Immutable的Put()方法來了解treap的構(gòu)建、添加節(jié)點后的旋轉(zhuǎn)及Immutable的寫時復制等過程。

我們先來看看Immutable、Mutable的定義:

//btcd/database/internal/treap/mutable.go

// Mutable represents a treap data structure which is used to hold ordered
// key/value pairs using a combination of binary search tree and heap semantics.
// It is a self-organizing and randomized data structure that doesn't require
// complex operations to maintain balance.  Search, insert, and delete
// operations are all O(log n).
type Mutable struct {
    root  *treapNode
    count int

    // totalSize is the best estimate of the total size of of all data in
    // the treap including the keys, values, and node sizes.
    totalSize uint64
}


//btcd/database/internal/treap/immutable.go

// Immutable represents a treap data structure which is used to hold ordered
// key/value pairs using a combination of binary search tree and heap semantics.
// It is a self-organizing and randomized data structure that doesn't require
// complex operations to maintain balance.  Search, insert, and delete
// operations are all O(log n).  In addition, it provides O(1) snapshots for
// multi-version concurrency control (MVCC).
//
// All operations which result in modifying the treap return a new version of
// the treap with only the modified nodes updated.  All unmodified nodes are
// shared with the previous version.  This is extremely useful in concurrent
// applications since the caller only has to atomically replace the treap
// pointer with the newly returned version after performing any mutations.  All
// readers can simply use their existing pointer as a snapshot since the treap
// it points to is immutable.  This effectively provides O(1) snapshot
// capability with efficient memory usage characteristics since the old nodes
// only remain allocated until there are no longer any references to them.
type Immutable struct {
    root  *treapNode
    count int

    // totalSize is the best estimate of the total size of of all data in
    // the treap including the keys, values, and node sizes.
    totalSize uint64
}

Immutable和Mutable的定義完全一樣，它們的區(qū)別在于Immutable提供了寫時復制，我們將在Put()方法中看到他們的區(qū)別。其中的root字段指向樹堆的根節(jié)點，節(jié)點的定義為:

//btcd/database/internal/treap/common.go

// treapNode represents a node in the treap.
type treapNode struct {
    key      []byte
    value    []byte
    priority int
    left     *treapNode
    right    *treapNode
}

treapNode中的key和value就是樹堆節(jié)點的值，priority是用于構(gòu)建堆的隨機修正值，也叫節(jié)點的優(yōu)先級，left和right分別指向左右子樹根節(jié)點。我們先來看看Mutable的Put()方法，來了解樹堆的構(gòu)建和插入節(jié)點后的旋轉(zhuǎn)過程:

//btcd/database/internal/treap/mutable.go

// Put inserts the passed key/value pair.
func (t *Mutable) Put(key, value []byte) {
    // Use an empty byte slice for the value when none was provided.  This
    // ultimately allows key existence to be determined from the value since
    // an empty byte slice is distinguishable from nil.
    if value == nil {
        value = emptySlice
    }

    // The node is the root of the tree if there isn't already one.
    if t.root == nil {                                                   (1)
        node := newTreapNode(key, value, rand.Int())
        t.count = 1
        t.totalSize = nodeSize(node)
        t.root = node
        return
    }

    // Find the binary tree insertion point and construct a list of parents
    // while doing so.  When the key matches an entry already in the treap,
    // just update its value and return.
    var parents parentStack
    var compareResult int
    for node := t.root; node != nil; {
        parents.Push(node)
        compareResult = bytes.Compare(key, node.key)
        if compareResult < 0 {
            node = node.left                                            (2)
            continue
        }
        if compareResult > 0 {
            node = node.right                                           (3)
            continue
        }

        // The key already exists, so update its value.
        t.totalSize -= uint64(len(node.value))
        t.totalSize += uint64(len(value))
        node.value = value                                              (4)
        return
    }

    // Link the new node into the binary tree in the correct position.
    node := newTreapNode(key, value, rand.Int())                        (5)
    t.count++
    t.totalSize += nodeSize(node)
    parent := parents.At(0)
    if compareResult < 0 {
        parent.left = node                                              (6)
    } else {
        parent.right = node                                             (7)
    }

    // Perform any rotations needed to maintain the min-heap.
    for parents.Len() > 0 {
        // There is nothing left to do when the node's priority is
        // greater than or equal to its parent's priority.
        parent = parents.Pop()
        if node.priority >= parent.priority {                           (8)
            break
        }

        // Perform a right rotation if the node is on the left side or
        // a left rotation if the node is on the right side.
        if parent.left == node {
            node.right, parent.left = parent, node.right                (9)
        } else {
            node.left, parent.right = parent, node.left                 (10)  
        }
        t.relinkGrandparent(node, parent, parents.At(0))
    }
}

......

// relinkGrandparent relinks the node into the treap after it has been rotated
// by changing the passed grandparent's left or right pointer, depending on
// where the old parent was, to point at the passed node.  Otherwise, when there
// is no grandparent, it means the node is now the root of the tree, so update
// it accordingly.
func (t *Mutable) relinkGrandparent(node, parent, grandparent *treapNode) {
    // The node is now the root of the tree when there is no grandparent.
    if grandparent == nil {
        t.root = node                                                   (11)
        return
    }

    // Relink the grandparent's left or right pointer based on which side
    // the old parent was.
    if grandparent.left == parent {
        grandparent.left = node                                         (12)
    } else {
        grandparent.right = node                                        (13)
    }
}

其中的主要步驟為:

對于空樹，添加的第一個節(jié)點直接成為根節(jié)點，如代碼(1)處所示，可以看到，節(jié)點的priority是由rand.Int()生成的隨機整數(shù);
對于非空樹，根據(jù)key來查找待插入的位置，并通過parentStack來記錄查找路徑。從根節(jié)點開始，如果待插入的Key小于根的Key，則進入左子樹繼續(xù)查找，如代碼(2)處所示；如果待插入的Key大于根的Key，則進入右子樹繼續(xù)查找，如代碼(3)處所示；如果待插入的Key正好的當前節(jié)點的Key，則直接更新其Value，如代碼(4)處所示;
當樹中沒有找到Key，則應插入新的節(jié)點，此時parents中的最后一個節(jié)點就是新節(jié)點的父節(jié)點，請注意，parents.At(0)是查找路徑上的最后一個節(jié)點。如果待插入的Key小于父節(jié)點的Key，則新節(jié)點變成父節(jié)點的左子節(jié)點，如代碼(6)處所示；否則，成為右子節(jié)點，如代碼(7)處所示;
由于新節(jié)點的priority是隨機產(chǎn)生的，它插入樹中后，樹可能不滿足最小堆性質(zhì)了，所以接下來需要進行旋轉(zhuǎn)。旋轉(zhuǎn)過程需要向上遞歸進行直到整顆樹滿足最小難序。代碼(8)處，如果新節(jié)點的優(yōu)化級正好大于或者等于父節(jié)點的優(yōu)先級，則不用旋轉(zhuǎn)，樹已經(jīng)滿足最小難序；如果新節(jié)點的優(yōu)化級小于父節(jié)點的優(yōu)化級，則需要旋轉(zhuǎn)，將父節(jié)點變成新節(jié)點的子節(jié)點。如果新節(jié)點是父節(jié)點的左子節(jié)點，則需要進行右旋，如果代碼(9)所示；如果新節(jié)點是父節(jié)點的右子節(jié)點，則需要進行左旋，如代碼(10)所示;
進行左旋或右旋后，原父節(jié)點變成新節(jié)點的子節(jié)點，但祖節(jié)點(原父節(jié)點的父節(jié)點)的子節(jié)點還指向原父節(jié)點，relinkGrandparent()將繼續(xù)完成旋轉(zhuǎn)過程。如果祖節(jié)點是空，則說明原父節(jié)點就是樹的根，不需要調(diào)整直接將新節(jié)點變成樹的根即可，如代碼(11)處所示；代碼(12)和(13)實際上是將新節(jié)點替代原父節(jié)點，變成祖節(jié)點的左子節(jié)點或者右子節(jié)點;
新節(jié)點、原父節(jié)點、祖節(jié)點完成旋轉(zhuǎn)后，新節(jié)點變成了新的父節(jié)點，原交節(jié)點變成子節(jié)點，祖節(jié)點不變，但此時新節(jié)點的優(yōu)化級可能還大于祖節(jié)點的優(yōu)化級，則新的父節(jié)點、祖節(jié)點及祖節(jié)點的父節(jié)點還要繼續(xù)旋轉(zhuǎn)，這一過程向上遞歸到根節(jié)點，保證查找路徑上節(jié)點均滿足最小堆序，才完成了整個旋轉(zhuǎn)過程及新節(jié)點插入過程。

從Mutable的Put()方法中，我們可以完整地了解treap的構(gòu)建、插入及涉及到的子樹旋轉(zhuǎn)過程。Immutable的Put()與Mutable的Put()實現(xiàn)步驟大致一致，不同的是，Immutable沒有直接修改原節(jié)點或旋轉(zhuǎn)原樹，而是將查找路徑上的所有節(jié)點均復制一份出來與原樹的其它節(jié)點一起形成一顆新的樹，在新樹上進行更新或者旋轉(zhuǎn)后返回新樹。它的實現(xiàn)如下:

//btcd/database/internal/treap/immutable.go

// Put inserts the passed key/value pair.
func (t *Immutable) Put(key, value []byte) *Immutable {
    // Use an empty byte slice for the value when none was provided.  This
    // ultimately allows key existence to be determined from the value since
    // an empty byte slice is distinguishable from nil.
    if value == nil {
        value = emptySlice
    }

    // The node is the root of the tree if there isn't already one.
    if t.root == nil {
        root := newTreapNode(key, value, rand.Int())
        return newImmutable(root, 1, nodeSize(root))                     (1)
    }

    // Find the binary tree insertion point and construct a replaced list of
    // parents while doing so.  This is done because this is an immutable
    // data structure so regardless of where in the treap the new key/value
    // pair ends up, all ancestors up to and including the root need to be
    // replaced.
    //
    // When the key matches an entry already in the treap, replace the node
    // with a new one that has the new value set and return.
    var parents parentStack
    var compareResult int
    for node := t.root; node != nil; {
        // Clone the node and link its parent to it if needed.
        nodeCopy := cloneTreapNode(node)
        if oldParent := parents.At(0); oldParent != nil {
            if oldParent.left == node {
                oldParent.left = nodeCopy                               (2)
            } else {
                oldParent.right = nodeCopy                              (3)
            }
        }
        parents.Push(nodeCopy)                                          (4)

        // Traverse left or right depending on the result of comparing
        // the keys.
        compareResult = bytes.Compare(key, node.key)
        if compareResult < 0 {
            node = node.left
            continue
        }
        if compareResult > 0 {
            node = node.right
            continue
        }

        // The key already exists, so update its value.
        nodeCopy.value = value                                          (5)

        // Return new immutable treap with the replaced node and
        // ancestors up to and including the root of the tree.
        newRoot := parents.At(parents.Len() - 1)                        (6)
        newTotalSize := t.totalSize - uint64(len(node.value)) +         (7)
            uint64(len(value))
        return newImmutable(newRoot, t.count, newTotalSize)             (8)
    }

    // Link the new node into the binary tree in the correct position.
    node := newTreapNode(key, value, rand.Int())
    parent := parents.At(0)
    if compareResult < 0 {
        parent.left = node
    } else {
        parent.right = node
    }

    // Perform any rotations needed to maintain the min-heap and replace
    // the ancestors up to and including the tree root.
    newRoot := parents.At(parents.Len() - 1)
    for parents.Len() > 0 {
        // There is nothing left to do when the node's priority is
        // greater than or equal to its parent's priority.
        parent = parents.Pop()
        if node.priority >= parent.priority {
            break
        }

        // Perform a right rotation if the node is on the left side or
        // a left rotation if the node is on the right side.
        if parent.left == node {
            node.right, parent.left = parent, node.right
        } else {
            node.left, parent.right = parent, node.left
        }

        // Either set the new root of the tree when there is no
        // grandparent or relink the grandparent to the node based on
        // which side the old parent the node is replacing was on.
        grandparent := parents.At(0)
        if grandparent == nil {
            newRoot = node
        } else if grandparent.left == parent {
            grandparent.left = node
        } else {
            grandparent.right = node
        }
    }

    return newImmutable(newRoot, t.count+1, t.totalSize+nodeSize(node))  (9)
}

其與Mutable的Put()方法的主要區(qū)別在于:

如果向空樹中插入一個節(jié)點，與直接將新節(jié)點變成原樹的根不同，它將以新節(jié)點為根創(chuàng)建一個新的樹堆并返回，如代碼(1)所示;
在查找待插入的Key時，查找路徑上的所有節(jié)點被復制一份出來，如代碼(2)、(3)和(4)處所示。如果找到待插入的Key，是在復制的節(jié)點上更新Value，而不是原節(jié)點上更新，如果代碼(5)處所示，節(jié)點更新后，將以復制出來的根節(jié)點來創(chuàng)建一顆新的樹并返回，如代碼(6)、(7)和(8)處所示;
接下來，如果待插入的Key不在樹中，將添加一個新的節(jié)點，且新的節(jié)點被加入到復制出來的父節(jié)點中，然后在復制的新樹上進行旋轉(zhuǎn)，最后返回新的樹，如代碼(9)所示。需要注意的是，原樹上節(jié)點均沒有更新，原樹與新樹共享查找路徑以外的其他節(jié)點。

Immutable的Put()的方法通過復制查找路徑上的節(jié)點并返回新的樹根實現(xiàn)了寫時復制，并進而支持了dbCache的MVCC。到此，ffldb的整個工作機制我們就介紹完了，其中的blockStore和dbCache及dbCache使用的數(shù)據(jù)結(jié)構(gòu)treap我們也作了詳細分析，相信大家對Bitcoin節(jié)點進行區(qū)塊的查找和存入磁盤的過程有了完整而清晰的認識了。接下來的文章，我們將介紹Btcd中網(wǎng)絡協(xié)議的實現(xiàn)，揭示區(qū)塊在P2P網(wǎng)絡中的傳遞過程。

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

Btcd區(qū)塊的存取之ffldb

Btcd區(qū)塊的存取之ffldb

blockStore

dbCache

treap

推薦閱讀更多精彩內(nèi)容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

Btcd區(qū)塊的存取之ffldb

blockStore

dbCache

treap

推薦閱讀更多精彩內(nèi)容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频