心血來潮的想看看回環檢測，然后發現詞袋模型是怎么產生的都不會（這真是一個悲傷的故事），所以就仔細看了一下它的代碼，除此之外還問了問做自然語言處理的室友，室友說這個方法已經很老了（不禁淚目，只能解釋傳統的才是優秀的），但最起碼還是問明白了一些。以下內容都是我自己的理解，如果有人看到覺得不對的請批評指正。

詞袋模型的生成

總的來說，如果要用DBoW3產生一本字典，首先需要對一幅圖像提取特征并且生成描述子，在ORBSLAM中，使用的是BREIF描述子，這樣一幅圖像就可以使用描述子來表示了。之后訓練一個字典則需要調用以下函數：

DBOW3:Vocabulary vocab;
vocab.create(descriptors);
vocab.save("vocabulary.yml.gz");

以上代碼的descriptors是從多幅圖像中提取的描述子，他的類型可以是vector<cv::Mat>或者是vector<vector<cv::Mat>>。create函數就是具體生成詞袋模型的函數，下面可以具體看一下create函數。源碼中重載了多種create函數，但是最核心的部分還是void Vocabulary::create (const std::vector<std::vector<cv::Mat> > &training_features )。在該函數中主要的算法有以下幾個部分，下面會根據不同的部分來進行分析。

// create root
    m_nodes.push_back ( Node ( 0 ) ); // root

    // create the tree
    HKmeansStep ( 0, features, 1 );

    // create the words
    createWords();

    // and set the weight of each node of the tree
    setNodeWeights ( training_features );

K叉樹節點的形式

為了提升查找效率，DBoW使用了K叉樹的方式來對描述子進行存儲并且查找。對于樹結構來說，最重要的是需要保存他的父節點和子節點，除此之外，它的id值也可以進行存儲。所以DBoW3中，每個節點的形式代碼所示，整個K叉樹如圖所示：

 /// Tree node
  struct Node 
  {
    /// Node id
    NodeId id;   //unsigned int
    /// Weight if the node is a word
    WordValue weight;  //double
    /// Children 
    std::vector<NodeId> children;
    /// Parent node (undefined in case of root)
    NodeId parent;
    /// Node descriptor
    cv::Mat descriptor;

    /// Word id if the node is a word
    WordId word_id;

    /**
     * Empty constructor
     */
    Node(): id(0), weight(0), parent(0), word_id(0){}
    
    /**
     * Constructor
     * @param _id node id
     */
    Node(NodeId _id): id(_id), weight(0), parent(0), word_id(0){}

    /**
     * Returns whether the node is a leaf node
     * @return true iff the node is a leaf
     */
    inline bool isLeaf() const { return children.empty(); }
  };

K叉樹.png

在樹中，根據根節點個數以及層數的不同，樹的節點共有 $\frac{k^{(L+1)}}{k-1}$ 個，而id值就是從0開始一共有這個多個，而最底層葉子節點是存儲了每個描述子的信息，而上面的每一層中的節點值都代表他們的聚類中心，根據每一類中心的不同，找單詞的時候就比原來效率提高了許多。word_id指的是單詞的id值，他從0開始共有 $k^L$ 個，只有葉子節點才會有這個word id值。
聚類主要是使用了KMeans++算法，它相較于KMeans多了一個自主選擇初始聚類中心的過程，他們的算法如下所示。

Kmeans++

該算法主要是為了選出合適的聚類中心，因為對于Kmeans來說，聚類中心的選取是隨機的并不能很好的表現出數據的特點，所以使用KMeans++可以得到合適的聚類中心，它主要的算法流程為：

1、從數據點中均勻隨機選取一個數據作為中心點。
2、對于每一個中心點x，計算其他點與x之間的最短距離D(x)。
3、如果D(x)越大，則證明他被選取為中心點的可能性越大，使用輪盤法選出下一個聚類中心。
4、重復步驟2和3，直到得到k個聚類中心。
5、至此，就得到了出事的聚類中心

初始化聚類中心的代碼在Vocabulary::initiateClustersKMpp中，我覺得最核心的代碼就是通過輪盤法計算聚類中心的過程。

double cut_d;
do
{
    cut_d = ( double ( rand() ) / double ( RAND_MAX ) ) * dist_sum;   //randomly choose one value between the sum of the distance
}
while ( cut_d == 0.0 );

double d_up_now = 0;
for ( dit = min_dists.begin(); dit != min_dists.end(); ++dit )
{
    d_up_now += *dit;
    if ( d_up_now >= cut_d ) break;   //choose the value 
 }

if ( dit == min_dists.end() )  //choose the center index
    ifeature = pfeatures.size()-1;
else
    ifeature = dit - min_dists.begin();

該段代碼的核心思想就是在總的距離之間隨機選取一個值，可以想象，如果距離的值越大，在總和之中占據的比例也越大，隨機選取得到的點在該區間的概率也越大，總而言之，該隨機選取得到的值在大值中的可能性也越大，這樣就有可能選取到與當前聚類中心相聚比較遠的點。如果并不是很理解的話，可以參考K-means與K-means++。在距離計算的時候，該代碼使用的是bit運算，具體的可以參考Bit Twiddling Hacks，是一個介紹bit運算非常好的網站。

KMeans

KMeans算法的主要步驟為：

1、隨機選取得到k個樣本作為聚類中心： $c_1, c_2,...,c_k$ （該步驟已經通過KMeans++得到）；
2、對于每一個樣本，計算他們與中心點之間的距離，取最小的距離的中心作為他們的歸類；
3、重新計算每個類的中心點；
4、如果每個樣本的中心都不再變化，則算法收斂，可以退出；否則返回1。

該算法的主要代碼在Vocabulary::HKmeansStep中，具體操作詳見代碼，這里就不展開討論了。

樹的生成

比如在第1層得到k個聚類中心以及每個中心中對應的特征點集合之后，就需要將其生成樹節點，每個樹節點產生的形式如下：

// create nodes
    for ( unsigned int i = 0; i < clusters.size(); ++i )
    {
        NodeId id = m_nodes.size();
        m_nodes.push_back ( Node ( id ) );   //m_nodes represents the tree,
        m_nodes.back().descriptor = clusters[i];  //represent the cluster
        m_nodes.back().parent = parent_id; 
        m_nodes[parent_id].children.push_back ( id );   //save the children's information
    }

如果層數沒有到達L，則再繼續對每個節點進行聚類。

// go on with the next level
    if ( current_level < m_L )
    {
        // iterate again with the resulting clusters
        const std::vector<NodeId> &children_ids = m_nodes[parent_id].children;
        for ( unsigned int i = 0; i < clusters.size(); ++i )
        {
            NodeId id = children_ids[i];

            std::vector<cv::Mat> child_features;
            child_features.reserve ( groups[i].size() );
            //groups reserve the descriptors of every node
            std::vector<unsigned int>::const_iterator vit;
            for ( vit = groups[i].begin(); vit != groups[i].end(); ++vit )
            {
                child_features.push_back ( descriptors[*vit] );
            }

            if ( child_features.size() > 1 )
            {
                HKmeansStep ( id, child_features, current_level + 1 );
            }
        }
    }

單詞的產生

單詞產生的函數如以下代碼所示，他主要的目的就是給葉子節點的word_id賦值，并且設置單詞（描述子）的值。

void Vocabulary::createWords()
{
    m_words.resize ( 0 );

    if ( !m_nodes.empty() )
    {
        m_words.reserve ( ( int ) pow ( ( double ) m_k, ( double ) m_L ) );


        auto  nit = m_nodes.begin(); // ignore root
        for ( ++nit; nit != m_nodes.end(); ++nit )
        {
            if ( nit->isLeaf() )
            {
                nit->word_id = m_words.size();
                m_words.push_back ( & ( *nit ) );
            }
        }
    }
}

設置節點權重

在文本處理中，對于每一個單詞的重要性是不一樣的，比如說常見的字眼“的”、“是”等等，他們出現的頻率是很高，可是他們的區分度并不高，所以他的并沒有太大的重要性，而“蜜蜂”、“鹽”等等一些名詞，并不是所有的句子都會存在的，則他們的區分度可能就會高一點，重要性也會增加。因此，在文件檢索中，一種常用的方法就是TF-IDF（Term Frequency-Inverse Document Frequency）。TF指的是某單詞在一幅圖像中經常出現，它的區分度就高。而IDF指某單詞在字典中出現的頻率越低，則分類圖像時區分度越高。之前我一直不知道這個內容有啥用，在請教了室友之后知道，這個權重可以在原本的特征維數上再加一維用來表示重要程度，這一維數據會使得匹配結果更加的準確。所以一副圖像就可以表示為：
$I = \{(w_1,\eta _1) , ...,(w_N,\eta _N)\} = \mathbf{v}_I$
其中 $w_i$ 表示TF-IDF的權重， $\eta_i$ 表示圖像中提取得到的描述子。在DBoW3中，描述子的權重如以下代碼所示：

void Vocabulary::setNodeWeights
( const std::vector<std::vector<cv::Mat> > &training_features )
{
    const unsigned int NWords = m_words.size();
    const unsigned int NDocs = training_features.size();

    if ( m_weighting == TF || m_weighting == BINARY )
    {
        // idf part must be 1 always
        for ( unsigned int i = 0; i < NWords; i++ )
            m_words[i]->weight = 1;
    }
    else if ( m_weighting == IDF || m_weighting == TF_IDF )
    {
        // IDF and TF-IDF: we calculte the idf path now

        // Note: this actually calculates the idf part of the tf-idf score.
        // The complete tf-idf score is calculated in ::transform

        std::vector<unsigned int> Ni ( NWords, 0 );
        std::vector<bool> counted ( NWords, false );


        for ( auto mit = training_features.begin(); mit != training_features.end(); ++mit )
        {
            fill ( counted.begin(), counted.end(), false );

            for ( auto fit = mit->begin(); fit < mit->end(); ++fit )
            {
                WordId word_id;
                transform ( *fit, word_id );

                if ( !counted[word_id] )
                {
                    Ni[word_id]++;
                    counted[word_id] = true;
                }
            }
        }

        // set ln(N/Ni)
        for ( unsigned int i = 0; i < NWords; i++ )
        {
            if ( Ni[i] > 0 )
            {
                m_words[i]->weight = log ( ( double ) NDocs / ( double ) Ni[i] );
            }// else // This cannot occur if using kmeans++
        }

    }

}

至此，字典就正式生成了，描述子的內容和權重存儲在m_words中，而m_nodes存儲了每個節點的信息。

字典的保存

字典保存的函數在void Vocabulary::save ( cv::FileStorage &f,const std::string &name ) const中，具體內容就不詳述了。

參考資料

DBow3代碼
視覺SLAM十四講

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

DBoW：字典的生成

DBoW：字典的生成

詞袋模型的生成

K叉樹節點的形式

Kmeans++

KMeans

樹的生成

單詞的產生

設置節點權重

字典的保存

參考資料

推薦閱讀更多精彩內容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

DBoW：字典的生成

詞袋模型的生成

K叉樹節點的形式

Kmeans++

KMeans

樹的生成

單詞的產生

設置節點權重

字典的保存

參考資料

推薦閱讀更多精彩內容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频