岳的大肥屁熟妇五十路99,秘密基地在线观看完整版免费,久久国产精品99久久久久久

通過前面的兩篇博客，我們已經獲取了訓練數據和字向量，還了解了RNN單元的原理和代碼實現。
這篇博客繼續講解如何實現一個RNN起名器(使用LSTM)。

1. 網絡結構

先看下RNN網絡常用的基礎結構，圖片來自karpathy：

解釋：

(1) 簡單的一對一。（嚴格的說不屬于RNN）
(2) 序列輸出 (例如輸入一張圖片，輸出一句句子)。
(3) 序列輸入 (例如輸入一句話，做情感分類)。
(4) 序列輸入，序列輸出 (典型例子：Machine Translation)。
(5) 同步的序列輸入和輸出 (例如視頻分類，給視頻的每一幀畫面打label)。

我們的起名器使用的是最后一種同步序列輸入和輸出。

2. lstm最終實現

上一篇介紹了lstm的基本實現。

接下來，我們看下我們的最終實現：

with self.graph.as_default():
    # Parameters:
    # Embedding layer
    with tf.name_scope("embedding"):
        self.Vector = tf.Variable(initial_value=self.W_value, name="Vector")
    # input to all gates
    U = tf.Variable(tf.truncated_normal([self.embedding_dim, self.hidden_dim * 4], -0.1, 0.1), name='x')
    # memory of all gates
    W = tf.Variable(tf.truncated_normal([self.hidden_dim, self.hidden_dim * 4], -0.1, 0.1), name='m')
    # biases all gates
    biases = tf.Variable(tf.zeros([1, self.hidden_dim * 4]))
    # Variables saving state across unrollings.
    saved_output = tf.Variable(tf.zeros([self.batch_size, self.hidden_dim]), trainable=False)
    saved_state = tf.Variable(tf.zeros([self.batch_size, self.hidden_dim]), trainable=False)
    # Classifier weights and biases.
    w = tf.Variable(tf.truncated_normal([self.hidden_dim, self.vocabulary_size], -0.1, 0.1))
    b = tf.Variable(tf.zeros([self.vocabulary_size]))
    self.keep_prob = tf.placeholder(tf.float32, name="kb")

    # Definition of the cell computation.
    def lstm_cell(i, o, state):
        i = tf.nn.dropout(x=i, keep_prob=self.keep_prob)
        mult = tf.matmul(i, U) + tf.matmul(o, W) + biases
        input_gate = tf.sigmoid(mult[:, :self.hidden_dim])
        forget_gate = tf.sigmoid(mult[:, self.hidden_dim:self.hidden_dim * 2])
        update = mult[:, self.hidden_dim * 3:self.hidden_dim * 4]
        state = forget_gate * state + input_gate * tf.tanh(update)
        output_gate = tf.sigmoid(mult[:, self.hidden_dim * 3:])
        output = tf.nn.dropout(output_gate * tf.tanh(state), self.keep_prob)
        return output, state

上面的代碼把iU，fU，cU，oU堆疊成U，把iW，fW，cW，oW堆疊成W。

這樣，矩陣乘法：

tf.matmul(i, iU) + tf.matmul(o, iW) + ib
tf.matmul(i, fU) + tf.matmul(o, fW) + fb
tf.matmul(i, cU) + tf.matmul(o, cW) + cb
tf.matmul(i, oU) + tf.matmul(o, oW) + ob

就可以合成下面的一步：

mult = tf.matmul(i, U) + tf.matmul(o, W) + biases

3. mini-batch

如果不使用mini-batch，一個一個樣本訓練，速度會很慢。

為了加快訓練速度，RNN通常也會采用mini-batch的方式訓練。

但是問題來了，不同的訓練語句長度不一樣怎么辦？一般采用固定batch長度，不夠的zero padding補上；多出的分割成多個。

下面的代碼生成batch數據：

class BatchGenerator(object):
    """Batch 生成器"""
    def __init__(self, X_value, Y_value, batch_size,
                 num_unrollings, vocabulary_size, char_to_index):
        self.X_value = X_value
        self.Y_value = Y_value
        self.data_len = len(X_value)
        self.batch_size = batch_size
        self.num_unrollings = num_unrollings
        self.vocabulary_size = vocabulary_size
        self.char_to_index = char_to_index
        self.start = 0
        self.end = batch_size - 1

        print "data length:", len(X_value)

    def next(self):
        X_all = self.X_value[[i % self.data_len for i in range(self.start, self.end + 1)]]
        Y_all = self.Y_value[[i % self.data_len for i in range(self.start, self.end + 1)]]
        X_all = [x + list(np.zeros(self.num_unrollings - len(x), dtype=int)) for x in X_all if len(x) != self.num_unrollings]
        Y_all = [y + list(np.zeros(self.num_unrollings - len(y), dtype=int)) for y in Y_all if len(y) != self.num_unrollings]
        X_batchs = list()
        Y_batchs = list()
        for step in range(self.num_unrollings):
            X_batch = list()
            Y_batch = np.zeros(shape=(self.batch_size, self.vocabulary_size), dtype=np.float)
            for b in range(self.batch_size):
                X_batch.append(X_all[b][step])
                Y_batch[b, Y_all[b][step]] = 1.0
            X_batchs.append(np.array(X_batch))
            Y_batchs.append(Y_batch)
        self.start = self.end + 1
        self.end += self.batch_size
        return X_batchs, Y_batchs

因為要使用字向量，所以X_batch數據是字的index(根據index查詢char embedding)，而Y_batch數據是one hot向量。

所以X_batchs的尺寸是：(5, 50)，即num_unrollings×batch_size；

Y_batchs的尺寸是：(5, 50, 5273)，即num_unrollings×batch_size×num_chars。

4.損失函數和模型評估

損失函數：根據softmax的輸出和label計算交叉熵

logits = tf.nn.xw_plus_b(tf.concat(0, outputs), w, b)
self.loss = tf.reduce_mean(
    tf.nn.softmax_cross_entropy_with_logits(
        logits, tf.concat(0, self.train_labels)))

評估指標：perplexity

perplexity 衡量概率模型的采樣的有多好，數值越小，概率模型越好（語言模型常用）。

def logprob(predictions, labels):
    """
    計算perplexity時用到。
    Log-probability of the true labels in a predicted batch.
    """
    predictions[predictions < 1e-10] = 1e-10
    return np.sum(np.multiply(labels, -np.log(predictions))) / labels.shape[0]

print('Minibatch perplexity: %.2f' % float(
    np.exp(logprob(predictions, np.concatenate(Y_batchs)))))

5. 生成名字（sample）

def sample_distribution(distribution):
    """Sample one element from a distribution assumed to be an array of normalized
    probabilities.
    sample按照distribution的概率分布采樣下標，這里的采樣方式是針對離散的分布，相當于連續分布中求CDF。
    """
    r = random.uniform(0, 1)
    s = 0
    for i in range(len(distribution)):
        s += distribution[i]
        if s >= r:
            return i
    return len(distribution) - 1


def sample(prediction, vocabulary_size):
    """Turn a (column) prediction into 1-hot encoded samples.
    根據sample_distribution采樣得的下標值，轉換成1-hot的樣本
    """
    p = np.zeros(shape=[1, vocabulary_size], dtype=np.float)
    p[0, sample_distribution(prediction[0])] = 1.0
    return p

def sample_name(self, first_name, ckpt_file=MODEL_PRE):
    """根據現有模型，sample生成名字"""
    with tf.Session(graph=self.graph) as session:
        saver = tf.train.Saver()
        saver.restore(session, ckpt_file)
        for _ in range(NAME_NUM):
            name = first_name
            sample_input = self.char_to_index[first_name[-1]]
            self.reset_sample_state.run()
            for _ in range(NAME_LEN-1):
                prediction = self.sample_prediction.eval({self.sample_input: [sample_input], self.keep_prob: 1.0})
                one_hot = sample(prediction, self.vocabulary_size)
                sample_input = self.char_to_index[prob_to_char(one_hot, self.index_to_char)[0]]
                name += prob_to_char(one_hot, self.index_to_char)[0]
            print name

根據輸入的姓，和名字長度，獲取名字。

6. 結果展示

完整代碼

訓練模型在main函數中執行train_all()

生成名字在main函數中執行namer_lstm_c2v()

生成的陳姓男孩名字：

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

RNN起名器（三）—— 程序實現

RNN起名器（三）—— 程序實現

1. 網絡結構

2. lstm最終實現

3. mini-batch

4.損失函數和模型評估

5. 生成名字（sample）

6. 結果展示

推薦閱讀更多精彩內容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

RNN起名器（三）—— 程序實現

1. 網絡結構

2. lstm最終實現

3. mini-batch

4.損失函數和模型評估

5. 生成名字（sample）

6. 結果展示

推薦閱讀更多精彩內容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频