疯狂做受xxxx高潮视频免费,亚洲色欲久久久久综合网,中文午夜乱理片无码

最近在研究OCR識(shí)別相關(guān)的東西，最終目標(biāo)是能識(shí)別身份證上的所有中文漢字+數(shù)字，不過(guò)本文先設(shè)定一個(gè)小目標(biāo)，先識(shí)別定長(zhǎng)為18的身份證號(hào)，當(dāng)然本文的思路也是可以復(fù)用來(lái)識(shí)別定長(zhǎng)的驗(yàn)證碼識(shí)別的。
本文實(shí)現(xiàn)思路主要來(lái)源于Xlvector的博客，采用基于CNN實(shí)現(xiàn)端到端的OCR，下面引用博文介紹目前基于深度學(xué)習(xí)的兩種OCR識(shí)別方法：

把OCR的問(wèn)題當(dāng)做一個(gè)多標(biāo)簽學(xué)習(xí)的問(wèn)題。4個(gè)數(shù)字組成的驗(yàn)證碼就相當(dāng)于有4個(gè)標(biāo)簽的圖片識(shí)別問(wèn)題（這里的標(biāo)簽還是有序的），用CNN來(lái)解決。

把OCR的問(wèn)題當(dāng)做一個(gè)語(yǔ)音識(shí)別的問(wèn)題，語(yǔ)音識(shí)別是把連續(xù)的音頻轉(zhuǎn)化為文本，驗(yàn)證碼識(shí)別就是把連續(xù)的圖片轉(zhuǎn)化為文本，用CNN+LSTM+CTC來(lái)解決。

這里方法1主要用來(lái)解決固定長(zhǎng)度標(biāo)簽的圖片識(shí)別問(wèn)題，而方法2主要用來(lái)解決不定長(zhǎng)度標(biāo)簽的圖片識(shí)別問(wèn)題，本文實(shí)現(xiàn)方法1識(shí)別固定18個(gè)數(shù)字字符的身份證號(hào)

環(huán)境依賴(lài)

本文基于tensorflow框架實(shí)現(xiàn),依賴(lài)于tensorflow環(huán)境，建議使用anaconda進(jìn)行python包管理及環(huán)境管理
本文使用freetype-py 進(jìn)行訓(xùn)練集圖片的實(shí)時(shí)生成，同時(shí)后續(xù)也可擴(kuò)展為能生成中文字符圖片的訓(xùn)練集，建議使用pip安裝

  pip install freetype-py

同時(shí)本文還依賴(lài)于numpy和opencv等常用庫(kù)

  pip install numpy cv2

知識(shí)準(zhǔn)備

本文不具體介紹CNN (卷積神經(jīng)網(wǎng)絡(luò))具體實(shí)現(xiàn)原理，不熟悉的建議參看集智博文卷積：如何成為一個(gè)很厲害的神經(jīng)網(wǎng)絡(luò)，這篇文章寫(xiě)得很??
本文實(shí)現(xiàn)思路很容易理解，就是把一個(gè)有序排列18個(gè)數(shù)字組成的圖片當(dāng)做一個(gè)多標(biāo)簽學(xué)習(xí)的問(wèn)題，標(biāo)簽的長(zhǎng)度可以任意改變，只要是固定長(zhǎng)度的，這個(gè)訓(xùn)練方法都是適用的，當(dāng)然現(xiàn)實(shí)中很多情況是需要識(shí)別不定長(zhǎng)度的標(biāo)簽的，這部分就需要使用方法2(CNN+lSTM+CTC)來(lái)解決了。

正文

訓(xùn)練數(shù)據(jù)集生成

首先先完成訓(xùn)練數(shù)據(jù)集圖片的生成，主要依賴(lài)于freetype-py庫(kù)生成數(shù)字/中文的圖片。其中要注意的一點(diǎn)是就是生成圖片的大小，本文經(jīng)過(guò)多次嘗試后，生成的圖片是32 x 256大小的，如果圖片太大，則可能導(dǎo)致訓(xùn)練不收斂

生成出來(lái)的示例圖片如下：

image.png

gen_image()方法返回
image_data：圖片像素?cái)?shù)據(jù) (32,256)
label：圖片標(biāo)簽 18位數(shù)字字符 477081933151463759
vec : 圖片標(biāo)簽轉(zhuǎn)成向量表示 (180,) 代表每個(gè)數(shù)字所處的列，總長(zhǎng)度 18 * 10

#!/usr/bin/env python2
# -*- coding: utf-8 -*-
"""
身份證文字+數(shù)字生成類(lèi)

@author: pengyuanjie
"""
import numpy as np
import freetype
import copy
import random
import cv2

class put_chinese_text(object):
    def __init__(self, ttf):
        self._face = freetype.Face(ttf)

    def draw_text(self, image, pos, text, text_size, text_color):
        '''
        draw chinese(or not) text with ttf
        :param image:     image(numpy.ndarray) to draw text
        :param pos:       where to draw text
        :param text:      the context, for chinese should be unicode type
        :param text_size: text size
        :param text_color:text color
        :return:          image
        '''
        self._face.set_char_size(text_size * 64)
        metrics = self._face.size
        ascender = metrics.ascender/64.0

        #descender = metrics.descender/64.0
        #height = metrics.height/64.0
        #linegap = height - ascender + descender
        ypos = int(ascender)

        if not isinstance(text, unicode):
            text = text.decode('utf-8')
        img = self.draw_string(image, pos[0], pos[1]+ypos, text, text_color)
        return img

    def draw_string(self, img, x_pos, y_pos, text, color):
        '''
        draw string
        :param x_pos: text x-postion on img
        :param y_pos: text y-postion on img
        :param text:  text (unicode)
        :param color: text color
        :return:      image
        '''
        prev_char = 0
        pen = freetype.Vector()
        pen.x = x_pos << 6   # div 64
        pen.y = y_pos << 6

        hscale = 1.0
        matrix = freetype.Matrix(int(hscale)*0x10000L, int(0.2*0x10000L),\
                                 int(0.0*0x10000L), int(1.1*0x10000L))
        cur_pen = freetype.Vector()
        pen_translate = freetype.Vector()

        image = copy.deepcopy(img)
        for cur_char in text:
            self._face.set_transform(matrix, pen_translate)

            self._face.load_char(cur_char)
            kerning = self._face.get_kerning(prev_char, cur_char)
            pen.x += kerning.x
            slot = self._face.glyph
            bitmap = slot.bitmap

            cur_pen.x = pen.x
            cur_pen.y = pen.y - slot.bitmap_top * 64
            self.draw_ft_bitmap(image, bitmap, cur_pen, color)

            pen.x += slot.advance.x
            prev_char = cur_char

        return image

    def draw_ft_bitmap(self, img, bitmap, pen, color):
        '''
        draw each char
        :param bitmap: bitmap
        :param pen:    pen
        :param color:  pen color e.g.(0,0,255) - red
        :return:       image
        '''
        x_pos = pen.x >> 6
        y_pos = pen.y >> 6
        cols = bitmap.width
        rows = bitmap.rows

        glyph_pixels = bitmap.buffer

        for row in range(rows):
            for col in range(cols):
                if glyph_pixels[row*cols + col] != 0:
                    img[y_pos + row][x_pos + col][0] = color[0]
                    img[y_pos + row][x_pos + col][1] = color[1]
                    img[y_pos + row][x_pos + col][2] = color[2]


class gen_id_card(object):
    def __init__(self):
       #self.words = open('AllWords.txt', 'r').read().split(' ')
       self.number = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
       self.char_set = self.number
       #self.char_set = self.words + self.number
       self.len = len(self.char_set)
       
       self.max_size = 18
       self.ft = put_chinese_text('fonts/OCR-B.ttf')
       
    #隨機(jī)生成字串，長(zhǎng)度固定
    #返回text,及對(duì)應(yīng)的向量
    def random_text(self):
        text = ''
        vecs = np.zeros((self.max_size * self.len))
        #size = random.randint(1, self.max_size)
        size = self.max_size
        for i in range(size):
            c = random.choice(self.char_set)
            vec = self.char2vec(c)
            text = text + c
            vecs[i*self.len:(i+1)*self.len] = np.copy(vec)
        return text,vecs
    
    #根據(jù)生成的text，生成image,返回標(biāo)簽和圖片元素?cái)?shù)據(jù)
    def gen_image(self):
        text,vec = self.random_text()
        img = np.zeros([32,256,3])
        color_ = (255,255,255) # Write
        pos = (0, 0)
        text_size = 21
        image = self.ft.draw_text(img, pos, text, text_size, color_)
        #僅返回單通道值，顏色對(duì)于漢字識(shí)別沒(méi)有什么意義
        return image[:,:,2],text,vec

    #單字轉(zhuǎn)向量
    def char2vec(self, c):
        vec = np.zeros((self.len))
        for j in range(self.len):
            if self.char_set[j] == c:
                vec[j] = 1
        return vec
        
    #向量轉(zhuǎn)文本
    def vec2text(self, vecs):
        text = ''
        v_len = len(vecs)
        for i in range(v_len):
            if(vecs[i] == 1):
                text = text + self.char_set[i % self.len]
        return text

if __name__ == '__main__':
    genObj = gen_id_card()
    image_data,label,vec = genObj.gen_image()
    cv2.imshow('image', image_data)
    cv2.waitKey(0)

構(gòu)建網(wǎng)絡(luò)，開(kāi)始訓(xùn)練

首先定義生成一個(gè)batch的方法：

# 生成一個(gè)訓(xùn)練batch
def get_next_batch(batch_size=128):
    obj = gen_id_card()
    batch_x = np.zeros([batch_size, IMAGE_HEIGHT*IMAGE_WIDTH])
    batch_y = np.zeros([batch_size, MAX_CAPTCHA*CHAR_SET_LEN])
 
 
    for i in range(batch_size):
        image, text, vec = obj.gen_image()
        batch_x[i,:] = image.reshape((IMAGE_HEIGHT*IMAGE_WIDTH))
        batch_y[i,:] = vec
    return batch_x, batch_y

用了Batch Normalization，個(gè)人還不是很理解，讀者可自行百度，代碼來(lái)源于參考博文

#Batch Normalization? 有空再理解,tflearn or slim都有封裝
## http://stackoverflow.com/a/34634291/2267819
def batch_norm(x, beta, gamma, phase_train, scope='bn', decay=0.9, eps=1e-5):
    with tf.variable_scope(scope):
        #beta = tf.get_variable(name='beta', shape=[n_out], initializer=tf.constant_initializer(0.0), trainable=True)
        #gamma = tf.get_variable(name='gamma', shape=[n_out], initializer=tf.random_normal_initializer(1.0, stddev), trainable=True)
        batch_mean, batch_var = tf.nn.moments(x, [0, 1, 2], name='moments')
        ema = tf.train.ExponentialMovingAverage(decay=decay)

        def mean_var_with_update():
            ema_apply_op = ema.apply([batch_mean, batch_var])
            with tf.control_dependencies([ema_apply_op]):
                return tf.identity(batch_mean), tf.identity(batch_var)

        mean, var = tf.cond(phase_train, mean_var_with_update, lambda: (ema.average(batch_mean), ema.average(batch_var)))
        normed = tf.nn.batch_normalization(x, mean, var, beta, gamma, eps)
    return normed

定義4層CNN和一層全連接層，卷積核分別是2層5x5、2層3x3，每層均使用tf.nn.relu非線(xiàn)性化,并使用max_pool，網(wǎng)絡(luò)結(jié)構(gòu)讀者可自行調(diào)參優(yōu)化

# 定義CNN
def crack_captcha_cnn(w_alpha=0.01, b_alpha=0.1):
    x = tf.reshape(X, shape=[-1, IMAGE_HEIGHT, IMAGE_WIDTH, 1])
 
    # 4 conv layer
    w_c1 = tf.Variable(w_alpha*tf.random_normal([5, 5, 1, 32]))
    b_c1 = tf.Variable(b_alpha*tf.random_normal([32]))
    conv1 = tf.nn.bias_add(tf.nn.conv2d(x, w_c1, strides=[1, 1, 1, 1], padding='SAME'), b_c1)
    conv1 = batch_norm(conv1, tf.constant(0.0, shape=[32]), tf.random_normal(shape=[32], mean=1.0, stddev=0.02), train_phase, scope='bn_1')
    conv1 = tf.nn.relu(conv1)
    conv1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
    conv1 = tf.nn.dropout(conv1, keep_prob)
 
    w_c2 = tf.Variable(w_alpha*tf.random_normal([5, 5, 32, 64]))
    b_c2 = tf.Variable(b_alpha*tf.random_normal([64]))
    conv2 = tf.nn.bias_add(tf.nn.conv2d(conv1, w_c2, strides=[1, 1, 1, 1], padding='SAME'), b_c2)
    conv2 = batch_norm(conv2, tf.constant(0.0, shape=[64]), tf.random_normal(shape=[64], mean=1.0, stddev=0.02), train_phase, scope='bn_2')
    conv2 = tf.nn.relu(conv2)
    conv2 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
    conv2 = tf.nn.dropout(conv2, keep_prob)
 
    w_c3 = tf.Variable(w_alpha*tf.random_normal([3, 3, 64, 64]))
    b_c3 = tf.Variable(b_alpha*tf.random_normal([64]))
    conv3 = tf.nn.bias_add(tf.nn.conv2d(conv2, w_c3, strides=[1, 1, 1, 1], padding='SAME'), b_c3)
    conv3 = batch_norm(conv3, tf.constant(0.0, shape=[64]), tf.random_normal(shape=[64], mean=1.0, stddev=0.02), train_phase, scope='bn_3')
    conv3 = tf.nn.relu(conv3)
    conv3 = tf.nn.max_pool(conv3, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
    conv3 = tf.nn.dropout(conv3, keep_prob)

    w_c4 = tf.Variable(w_alpha*tf.random_normal([3, 3, 64, 64]))
    b_c4 = tf.Variable(b_alpha*tf.random_normal([64]))
    conv4 = tf.nn.bias_add(tf.nn.conv2d(conv3, w_c4, strides=[1, 1, 1, 1], padding='SAME'), b_c4)
    conv4 = batch_norm(conv4, tf.constant(0.0, shape=[64]), tf.random_normal(shape=[64], mean=1.0, stddev=0.02), train_phase, scope='bn_4')
    conv4 = tf.nn.relu(conv4)
    conv4 = tf.nn.max_pool(conv4, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')
    conv4 = tf.nn.dropout(conv4, keep_prob)
     
    # Fully connected layer
    w_d = tf.Variable(w_alpha*tf.random_normal([2*16*64, 1024]))
    b_d = tf.Variable(b_alpha*tf.random_normal([1024]))
    dense = tf.reshape(conv4, [-1, w_d.get_shape().as_list()[0]])
    dense = tf.nn.relu(tf.add(tf.matmul(dense, w_d), b_d))
    dense = tf.nn.dropout(dense, keep_prob)
 
    w_out = tf.Variable(w_alpha*tf.random_normal([1024, MAX_CAPTCHA*CHAR_SET_LEN]))
    b_out = tf.Variable(b_alpha*tf.random_normal([MAX_CAPTCHA*CHAR_SET_LEN]))
    out = tf.add(tf.matmul(dense, w_out), b_out)
    return out

最后執(zhí)行訓(xùn)練，使用sigmoid分類(lèi)，每100次計(jì)算一次準(zhǔn)確率，如果準(zhǔn)確率超過(guò)80%，則保存模型并結(jié)束訓(xùn)練

# 訓(xùn)練
def train_crack_captcha_cnn():
    output = crack_captcha_cnn()
    # loss
    #loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=output, labels=Y))
    loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=output, labels=Y))
    # 最后一層用來(lái)分類(lèi)的softmax和sigmoid有什么不同？
    # optimizer 為了加快訓(xùn)練 learning_rate應(yīng)該開(kāi)始大，然后慢慢衰
    optimizer = tf.train.AdamOptimizer(learning_rate=0.002).minimize(loss)
 
    predict = tf.reshape(output, [-1, MAX_CAPTCHA, CHAR_SET_LEN])
    max_idx_p = tf.argmax(predict, 2)
    max_idx_l = tf.argmax(tf.reshape(Y, [-1, MAX_CAPTCHA, CHAR_SET_LEN]), 2)
    correct_pred = tf.equal(max_idx_p, max_idx_l)
    accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
 
    saver = tf.train.Saver()
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
 
        step = 0
        while True:
            batch_x, batch_y = get_next_batch(64)
            _, loss_ = sess.run([optimizer, loss], feed_dict={X: batch_x, Y: batch_y, keep_prob: 0.75, train_phase:True})
            print(step, loss_)
            
            # 每100 step計(jì)算一次準(zhǔn)確率
            if step % 100 == 0 and step != 0:
                batch_x_test, batch_y_test = get_next_batch(100)
                acc = sess.run(accuracy, feed_dict={X: batch_x_test, Y: batch_y_test, keep_prob: 1., train_phase:False})
                print  "第%s步，訓(xùn)練準(zhǔn)確率為：%s" % (step, acc)
                # 如果準(zhǔn)確率大80%,保存模型,完成訓(xùn)練
                if acc > 0.8:
                    saver.save(sess, "crack_capcha.model", global_step=step)
                    break
            step += 1

執(zhí)行結(jié)果，筆者在大概500次訓(xùn)練后，得到準(zhǔn)確率84.3%的結(jié)果

image.png

大概訓(xùn)練1500~2200次左右，準(zhǔn)確率就能達(dá)到98%，打印前5條測(cè)試樣本可以看出，輸出結(jié)果基本與label一致了

image.png

后記

最后所有代碼和字體資源文件托管在我的Github下

筆者在一開(kāi)始訓(xùn)練的時(shí)候圖片大小是64 x 512的，訓(xùn)練的時(shí)候發(fā)現(xiàn)訓(xùn)練速度很慢，而且訓(xùn)練的loss不收斂一直保持在0.33左右，縮小圖片為32 x 256后解決，不知道為啥，猜測(cè)要么是網(wǎng)絡(luò)層級(jí)不夠，或者特征層數(shù)不夠吧。

小目標(biāo)完成后，為了最終目標(biāo)的完成，后續(xù)可能?chē)L試方法2，去識(shí)別不定長(zhǎng)的中文字符圖片，不過(guò)要先去理解LSTM網(wǎng)絡(luò)和 CTC模型了。

參考鏈接

TensorFlow練習(xí)20: 使用深度學(xué)習(xí)破解字符驗(yàn)證碼
 Python2.x上使用freetype實(shí)現(xiàn)OpenCV2.x的中文輸出
 端到端的OCR：基于CNN的實(shí)現(xiàn)

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

tensorflow 實(shí)現(xiàn)端到端的OCR：二代身份證號(hào)識(shí)別

tensorflow 實(shí)現(xiàn)端到端的OCR：二代身份證號(hào)識(shí)別

環(huán)境依賴(lài)

知識(shí)準(zhǔn)備

正文

訓(xùn)練數(shù)據(jù)集生成

構(gòu)建網(wǎng)絡(luò)，開(kāi)始訓(xùn)練

后記

參考鏈接

推薦閱讀更多精彩內(nèi)容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

tensorflow 實(shí)現(xiàn)端到端的OCR：二代身份證號(hào)識(shí)別

環(huán)境依賴(lài)

知識(shí)準(zhǔn)備

正文

訓(xùn)練數(shù)據(jù)集生成

構(gòu)建網(wǎng)絡(luò)，開(kāi)始訓(xùn)練

后記

參考鏈接

推薦閱讀更多精彩內(nèi)容

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频