日本只有19岁可以上大学,久久国产一区二区三区,真人与拘做受免费视频一

Record顧名思義主要是為了記錄數據的。
使用TFRocord存儲數據的好處：
- 為了更加方便的建圖，原來使用placeholder的話，還要每次feed_dict一下，使用TFRecord+ Dataset 的時候直接就把數據讀入操作當成一個圖中的節點，就不用每次都feed了。
- 可以方便的和Estimator進行對接。
TFRecord以字典的方式進行數據的創建。

將數據寫入TFRecord 文件

創建一個writer

writer = tf.python_io.TFRecordWriter('%s.tfrecord' %'data')

創建存儲類型tf_feature

往.tfrecord里面寫數據的時候首先要先定義寫入數據項（feature）的類型。

int64：tf.train.Feature(int64_list = tf.train.Int64List(value=輸入))
float32：tf.train.Feature(float_list = tf.train.FloatList(value=輸入))
string：tf.train.Feature(bytes_list=tf.train.BytesList(value=輸入))
注：輸入必須是list(向量)，由于tensorflow feature類型只接受list數據，但是如果數據類型是矩陣或者張量的時候，有兩種解決方法：
- 轉成list類型：將張量fatten成list(也就是向量)，再用寫入list的方式寫入。
- 轉成string類型：將張量用.tostring()轉換成string類型，再用tf.train.Feature(bytes_list=tf.train.BytesList(value=[input.tostring()]))來存儲。
- 形狀信息：不管那種方式都會使數據丟失形狀信息，所以在向該樣本中寫入feature時應該額外加入shape信息作為額外feature。shape信息是int類型，這里我是用原feature名字+'_shape'來指定shape信息的feature名。

# 這里我們將會寫3個樣本，每個樣本里有4個feature：標量，向量，矩陣，張量
for i in range(3):
    # 創建字典
    features={}
    # 寫入標量，類型Int64，由于是標量，所以"value=[scalars[i]]" 變成list
    features['scalar'] = tf.train.Feature(int64_list=tf.train.Int64List(value=[scalars[i]]))
    
    # 寫入向量，類型float，本身就是list，所以"value=vectors[i]"沒有中括號
    features['vector'] = tf.train.Feature(float_list = tf.train.FloatList(value=vectors[i]))
    
    # 寫入矩陣，類型float，本身是矩陣，一種方法是將矩陣flatten成list
    features['matrix'] = tf.train.Feature(float_list = tf.train.FloatList(value=matrices[i].reshape(-1)))
    # 然而矩陣的形狀信息(2,3)會丟失，需要存儲形狀信息，隨后可轉回原形狀
    features['matrix_shape'] = tf.train.Feature(int64_list = tf.train.Int64List(value=matrices[i].shape))
    
    # 寫入張量，類型float，本身是三維張量，另一種方法是轉變成字符類型存儲，隨后再轉回原類型
    features['tensor']         = tf.train.Feature(bytes_list=tf.train.BytesList(value=[tensors[i].tostring()]))
    # 存儲丟失的形狀信息(806,806,3)
    features['tensor_shape'] = tf.train.Feature(int64_list = tf.train.Int64List(value=tensors[i].shape))

將 tf_feature 轉換成 tf_example 以及進行序列化

# 將存有所有feature的字典送入tf.train.Features中
tf_features = tf.train.Features(feature= features)
# 再將其變成一個樣本example
tf_example = tf.train.Example(features = tf_features)
# 序列化該樣本
tf_serialized = tf_example.SerializeToString()

寫入樣本關閉文件

# 寫入一個序列化的樣本
writer.write(tf_serialized)
# 由于上面有循環3次，所以到此我們已經寫了3個樣本
# 關閉文件    
writer.close()

使用Dataset讀取數據

之前的一篇Dataset的介紹介紹了Dataset的基本用法，下面的介紹如何和TFRecord配合使用。

dataset = tf.data.TFRecordDataset(filenames)
# 這樣的話就是讀取兩次數據，數據量就是兩倍
dataset = tf.data.TFRecordDataset(["test.tfrecord","test.tfrecord"])

解析feature信息。

是寫入的逆過程，所以會需要寫入時的信息：使用庫pandas。

isbyte是用于記錄該feature是否字符化了。
default是所讀的樣本該feature值如果有確實，用什么進行填補，一般是使用np.NaN
length_type：是指示讀取向量的方式是否是定長。

data_info = pd.DataFrame({'name':['scalar','vector','matrix','matrix_shape','tensor','tensor_shape'],
                         'type':[scalars[0].dtype,vectors[0].dtype,matrices[0].dtype,tf.int64, tensors[0].dtype,tf.int64],
                         'shape':[scalars[0].shape,(3,),matrices[0].shape,(len(matrices[0].shape),),tensors[0].shape,(len(tensors[0].shape),)],
                         'isbyte':[False,False,True,False,False,False],
                         'length_type':['fixed','fixed','var','fixed','fixed','fixed']},
                         columns=['name','type','shape','isbyte','length_type','default'])

創建解析函數

example_proto，也就是序列化后的數據（也就是讀取到的TFRecord數據）。

def parse_function(example_proto):
    # 只接受一個輸入：example_proto，也就是序列化后的樣本tf_serialized

解析方式有兩種：

定長特征解析：tf.FixedLenFeature(shape, dtype, default_value)
- shape：可當reshape來用，如vector的shape從(3,)改動成了(1,3)。
  注：如果寫入的feature使用了.tostring() 其shape就是()
- dtype：必須是tf.float32，tf.int64， tf.string中的一種。
- default_value：feature值缺失時所指定的值。
不定長特征解析：tf.VarLenFeature(dtype)
注：可以不明確指定shape，但得到的tensor是SparseTensor。

dics = {# 這里沒用default_value，隨后的都是None
            'scalar': tf.FixedLenFeature(shape=(), dtype=tf.int64, default_value=None), 
             
            # vector的shape刻意從原本的(3,)指定成(1,3)
            'vector': tf.FixedLenFeature(shape=(1,3), dtype=tf.float32), 
            
            # 使用 VarLenFeature來解析
            'matrix': tf.VarLenFeature(dtype=dtype('float32')), 
            'matrix_shape': tf.FixedLenFeature(shape=(2,), dtype=tf.int64), 
            # tensor在寫入時 使用了toString()，shape是()
            # 但這里的type不是tensor的原type，而是字符化后所用的tf.string，隨后再回轉成原tf.uint8類型
            'tensor': tf.FixedLenFeature(shape=(), dtype=tf.string), 
            'tensor_shape': tf.FixedLenFeature(shape=(3,), dtype=tf.int64)}

進行解析

得到的parsed_example也是一個字典，其中每個key是對應feature的名字，value是相應的feature解析值。如果使用了下面兩種情況，則還需要對這些值進行轉變。其他情況則不用。
string類型：tf.decode_raw(parsed_feature, type) 來解碼
注：這里type必須要和當初.tostring()化前的一致。如tensor轉變前是tf.uint8，這里就需是tf.uint8；轉變前是tf.float32，則tf.float32
VarLen解析：由于得到的是SparseTensor，所以視情況需要用tf.sparse_tensor_to_dense(SparseTensor)來轉變成DenseTensor。

# 把序列化樣本和解析字典送入函數里得到解析的樣本
parsed_example = tf.parse_single_example(example_proto, dics)
# 解碼字符
parsed_example['tensor'] = tf.decode_raw(parsed_example['tensor'], tf.uint8)
# 稀疏表示 轉為 密集表示
parsed_example['matrix'] = tf.sparse_tensor_to_dense(parsed_example['matrix'])

轉變形狀

# 轉變matrix形狀
parsed_example['matrix'] = tf.reshape(parsed_example['matrix'], parsed_example['matrix_shape'])
# 轉變tensor形狀
parsed_example['tensor'] = tf.reshape(parsed_example['tensor'], parsed_example['tensor_shape'])

執行解析函數

new_dataset = dataset.map(parse_function)

創建迭代器

有了解析過的數據集后，接下來就是獲取當中的樣本。
make_one_shot_iterator():表示只將數據讀取一次，然后就拋棄這個數據了

# 創建獲取數據集中樣本的迭代器
iterator = new_dataset.make_one_shot_iterator()

獲取樣本

# 獲得下一個樣本
next_element = iterator.get_next()
# 創建Session
sess = tf.InteractiveSession()

# 獲取
i = 1
while True:
    # 不斷的獲得下一個樣本
    try:
        # 獲得的值直接屬于graph的一部分，所以不再需要用feed_dict來喂
        scalar,vector,matrix,tensor = sess.run([next_element['scalar'],
                                                next_element['vector'],
                                                next_element['matrix'],
                                                next_element['tensor']])
    # 如果遍歷完了數據集，則返回錯誤
    except tf.errors.OutOfRangeError:
        print("End of dataset")
        break
    else:
        # 顯示每個樣本中的所有feature的信息，只顯示scalar的值
        print('==============example %s ==============' %i)
        print('scalar: value: %s | shape: %s | type: %s' %(scalar, scalar.shape, scalar.dtype))
        print('vector shape: %s | type: %s' %(vector.shape, vector.dtype))
        print('matrix shape: %s | type: %s' %(matrix.shape, matrix.dtype))
        print('tensor shape: %s | type: %s' %(tensor.shape, tensor.dtype))
    i+=1
plt.imshow(tensor)

進行shuffle

buffer_size=10000：的含義是先創建一個大小為10000的buffer，然后對這個buffer進行打亂，如果buffersize過大的話雖然打亂效果很好，但是更加的占用內存，如果buffersize小的話打亂效果不好，一般可以設置為一個batch_size的10倍。

shuffle_dataset = new_dataset.shuffle(buffer_size=10000)
iterator = shuffle_dataset.make_one_shot_iterator()
next_element = iterator.get_next()

設置batch

batch_dataset = shuffle_dataset.batch(4)
iterator = batch_dataset.make_one_shot_iterator()
next_element = iterator.get_next()

Batch_padding

可以在每個batch內進行padding。
padded_shapes指定了內部數據是如何pad的。
rank數要與元數據對應
rank中的任何一維被設定成None或-1時都表示將pad到該batch下的最大長度。

batch_padding_dataset = new_dataset.padded_batch(4, 
                        padded_shapes={'scalar': [],
                                       'vector': [-1,5],
                                       'matrix': [None,None],
                                       'matrix_shape': [None],
                                       'tensor': [None,None,None],
                                       'tensor_shape': [None]})
iterator = batch_padding_dataset.make_one_shot_iterator()
next_element = iterator.get_next()

設置epoch

使用.repeat(num_epochs)來指定要遍歷幾遍整個數據集。

num_epochs = 2
epoch_dataset = new_dataset.repeat(num_epochs)
iterator = epoch_dataset.make_one_shot_iterator()
next_element = iterator.get_next()

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

[tf] TFRecord + Dataset 進行數據的寫入和讀取

[tf] TFRecord + Dataset 進行數據的寫入和讀取

將數據寫入TFRecord 文件

創建一個writer

創建存儲類型tf_feature

將 tf_feature 轉換成 tf_example 以及進行序列化

寫入樣本關閉文件

使用Dataset讀取數據

解析feature信息。

創建解析函數

進行解析

轉變形狀

執行解析函數

創建迭代器

獲取樣本

進行shuffle

設置batch

Batch_padding

設置epoch

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美 国产 综合 欧美 视频

[tf] TFRecord + Dataset 進行數據的寫入和讀取

將數據寫入TFRecord 文件

創建一個writer

創建存儲類型tf_feature

將 tf_feature 轉換成 tf_example 以及進行序列化

寫入樣本 關閉文件

使用Dataset讀取數據

解析feature信息。

創建解析函數

進行解析

轉變形狀

執行解析函數

創建迭代器

獲取樣本

進行shuffle

設置batch

Batch_padding

設置epoch

三个男躁一个女,国精产品一区一手机的秘密,麦子交换系列最经典十句话,欧美国产综合欧美视频

寫入樣本關閉文件