TORCH01-01: Tensor與Stroage構造器

本文主要了解PyTorch的最核心的數據單元:Tensor與Storage,主要講解了其構造器方式的數據對象構建。由于Torch采用C/C++擴展實現,很多C的接口在Pytorch的官方文檔無法查閱(文檔與性能是Python最渣的兩個方面)
??1. Tensor的構建
??2. Storage類與Tensor


Tensor類型

  • 模塊torch.Tensor
import torch

torch.Tensor
torch.Tensor
# help(torch.Tensor)
  • Tensor張量就是一個統一類型的矩陣。可以按照數據類型與設備分成多種。
    • 默認的torch.Tensor實際是torch.FloatTensor的別名;
類型 類型定義 CPU GPU
16位浮點數 torch.half / torch.float16 torch.HalfTensor torch.cuda.HalfTensor
32浮點數 torch.float / torch.float32 torch.FloatTensor torch.cuda.FloatTensor
64位浮點數 torch.double / torch.float64 torch.DoubleTensor torch.cuda.DoubleTensor
8位整數 torch.int8 torch.CharTensor torch.cuda.CharTensor
16位整數 torch.int16 / torch.short torch.ShortTensor torch.cuda.ShortTensor
32位整數 torch.int32 / torch.int torch.IntTensor torch.cuda.IntTensor
64位整數 torch.int64 / torch.long torch.LongTensor torch.cuda.LongTensor
8位無符號整數 torch.uint8 torch.ByteTensor torch.cuda.ByteTensor
8位邏輯類型 torch.bool torch.BoolTensor torch.cuda.BoolTensor

Tensor的構建

  • Tensor的構建兩種方式
    1. 構造器方式
      • torch.Tensor類
    2. 函數工具方式
      • torch.tensor函數

Tensor構造器

help(torch.Tensor.__init__)
Help on wrapper_descriptor:

__init__(self, /, *args, **kwargs)
    Initialize self.  See help(type(self)) for accurate signature.
  • 實際上這兒存在一個文檔描述沒有說清楚的問題,上面構造器來自_TensorBase.py中的_TensorBase類,就是Tensor的父類,_TensorBase類來自C++。下面用幾點來說明這個調用過程:

    • 上面的_TensorBase.py文件可以通過文件搜索找到;或者使用PyCharm跟蹤找到。
    • Torch早期版本來自Lua語言實現,該語言是與C語言交互非常直接的。
    • 后來Torch從C擴展到C++
    • 在Python中開始使用Cython開始擴展,其中很多效率性的處理都是交給C/C++語言的,這樣Python文檔很多不清楚的地方實際都在C++中找到原型說明,比如Tensor的構造器;
  • 在官網可以直接下載C++庫;C++只能下載庫,源代碼只有Python的擴展源代碼,沒有C++的源代碼:

    • Torch的C++庫下載

C的Tensor構造函數

  • 下面文件可以下載C庫,并在include目錄下找到TH\generic\THTensor.h
#ifndef TH_GENERIC_FILE
#define TH_GENERIC_FILE "TH/generic/THTensor.h"
#else

/* a la lua? dim, storageoffset, ...  et les methodes ? */

#include <c10/core/TensorImpl.h>

#define THTensor at::TensorImpl

// These used to be distinct types; for some measure of backwards compatibility and documentation
// alias these to the single THTensor type.
#define THFloatTensor THTensor
#define THDoubleTensor THTensor
#define THHalfTensor THTensor
#define THByteTensor THTensor
#define THCharTensor THTensor
#define THShortTensor THTensor
#define THIntTensor THTensor
#define THLongTensor THTensor
#define THBoolTensor THTensor
#define THBFloat16Tensor THTensor

/**** access methods ****/
TH_API THStorage* THTensor_(storage)(const THTensor *self);
TH_API ptrdiff_t THTensor_(storageOffset)(const THTensor *self);

// See [NOTE: nDimension vs nDimensionLegacyNoScalars vs nDimensionLegacyAll]
TH_API int THTensor_(nDimension)(const THTensor *self);
TH_API int THTensor_(nDimensionLegacyNoScalars)(const THTensor *self);
TH_API int THTensor_(nDimensionLegacyAll)(const THTensor *self);
TH_API int64_t THTensor_(size)(const THTensor *self, int dim);
TH_API int64_t THTensor_(stride)(const THTensor *self, int dim);
TH_API scalar_t *THTensor_(data)(const THTensor *self);


/**** creation methods ****/
TH_API THTensor *THTensor_(new)(void);
TH_API THTensor *THTensor_(newWithTensor)(THTensor *tensor);
TH_API THTensor *THTensor_(newWithStorage1d)(THStorage *storage_, ptrdiff_t storageOffset_,
                                int64_t size0_, int64_t stride0_);
TH_API THTensor *THTensor_(newWithStorage2d)(THStorage *storage_, ptrdiff_t storageOffset_,
                                int64_t size0_, int64_t stride0_,
                                int64_t size1_, int64_t stride1_);
TH_API THTensor *THTensor_(newWithStorage3d)(THStorage *storage_, ptrdiff_t storageOffset_,
                                int64_t size0_, int64_t stride0_,
                                int64_t size1_, int64_t stride1_,
                                int64_t size2_, int64_t stride2_);
TH_API THTensor *THTensor_(newWithStorage4d)(THStorage *storage_, ptrdiff_t storageOffset_,
                                int64_t size0_, int64_t stride0_,
                                int64_t size1_, int64_t stride1_,
                                int64_t size2_, int64_t stride2_,
                                int64_t size3_, int64_t stride3_);

/* stride might be NULL */
TH_API THTensor *THTensor_(newWithSize1d)(int64_t size0_);
TH_API THTensor *THTensor_(newWithSize2d)(int64_t size0_, int64_t size1_);
TH_API THTensor *THTensor_(newWithSize3d)(int64_t size0_, int64_t size1_, int64_t size2_);
TH_API THTensor *THTensor_(newWithSize4d)(int64_t size0_, int64_t size1_, int64_t size2_, int64_t size3_);

TH_API THTensor *THTensor_(newClone)(THTensor *self);
TH_API THTensor *THTensor_(newContiguous)(THTensor *tensor);
TH_API THTensor *THTensor_(newSelect)(THTensor *tensor, int dimension_, int64_t sliceIndex_);
TH_API THTensor *THTensor_(newNarrow)(THTensor *tensor, int dimension_, int64_t firstIndex_, int64_t size_);
TH_API THTensor *THTensor_(newTranspose)(THTensor *tensor, int dimension1_, int dimension2_);

// resize* methods simply resize the storage. So they may not retain the current data at current indices.
// This is especially likely to happen when the tensor is not contiguous. In general, if you still need the
// values, unless you are doing some size and stride tricks, do not use resize*.
TH_API void THTensor_(resizeNd)(THTensor *tensor, int nDimension, const int64_t *size, const int64_t *stride);
TH_API void THTensor_(resizeAs)(THTensor *tensor, THTensor *src);
TH_API void THTensor_(resize0d)(THTensor *tensor);
TH_API void THTensor_(resize1d)(THTensor *tensor, int64_t size0_);
TH_API void THTensor_(resize2d)(THTensor *tensor, int64_t size0_, int64_t size1_);
TH_API void THTensor_(resize3d)(THTensor *tensor, int64_t size0_, int64_t size1_, int64_t size2_);
TH_API void THTensor_(resize4d)(THTensor *tensor, int64_t size0_, int64_t size1_, int64_t size2_, int64_t size3_);
TH_API void THTensor_(resize5d)(THTensor *tensor, int64_t size0_, int64_t size1_, int64_t size2_, int64_t size3_, int64_t size4_);
// Note: these are legacy resize functions that treat sizes as size->size() == 0 and size->data<int64_t>() as being 0-terminated.

TH_API void THTensor_(set)(THTensor *self, THTensor *src);
TH_API void THTensor_(setStorageNd)(THTensor *self, THStorage *storage_, ptrdiff_t storageOffset_, int nDimension, const int64_t *size, const int64_t *stride);
TH_API void THTensor_(setStorage1d)(THTensor *self, THStorage *storage_, ptrdiff_t storageOffset_,
                                    int64_t size0_, int64_t stride0_);
TH_API void THTensor_(setStorage2d)(THTensor *self, THStorage *storage_, ptrdiff_t storageOffset_,
                                    int64_t size0_, int64_t stride0_,
                                    int64_t size1_, int64_t stride1_);
TH_API void THTensor_(setStorage3d)(THTensor *self, THStorage *storage_, ptrdiff_t storageOffset_,
                                    int64_t size0_, int64_t stride0_,
                                    int64_t size1_, int64_t stride1_,
                                    int64_t size2_, int64_t stride2_);
TH_API void THTensor_(setStorage4d)(THTensor *self, THStorage *storage_, ptrdiff_t storageOffset_,
                                    int64_t size0_, int64_t stride0_,
                                    int64_t size1_, int64_t stride1_,
                                    int64_t size2_, int64_t stride2_,
                                    int64_t size3_, int64_t stride3_);

TH_API void THTensor_(narrow)(THTensor *self, THTensor *src, int dimension_, int64_t firstIndex_, int64_t size_);
TH_API void THTensor_(select)(THTensor *self, THTensor *src, int dimension_, int64_t sliceIndex_);
TH_API void THTensor_(transpose)(THTensor *self, THTensor *src, int dimension1_, int dimension2_);
TH_API int THTensor_(isTransposed)(const THTensor *self);
TH_API void THTensor_(unfold)(THTensor *self, THTensor *src, int dimension_, int64_t size_, int64_t step_);

TH_API void THTensor_(squeeze)(THTensor *self, THTensor *src);
TH_API void THTensor_(squeeze1d)(THTensor *self, THTensor *src, int dimension_);
TH_API void THTensor_(unsqueeze1d)(THTensor *self, THTensor *src, int dimension_);

TH_API int THTensor_(isContiguous)(const THTensor *self);
TH_API int THTensor_(isSameSizeAs)(const THTensor *self, const THTensor *src);
TH_API int THTensor_(isSetTo)(const THTensor *self, const THTensor *src);
TH_API ptrdiff_t THTensor_(nElement)(const THTensor *self);

TH_API void THTensor_(retain)(THTensor *self);
TH_API void THTensor_(free)(THTensor *self);
TH_API void THTensor_(freeCopyTo)(THTensor *self, THTensor *dst);

/* Slow access methods [check everything] */
TH_API void THTensor_(set0d)(THTensor *tensor, scalar_t value);
TH_API void THTensor_(set1d)(THTensor *tensor, int64_t x0, scalar_t value);
TH_API void THTensor_(set2d)(THTensor *tensor, int64_t x0, int64_t x1, scalar_t value);
TH_API void THTensor_(set3d)(THTensor *tensor, int64_t x0, int64_t x1, int64_t x2, scalar_t value);
TH_API void THTensor_(set4d)(THTensor *tensor, int64_t x0, int64_t x1, int64_t x2, int64_t x3, scalar_t value);

TH_API scalar_t THTensor_(get0d)(const THTensor *tensor);
TH_API scalar_t THTensor_(get1d)(const THTensor *tensor, int64_t x0);
TH_API scalar_t THTensor_(get2d)(const THTensor *tensor, int64_t x0, int64_t x1);
TH_API scalar_t THTensor_(get3d)(const THTensor *tensor, int64_t x0, int64_t x1, int64_t x2);
TH_API scalar_t THTensor_(get4d)(const THTensor *tensor, int64_t x0, int64_t x1, int64_t x2, int64_t x3);

/* Shape manipulation methods */
TH_API void THTensor_(cat)(THTensor *r_, THTensor *ta, THTensor *tb, int dimension);
TH_API void THTensor_(catArray)(THTensor *result, THTensor **inputs, int numInputs, int dimension);

/* Debug methods */
TH_API THDescBuff THTensor_(desc)(const THTensor *tensor);
TH_API THDescBuff THTensor_(sizeDesc)(const THTensor *tensor);

#endif

C++的構造

  • 來自C++庫(與C同一個庫)的TH\generic\THTensor.hpp文件:
#ifndef TH_GENERIC_FILE
#define TH_GENERIC_FILE "TH/generic/THTensor.hpp"
#else

// STOP!!! Thinking of including this header directly?  Please
// read Note [TH abstraction violation]

// NOTE: functions exist here only to support dispatch via Declarations.cwrap.  You probably don't want to put
// new functions in here, they should probably be un-genericized.

TH_CPP_API void THTensor_(setStorage)(THTensor *self, THStorage *storage_, ptrdiff_t storageOffset_,
                                      at::IntArrayRef size_, at::IntArrayRef stride_);
/* strides.data() might be NULL */
TH_CPP_API THTensor *THTensor_(newWithStorage)(THStorage *storage, ptrdiff_t storageOffset,
                                               at::IntArrayRef sizes, at::IntArrayRef strides);

TH_CPP_API void THTensor_(resize)(THTensor *self, at::IntArrayRef size, at::IntArrayRef stride);
TH_CPP_API THTensor *THTensor_(newWithSize)(at::IntArrayRef size, at::IntArrayRef stride);

#endif

TensorStorage類

#ifndef TH_GENERIC_FILE
#define TH_GENERIC_FILE "TH/generic/THStorage.h"
#else

#include <c10/core/Allocator.h>
#include <c10/core/StorageImpl.h>

/* on pourrait avoir un liste chainee
   qui initialise math, lab structures (or more).
   mouais -- complique.

   Pb: THMapStorage is kind of a class
   THLab_()... comment je m'en sors?

   en template, faudrait que je les instancie toutes!!! oh boy!
   Et comment je sais que c'est pour Cuda? Le type float est le meme dans les <>

   au bout du compte, ca serait sur des pointeurs float/double... etc... = facile.
   primitives??
 */

// Struct definition is moved to THStorage.hpp (so this file stays C compatible)

#define THStorage at::StorageImpl

// These used to be distinct types; for some measure of backwards compatibility and documentation
// alias these to the single THStorage type.
#define THFloatStorage THStorage
#define THDoubleStorage THStorage
#define THHalfStorage THStorage
#define THByteStorage THStorage
#define THCharStorage THStorage
#define THShortStorage THStorage
#define THIntStorage THStorage
#define THLongStorage THStorage
#define THBoolStorage THStorage
#define THBFloat16Storage THStorage

TH_API scalar_t* THStorage_(data)(const THStorage*);
TH_API ptrdiff_t THStorage_(size)(const THStorage*);
TH_API size_t THStorage_(elementSize)(void);

/* slow access -- checks everything */
TH_API void THStorage_(set)(THStorage*, ptrdiff_t, scalar_t);
TH_API scalar_t THStorage_(get)(const THStorage*, ptrdiff_t);

TH_API THStorage* THStorage_(new)(void);
TH_API THStorage* THStorage_(newWithSize)(ptrdiff_t size);
TH_API THStorage* THStorage_(newWithSize1)(scalar_t);
TH_API THStorage* THStorage_(newWithSize2)(scalar_t, scalar_t);
TH_API THStorage* THStorage_(newWithSize3)(scalar_t, scalar_t, scalar_t);
TH_API THStorage* THStorage_(newWithSize4)(scalar_t, scalar_t, scalar_t, scalar_t);
TH_API THStorage* THStorage_(newWithMapping)(const char *filename, ptrdiff_t size, int flags);

TH_API THStorage* THStorage_(newWithAllocator)(ptrdiff_t size,
                                               c10::Allocator* allocator);
TH_API THStorage* THStorage_(newWithDataAndAllocator)(
    at::DataPtr&& data, ptrdiff_t size, at::Allocator* allocator);

/* should not differ with API */
TH_API void THStorage_(setFlag)(THStorage *storage, const char flag);
TH_API void THStorage_(clearFlag)(THStorage *storage, const char flag);
TH_API void THStorage_(retain)(THStorage *storage);
TH_API void THStorage_(swap)(THStorage *storage1, THStorage *storage2);

/* might differ with other API (like CUDA) */
TH_API void THStorage_(free)(THStorage *storage);
TH_API void THStorage_(resize)(THStorage *storage, ptrdiff_t size);
TH_API void THStorage_(fill)(THStorage *storage, scalar_t value);

#endif

Python中的函數

  • C與C++的函數在Python中都提供了封裝實現。在python的site-package目錄下的init.pyi文件中都有接口說明。
    • 實際上Tensor的構造器與tensor*_likenew_*等函數共享相同的參數格式。

官方推薦的Tensor創建方式

- 使用torch.tensor函數
- 使用torch.*_like函數
- 使用torch.new_*函數
- 其他的特殊功能的創建函數(隨機Tensor,從其他格式轉換創建,從文件加載創建等)

Tensor的創建例子

使用tensor函數創建

  • tensor函數總是使用深度拷貝,器特點是從已有的數據直接構建Tensor。 已有的數據格式包含
    • list
    • tuple,
    • NumPy ndarray,
    • scalar
    • other types.
    torch.tensor(data, dtype=None, device=None, requires_grad=False, pin_memory=False) → Tensor
import torch
print(help(torch.tensor))
Help on built-in function tensor:

tensor(...)
    tensor(data, dtype=None, device=None, requires_grad=False, pin_memory=False) -> Tensor
    
    Constructs a tensor with :attr:`data`.
    
    .. warning::
    
        :func:`torch.tensor` always copies :attr:`data`. If you have a Tensor
        ``data`` and want to avoid a copy, use :func:`torch.Tensor.requires_grad_`
        or :func:`torch.Tensor.detach`.
        If you have a NumPy ``ndarray`` and want to avoid a copy, use
        :func:`torch.as_tensor`.
    
    .. warning::
    
        When data is a tensor `x`, :func:`torch.tensor` reads out 'the data' from whatever it is passed,
        and constructs a leaf variable. Therefore ``torch.tensor(x)`` is equivalent to ``x.clone().detach()``
        and ``torch.tensor(x, requires_grad=True)`` is equivalent to ``x.clone().detach().requires_grad_(True)``.
        The equivalents using ``clone()`` and ``detach()`` are recommended.
    
    Args:
        data (array_like): Initial data for the tensor. Can be a list, tuple,
            NumPy ``ndarray``, scalar, and other types.
        dtype (:class:`torch.dtype`, optional): the desired data type of returned tensor.
            Default: if ``None``, infers data type from :attr:`data`.
        device (:class:`torch.device`, optional): the desired device of returned tensor.
            Default: if ``None``, uses the current device for the default tensor type
            (see :func:`torch.set_default_tensor_type`). :attr:`device` will be the CPU
            for CPU tensor types and the current CUDA device for CUDA tensor types.
        requires_grad (bool, optional): If autograd should record operations on the
            returned tensor. Default: ``False``.
        pin_memory (bool, optional): If set, returned tensor would be allocated in
            the pinned memory. Works only for CPU tensors. Default: ``False``.
    
    
    Example::
    
        >>> torch.tensor([[0.1, 1.2], [2.2, 3.1], [4.9, 5.2]])
        tensor([[ 0.1000,  1.2000],
                [ 2.2000,  3.1000],
                [ 4.9000,  5.2000]])
    
        >>> torch.tensor([0, 1])  # Type inference on data
        tensor([ 0,  1])
    
        >>> torch.tensor([[0.11111, 0.222222, 0.3333333]],
                         dtype=torch.float64,
                         device=torch.device('cuda:0'))  # creates a torch.cuda.DoubleTensor
        tensor([[ 0.1111,  0.2222,  0.3333]], dtype=torch.float64, device='cuda:0')
    
        >>> torch.tensor(3.14159)  # Create a scalar (zero-dimensional tensor)
        tensor(3.1416)
    
        >>> torch.tensor([])  # Create an empty tensor (of size (0,))
        tensor([])

None
  1. list與tuple
import torch

t_list = torch.tensor([1, 2, 3])
t_tuple = torch.tensor(((4, 5, 6), (7, 8, 9)))
print(t_list, t_tuple)

tensor([1, 2, 3]) tensor([[4, 5, 6],
        [7, 8, 9]])
  1. scalar標量
t_scalar = torch.tensor(88)
print(t_scalar)
tensor(88)
  1. numpy.ndarray
import numpy as np
n_arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
t_ndarray = torch.tensor(n_arr)
print(t_ndarray)
tensor([[1, 2, 3, 4],
        [5, 6, 7, 8]])
  1. 其他

    • 測試下DataFrame,但是數據還是需要轉換成numpy。
import pandas as  pd
pd_data = pd.DataFrame([[1,2,3], [4,5,6]])
print(pd_data)
print(type(pd_data.values))
t_pandas = torch.tensor(pd_data.values)
print(t_pandas)
   0  1  2
0  1  2  3
1  4  5  6
<class 'numpy.ndarray'>
tensor([[1, 2, 3],
        [4, 5, 6]])

使用Tensor構造器

  • 按照C的函數定義與C++的類取使用。

空初始化

/* Empty init */
THTensor *THTensor_(new)(void)
{
  return c10::make_intrusive<at::TensorImpl, at::UndefinedTensorImpl>(
    c10::intrusive_ptr<at::StorageImpl>::reclaim(THStorage_(new)()),
    at::CPUTensorId()
  ).release();
}
import torch
t1 = torch.Tensor()
print(t1)
tensor([])

指針拷貝

- 引用拷貝
/* Pointer-copy init */
THTensor *THTensor_(newWithTensor)(THTensor *tensor)
{
  return at::native::alias(THTensor_wrap(tensor)).unsafeReleaseTensorImpl();
}

arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]],  np.float32)  # 記得添加類型
t_arr = torch.tensor(arr)

t2 = torch.Tensor(t_arr)       # t_arr必須是float32, 這是Tensor的默認類型,
                                        # Tensor構造器是不能指定類型,tensor函數可以
print(t2)
tensor([[1., 2., 3., 4.],
        [5., 6., 7., 8.]])
# 如果輸入的是整型,就必須使用整型的Tensor
arr_i = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])  # 字面值默認是Long類型
t_arr_i = torch.tensor(arr_i)

t2_i = torch.LongTensor(t_arr_i)       # t_arr必須是float32, 這是Tensor的默認類型,
                                        # Tensor構造器是不能指定類型,tensor函數可以
print(t2_i)
tensor([[1, 2, 3, 4],
        [5, 6, 7, 8]])

使用Storage構造

- 在官方的文檔中,提供的是torch.Storage的類說明,實際上每個Tensor都提供一個對應類型的Storage,可以使用python的doc工具查看到如下輸出:
    torch.storage._StorageBase(builtins.object)
        |- BoolStorage(torch._C.BoolStorageBase, torch.storage._StorageBase)
        |- ByteStorage(torch._C.ByteStorageBase, torch.storage._StorageBase)
        |- CharStorage(torch._C.CharStorageBase, torch.storage._StorageBase)
        |- DoubleStorage(torch._C.DoubleStorageBase, torch.storage._StorageBase)
        |- FloatStorage(torch._C.FloatStorageBase, torch.storage._StorageBase)
        |- IntStorage(torch._C.IntStorageBase, torch.storage._StorageBase)
        |- LongStorage(torch._C.LongStorageBase, torch.storage._StorageBase)
        |- ShortStorage(torch._C.ShortStorageBase, torch.storage._StorageBase)
  • Storage的構造函數在python中也查不到詳細的說明,可以通過C/C++的文檔查閱到
TH_API THStorage* THStorage_(new)(void);
TH_API THStorage* THStorage_(newWithSize)(ptrdiff_t size);
TH_API THStorage* THStorage_(newWithSize1)(scalar_t);
TH_API THStorage* THStorage_(newWithSize2)(scalar_t, scalar_t);
TH_API THStorage* THStorage_(newWithSize3)(scalar_t, scalar_t, scalar_t);
TH_API THStorage* THStorage_(newWithSize4)(scalar_t, scalar_t, scalar_t, scalar_t);
TH_API THStorage* THStorage_(newWithMapping)(const char *filename, ptrdiff_t size, int flags);

TH_API THStorage* THStorage_(newWithAllocator)(ptrdiff_t size,
                                               c10::Allocator* allocator);
TH_API THStorage* THStorage_(newWithDataAndAllocator)(
    at::DataPtr&& data, ptrdiff_t size, at::Allocator* allocator);
  • Tensor使用Storage作為參數的構造器
TH_API THTensor *THTensor_(newWithStorage1d)(THStorage *storage_, ptrdiff_t storageOffset_,
                                int64_t size0_, int64_t stride0_);
TH_API THTensor *THTensor_(newWithStorage2d)(THStorage *storage_, ptrdiff_t storageOffset_,
                                int64_t size0_, int64_t stride0_,
                                int64_t size1_, int64_t stride1_);
TH_API THTensor *THTensor_(newWithStorage3d)(THStorage *storage_, ptrdiff_t storageOffset_,
                                int64_t size0_, int64_t stride0_,
                                int64_t size1_, int64_t stride1_,
                                int64_t size2_, int64_t stride2_);
TH_API THTensor *THTensor_(newWithStorage4d)(THStorage *storage_, ptrdiff_t storageOffset_,
                                int64_t size0_, int64_t stride0_,
                                int64_t size1_, int64_t stride1_,
                                int64_t size2_, int64_t stride2_,
                                int64_t size3_, int64_t stride3_);
s1 = torch.Storage(5)   # 5個空間的存儲(數據沒有初始化,是內存的原始狀態,
                                # 多次運行可以看出其隨機性,因為分配的空間在改變)
ts1 = torch.Tensor(s1)
print(s1, ts1)
 8.407790785948902e-45
 0.0
 1.817253113425142e-24
 1.401298464324817e-45
 0.0
[torch.FloatStorage of size 5] tensor([8.4078e-45, 0.0000e+00, 1.8173e-24, 1.4013e-45, 0.0000e+00])
  • 下面是使用data創建Storage
TH_API THStorage* THStorage_(newWithDataAndAllocator)(
    at::DataPtr&& data, ptrdiff_t size, at::Allocator* allocator);
s2 = torch.Storage([1,2,3,4], 6)   # 5個空間的存儲(數據沒有初始化,是內存的原始狀態,
                                # 多次運行可以看出其隨機性,因為分配的空間在改變)
ts2 = torch.Tensor(s2)
print(s2, ts2)
---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-49-f0affd699614> in <module>()
----> 1 s2 = torch.Storage([1,2,3,4], 6)   # 5個空間的存儲(數據沒有初始化,是內存的原始狀態,
      2                                 # 多次運行可以看出其隨機性,因為分配的空間在改變)
      3 ts2 = torch.Tensor(s2)
      4 print(s2, ts2)


TypeError: torch.FloatStorage constructor received an invalid combination of arguments - got (list, int), but expected one of:
 * no arguments
 * (int size)
 * (Sequence data)
 * (torch.FloatStorage view_source)
 * (torch.FloatStorage view_source, int offset)
      didn't match because some of the arguments have invalid types: (list, int)
 * (torch.FloatStorage view_source, int offset, int size)
  • 注意:
    • 如果故意犯一個錯,則會輸出文檔中查不到的Storage的Python構造器說明,如下:
      • 修改上面語句如下:s2 = torch.Storage([1,2,3,4], 3),增加一個參數。
    TypeError: torch.FloatStorage constructor received an invalid combination of arguments - got (list, int), but expected one of:
             * no arguments
             * (int size)
             * (Sequence data)
             * (torch.FloatStorage view_source)
             * (torch.FloatStorage view_source, int offset)
                  didn't match because some of the arguments have invalid types: (list, int)
             * (torch.FloatStorage view_source, int offset, int size)
  • 同樣的可以通過錯誤得到Tensor的構造器說明:
        TypeError: new() received an invalid combination of arguments - got (torch.FloatStorage, int, int), but expected one of:
               |-  * (torch.device device)
               |-  * (torch.Storage storage)
               |-  * (Tensor other)
               |-  * (tuple of ints size, torch.device device)
               |-  * (object data, torch.device device)
s3 = torch.Storage([1,2,3,4])   # 5個空間的存儲(數據沒有初始化,是內存的原始狀態,
                                # 多次運行可以看出其隨機性,因為分配的空間在改變)
ts3 = torch.Tensor(s3, 2, 2)
print(s3, ts3)
---------------------------------------------------------------------------

TypeError                                 Traceback (most recent call last)

<ipython-input-40-116a7869da0b> in <module>()
      1 s3 = torch.Storage([1,2,3,4])   # 5個空間的存儲(數據沒有初始化,是內存的原始狀態,
      2                                 # 多次運行可以看出其隨機性,因為分配的空間在改變)
----> 3 ts3 = torch.Tensor(s3, 2, 2)
      4 print(s3, ts3)


TypeError: new() received an invalid combination of arguments - got (torch.FloatStorage, int, int), but expected one of:
 * (torch.device device)
 * (torch.Storage storage)
 * (Tensor other)
 * (tuple of ints size, torch.device device)
 * (object data, torch.device device)

構造指定大小的Tensor

- `* (tuple of ints size, torch.device device)` 
    - 使用元組的方式就是直接使用多個參數,不要使用(),否則當成數據來處理。
t4 = torch.Tensor(3, 2, 3)
print(t4)
tensor([[[1.5414e-44, 0.0000e+00, 0.0000e+00],
         [0.0000e+00, 0.0000e+00, 0.0000e+00]],

        [[0.0000e+00, 0.0000e+00, 0.0000e+00],
         [0.0000e+00, 0.0000e+00, 0.0000e+00]],

        [[0.0000e+00, 0.0000e+00, 0.0000e+00],
         [0.0000e+00, 0.0000e+00, 0.0000e+00]]])

使用數據來構造Tensor

t5 = torch.Tensor((3, 2, 3))    # 自動轉換
print(t5)
tensor([3., 2., 3.])

總結

Tensor的Python構造器定義如下

    Tensor.__init__(torch.device device)
    Tensor.__init__(torch.Storage storage)
    Tensor.__init__(Tensor other)
    Tensor.__init__(tuple of ints size, torch.device device)
    Tensor.__init__(object data, torch.device device)

Storage的Python構造器定義如下

    FloatStorage.__init__() no arguments
    FloatStorage.__init__(int size)
    FloatStorage.__init__(Sequence data)
    FloatStorage.__init__(torch.FloatStorage view_source)
    FloatStorage.__init__(torch.FloatStorage view_source, int offset)
    FloatStorage.__init__(torch.FloatStorage view_source, int offset, int size)

  • 有了這兩個構造器,創建Tensor就沒有問題了,為什么官方文檔,不提供詳細的文檔呢?估計也是這樣構造比較啰嗦,不推薦的緣故吧!但是這里通過常規的編程思路,可以更好的理解Torch。
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。

推薦閱讀更多精彩內容