Tensor is the primary object that we deal with (
Variable is just a thin wrapper class for
Tensor). In this post, I will give a summary of pitfalls that we should avoid when using Tensors. Since
LongTensor are the most popular
Tensor types in PyTorch, I will focus on these two data types.
Tensor and Tensor operation
For operations between Tensors, the rule is strict. Both Tensors of the operation must have the same data type, or you will see error messages like
TypeError: sub received an invalid combination of arguments - got (float), but expected one of:
* (int value)
didn’t match because some of the arguments have invalid types: (!float!) * (torch.LongTensor other)
didn’t match because some of the arguments have invalid types: (!float!) * (int value, torch.LongTensor other)
As another example, several loss functions like
CrossEntropyLoss require that the target should be torch
LongTensor. So before doing operations, make sure that your input
Tensor types match the function definitions.
It is easy to convert the type of one
Tensor to another
Tensor of different types. You can use
x.type_as(y) to convert
x to the type of
Tensor and scalar operation
FloatTensor, you can do math operations (multiplication, addition, division etc.) with a scalar of type
float. But for
LongTensor, you can only do math operation with
int scalar but not
Why do some losses require target to be LongTensor?
According to PyTorch developers, some use cases requires that the target be
LongTensor type and int just can not hold the target value.
FloatTensor or DoubleTensor
For deep learning, precision is not a very important issue. Plus, GPU can not process double precision very well. So
FloatTensor is enough, which is also the default type for model parameters.
NumPy array and torch Tensor
Shared memory or not?
You can use
torch.from_numpy() method to convert a NumPy array to corresponding torch
Tensor, which will share underlying memory with NumPy array. To convert
x to NumPy array, use
x.numpy() to convert it to a NumPy array, which also shares the memory with original
Tensor and Numpy array always share the underlying memory? The short answer is no. If their underlying data type is not compatible, a copy of original data will be made. For example, if you try to save torch
FloatTensor as numpy array of type
np.float64, it will trigger a deep copy.
Correpsondece between NumPy and torch data type
It should be noted that not all NumPy arrays can be converted to torch
Tensor. Below is a table showing NumPy data types which is convertable to torch
|NumPy data type||Tensor data type|
Speed comparison between NumPy and torch operations
I am curious to know the speed difference between torch Tensor operation and equivalent NumPy ndarray operations. I do it in Jupyter-console using the builtin magic
import torch import numpy as np # torch Tensor on CPU x = torch.rand(1, 64) y = torch.rand(5000, 64) %timeit z=(x*y).sum(dim=1) # torch Tensor on GPU x, y = x.cuda(), y.cuda() %timeit z = (x*y).sum(dim=1) # numpy ndarray on CPU x = np.random.random((1, 64)) y = np.random.random((5000, 64)) %timeit z = (x*y).sum(axis=1)
The result is listed on the following table:
|Data type and device||Average operation time|
|Tensor on CPU||938 $\mu s$|
|Tensor on GPU||38.9 $\mu s$|
|NumPy ndarray (on CPU)||623 $\mu s$|
It is pretty clear that Tensor operations on GPU runs orders of magnitute faster than operations on CPU. NumPy, due to its excellent implementation of its core in C, runs a little bit faster than Tensor on CPU.
Convert scalar to torch Tensor
You can convert a scalar to
Tensor by providing the scalr to the
Tensor constructor, which will not achieve what you want. For example,
torch.Tensor(1) will not give you a
Tensor which contains float 1. Instead, the produced
Tensor is something like
1.00000e-20 * 5.4514
[torch.FloatTensor of size 1]
To achieve what you want, you have to provide a list with single element 1 to the
License CC BY-NC-ND 4.0