In PyTorch, Tensor is the primary object that we deal with (Variable is just a thin wrapper class for Tensor). In this post, I will give a summary of pitfalls that we should avoid when using Tensors. Since FloatTensor and LongTensor are the most popular Tensor types in PyTorch, I will focus on these two data types.

Tensor operations
#

Tensor and Tensor operation
#

For operations between Tensors, the rule is strict. Both Tensors of the operation must have the same data type, or you will see error messages like

TypeError: sub received an invalid combination of arguments - got (float), but expected one of:
 * (int value)
      didn't match because some of the arguments have invalid types: (!float!)
 * (torch.LongTensor other)
      didn't match because some of the arguments have invalid types: (!float!)
 * (int value, torch.LongTensor other)

As another example, several loss functions like CrossEntropyLoss require that the target should be torch LongTensor. So before doing operations, make sure that your input Tensor types match the function definitions.

It is easy to convert the type of one Tensor to another Tensor. Suppose x and y are Tensor of different types. You can use x.type(y.type()) or x.type_as(y) to convert x to the type of y.

Tensor and scalar operation
#

For FloatTensor, you can do math operations (multiplication, addition, division etc.) with a scalar of type int or float. But for LongTensor, you can only do math operation with int scalar but not float.

Why do some losses require target to be LongTensor?
#

According to PyTorch developers, some use cases requires that the target be LongTensor type and int just can not hold the target value.

FloatTensor or DoubleTensor
#

For deep learning, precision is not a very important issue. Plus, GPU can not process double precision very well. So FloatTensor is enough, which is also the default type for model parameters.

NumPy array and torch Tensor
#

Shared memory or not?
#

You can use torch.from_numpy() method to convert a NumPy array to corresponding torch Tensor, which will share underlying memory with NumPy array. To convert Tensor x to NumPy array, use x.numpy() to convert it to a NumPy array, which also shares the memory with original Tensor.

Does torch Tensor and Numpy array always share the underlying memory? The short answer is no. If their underlying data type is not compatible, a copy of original data will be made. For example, if you try to save torch FloatTensor as numpy array of type np.float64, it will trigger a deep copy.

Correpsondece between NumPy and torch data type
#

It should be noted that not all NumPy arrays can be converted to torch Tensor. Below is a table showing NumPy data types which is convertable to torch Tensor type.

NumPy data type	Tensor data type
`numpy.uint8`	`torch.ByteTensor`
`numpy.int16`	`torch.ShortTensor`
`numpy.int32`	`torch.IntTensor`
`numpy.int`	`torch.LongTensor`
`numpy.int64`	`torch.LongTensor`
`numpy.float32`	`torch.FloatTensor`
`numpy.float`	`torch.DoubleTensor`
`numpy.float64`	`torch.DoubleTensor`

Speed comparison between NumPy and torch operations
#

I am curious to know the speed difference between torch Tensor operation and equivalent NumPy ndarray operations. I do it in Jupyter-console using the built-in magic %timeit.

import torch
import numpy as np

# torch Tensor on CPU

x = torch.rand(1, 64)
y = torch.rand(5000, 64)
%timeit z=(x*y).sum(dim=1)

# torch Tensor on GPU

x, y = x.cuda(), y.cuda()
%timeit z = (x*y).sum(dim=1)

# numpy ndarray on CPU

x = np.random.random((1, 64))
y = np.random.random((5000, 64))
%timeit z = (x*y).sum(axis=1)

The result is listed on the following table:

Data type and device	Average operation time
Tensor on CPU	938 $\mu s$
Tensor on GPU	38.9 $\mu s$
NumPy ndarray (on CPU)	623 $\mu s$

It is pretty clear that Tensor operations on GPU runs orders of magnitute faster than operations on CPU. NumPy, due to its excellent implementation of its core in C, runs a little bit faster than Tensor on CPU.

Convert scalar to torch Tensor
#

You can convert a scalar to Tensor by providing the scalr to the Tensor constructor, which will not achieve what you want. For example,torch.Tensor(1) will not give you a Tensor which contains float 1. Instead, the produced Tensor is something like