Skip to content
Liu Yihua edited this page Dec 16, 2023 · 5 revisions

Introduction

Boost.uBLAS tensor extension started in 2018 as a GSoC project. The initial implementation and documentation can be found here.

Rationale

Tensors provide a natural and compact representation for massive multidimensional data with a high dimensionality which occur in disciplines like computational neuroscience, neuroinformatics, pattern/image recognition, signal processing and machine learning. Common numerical methods are for instance the higher-order singular value decomposition (HOSVD) or higher-order orthogonal iterator (HOOI) with no restrictions on the number of tensor dimensions.

The tensor addition primarily focuses on scientific computing with basic multilinear algebra operations with tensors, matrices and vectors supporting many basic multilinear algebra operations such as the modal tensor contractions and tensor transpositions where the mode can be runtime variable. Tensor objects are able to communicate with Boost’s matrix and vector objects through expression templates and free functions. The extension primarily supports dense tensors that can be projected on subtensors with ranges or slices. The primary design goal of this extension is to efficiently support tensors with almost no abstraction penalties, functional, compatible, and provide a convenient mathematical notation using new language of the C++14 and C++17 standards.

Notation

In the realm of numerical multilinear algebra, a tensor (or hypermatrix) refers to a multiway array. The number of dimensions of a tensor is referred to as its order (sometimes also rank). Therefore, a 0th-order tensor is a scalar, a 1-order tensor a vector, and a 2-order tensor a matrix. An element of a p-order tensor is accessible with a multi-index of length p.

Preliminaries and Limitations

Following the presumption of the existing uBLAS implementation, we store tensor elements within a contiguously allocated memory space. We limit ourselves to the implementation of dense tensors and subtensors. Tuning will be based on dense tensors and without using machine-dependent assembly instructions or intrinsics. All types shall be in the namespace boost::numeric. All extensions and descriptions are based on the terminology and definitions that are given here. Following these definitions, our proposal of uBLAS for tensor template types will follow uBlas generalization of vectors for matrices.

In order to use all features of this work, only tensor.hpp needs to be included.

#include <boost/numeric/ublas/tensor.hpp>

For the following examples, we also assume that the using-directive is set.

using namespace boost::numeric::ublas;

Tensor Data Type

The implementation of a tensor is adjusted from the definition provided by Mathworld:

"Tensors are generalizations of scalars (that have no indices), vectors (that have exactly one index), and matrices (that have exactly two indices) to an arbitrary number of indices."

Therefore, the templated class tensor is the base container for dense tensors and also a proxy of a one-dimensional sequence container with random-access. Every element A(i1 i1,i2,…,ip) of a p-order (n1xn2x...xnp)-dimensional tensor A is mapped to j-th element of a one-dimensional sequence container. The mapping to and usage of a one-dimensional sequence container simplifies the management of memory. However, it is not necessary and other types of implementations could be thought of.

This implementation of tensor supports the first-order and last-order orientation where the elements along first- or last-dimension, respectively. In case of the two dimensions, first-order and last-order correspond with the column-major and row-major formats. A detailed description of the template parameters, member types, and member functions are given in tensor.html.

template<class T, class F=first_order, class A=std::vector<T,std::allocator<T>>>
class tensor;

Executable examples are provided in access_tensor.cpp.

Constructing

There are several constructors available in order to generate an instance of the tensor template class. The most simple ones are exemplified below.

// Constructs a 3-order tensor with shape extents (3,4,2). 
// Elements are stored in the first-order layout
auto A = tensor<float>{3,4,2}; 
// Constructs a 4-order tensor with shape extents (5,4,3,2). 
// Elements are stored in the last-order layout
auto B = tensor<std::complex<double>,last_order>(shape{5,4,3,2});

The shape object is an auxiliary class with which dimension extents are stored. Further documentation is given in extents.hpp. The user can copy or move construct a tensor object from tensor, matrix and vector expressions. The resulting tensor extents are automatically extracted from the expression.

// Constructs a 2-order tensor from a matrix expression.
tensor<float> A = matrix<float>(3,4) + 4;
// Constructs a 2-order tensor from a tensor expression.
tensor<float> A = matrix<float>(3,4) + tensor<float>{3,4};

Accessing Elements

Users of the tensor object can access elements can be done either with a single-index accessing the underlying one-dimensional container or with a multi-index that is translated into a single-index.

The user can read from and write to the tensor with a single-index using the operator[] or at() member functions.

for(auto i = 0u; i < A.size(); ++i)
  A[i] = i; // or A.at(i) = i;

The user can read from and write to the tensor with a mutli-index using the at() member function only.

for(auto i = 0u; i < A.size(2); ++i)
  for(auto j = 0u; j < A.size(1); ++j)
    for(auto k = 0u; k < A.size(0); ++k)
      C.at(k,j,i) = A.at(i,j,k); // transposing A

The user can also call the low-level data() member function that returns a pointer to the first element of the one-dimensional storage array.

Reshaping

The user is able to dynamically change the order and dimension extents of a tensor instance. He can use the reshape() function that internally calls the resize() function of the underlying sequence container.

auto A = tensor<std::complex<double>,last_order>(shape{5,4,3,2});
A.reshape(shape{2,3,4,5}); // only reordering of the shape elements
A.reshape(shape{2,3});  // A is reduced to 6 elements
A.reshape(shape{4,3});  // Eventually, memory is allocated

If the current size is greater than the current number of elements, the container is reduced to its first count elements. If the current size is less than count, additional copies of the specified value are appended.

Formatted Output

The tensor library supports formatted output for convenience by overloading the stream operator operator<< of the std::ostream class for the tensor template class. The formatted output can be copied into Matlab or Octave environment without any modification.

auto A = tensor<float>{3,4,2};
std::iota(A.begin(), A.end(), 0.0f);
std::cout << "A=" << A << ";" << std::endl;
/*
A=cat(3,...
[ ... 
0 3 6 9 ; 
1 4 7 10 ; 
2 5 8 11 ],...
[ ... 
12 15 18 21 ; 
13 16 19 22 ; 
14 17 20 23 ]);
*/

Using the Standard Library Algorithms

Similar to the std::vector the tensor template class provides (const[reverse]) iterators in order to use algorithms from the C++ standard library. If the multi-index position of a tensor element is not required but a single-index position suffices, all standard algorithms applicable to sequence containers can be used.

For instance, the elementwise initialization from the previous example is accomplished using the std::iota function instead of a for loop.

std::iota(A.rbegin(), A.rend(), 1.0);

or even elementwise addition

std::transform(A.begin(), A.end(), B.begin(), C.begin(), std::plus<>{});

where elements are traversed according to the storage format. Moreover, range-based for-loops can be used.

for(auto& a: A) a = 0;

Tensor Operations

For the examples within the tables,

  • tensors shall be denoted by X,Y,Z
  • matrices shall be denoted by A,B,C
  • vectors shall be denoted by u,v,w

Entrywise Arithmetic Operations

Entrywise arithmetic tensor operations perform binary or unary operations on tensor, matrix, and vector elements with the same multi-index. The operators are overloaded and implemented in operators_arithmetic.hpp. Consequently all objects must have the same shape. Note that none of the operators evaluate the expression but rather construct an expression template with respect to the operation. Expression templates are evaluated when the assignment operator of the tensor is encountered.

Operations Operators Example
Binary Operations with Tensors +,-,/,* Z = X*Y;
Binary Operations with Matrices +,-,/,* Z = X*A;
Binary Operations with Vectors +,-,/,* Z = X*v;
Binary Operations with Scalars +,-,/,* Z = 4*X;
Unary Operations +,- Z = -X;

Computed Assignment

Tensor assignments perform binary or unary operations on tensor, matrix, and vector elements with the same multi-index. The operators are overloaded and implemented in operators_arithmetic.hpp. Note that the implementation evaluates the expression.

Operations Operators Example
Operations with Tensors +=,-=,/=,*= Z += X;, Z *= Y;
Operations with Scalars +=,-=,/=,*= Z += 4;

Comparison

Comparison operations compare tensor elements with the same multi-index. The operators are overloaded and implemented in operators_comparison.hpp. Consequently all objects must have the same shape. Note that all comparisons are evaluated when the compare operator is encountered and expression objects are not generated.

Operations Operators Example
Tensors ==,!=,<,>,<=,>= X==Y && Y!=Z;
Tensors with Scalars ==,!=,<,>,<=,>= X==Y && 3*Z<=4;

Special Functions

The following free tensor functions are implemented in functions.hpp. Internally they call generic functions, defined in multiplication.hpp and algorithms.hpp, that are implemented in terms of pointers, strides, and offsets. Note that the functions are evaluated immediately and no expression object is generated and that you can combine entrywise and compare operations as well as computed assignments with these free functions.

Operations Operators Example
Tensor Transposition trans() auto Z = trans(X,{4,3,2})
k-mode Tensor-Times-Vector prod() auto Z = prod(X,v,2);
k-mode Tensor-Times-Matrix prod() auto Z = prod(X,A,2);
Tensor-Times-Tensor prod() auto Z = prod(X,Y,{1,3});, auto Z = prod(X,Y,{1,3},{4,2});
Inner Product inner_prod() auto a = inner_prod(X,Y);
Outer Product outer_prod() auto Z = outer_prod(X,Y);
Frobenius Norm norm() auto a = norm(X);
Extract Real Values real() auto Y = real(X);
Extract Imaginary Values imag() auto Y = imag(X);
Compute Complex Conjugate conj() auto Y = conj(X);

Examples of the usage of special functions are given in the example file multiply_tensors_product_function.cpp.

Einstein Notation

Repeating the description provided by Mathworld:

"Einstein summation is a notational convention for simplifying expressions including summations of vectors, matrices, and general tensors. There are essentially three rules of Einstein summation notation, namely:"

  1. Repeated indices are implicitly summed over.
  2. Each index can appear at most twice in any term.
  3. Each term must contain identical non-repeated indices.

This implementation provides a convenient mechanism to denote tensor contractions and the corresponding tensor contraction operations are called. In other words, the prod() functions in the special function section are encapsulated. The user can choose from 27 Index instances, _,_a,_b,...,_y,_z, in the boost::numeric::ublas::indices namespace for this purpose. The implementation is within in operators_arithmetic.hpp. Using those indices and the overloaded access member function operator() of the tensor template class. The positioning of indices, namely contravariant (upper) and covariant (lower) is neglected. For instance, a 1-mode tensor-times-vector multiplication can now be expressed as follows:

// equivalent to: tensor_t C1 = B1 + prod(A,v1,1);
tensor_t C1 = B1 + A(_i,_,_) * v1(_i,_);

Here the index _ tells the contraction operations to neglect this index position and not contract over that dimension. A 2-mode tensor-times-matrix multiplication is given by the following expression:

// equivalent to: tensor_t C2 = prod(A,B2) + 4;
tensor_t C2 =  A(_,_j,_) * B2(_,_j) + 4;

The usage of index types becomes very useful in case of the tensor-tensor-multiplication as it simplifies the tensor contraction expression.

//equivalent to: tensor_t C2 = T2 + prod(A,B,perm_t{1,2},perm_t{3,1}) + 5;
tensor_t C2 = T2 + A(_i,_j,_k)*B(_j,_l,_i,_m) + 5;

Further examples can be found in the example multiply_tensors_einstein_notation.cpp.

Expression Templates

Expression Template Types

There are three types of expression templates that are defined in expression.hpp. A more formal documentation of tensor expressions can be found in tensor_expression.

  1. The templated class tensor_expression is required to be a public base of the tensor, binary_tensor_expression, and unary_tensor_expression classes. It inherits from ublas_expression.
  2. The templated class binary_tensor_expression contains a constant reference to a left and right expression that can be evaluated by using the access operator. It inherits from tensor_expression.
  3. The templated class unary_tensor_expression contains a constant reference to an expression that can be evaluated by using the access operator. It inherits from tensor_expression.

Generation of Expression Template Instances

Instances of tensor expressions are generated when binary or unary entrywise operations are encountered.

Operator Type Operand Tensor Expression Type Example
Binary Arithmetic Operators Tensor Expression Binary Tensor Expression X*Y, X*Y+X
Binary Arithmetic Operators Tensor & Matrix Expression Binary Tensor Expression B*Y-A
Binary Arithmetic Operators Tensor & Vector Expression Binary Tensor Expression Z*u-A/X
Binary Arithmetic Operators Tensor Expression & Scalar Unary Tensor Expression Y-3
Unary Arithmetic Operators Tensor Expression Unary Tensor Expression -Z, -(X*v)

Expression Evaluation

Expressions are evaluated at latest when the assignment operator is encountered. The evaluation time depends on their types. Entrywise arithmetic tensor operations generate expression objects and are evaluated when the assignment operator is encountered. The evaluation functions are located in expression_evaluation.hpp

Other tensor operations such as computed assignments, comparison operations and special functions expression objects are immediately evaluated without generating expression objects.

Smart Expression Evaluation

As an addition tensor shapes are retrieved and investigated before evaluation. This is done by using the if constexpr expression of the C++17 standard where an instance of an expression template is analyzed. The analysis traverses the expression tree and tests if all tensor shape extents of unary or binary tensor operations are the same. The implementation is found in expression_evaluation.hpp.