Skip to content
Mendz edited this page Nov 9, 2017 · 2 revisions

CVS<T>

Represents a matrix in Compressed Value Storage (CVS) format.

About CVS

Compressed value storage (CVS) applies basic lossless compression to a matrix such that each distinct non-zero value is aligned with its list of matrix linear indexes. CVS supports compression in row-major order and in column-major order. The basic representation of CVS is a tuple of 4 values:

  • A - is an unordered list[NDNZ] of distinct non-zero values in the matrix.
  • LA - is a list[NDNZ] such that LAi is Ai's ordered list[Ng] of linear indexes in the matrix.
  • L - is an indicator of the matrix linear indexing mode: 0 for row-major order, and 1 for column-major order.
  • S - is the (int rows, int columns) tuple size of the matrix.

NDNZ means "number of distinct non-zeroes". NNZ means "number of non-zeroes". In LA, the total number of linear indexes ΣNg = NNZ.

(
    List<int> value, 
    List<List<int>> linearIndex, 
    MatrixLinearIndexMode linearIndexMode,
    (int rows, int columns) size
)

CVS stores the linear indexes in LAi instead of the actual coordinates to save memory. In C# for example, an int is a 32-bit structure, which uses approximately 4 bytes of memory (assuming 8 bits = 1 byte). Coordinates are a pair of two integers (int row, int column), thus using 8 bytes of memory. By converting coordinates to their linear index equivalent, which is an int, the memory use is cut down to 4 bytes instead.

The inclusion of L and S is relevant in the conversion of coordinates to/from their equivalent matrix linear indexes.

  • If L is 0 (row-major order):
    • The conversion from coordinates to linear index is columnIndex + (rowIndex * size.columns).
    • The conversion from linear index to coordinates is (linearIndex / size.columns, linearIndex % size.columns).
  • If L is 1 (column-major order):
    • The conversion from coordinates to linear index is rowIndex + (columnIndex * size.rows).
    • The conversion from linear index to coordinates is (linearIndex % size.rows, linearIndex / size.rows).

In order for CVS to be interpreted correctly, the same linear indexing mode used to create the CVS should be used in operations. Thus, CVS should clearly identify itself if it's in row-major order or in column-major order. It is possible for algorithms to be created in such a way that it can be used for CVS in either row-major order or in column-major order, for as long as the linear indexing mode is evaluated and used within the algorithm itself.

CVS compression can achieve higher compression ratio and better space savings than CRS and CCS. There are many apparent reasons for this -- given that CRS/CCS uses memory for (2 * NNZ) + Ni+1:

  • CVS uses memory for NDNZ + NNZ + scalar + (int,int) tuple + LA[NDNZ] overhead -- where (scalar + (int,int) tuple) may be approx. 12+ bytes. Assuming the scalar and tuple bytes are insignificant, CVS saves the memory used by CRS/CCS for NNZ + Ni+1 - (NDNZ + LA[NDNZ] overhead).
  • NNZ can be significantly > NDNZ. For example, in a (1, 0, 0)-adjacency matrix with 2500 NNZ, CVS NDNZ = 1.
  • Ni+1 can be significantly > NDNZ. For example, in a 2500 x 2500 matrix sourced (1, 0, 0)-adjacency matrix, Ni+1 = 2501, compared to CVS NDNZ = 1.
  • CVS is a lossless compression that exploits redundancy. For matrices with no redundant entries, such that NDNZ = NNZ, and assuming the scalar and tuple bytes are insignificant, CVS saves the memory used by CRS/CCS for Ni+1 - (NDNZ + LA[NDNZ] overhead).
Clone this wiki locally