Multi-Dimensional Arrays

Multidimensional Arrays

An array with more than one dimension is called a multidimensional array.

int matrix[5][10];  /* array of 5 arrays of 10 int; a 5x10 array of int */

Building up multidimensional arrays:

int a;            /* int                                     */
int b[10];        /* array of 10 int                         */
int c[5][10];     /* array of 5 arrays of 10 int             */
int d[3][5][10];  /* array of 3 arrays of 5 arrays of 10 int */

int e[10][5][3];  /* array of 10 arrays of 5 arrays of 3 int */

Storage order: Arrays in C are stored in row major order. This means that the rightmost subscript varies the most rapidly.

Given this declaration of points:

double points[3][4]; /* points[ROWS][COLUMNS] */

An array of 3 arrays of 4 doubles
A 3x4 array of doubles

We could diagram the array like this:

With details:

Or draw it contiguously (as it really is in memory):

 

Or horizontally:

Giving concrete values to the 2D array of doubles will help visualize the arrays. Note how the initialization syntax helps us visualize the "array of arrays" notion:

double points[3][4] = {{1.0, 2.0, 3.0, 4.0}, {5.0, 6.0, 7.0, 8.0}, {9.0, 10.0, 11.0, 12.0}};

or even formatted as a 3x4 matrix:

double points[3][4] = { 
                        {1.0,  2.0,  3.0,  4.0}, 
                        {5.0,  6.0,  7.0,  8.0}, 
                        {9.0, 10.0, 11.0, 12.0}
                      };

Diagram:

Some expressions involving points (on a 64-bit computer):

Addresses                                   Type
--------------------------------------------------------------------------------
points        = 0x7fffa50f0200    An array of 3 arrays of 4 doubles
&points       = 0x7fffa50f0200    A pointer to an array of 3 arrays of 4 doubles
points[0]     = 0x7fffa50f0200    An array of 4 doubles
&points[0]    = 0x7fffa50f0200    A pointer to an array of 4 doubles
*points       = 0x7fffa50f0200    An array of 4 doubles
&points[0][0] = 0x7fffa50f0200    A pointer to a double

Contents
------------------------
**points      = 1.000000
*points[0]    = 1.000000
points[0][0]  = 1.000000

Sizes
-------------------------
sizeof(points)       = 96
sizeof(*points)      = 32
sizeof(points[0])    = 32
sizeof(**points)     =  8
sizeof(points[0][0]) =  8
sizeof(&points)      =  8

C code to display above tables.

Notes:

When using just the name points, it is the address of the first element (just like single-dimensional arrays).
The first element is an array of 4 doubles (it's the first row).
Each element of a 2D array is an array (1D) itself.

Accessing Elements in a 2-D Array

short matrix[3][8]; /* 24 shorts, 3x8 array, 3 rows, 8 columns */

matrix

  matrix[0]
*(matrix + 0)
   *matrix

  matrix[1]
*(matrix + 1)

  matrix[2]
*(matrix + 2)

     matrix[1][2]
*(*(matrix + 1) + 2)

Remember the rule:

array[i] == *(array + i)

where:

array is an array of any type
i is any integer expression

With multidimensional arrays, the rule becomes:

array[i][j] == *(*(array + i) + j)
array[i][j][k] == *(*(*(array + i) + j) + k)
etc...

Pointer arithmetic is used to locate each element (base address + offset) and is done by the compiler. If you look at the assembly code that is generated, you will see all of the pointer arithmetic that is being done for each access.

Given this declaration:

short matrix[3][8];

The value of sizeof varies with the argument:

Sizes
-------------------------
sizeof(matrix)       = 48   ; entire matrix
sizeof(matrix[0])    = 16   ; first row
sizeof(matrix[1])    = 16   ; second row
sizeof(matrix[0][0]) = 2    ; first short element

Dynamically Allocated 2D Arrays

Recall the 2D points static array and how a dynamically allocated array would look:

double points[3][4];

double *pd = (double *)malloc(3 * 4 * sizeof(double));

What is sizeof each element in points?

Given a row and column:

int row = 1, column = 2;
double value;

The static 2D array can be accessed using subscripts, but the dynamic "2D array" can only be indexed with a single subscript.
```
value = points[row][column]; /* OK      */
value = pd[row][column];     /* ILLEGAL */
```
We (the programmers) have to do all of the arithmetic to locate an element using two subscripts:
```
value = pd[row * 4 + column];
```

The compiler is still doing some of the work for us:

value = *(address-of-pd + (row * 4 + column) * sizeof(double));

What does the number 4 in the above calculations represent?

If we want to use two subscripts on a dynamic 2D array, we have to set things up a little differently.

Why can't we just cast pd to a two-dimensional array and have the compiler do the pointer arithmetic for us?

Using these definitions from above:

  /* Assume these values are chosen at runtime. */
int ROWS = 3;
int COLS = 4;

  /* Dynamically allocate the memory */
double *pd = malloc(ROWS * COLS * sizeof(double));

Create a variable that is a pointer to a pointer to a double

double **ppd;

Allocate an array of 3 (ROWS) pointers to doubles and point ppd at it:

ppd = malloc(ROWS * sizeof(double *));

Point each element of ppd at an array of 4 doubles:

ppd[0] = pd;
ppd[1] = pd + 4;
ppd[2] = pd + 8;

Of course, for a large array, or an array whose size is not known at compile time, you would want to set these in a loop:

int row;
for (row = 0; row < ROWS; row++)
  ppd[row] = pd + (COLS * row);

This yields the diagram (32-bit computer):

Given a row and column, we can access elements through the single pointer or double pointer variable:

int row = 1, column = 3;
double value;

  /* Access via double pointer (array of arrays) using subscripting */
value = ppd[row][column];            

  /* Access via single pointer using pointer arithmetic        */
  /* and/or subscripting. These statements are all equivalent. */
value = pd[row * COLS + column];
value = *(pd + row * COLS + column);
value = (pd + row * COLS)[column];

We could make the code easier by creating a helper function that will allocate a 2D array of doubles of any size:

double **allocate_2D(int rows, int cols)
{
  int i;             /* Loop variable               */
  double **pointers; /* The pointers for each row   */
  double *array;     /* The actual array of doubles */

    /* Allocate memory for the rows X cols array */
  array = (double *) malloc(rows * cols * sizeof(double));

    /* Allocate the array of pointers, one per row */
  pointers = malloc(rows * sizeof(double *)); 

    /* Point each pointer at its corresponding row */
  for (i = 0; i < rows; i++)
    pointers[i] = array + (cols * i);

  return pointers;
}

void print2D(double **ppd, int rows, int cols)
{
  int i, j;
  for (i = 0; i < rows; i++)
  {
    for (j = 0; j < cols; j++)
      printf("%8.2f", ppd[i][j]);
    printf("\n");
  }
}

Using it:

int i, j;
int rows = 3, cols = 4;

double **ppd = allocate_2D(3, 4);

  /* Do something with the array ... */
for (i = 0; i < rows; i++)
  for (j = 0; j < cols; j++)
    ppd[i][j] = i * cols + j + 1;

print2D(ppd, rows, cols);

free(*ppd); /* Free the array of 12 doubles (Must do this first!)  */
free(ppd);  /* Free the pointers to each row                       */

Output:

    1.00    2.00    3.00    4.00
    5.00    6.00    7.00    8.00
    9.00   10.00   11.00   12.00

You can't just create a 1D array and then cast it to a 2D array. The first reason is that you can't cast to an array. The second (if you could cast to an array) is that casting is done at compile-time, so the compiler needs to know how many columns there are. If you don't know the size until run-time, you need to do it this way.

Doing this:

double *p = (double [3][4]) points; /* Not legal */

gives this error:

error: cast specifies array type
   pd = (double [3][4])points;

Passing 2D Arrays to Functions

Recall how we pass an array to a function.

Putting values in the matrix and printing it (from above):

short matrix[3][8]; /* 24 shorts, 3x8 array, 3 rows, 8 columns */

Fill3x8Matrix(matrix);  /* Put values in the matrix */
Print3x8Matrix(matrix); /* Print the matrix         */

Implementations:

void Fill3x8Matrix(short matrix[][8])
{
  int i, j;
  for (i = 0; i < 3; i++)
    for (j = 0; j < 8; j++)
      matrix[i][j] = i * 8 + j + 1; 
}

void Print3x8Matrix(short matrix[][8])
{
  int i, j;
  for (i = 0; i < 3; i++)
    for (j = 0; j < 8; j++)
      printf("%i ", matrix[i][j]);
  printf("\n");
}

These functions could have specified the parameters this way: (precedence chart)

void Print3x8Matrix(short (*matrix)[8])

If you forget the parentheses it is incorrect and means something completely different:

void Print3x8Matrix(short *matrix[8])

The above functions expect an array of pointers, which can also be written like this: (you don't provide the number)

void Print3x8Matrix(short *matrix[])

or this:

void Print3x8Matrix(short **matrix)

If you include the first number, it is ignored by the compiler just like all arrays that are passed as parameters:

void Print3x8Matrix(short matrix[3][8]); /* First number (3) is ignored */

Why are they not declared like this?:

void Fill3x8Matrix(short matrix[][]);
void Print3x8Matrix(short matrix[][]);

This is the error from gcc:

error: array type has incomplete element type 'short int[]'
 void Fill3x8Matrix(short matrix[][])
                          ^~~~~~
note: declaration of 'matrix' as multidimensional array must have bounds for all dimensions except the first
In function 'Fill3x8Matrix':

Here is a generic print function that can print any size 2D array (using pointer arithmetic). You must provide the size of both dimensions:

void PrintMatrix(short *matrix, int rows, int columns)
{
  int i;
  for (i = 0; i < rows * columns; i++)
    printf("%i ", *(matrix + i));
  printf("\n");
}

void foo(void)
{
  short mat[3][8];

  Fill3x8Matrix(mat);  /* only works with 3x8 */
  Print3x8Matrix(mat); /* only works with 3x8 */

    /* works with any size 2D array */
  PrintMatrix(&mat[0][0], 3, 8);
}

When you treat a 2D array as a 1D array like this it is called flattening the array. Technically speaking, static 2D arrays are not guaranteed to be contiguous in memory, although I've never encountered one that wasn't contiguous.

From the C-FAQ:

It must be noted, however, that a program which performs multidimensional array subscripting ``by hand'' in this way is not in strict conformance with the ANSI C Standard; according to an official interpretation, the behavior of accessing (&array[0][0])[x] is not defined for x >= [number of columns].

Finally:

The compiler needs to know the size of each element. It doesn't need to (and can't) know the number of elements. The size of each element is determined by the type of the elements, and for 2D arrays, the type is determined by the size of all but the first dimension.

Here's an example of using pointer arithmetic on static arrays:

void Test(int a[], int b[][6], int c[][3][5])
{
  printf("a = %p, b = %p, c = %p\n", (void *)a, (void *)b, (void *)c);
  a++;
  b++;
  c++;
  printf("a = %p, b = %p, c = %p\n", (void *)a, (void *)b, (void *)c);
}

Output:
a = 0012FEE8, b = 0012FF38, c = 0012FEFC  
a = 0012FEEC, b = 0012FF50, c = 0012FF38

In decimal:

Output:
a = 1244904, b = 1244984, c = 1244924
a = 1244908, b = 1245008, c = 1244984

Shown graphically. The thick black lines indicate what the pointers are actually pointing at. The dotted gray elements represent zero or more elements. Remember, the size of the arrays (i.e. the number of elements in the arrays) is unknown to the function since it's only receiving pointers.

The function Test is equivalent to this:

void Test(int *a, int (*b)[6], int (*c)[3][5])

Other methods for filling the matrix use explicit pointer arithmetic:

void Fill3x8Matrix(short matrix[][8])
{
  int i, j;
  for (i = 0; i < 3; i++)
    for (j = 0; j < 8; j++)
      *(*(matrix + i) + j) = i * 8 + j + 1; 
}

void Fill3x8Matrix(short matrix[][8])
{
  int i, j;
  for (i = 0; i < 3; i++)
  {
    short *pmat = *(matrix + i);
    for (j = 0; j < 8; j++)
      *pmat++ = i * 8 + j + 1;
  }
}

How does the compiler calculate the address (offset) for the element below?

matrix[1][2];

Using address offsets we get:

&matrix[1][2] ==> &*(*(matrix + 1) + 2) ==> *(matrix + 1) + 2

First dimension - Each element of matrix is an array of 8 shorts, so each element is 16 bytes.
Second dimension - Each element of each element of matrix is a short, so it's 2 bytes.

Given these declarations:

short matrix[3][8]        
short array[10]

We can calculate the size of any portion:

Expression             Meaning                 Size (bytes)
-----------------------------------------------------------
array                Entire array                  20
array[N]             Element in 1st dimension       2
matrix               Entire array                  48
matrix[N]            Element in 1st dimension      16
matrix[N][M]         Element in 2nd dimension       2

Recap:

The compiler needs to know the size of each of the elements, in each dimension.
Since the size of each dimension relies on the fundamental type (int, double, etc.) of the array(s), there is an implicit size specified.
In a two-dimensional array, knowing the size of the second dimension (number of columns) and the data type of the array is sufficient to perform pointer arithmetic on the first dimension.
This seemingly convoluted way of locating array elements is required since memory is laid out in one dimension by the compiler. The multiple dimension syntax (e.g. [][]) is just a convenience for the programmer.

Additional examples with pointers and arrays If you want to see even more examples, you can follow the link.