Multidimensional Array
General Trivia
Multidimensional arrays are like matrices in mathematics. For example:
// array[N_ROWS][N_COLUMNS]
// A 2D Array or 2X2 Matrix
int arr1[2][2] = {
{1, 2};
{3, 4}
};
Note: You can’t leave both ROWS
and COLUMNS
empty. Specifying the number of columns is necessary.
At assembly level, they are no different than a normal array. This is the assembly for the above code.
sub rsp, 16
mov DWORD PTR -16[rbp], 1
mov DWORD PTR -12[rbp], 2
mov DWORD PTR -8[rbp], 3
mov DWORD PTR -4[rbp], 4
From this, we can infer that elements are stored row-wise, i.e arr[0][0], arr[0][1], arr[1][0], arr[1][1]
.
The size for stack allocation is calculates using this formula: n(rows)*n(columns)*sizeof(type)
.
- For the above example, that would be
2*2*4 = 16
bytes. - We can also verify the formula by individually calculating the size. There are 4 integers, each of size 4, so 16 bytes are required. So, no block magic.
Accessing Elements
This one is a little tricky as there are two ways to access the elements:
- Row-major is the standard way and cache-friendly way to access elements. It is the default one.
- Column-major is the non-standard way and non-cache-friendly way to access elements.
The talk about “standard access mode” is also language dependent.
- Row-major is visually easy and aligns with how we interpret matrices in general, while column-major has its own use cases, primarily in scientific computing and advance mathematics.
- Many mainstream languages default to row-major for that reason, like C/C++, Java, Python. On the other hand, languages like Fortran, MATLAB and Julia prioritizes column-major.
This is the code for accessing elements in both ways:
#include <stdio.h>
int main(void){
// 2D Array
int arr[2][3] = {
{1, 2, 3},
{4, 5, 6}
};
// Row-Major
for (int i = 0; i < 2; i++){
for (int j = 0; j < 3; j++){
printf("%d, ", arr[i][j]);
}
printf("\n");
}
printf("\n");
// Column-Major
for (int i = 0; i < 3; i++){
for (int j = 0; j < 2; j++){
printf("%d, ", arr[j][i]);
}
printf("\n");
}
printf("\n");
}
In terms of normal mathematics, it is just matrix transformation.
Note: Elements are stored row-major only. We can’t change that.
Address Calculation
For 1D arrays, the formula to calculate the address of any position was:
addr_Of_i = base_addr + sizeof(type)*i
For example, if an array has 10 integers, the base address is 10000, the address of 7th element (0-based indexing) would be:
10000 + 4*6 = 10024
We can verify if the formula works by laying down all the addresses.
10000 10004 10008 10012 10016 10020 10024 10028 10032 10036 0 1 2 3 4 5 6 7 8 9
For multidimensional arrays, the formula to calculate the address of a position is different for both row-major and column-major.
Lets derive the formula for row-major order.
int arr[2][2] = {
{1, 2},
{3, 4}
};
10000 10004 10008 10012
ROW 0 0 1 1
COLUMN 0 1 0 1
To jump to the next row, we have to traverse the total number of elements in one row multiplied by the size of their type, i.e n(cols)*sizeof(int)
To jump to the next element within the same row, we have to add the column offset.
To traverse every element in 0th row, we have to do:
base_addr + col_off
To jump to the next row, we have to go past every element in the current row, which is given by the number of columns and then add the offset for elements in the next row.
base_addr + n(cols)*4 + col_off*4
To jump to the ith
row, we have to go past (i-1)
rows, where each row would have n(cols)
elements, that means, we have to go past i*cols*4
to reach the next row. Then we add the column offset multiplied by size. So, the final formula becomes:
base_addr + (ith_row * n(cols) * 4) + (column_offset + 4)
In simple terms,
addr_of_i = ( (i * N_COLS) + j ) * 4
Lets test this formula.
#include <stdio.h>
#define ROWS 2
#define COLS 3
int main(void){
int arr[ROWS][COLS] = {
{1, 2, 3},
{4, 5, 6}
};
int* base = &arr[0][0];
for (int i = 0; i < ROWS; i++){
for (int j = 0; j < COLS; j++){
printf("%d, ", *(base + (i*3 + j)) );
}
}
}
We are not multiplying by 4 because the pointer is of type int
and dereferencing scales to the size of the type automatically.