More on arrays

Yutaka Masuda

February 2020

More on arrays

Basic usage of arrays

Fundamental data-structure in Fortran

An array is a collection of elements (or objects), which have the same data type. The array is the fundamental data-structure that Fortran can directly use. Fortran can process the array efficiently. As an array keeps a simple structure in memory, it is relatively comfortable for the compiler to apply acceleration techniques (e.g., parallelization) to the program. Also, the program has room to improve the speed by using the array structure efficiently.

Literals

A literal array is defined with [] with a comma-separated list of literals with the same type. You must not mix different types in the list. A high-dimension literal array (matrix) is defined using intrinsic function reshape() with []. Without reshape, there is no method to create a matrix literal.

Example 1: [1,2,3] or [1.0,2.0,3.0]
Example 2: reshape([1,2,3,4,5,6],[3,2])
Example 3 (illegal): [1.0, 2, 3]

Array variables

An array variable is declared with a variable name plus () with the dimension and the rank of the array. The range of index should be lower:upper in each dimension. When omitting the lower bound of the index, 1 will be assumed.

Example 1: integer :: a(3) for 1-dimentional, 3-element integer array.
Example 2: integer :: b(3,6) for 2-dimentional integer array, with index ranging from 1 to 3 for the 1st dimension, and 1 to 6 for the 2nd dimension

To access an element of the array, specify the index in (). By specifying the range of indices, you can make a submatrix.

Example 1: a(1) or b(2,2)
Example 2: b(2:3,4:6)

Allocated arrays

An allocatable array is defined with the allocatable option in the declaration. The index should be empty, e.g., (:) or (:,:). To allocate the array, use the allocatable() statement with a range of the index. Use deallocate to remove the allocated array (to release the allocated memory). When you leave the program unit, an allocated array is deallocated if the array is defined in the unit.

Example 1: integer :: x(:,:) for declaration
Example 2: allocate(x(2,5)) for allocation
Example 3: deallocate(x) for deallocation

Systematic access to each element in array

A loop is a typical method to get access to each element in an array. The loop generates a sequence of integer numbers by a counter variable, and you can use it as the index. For example, we have a vector with 5 elements. \[ \mathbf{a}'=\left[ \begin{array}{ccccc} a_1&a_2&a_3&a_4&a_5 \end{array} \right]' \] Using the following program, you can print each of them sequentially.

program arr
   implicit none
   integer :: i
   real :: a(1:5)
   a = [1,2,3,4,5]

   do i=1,5
      ! use a counter as index
      print *,a(i)
   end do
end program arr

We are going to see this technique later in this chapter.

Array index

Arbitrary lower bound

The array index starts from 1 by default. You can declare an array with an arbitrary starting value in the index.

program arr
   implicit none
   ! range: -2 to 2
   real :: a(-2:2)
   a = [1,2,3,4,5]
   print *,a(-2),a(-1),a(0),a(1),a(2)
end program arr

In my personal opinion, you should not use this feature unless you need it (for example, a case where you have to maintain the code written by other people). It is problematic when the array is passed to functions or subroutines.

program arr
   implicit none
   real :: a(-2:2)
   a = [1,2,3,4,5]
   call print_element(a)

contains
subroutine print_element(x)
   real,intent(in) :: x(:)
   print *,x(1)
end subroutine print_element
end program arr

   1.00000000

You expect 4.00000000 but it did not because the array bounds are not be explicitly transferred to procedures. When you put x(:) in a procedure, it is interpreted as the index starts from 1. If you pass the non-standard index to a procedure, you must take care of such a possibility in the procedure. Please consult the following examples.

! explicit lower bound
subroutine print_element(x)
   real,intent(in) :: x(-2:)
   print *,x(1)
end subroutine print_element

! pass the lower-bound variable
subroutine print_element(x,bottom)
   real,intent(in) :: x(bottom:)
   integer,intent(in) :: bottom
   print *,x(1)
end subroutine print_element

There are inquiry functions to obtain the bounds of an array.

Function	description
`lbound(x)`	lower bound of array `x`
`lbound(x,1)`	row-wise lower bound of 2-dimensional array `x`
`lbound(x,2)`	column-wise lower bound of 2-dimensional array `x`
`ubound(x)`	upper bound of array `x`
`ubound(x,1)`	row-wise uppwe bound of 2-dimensional array `x`
`ubound(x,2)`	column-wise upper bound of 2-dimensional array `x`

Adjustment of index

When passing a subarray to a procedure, a programmer should be aware of the index of the subarray is different from the index of the full array. See the following piece of code.

   integer :: x(5),k
   x = [2,9,0,5,7]
   k = minloc(x(3:5))
   print *,k

You know x has 5 elements, and the 3rd element is the smallest. However, the function minloc accepts a subarray x(3:5), and it is re-interpreted as a 3-element array-like x(1:3), then it returns 1 because the first element of the subarray is the smallest. If you want to have the original array index, you must adjust the index after calling a procedure with the subarray.

   integer :: x(5),k
   x = [2,9,0,5,7]
   k = minloc(x(3:5))
   print *,k+2

In a general case, the adjustment is as follows.

   integer :: x(5)
   integer :: i,k
   x = [2,9,0,5,7]
   i = 3
   k = minloc(x(i:5))
   print *,i+k-1

The adjustment may be confusing, so please make a detailed plan before coding.

Stepped index

You can define steps in the index as (start:end:step). For example, (1:5:2) is equivalent to discrete indices 1, 3, and 5. You can make a subarray using the stepped index.

   integer :: x(6) = [10,20,30,40,50,60]
   print *,x(1:5:2)

Indirect addressing

You can use an integer vector for index. See the following example.

   integer :: x(6) = [10,20,30,40,50,60]
   integer :: idx(3)
   idx = [5,2,4]
   print *,x(idx)

      50        20        40

Zero-sized array

A tricky feature in Fortran is a zero-sized array whose length is 0, i.e., no elements in it. This array is empty, but it still exists.

program arr
   implicit none
   real :: a(0)
   print *,size(a),lbound(a),ubound(a)
end program arr

           0           1           0

The size is 0, and the lower bound is 1, and the upper is 0. So, this array is equivalent to a(1:0). Note that the zero-sized array is defined whenever the upper bound is smaller than the lower bound, and the resulting zero-sized array’s lower-bound is always 1. For example, when you define a zero-sized array, a(5:0), its lower bound is adjusted to 1, so that it is equivalent to a(1:0).

You can not assign any values to this particular array. When you try a(0)=1, the compiler stops with an error. However, you may use it in a formula, and the result should be an empty array.

program arr
   implicit none
   real :: a(0)
   print *,"a=",a
   print *,"a+1=",a+1
end program arr

 a=
 a+1=

Mathematical functions may not be failed with the zero-sized array, but you should not expect any meaningful results. For example, although sum(a) returns 0 in GFortran, it is nonsense. sqrt(a) returns the array with the same size as a, i.e., zero-sized.

The zero-sized array may appear when you allocate an array. We will see the usage of a zero-sized array in the next section.

Assignment, again

Vector with sequencial numbers

Using a similar syntax as do, you can create a vector with sequential numbers.

   real :: x(5)
   integer :: i
   x = [(i,i=1,5)]

   1.00000000       2.00000000       3.00000000       4.00000000       5.00000000

Assignment to the same array

You can relocate elements in the same array.

   real :: x(5)
   integer :: i
   x = [(i,i=1,5)]
   x(2:4) = x(1:3)

The Fortran standard guarantees to relocate without any trouble. For example, in the following code, the program exchanges rows and columns within a matrix. If you want to do it manually, you need a temporary matrix.

   real :: a(5,5)
   integer :: i,idx(5)
   idx = [3,2,5,1,4]
   a = reshape([(i,i=1,25)],shape(a))
   a = a(idx,idx)

Reallocation of arrays

Basics

When using an allocatable array, you should know the size of the array before allocating the array. If the size of the array is too small to have the data, you have to reallocate (resize) the array. Once you deallocate an array, the content will be lost, so you must keep the original content of the array. The traditional reallocation requires the following steps (here, a is the target array).

real,allocatable :: a(:),t(:)

! the target array has been allocated.
allocate(a(5))

! 1. prepare the current (old) size and the new size of a
old_size = size(a)
new_size = size(a)+1

! 2. allocate an temporary array 
allocate(t(old_size))

! 3. copy a to t
t = a

! 4. deallocate a
dallocate(a)

! 5. re-allocate a with the new size
allocate(a(new_size))

! 6. copy t back to a
a(1:old_size) = t(1:old_size)

! 7. remove the temporary array
deallocate(t)

In modern Fortran, the reallocation is automated without temporary arrays.

real,allocatable :: a(:)

allocate(a(5))

! fill the array
a = [1,2,3,4,5]

! add more elements; should be real numbers
a = [a, 6.0, 7.0]

At the end of this program, the array a should have some extra elements, and the size of a becomes 7. A trap is, it fails if you mix different numerical types in the array literal in []. For example, the following expression gives an error.

! OKAY: all values are the same type (integer)
a = [1,2,3,4,5]

! ERROR: mixed types
a = [1, 2, 3, 4.0, 5.0]

! ERROR: a is real but 6,7 are integer numbers
a = [a, 6, 7]

Reallocation works only when the size of reallocated array is known. For example, in the above case, a=[a, 6.0, 7.0] works because it is evident that the resulting size is 7. If you directly specify the index, which is out of range, it will be failing without any errors. It does not work because referring to a nonexistent index does not mean to resize the array.

allocate(a(5))
a = [1,2,3,4,5]

! DANGEROUS!
! YOU MUST NOT DO IT. IT DESTROYS SOME DATA.
a(6) = 6.0
a(7) = 7.0

! The size is still 5.
print *,size(a)

It is not apparent how to resize a 2-dimensional array. Also, a frequent resize of an array may have a performance penalty with repeated allocations.

Copy of allocated arrays

If you have 2 allocatable arrays, you can easily copy the data in one array to another array.

real,allocatable :: a(:),b(:)

allocate(a(5))
a = [1,2,3,4,5]

! copy the content of a to b
b = a
print *,b
print *,allocated(a),allocated(b)

   1.00000000       2.00000000       3.00000000       4.00000000       5.00000000
 T T

It makes a copy of a to b, which will be accordingly allocated with the same size as a, and both arrays are alive. The resize of b happens even though b has been already allocated. In the following example, a is also resized (from 3 to 5).

real,allocatable :: a(:),b(:),c(:)

allocate(a(3),b(6),c(4))
a = [1,2,3,4,5]

! copy the content of a to b and c
print *,size(a),size(b),size(c)
b = a
c = a
print *,size(a),size(b),size(c)

           5           6           4
           5           5           5

Obviously, it is dangerous; it may cause an unintended resize of allocated arrays. I believe that it is one of the most embarrassing features introduced to Modern Fortran. To avoid this issue, you can put (:) to the allocated array all the time. See the following code.

real,allocatable :: a(:),b(:),c(:)

allocate(a(3),b(6),c(4))
a = [1,2,3,4,5]

! copy the content of a to b and c without the resize
print *,size(a),size(b),size(c)
b(:) = a(:)
c(:) = a(:)
print *,size(a),size(b),size(c)

           5           6           4
           5           6           4

It keeps the original size and copies the elements as many as possible. For example, b gets all elements of a, and c get only 4 of the elements.

Transfer of allocated array

A function, move_alloc, transfers the data in an allocated array to another allocated array. The source array will be automatically deallocated after the transfer is done. This involves some allocation and deallocation steps.

real,allocatable :: a(:),b(:)

allocate(a(5))
a = [1,2,3,4,5]

! move the data from a to b
call move_alloc(a,b)

print *,allocated(a),allocated(b)

F T

When the function is called as call move_alloc(a,b), it behaves as follows.

Deallocate the destination array, b, if allocated.
Allocate b as the same size as a.
Copy all elements in a to b like b=a.
Deallocate a.

The arguments can be specified using the keywords: call move_alloc(from=a, to=b).

This function is useful when to resize an array with a temporary array keeping the original data. The following example shows that array a is expanding with 3 elements more, keeping the original data.

program arr
   implicit none
   integer :: n,m
   real,allocatable :: a(:),temporary(:)

   ! initial size of a
   n = 5
   allocate(a(n))
   a = [1,2,3,4,5]

   ! new array
   m = n + 3
   allocate(temporary(m))

   ! copy the original data to the temporary array
   temporary(1:n) = a(1:n)

   ! transfer the data from the temporary array to the original array
   ! the original array resized; temporary array deallocated
   call move_alloc(from=temporary, to=a)
end program arr

If it seems confusing, use a combination of allocate-deallocate statements manually.

Array of character variable

You can create an array of character variables. For example, the following code has a vector with 3 elements of character variable with length 10.

   character(len=10) :: a(3)

You need two kinds of indices: one for the position of the character and the other one for the index of the array. It is not a 2-dimensional array. These two indices should be separately specified so that the position of character always comes to the end of the variable. See the following example for details.

program strarray
   implicit none
   character(len=10) :: a(3)
   a(1) = "aaaaaa"
   a(2) = "bbbbbb"
   a(3) = "cccccc"

   ! the first element of the array; characters 1 and 2
   print *,a(1)(4:5)
end program strarray

aa

In a(1)(4:5), the first parentheses refer to the index of the array (the first character variable), and the second parentheses extract the 4th and 5th characters from a(1).

Some algorithms using arrays

Traversing all elements of a matrix

Full matrix

In matrix computations, we often have to look at all elements in a matrix. Although some functions like sum and minval are helpful to avoid the manual search through all the elements, there are many cases where these functions are not applicable.

A matrix has 2 indices, row, and column. When you look at all the elements, you should use a double loop; one is for row index, and the other one is for column index. The following code use i for rows and j for a column, creating the index is (i,j), to show each element in a matrix.

program mat
   implicit none
   integer,parameter :: n=3
   real :: a(n,n)
   integer :: i,j

   ! initial values
   a = 1.0

   ! make a combination of (i,j)
   do i=1,n
      do j=1,n
         print *,a(i,j)
      end do
   end do
end program mat

This program defines a \(3 \times 3\) matrix.

\[ \mathbf{A} = \left[ \begin{array}{ccc} a_{11}&a_{12}&a_{13} \\ a_{21}&a_{22}&a_{23} \\ a_{31}&a_{32}&a_{33} \end{array} \right] \]

The above program access rows in the matrix i.e., fixed row i with incrementing column j. The sequence of elements looks like this.

\[ a_{11}, a_{12}, a_{13}, a_{21}, a_{22}, a_{23}, a_{31}, a_{32}, a_{33} \]

To support the within-column access, i.e., fixed column j with incrementing row i, you just swap the order between i and j. Here is a piece of the program.

   do j=1,n
      do i=1,n
         print *,a(i,j)
      end do
   end do

The sequence of access is as follows.

\[ a_{11}, a_{21}, a_{31}, a_{12}, a_{22}, a_{32}, a_{13}, a_{23}, a_{33} \]

This sequence is the same as data order in memory. In general, within-column access is faster because of the alignment in memory.

Some operations

Using the above technique, you can perform matrix operations. For example, the following program calculates the sum of all elements of a matrix.

program mat
   implicit none
   integer,parameter :: n=3
   real :: a(n,n),s
   integer :: i,j

   ! initial values
   a = 1.0

   ! sum of all elements
   s = 0.0
   do j=1,n
      do i=1,n
         s = s + a(i,j)
      end do
   end do
   print *,"sum = ",s
end program mat

Exercises

Modify the above program to find the maximum value in a matrix.
In the above program (exercise 1), print the maximum value and its index.
Assuming the matrix is symmetric, write a program to print the sum of the elements in the lower triangular part of the matrix.

Product of matrix by vector

Methematical formulation

Suppose we have a matrix \(\mathbf{A}\) (\(m \times n\)) and a vector \(\mathbf{x}\) (\(n \times 1\)). The product is \(\mathbf{b}=\mathbf{Ax}\). The \(i\)-th element of \(\mathbf{b}\) (\(b_i\)) is available from the following formula. \[ b_i = \sum_{j=1}^{n}A_{ij}x_{j}\quad\text{for}\quad 1\leq i\leq n \] A graphical image of this formula is as follows. \[ \left[ \begin{array}{c} \vdots\\ \vdots\\ b_i \\ \vdots \\ \vdots \end{array} \right] = \left[ \begin{array}{ccccc} \vdots & \vdots & & \vdots & \vdots\\ \vdots & \vdots & & \vdots & \vdots\\ \hline A_{i1}&A_{i2} &\cdots & A_{i,(n-1)} & A_{in}\\ \hline \vdots & \vdots & & \vdots & \vdots \\ \vdots & \vdots & & \vdots & \vdots \end{array} \right] \left[ \begin{array}{c} x_1\\ x_2 \\ \vdots \\ x_{n-1}\\ x_{n} \end{array} \right] \]

Formula

The product \(\mathbf{b}\) is available by computing \(b_i\) from \(i=1\) to \(i=m\).

For \(i=1\), you have \(b_1=\sum_{j=1}^{n}A_{1j}x_{j}\) i.e. b(i)=dot_product(A(1,:),x(:)).
For \(i=2\), you have \(b_2=\sum_{j=1}^{n}A_{2j}x_{j}\) i.e. b(i)=dot_product(A(2,:),x(:)).
For \(i=n\), you have \(b_m=\sum_{j=1}^{n}A_{mj}x_{j}\) i.e. b(m)=dot_product(A(m,:),x(:)).

Coding

I show a reference code for this program.

program a_times_x
   implicit none
   integer,parameter :: m=5,n=3
   integer :: i,j
   real :: A(m,n),x(n),b(m)

   ! some arrays
   A = reshape([i=1,m*n],shape(A))
   x = [(i=1,m)]

   ! computation
   do i=1,m
      b(i) = dot_product(A(i,:),x(:))
   end do

   print *,"b=A*x: ",b
end program a_times_x

Exercises

Rewrite the above code using loops instead of dot_product.
Modify the program in Exercise 1 so that two loops are exchanged, and confirm the result is the same as the original program.
Make a function to calculate the product of a matrix by a vector.

Selection sort

Sorting is an action to change the order of elements in an array. Usually, the elements are numbers, and the order becomes increasing or decreasing. There are many sorting algorithms, and you have to use an efficient one. Nevertheless, here, we introduce a naive algorithm, selection sort, just for an educational purpose.

Idea

The selection sort is very similar to sorting behavior by a human. Imagine that you sort playing cards by number (from the smallest to the largest) on your hand. What you will do is that you search a card with the smallest number, pick it, and put it on the top of a bunch of cards. Next, you search a card with the smallest number from the remaining cards and do the same stuff. You will repeat this process until all cards are processed.

In a computer program, the algorithm is slightly different from human behavior because of the limitation of data structure. In selection sort, first, you search a card with the smallest, pick it, and swap it with the top of cards. Next, you search the smallest out of the remaining cards, pick it, and swap it with the second card of the deck. See Wikipedia about the details of this algorithm.

Formulation

Suppose we have an array x with the length \(n\). We describe this algorithm using Fortran notation.

First, search the smallest value from x(1:n), and determine the index of the element k, then swap x(1) and x(k).
Second, search the smallest value from x(2:n), and determine the index of the element k, then swap x(2) and x(k).
Third, search the smallest value from x(3:n), and determine the index of the element k, then swap x(3) and x(k).

You will see the general rule of this algorithm.

In round i, search the smallest value from x(i:n), and determine the index of the element k, then swap x(i) and x(k).
Repeat this process for i from 1 to \(n\).

You can see this algorithm needs a loop for i.

Coding

There is no swap procedure in Fortran. So, we have to prepare the subroutine as follows.

subroutine swap(a,b)
   integer,intent(inout) :: a,b
   integer :: t
   t = a
   a = b
   b = t
end subroutine swap

Here is a reference code for selection sort. Note that minloc has a subarray, and it returns an index starting from 1 instead of i.

program sort
   implicit none
   integer,parameter :: n=5
   integer :: x(n)
   integer :: i,k

   x = [2,5,1,3,4]

   do i=1,n
      k = minloc(x(i:n))
      call swap(x(i),x(i+k-1))
   end do

   print *,x

contains
subroutine swap(a,b)
   integer,intent(inout) :: a,b
   integer :: t
   t = a
   a = b
   b = t
end subroutine swap
end program sort

It is straightforward and fast enough for small arrays. However, it becomes very slow, and you should not use this algorithm in practice with larger arrays.

Exercises

Consider why it will be slow for a large array.
Make a function of selection sort for integer array based on the above code.
Make another function to support descending order (from the largest to the smallest).

Split a string by space

In a typical data analysis, the data is stored in a text file, and each field is separated by white space. The read statement reads an entire row as characters, including the spaces. Although you often need to extract a particular field separated by space, Fortran does not have such a function. We are going to implement it by ourselves.

Design of the program

We put some restrictions on input and output because of simplicity.

The input row is stored in a character variable with a certain length like character(len=100) :: input. The input characters can be altered at the end of the program.
One or more spaces separate each item. The leading spaces in input will be ignored.
The output (separated characters) will be stored in an array of character variables like character(len=100) :: item(5).
The maximum number of items should be defined by a constant like integer,parameter :: max_item=5.

In this case, the program reads space-separated characters in str, splits it by space, and returns the first item in item(1), the second, item(2), and so on. If the original row has more than max_item items, the remaining items will be discarded.

You can figure out the program behavior with the following pseudo-code.

   integer,parameter :: max_item=5
   character(len=100) :: input
   character(len=100) :: item(max_item)

   ! 6 items
   input = " 100  50  3.14  abc  1.23e-2  12345"

   (some processing)

   print *,item(1)    ! 100
   print *,item(2)    ! 50
   print *,item(3)    ! 3.14
   print *,item(4)    ! abc
   print *,item(5)    ! 1.23e-2

Naive solution

There is the simplest but imperfect solution to this issue. See the following code.

   read(input,*) item

It perfectly works only if max_item is equal to or smaller than the actual number of items in input. In the above example, if input has 4 or fewer items, the read statement fails. Although you can trap the error by iostat=, you do not know how many items you have obtained.

   integer :: info
   ...
   ! you do not know how many items you got
   read(input,*,iostat=info) item

To get the number of items, you can count the number of non-empty items using a do loop. In this case, you have to initialize item with an empty string.

   integer :: i,n,info
   ...
   item = ""
   read(input,*,iostat=info) item

   n = 0
   do i=1,max_item
      ! count only non-empty items
      if(len_trim(item(i))>0) then
         n = n + 1
      end if
   end do

Standard solution

A straightforward approach is to extract the leading non-space characters and repeat them until you extract all the non-space characters. We can write the procedure as follows.

Initialize integer n=0.
Remove leading spaces by input=adjustl(input).
Search the position of a space (p) in input from the beginning. If space is at the top of input, there is no more non-space character and exit. If found somewhere else, sub-string input(1:p-1) is the non-space item.
Increment n and put item(n)=input(1:p-1). If n equals max_item, exit.
Put spaces to input(1:p-1) as input(1:p-1)=" ".
Go to 1.

The code is as follows.

program array
   integer,parameter :: max_item=5
   character(len=100) :: input
   character(len=100) :: item(max_item)
   integer :: n,p

   input = " 100  50  3.14  abc  1.23e-2  12345"

   n = 0
   do
      ! step 1: remove the leading spaces
      input = adjustl(input)

      ! step 2: position of the non-space character
      p = scan(input," ")
      if(p==0) exit

      ! step 3: get the substring
      n = n + 1
      item(n) = input(1:p-1)
      if(n==max_item) exit

      ! step 4: erase the first non-space string
      input(1:p-1) = " "
   end do
end program array

Exercises

Make a subroutine to split input and return items and n.
Modify the subroutine not to alter the input string.

Back to index.html.