More on arrays
February 2020
Back to index.html.
More on arrays
Basic usage of arrays
Fundamental data-structure in Fortran
An array is a collection of elements (or objects), which have the same data type. The array is the fundamental data-structure that Fortran can directly use. Fortran can process the array efficiently. As an array keeps a simple structure in memory, it is relatively comfortable for the compiler to apply acceleration techniques (e.g., parallelization) to the program. Also, the program has room to improve the speed by using the array structure efficiently.
Literals
A literal array is defined with []
with a comma-separated list of literals with the same type. You must not mix different types in the list. A high-dimension literal array (matrix) is defined using intrinsic function reshape()
with []
. Without reshape
, there is no method to create a matrix literal.
- Example 1:
[1,2,3]
or[1.0,2.0,3.0]
- Example 2:
reshape([1,2,3,4,5,6],[3,2])
- Example 3 (illegal):
[1.0, 2, 3]
Array variables
An array variable is declared with a variable name plus ()
with the dimension and the rank of the array. The range of index should be lower:upper
in each dimension. When omitting the lower bound of the index, 1 will be assumed.
- Example 1:
integer :: a(3)
for 1-dimentional, 3-element integer array. - Example 2:
integer :: b(3,6)
for 2-dimentional integer array, with index ranging from 1 to 3 for the 1st dimension, and 1 to 6 for the 2nd dimension
To access an element of the array, specify the index in ()
. By specifying the range of indices, you can make a submatrix.
- Example 1:
a(1)
orb(2,2)
- Example 2:
b(2:3,4:6)
Allocated arrays
An allocatable array is defined with the allocatable
option in the declaration. The index should be empty, e.g., (:)
or (:,:)
. To allocate the array, use the allocatable()
statement with a range of the index. Use deallocate
to remove the allocated array (to release the allocated memory). When you leave the program unit, an allocated array is deallocated if the array is defined in the unit.
- Example 1:
integer :: x(:,:)
for declaration - Example 2:
allocate(x(2,5))
for allocation - Example 3:
deallocate(x)
for deallocation
Systematic access to each element in array
A loop is a typical method to get access to each element in an array. The loop generates a sequence of integer numbers by a counter variable, and you can use it as the index. For example, we have a vector with 5 elements. \[ \mathbf{a}'=\left[ \begin{array}{ccccc} a_1&a_2&a_3&a_4&a_5 \end{array} \right]' \] Using the following program, you can print each of them sequentially.
program arr
implicit none
integer :: i
real :: a(1:5)
a = [1,2,3,4,5]
do i=1,5
! use a counter as index
print *,a(i)
end do
end program arr
We are going to see this technique later in this chapter.
Array index
Arbitrary lower bound
The array index starts from 1 by default. You can declare an array with an arbitrary starting value in the index.
program arr
implicit none
! range: -2 to 2
real :: a(-2:2)
a = [1,2,3,4,5]
print *,a(-2),a(-1),a(0),a(1),a(2)
end program arr
In my personal opinion, you should not use this feature unless you need it (for example, a case where you have to maintain the code written by other people). It is problematic when the array is passed to functions or subroutines.
program arr
implicit none
real :: a(-2:2)
a = [1,2,3,4,5]
call print_element(a)
contains
subroutine print_element(x)
real,intent(in) :: x(:)
print *,x(1)
end subroutine print_element
end program arr
1.00000000
You expect 4.00000000
but it did not because the array bounds are not be explicitly transferred to procedures. When you put x(:)
in a procedure, it is interpreted as the index starts from 1. If you pass the non-standard index to a procedure, you must take care of such a possibility in the procedure. Please consult the following examples.
! explicit lower bound
subroutine print_element(x)
real,intent(in) :: x(-2:)
print *,x(1)
end subroutine print_element
! pass the lower-bound variable
subroutine print_element(x,bottom)
real,intent(in) :: x(bottom:)
integer,intent(in) :: bottom
print *,x(1)
end subroutine print_element
There are inquiry functions to obtain the bounds of an array.
Function | description |
---|---|
lbound(x) |
lower bound of array x |
lbound(x,1) |
row-wise lower bound of 2-dimensional array x |
lbound(x,2) |
column-wise lower bound of 2-dimensional array x |
ubound(x) |
upper bound of array x |
ubound(x,1) |
row-wise uppwe bound of 2-dimensional array x |
ubound(x,2) |
column-wise upper bound of 2-dimensional array x |
Adjustment of index
When passing a subarray to a procedure, a programmer should be aware of the index of the subarray is different from the index of the full array. See the following piece of code.
integer :: x(5),k
x = [2,9,0,5,7]
k = minloc(x(3:5))
print *,k
1
You know x
has 5 elements, and the 3rd element is the smallest. However, the function minloc
accepts a subarray x(3:5)
, and it is re-interpreted as a 3-element array-like x(1:3)
, then it returns 1 because the first element of the subarray is the smallest. If you want to have the original array index, you must adjust the index after calling a procedure with the subarray.
integer :: x(5),k
x = [2,9,0,5,7]
k = minloc(x(3:5))
print *,k+2
In a general case, the adjustment is as follows.
integer :: x(5)
integer :: i,k
x = [2,9,0,5,7]
i = 3
k = minloc(x(i:5))
print *,i+k-1
The adjustment may be confusing, so please make a detailed plan before coding.
Stepped index
You can define steps in the index as (start:end:step)
. For example, (1:5:2)
is equivalent to discrete indices 1
, 3
, and 5
. You can make a subarray using the stepped index.
integer :: x(6) = [10,20,30,40,50,60]
print *,x(1:5:2)
Indirect addressing
You can use an integer vector for index. See the following example.
integer :: x(6) = [10,20,30,40,50,60]
integer :: idx(3)
idx = [5,2,4]
print *,x(idx)
50 20 40
Zero-sized array
A tricky feature in Fortran is a zero-sized array whose length is 0, i.e., no elements in it. This array is empty, but it still exists.
program arr
implicit none
real :: a(0)
print *,size(a),lbound(a),ubound(a)
end program arr
0 1 0
The size is 0, and the lower bound is 1, and the upper is 0. So, this array is equivalent to a(1:0)
. Note that the zero-sized array is defined whenever the upper bound is smaller than the lower bound, and the resulting zero-sized array’s lower-bound is always 1. For example, when you define a zero-sized array, a(5:0)
, its lower bound is adjusted to 1, so that it is equivalent to a(1:0)
.
You can not assign any values to this particular array. When you try a(0)=1
, the compiler stops with an error. However, you may use it in a formula, and the result should be an empty array.
program arr
implicit none
real :: a(0)
print *,"a=",a
print *,"a+1=",a+1
end program arr
a=
a+1=
Mathematical functions may not be failed with the zero-sized array, but you should not expect any meaningful results. For example, although sum(a)
returns 0
in GFortran, it is nonsense. sqrt(a)
returns the array with the same size as a
, i.e., zero-sized.
The zero-sized array may appear when you allocate an array. We will see the usage of a zero-sized array in the next section.
Assignment, again
Vector with sequencial numbers
Using a similar syntax as do
, you can create a vector with sequential numbers.
real :: x(5)
integer :: i
x = [(i,i=1,5)]
1.00000000 2.00000000 3.00000000 4.00000000 5.00000000
Assignment to the same array
You can relocate elements in the same array.
real :: x(5)
integer :: i
x = [(i,i=1,5)]
x(2:4) = x(1:3)
The Fortran standard guarantees to relocate without any trouble. For example, in the following code, the program exchanges rows and columns within a matrix. If you want to do it manually, you need a temporary matrix.
real :: a(5,5)
integer :: i,idx(5)
idx = [3,2,5,1,4]
a = reshape([(i,i=1,25)],shape(a))
a = a(idx,idx)
Reallocation of arrays
Basics
When using an allocatable array, you should know the size of the array before allocating the array. If the size of the array is too small to have the data, you have to reallocate (resize) the array. Once you deallocate an array, the content will be lost, so you must keep the original content of the array. The traditional reallocation requires the following steps (here, a
is the target array).
real,allocatable :: a(:),t(:)
! the target array has been allocated.
allocate(a(5))
! 1. prepare the current (old) size and the new size of a
old_size = size(a)
new_size = size(a)+1
! 2. allocate an temporary array
allocate(t(old_size))
! 3. copy a to t
t = a
! 4. deallocate a
dallocate(a)
! 5. re-allocate a with the new size
allocate(a(new_size))
! 6. copy t back to a
a(1:old_size) = t(1:old_size)
! 7. remove the temporary array
deallocate(t)
In modern Fortran, the reallocation is automated without temporary arrays.
real,allocatable :: a(:)
allocate(a(5))
! fill the array
a = [1,2,3,4,5]
! add more elements; should be real numbers
a = [a, 6.0, 7.0]
At the end of this program, the array a
should have some extra elements, and the size of a
becomes 7. A trap is, it fails if you mix different numerical types in the array literal in []
. For example, the following expression gives an error.
! OKAY: all values are the same type (integer)
a = [1,2,3,4,5]
! ERROR: mixed types
a = [1, 2, 3, 4.0, 5.0]
! ERROR: a is real but 6,7 are integer numbers
a = [a, 6, 7]
Reallocation works only when the size of reallocated array is known. For example, in the above case, a=[a, 6.0, 7.0]
works because it is evident that the resulting size is 7. If you directly specify the index, which is out of range, it will be failing without any errors. It does not work because referring to a nonexistent index does not mean to resize the array.
allocate(a(5))
a = [1,2,3,4,5]
! DANGEROUS!
! YOU MUST NOT DO IT. IT DESTROYS SOME DATA.
a(6) = 6.0
a(7) = 7.0
! The size is still 5.
print *,size(a)
It is not apparent how to resize a 2-dimensional array. Also, a frequent resize of an array may have a performance penalty with repeated allocations.
Copy of allocated arrays
If you have 2 allocatable arrays, you can easily copy the data in one array to another array.
real,allocatable :: a(:),b(:)
allocate(a(5))
a = [1,2,3,4,5]
! copy the content of a to b
b = a
print *,b
print *,allocated(a),allocated(b)
1.00000000 2.00000000 3.00000000 4.00000000 5.00000000
T T
It makes a copy of a
to b
, which will be accordingly allocated with the same size as a
, and both arrays are alive. The resize of b
happens even though b
has been already allocated. In the following example, a
is also resized (from 3 to 5).
real,allocatable :: a(:),b(:),c(:)
allocate(a(3),b(6),c(4))
a = [1,2,3,4,5]
! copy the content of a to b and c
print *,size(a),size(b),size(c)
b = a
c = a
print *,size(a),size(b),size(c)
5 6 4
5 5 5
Obviously, it is dangerous; it may cause an unintended resize of allocated arrays. I believe that it is one of the most embarrassing features introduced to Modern Fortran. To avoid this issue, you can put (:) to the allocated array all the time. See the following code.
real,allocatable :: a(:),b(:),c(:)
allocate(a(3),b(6),c(4))
a = [1,2,3,4,5]
! copy the content of a to b and c without the resize
print *,size(a),size(b),size(c)
b(:) = a(:)
c(:) = a(:)
print *,size(a),size(b),size(c)
5 6 4
5 6 4
It keeps the original size and copies the elements as many as possible. For example, b
gets all elements of a
, and c
get only 4 of the elements.
Transfer of allocated array
A function, move_alloc
, transfers the data in an allocated array to another allocated array. The source array will be automatically deallocated after the transfer is done. This involves some allocation and deallocation steps.
real,allocatable :: a(:),b(:)
allocate(a(5))
a = [1,2,3,4,5]
! move the data from a to b
call move_alloc(a,b)
print *,allocated(a),allocated(b)
F T
When the function is called as call move_alloc(a,b)
, it behaves as follows.
- Deallocate the destination array,
b
, if allocated. - Allocate
b
as the same size asa
. - Copy all elements in
a
tob
likeb=a
. - Deallocate
a
.
The arguments can be specified using the keywords: call move_alloc(from=a, to=b)
.
This function is useful when to resize an array with a temporary array keeping the original data. The following example shows that array a
is expanding with 3 elements more, keeping the original data.
program arr
implicit none
integer :: n,m
real,allocatable :: a(:),temporary(:)
! initial size of a
n = 5
allocate(a(n))
a = [1,2,3,4,5]
! new array
m = n + 3
allocate(temporary(m))
! copy the original data to the temporary array
temporary(1:n) = a(1:n)
! transfer the data from the temporary array to the original array
! the original array resized; temporary array deallocated
call move_alloc(from=temporary, to=a)
end program arr
If it seems confusing, use a combination of allocate-deallocate statements manually.
Array of character variable
You can create an array of character variables. For example, the following code has a vector with 3 elements of character variable with length 10.
character(len=10) :: a(3)
You need two kinds of indices: one for the position of the character and the other one for the index of the array. It is not a 2-dimensional array. These two indices should be separately specified so that the position of character always comes to the end of the variable. See the following example for details.
program strarray
implicit none
character(len=10) :: a(3)
a(1) = "aaaaaa"
a(2) = "bbbbbb"
a(3) = "cccccc"
! the first element of the array; characters 1 and 2
print *,a(1)(4:5)
end program strarray
aa
In a(1)(4:5)
, the first parentheses refer to the index of the array (the first character variable), and the second parentheses extract the 4th and 5th characters from a(1)
.
Some algorithms using arrays
Traversing all elements of a matrix
Full matrix
In matrix computations, we often have to look at all elements in a matrix. Although some functions like sum
and minval
are helpful to avoid the manual search through all the elements, there are many cases where these functions are not applicable.
A matrix has 2 indices, row, and column. When you look at all the elements, you should use a double loop; one is for row index, and the other one is for column index. The following code use i
for rows and j
for a column, creating the index is (i,j)
, to show each element in a matrix.
program mat
implicit none
integer,parameter :: n=3
real :: a(n,n)
integer :: i,j
! initial values
a = 1.0
! make a combination of (i,j)
do i=1,n
do j=1,n
print *,a(i,j)
end do
end do
end program mat
This program defines a \(3 \times 3\) matrix.
\[ \mathbf{A} = \left[ \begin{array}{ccc} a_{11}&a_{12}&a_{13} \\ a_{21}&a_{22}&a_{23} \\ a_{31}&a_{32}&a_{33} \end{array} \right] \]
The above program access rows in the matrix i.e., fixed row i
with incrementing column j
. The sequence of elements looks like this.
\[ a_{11}, a_{12}, a_{13}, a_{21}, a_{22}, a_{23}, a_{31}, a_{32}, a_{33} \]
To support the within-column access, i.e., fixed column j
with incrementing row i
, you just swap the order between i
and j
. Here is a piece of the program.
do j=1,n
do i=1,n
print *,a(i,j)
end do
end do
The sequence of access is as follows.
\[ a_{11}, a_{21}, a_{31}, a_{12}, a_{22}, a_{32}, a_{13}, a_{23}, a_{33} \]
This sequence is the same as data order in memory. In general, within-column access is faster because of the alignment in memory.
Some operations
Using the above technique, you can perform matrix operations. For example, the following program calculates the sum of all elements of a matrix.
program mat
implicit none
integer,parameter :: n=3
real :: a(n,n),s
integer :: i,j
! initial values
a = 1.0
! sum of all elements
s = 0.0
do j=1,n
do i=1,n
s = s + a(i,j)
end do
end do
print *,"sum = ",s
end program mat
Exercises
- Modify the above program to find the maximum value in a matrix.
- In the above program (exercise 1), print the maximum value and its index.
- Assuming the matrix is symmetric, write a program to print the sum of the elements in the lower triangular part of the matrix.
Product of matrix by vector
Methematical formulation
Suppose we have a matrix \(\mathbf{A}\) (\(m \times n\)) and a vector \(\mathbf{x}\) (\(n \times 1\)). The product is \(\mathbf{b}=\mathbf{Ax}\). The \(i\)-th element of \(\mathbf{b}\) (\(b_i\)) is available from the following formula. \[ b_i = \sum_{j=1}^{n}A_{ij}x_{j}\quad\text{for}\quad 1\leq i\leq n \] A graphical image of this formula is as follows. \[ \left[ \begin{array}{c} \vdots\\ \vdots\\ b_i \\ \vdots \\ \vdots \end{array} \right] = \left[ \begin{array}{ccccc} \vdots & \vdots & & \vdots & \vdots\\ \vdots & \vdots & & \vdots & \vdots\\ \hline A_{i1}&A_{i2} &\cdots & A_{i,(n-1)} & A_{in}\\ \hline \vdots & \vdots & & \vdots & \vdots \\ \vdots & \vdots & & \vdots & \vdots \end{array} \right] \left[ \begin{array}{c} x_1\\ x_2 \\ \vdots \\ x_{n-1}\\ x_{n} \end{array} \right] \]
Formula
The product \(\mathbf{b}\) is available by computing \(b_i\) from \(i=1\) to \(i=m\).
- For \(i=1\), you have \(b_1=\sum_{j=1}^{n}A_{1j}x_{j}\) i.e.
b(i)=dot_product(A(1,:),x(:))
. - For \(i=2\), you have \(b_2=\sum_{j=1}^{n}A_{2j}x_{j}\) i.e.
b(i)=dot_product(A(2,:),x(:))
. - For \(i=n\), you have \(b_m=\sum_{j=1}^{n}A_{mj}x_{j}\) i.e.
b(m)=dot_product(A(m,:),x(:))
.
Coding
I show a reference code for this program.
program a_times_x
implicit none
integer,parameter :: m=5,n=3
integer :: i,j
real :: A(m,n),x(n),b(m)
! some arrays
A = reshape([i=1,m*n],shape(A))
x = [(i=1,m)]
! computation
do i=1,m
b(i) = dot_product(A(i,:),x(:))
end do
print *,"b=A*x: ",b
end program a_times_x
Exercises
- Rewrite the above code using loops instead of
dot_product
. - Modify the program in Exercise 1 so that two loops are exchanged, and confirm the result is the same as the original program.
- Make a function to calculate the product of a matrix by a vector.
Selection sort
Sorting is an action to change the order of elements in an array. Usually, the elements are numbers, and the order becomes increasing or decreasing. There are many sorting algorithms, and you have to use an efficient one. Nevertheless, here, we introduce a naive algorithm, selection sort, just for an educational purpose.
Idea
The selection sort is very similar to sorting behavior by a human. Imagine that you sort playing cards by number (from the smallest to the largest) on your hand. What you will do is that you search a card with the smallest number, pick it, and put it on the top of a bunch of cards. Next, you search a card with the smallest number from the remaining cards and do the same stuff. You will repeat this process until all cards are processed.
In a computer program, the algorithm is slightly different from human behavior because of the limitation of data structure. In selection sort, first, you search a card with the smallest, pick it, and swap it with the top of cards. Next, you search the smallest out of the remaining cards, pick it, and swap it with the second card of the deck. See Wikipedia about the details of this algorithm.
Formulation
Suppose we have an array x
with the length \(n\). We describe this algorithm using Fortran notation.
- First, search the smallest value from
x(1:n)
, and determine the index of the elementk
, then swapx(1)
andx(k)
. - Second, search the smallest value from
x(2:n)
, and determine the index of the elementk
, then swapx(2)
andx(k)
. - Third, search the smallest value from
x(3:n)
, and determine the index of the elementk
, then swapx(3)
andx(k)
.
You will see the general rule of this algorithm.
- In round
i
, search the smallest value fromx(i:n)
, and determine the index of the elementk
, then swapx(i)
andx(k)
. - Repeat this process for
i
from 1 to \(n\).
You can see this algorithm needs a loop for i
.
Coding
There is no swap procedure in Fortran. So, we have to prepare the subroutine as follows.
subroutine swap(a,b)
integer,intent(inout) :: a,b
integer :: t
t = a
a = b
b = t
end subroutine swap
Here is a reference code for selection sort. Note that minloc
has a subarray, and it returns an index starting from 1
instead of i
.
program sort
implicit none
integer,parameter :: n=5
integer :: x(n)
integer :: i,k
x = [2,5,1,3,4]
do i=1,n
k = minloc(x(i:n))
call swap(x(i),x(i+k-1))
end do
print *,x
contains
subroutine swap(a,b)
integer,intent(inout) :: a,b
integer :: t
t = a
a = b
b = t
end subroutine swap
end program sort
It is straightforward and fast enough for small arrays. However, it becomes very slow, and you should not use this algorithm in practice with larger arrays.
Exercises
- Consider why it will be slow for a large array.
- Make a function of selection sort for integer array based on the above code.
- Make another function to support descending order (from the largest to the smallest).
Split a string by space
In a typical data analysis, the data is stored in a text file, and each field is separated by white space. The read
statement reads an entire row as characters, including the spaces. Although you often need to extract a particular field separated by space, Fortran does not have such a function. We are going to implement it by ourselves.
Design of the program
We put some restrictions on input and output because of simplicity.
- The input row is stored in a character variable with a certain length like
character(len=100) :: input
. The input characters can be altered at the end of the program. - One or more spaces separate each item. The leading spaces in
input
will be ignored. - The output (separated characters) will be stored in an array of character variables like
character(len=100) :: item(5)
. - The maximum number of items should be defined by a constant like
integer,parameter :: max_item=5
.
In this case, the program reads space-separated characters in str
, splits it by space, and returns the first item in item(1)
, the second, item(2)
, and so on. If the original row has more than max_item
items, the remaining items will be discarded.
You can figure out the program behavior with the following pseudo-code.
integer,parameter :: max_item=5
character(len=100) :: input
character(len=100) :: item(max_item)
! 6 items
input = " 100 50 3.14 abc 1.23e-2 12345"
(some processing)
print *,item(1) ! 100
print *,item(2) ! 50
print *,item(3) ! 3.14
print *,item(4) ! abc
print *,item(5) ! 1.23e-2
Naive solution
There is the simplest but imperfect solution to this issue. See the following code.
read(input,*) item
It perfectly works only if max_item
is equal to or smaller than the actual number of items in input
. In the above example, if input
has 4 or fewer items, the read
statement fails. Although you can trap the error by iostat=
, you do not know how many items you have obtained.
integer :: info
...
! you do not know how many items you got
read(input,*,iostat=info) item
To get the number of items, you can count the number of non-empty items using a do
loop. In this case, you have to initialize item
with an empty string.
integer :: i,n,info
...
item = ""
read(input,*,iostat=info) item
n = 0
do i=1,max_item
! count only non-empty items
if(len_trim(item(i))>0) then
n = n + 1
end if
end do
Standard solution
A straightforward approach is to extract the leading non-space characters and repeat them until you extract all the non-space characters. We can write the procedure as follows.
- Initialize integer
n=0
. - Remove leading spaces by
input=adjustl(input)
. - Search the position of a space (
p
) ininput
from the beginning. If space is at the top ofinput
, there is no more non-space character and exit. If found somewhere else, sub-stringinput(1:p-1)
is the non-space item. - Increment
n
and putitem(n)=input(1:p-1)
. Ifn
equalsmax_item
, exit. - Put spaces to
input(1:p-1)
asinput(1:p-1)=" "
. - Go to 1.
The code is as follows.
program array
integer,parameter :: max_item=5
character(len=100) :: input
character(len=100) :: item(max_item)
integer :: n,p
input = " 100 50 3.14 abc 1.23e-2 12345"
n = 0
do
! step 1: remove the leading spaces
input = adjustl(input)
! step 2: position of the non-space character
p = scan(input," ")
if(p==0) exit
! step 3: get the substring
n = n + 1
item(n) = input(1:p-1)
if(n==max_item) exit
! step 4: erase the first non-space string
input(1:p-1) = " "
end do
end program array
Exercises
- Make a subroutine to split
input
and returnitems
andn
. - Modify the subroutine not to alter the input string.
Back to index.html.