Data preparation with RENUMF90

Yutaka Masuda

September 2019

Back to index.html.

Multiple-trait models

Model

For renumbering on a multiple-trait model, we still need a very similar instruction file to the single-trait model. Only differences include 1) a need to specify the model for each trait and 2) use of covariance matrix instead of a scalar value.

We will consider a 2-trait model with equal design matrices, which means all traits have the same model. The mathematical model is \[ \begin{aligned} y_{ijk:1} &= A_{i:1} + S_{j:1} + \beta_{1} x_{ijk:1} + u_{k:1} + e_{ijk:1}\\ y_{ijk:2} &= A_{i:2} + S_{j:2} + \beta_{2} x_{ijk:2} + u_{k:2} + e_{ijk:2} \end{aligned} \] where \(y_{ijk:t}\) is a observation for trait \(t\), \(A_{i:t}\) is the fixed effect for trait \(t\), \(S_{i:t}\) is also the fixed effect for trait \(t\), \(\beta_{t}\) is the fixed regression coefficient for trait \(t\), \(x_{ijk:t}\) is a covariate for trait \(t\), \(u_{k:t}\) is the additive genetic effect for trait \(t\), and \(e_{ijk:t}\) is the random residual effect. The genetic (\(\mathbf{G}_0\)) and residual (\(\mathbf{R}_0\)) covariance matrices are \[ \mathbf{G}_{0} = \left[ \begin{array}{rr} 0.50&-0.25\\ -0.25&1.00 \end{array} \right] \quad \text{and} \quad \mathbf{R}_{0} = \left[ \begin{array}{rr} 2.0&1.0\\ 1.0&1.5 \end{array} \right] \]

Required files

The raw data file is identical to the previous single-trait data except for the new column (trait 2) added. The data file should contain all required effects and observations for all traits.

  ID006  A  1  1.0  3.0  4.5
  ID009  A  2  1.0  2.0  7.5
  ID012  A  1  2.0  4.0  3.5
  ID007  B  2  2.0  6.0 -0.5
  ID010  B  1  1.0  3.0  5.5
  ID013  B  2  2.0  6.0  1.5
  ID008  C  1  2.0  6.0 -1.5
  ID011  C  2  1.0  6.0  2.5
  ID014  C  1  1.0  8.0  0.5
  ID015  C  2  2.0  4.0  4.5

The pedigree file is also the same as the previous one.

ID001        0         0
ID002        0         0
ID003        0         0
ID004        0         0
ID005        0         0
ID006        0         0
ID007    ID002     ID005
ID008    ID001     ID004
ID009    ID002     ID003
ID010    ID007     ID006
ID011    ID007     ID004
ID012    ID011     ID008
ID013    ID011     ID010
ID014    ID009     ID013
ID015    ID011     ID010

The instruction file for this 2-trait model is as follows.

DATAFILE
rawdata5.txt
TRAITS          # 2-trait model: put 2 positions
5 6
FIELDS_PASSED TO OUTPUT

WEIGHT(S)

RESIDUAL_VARIANCE
2.0 1.0
1.0 1.5
EFFECT           # 1st effect fixed
2 2 cross alpha
EFFECT           # 2nd effect fixed
3 3 cross alpha
EFFECT           # 3rd effect fixed
4 4 cov
EFFECT           # 4th effect
1 1 cross alpha
RANDOM           ## treated as a random effect
animal
FILE             ## pedigree file
rawpedigree5.txt
FILE_POS         ## animal, sire and dam IDs, and two 0s
1 2 3 0 0
(CO)VARIANCES    ## its variance component
 0.50 -0.25
-0.25  1.00

For a multiple-trait model, you can check the following points in an instruction file. Incorrect description in these statements falls in the sudden stop of RENUMF90.

Run RENUMF90 with the above files, and you can see the parameter file configured with the 2-trait model.

Missing observations

You can include missing observations in the data file. RENUMF90 does not distinguish the real and the missing observations; the program just read the raw data and just pass the observation to the renumbered file. Although RENUMF90 doesn’t care about missing observations, BLUPF90 cares about the values. So the generated parameter file (renf90.par) should include an option to define the missing code. You can write the following option in the instruction file, and RENUMF90 passes it to renf90.par.

OPTION missing -999

This example defines an observation with the value -999 as a missing code. The default is 0 (i.e. if you don’t define any missing code, the program assumes 0 as missing).

Note that RENUMF90 still recognizes 0 as a missing observation in the computations of the basic statistics (e.g. average and standard deviation) just for your information. In this case, the statistics may be inaccurate but all the files are correctly renumbered.

Different models across traits (Unequal design matrices)

Now we consider a multiple-trait model which has different mathematical models across traits. It is typically called a model with unequal design matrices. Let’s see the following 2-trait model. \[ \begin{aligned} y_{ijk:1} &= A_{i:1} + \phantom{S_{j:1} + } \beta_1 x_{ijk:1} + u_{k:1} + e_{ijk:1}\\ y_{ijk:2} &= A_{i:2} + S_{j:2} + \beta_{2} x_{ijk:2} + u_{k:2} + e_{ijk:2} \end{aligned} \] In this model, the \(S_{j:1}\) is missing for trait 1. To support it with RENUMF90, you can just change a line in the previous instruction file.

EFFECT                  # 2nd effect fixed
0 3 cross alpha

The position 0 means that this effect will not be included in the model for this trait.

We further consider the following 2-trait model. \[ \begin{aligned} y_{ijk:1} &= A_{i:1} \phantom{+ S_{j:1} } + \beta_1 x_{ijk:1} + u_{k:1} + e_{ijk:1}\\ y_{ijk:2} &= \phantom{A_{i:2} + } S_{j:2} + \beta_{2} x_{ijk:2} + u_{k:2} + e_{ijk:2} \end{aligned} \] In this case, \(S_{j:1}\) is missing for trait 1 and \(A_{i:2}\) is missing for trait 2. Applying the above principle to this case, you can figure out the following solution (just showing a piece of instruction file).

EFFECT                  # 1st effect fixed
2 0 cross alpha
EFFECT                  # 2nd effect fixed
0 3 cross alpha

Above statements are easy to understand which effects are missing for a particular trait.

There is another way to handle such a model. You can combine above 2 EFFECT lines into 1 as follows.

EFFECT                  # 1st and 2nd effects fixed
2 3 cross alpha

This statement may be useful when the different effects are considered as similar types of effects. For example, in dairy cattle, a contemporary group effect (like HYS: herd-year-season) is commonly included in the model for production traits but the definition should be different across parities. HYS in the 1st lactation and HYS in the 2nd lactation are different but considered as similar effects. With this compact definition, memory requirements would reduce. If you are confused with this notation, you don’t have to use the compact statement.

Summary

Back to index.html.