INTRODUCTION
With the continuous development of research and the search for new statistical analysis strategies that provide greater precision and accuracy in obtaining results, attention has been focused on determining which is the most appropriate for the analysis of data from experiments with repeated measures at different times in the same experimental unit.
In the agricultural and livestock field, more and more experiments with these characteristics are carried out, since repeated measurements in the same experimental unit over time are cheaper than the use of a different experimental unit for each measurement over time, less experimental units are required, sample size and costs are reduced, test power and accuracy in estimating trends over time are improved. If this type of analysis is properly applied, it emphasizes the validity of statistical conclusions, because it has greater accuracy in the estimation of parameters of the analysis model (^{Kuehl 2000}).
The design with repeated measures over time was studied with the analysis of univariate and multivariate variance (ANOVA and MANOVA), respectively (^{Fernández et al. 1996}) and other authors used the mixed linear models and the mixed generalized linear models for the advantages that they present with respect to the traditional ones (^{Balzarini and Macchiavelli 2005} and ^{Vallejo et al. 2010}).
The statistical procedure with mixed models allows to analyze correctly and efficiently the data of experiments with repeated measures, through the modeling of the structure of variance-covariance matrix that consider the correlations between repeated measures and the presence of heterogeneous variances to make more precise inferences.
The objective of this study was the proposal of a methodology for statistical analysis to guide the researcher by using experiments with repeated measures over time in the same experimental unit. It is presented through a case study with legumes as a substrate in the production of in vitro gas in the agricultural and livestock field.
MATERIALS AND METHODS
Experimental procedure. The information of an experiment belonging to the Department of Physiology of the Institute of Animal Science was carried out with repeated measures at different times in time in the same experimental units, using the in vitro gas production technique.
Three shrub legumes were evaluated: Acacia cornigera (Acacia), Albizia lebbekoides (Albizia) and Leucaena leucocephala (Leucaena). Samples were collected from fully established plants in an Arboretum of the Institute of Animal Science (San José de las Lajas, Mayabeque, Cuba) in a typical red ferralitic soil (^{Hernández et al. 2015}), without fertilization or irrigation. Leaves and small stems (smaller than 5 mm) of legumes were manually collected simulating the browsing of the animals at 1.5 m height. The plant material was dried in a forced air oven at 60 ° C for 72 h. Subsequently, it was ground in a hammer mill, at a particle size of 1 mm. The plant material was properly preserved in sealed nylon bags and sent to the University of Zaragoza (Spain) for further chemical analysis and in vitro evaluations (^{Rodríguez et al. 2014}).
The variable studied was in vitro gas production (mL g^{-1} OMinc) measured at 2, 4, 6, 8, 10, 12, 16 and 24 hours, at which point the fermentation was stopped after measuring the gas.
Statistical analysis. To determine the existence of an association among sampling schedules, the Pearson correlation matrix was obtained. The sphericity assumption was calculated through the Mauchly statistic (^{Pérez and Medrano 2010} and ^{Acosta and Sánchez 2015}). Before the breach of this, the fit of the degrees of freedom was performed by means of the epsilon of ^{Greenhouse and Geisser (1959)} and ^{Huynh and Feldt (1976)}. Compliance with the assumption of normality was verified by the tests of ^{Shapiro and Wilk (1965)} and Kolmogorov -Smirnov modified by ^{Lilliefors (1967)}.
In order to obtain estimates with lower bias and lower variance of model parameters, the variance-covariance structures were examined: Unstructured (UN), Toeplitz (TOEP), AutoRegressive of order 1 (Ar (1)), Composite Symmetry (CS) and Components of Variance (CV). These were selected from the smallest values of the information criteria: Akaike (AIC), Corrected Akaike (AICC) and Bayesian (BIC).
Parameters were estimated by the Maximum Restricted Likelihood method and means were compared with the multiple comparison test of Tukey, modified by Kramer, with a significance level for P <0.05 (^{Tukey 1956}).
The estimation method was the approach of Laplace contained in the GLIMMIX procedure of SAS (^{Gualdrón 2009} and ^{Vallejo et al. 2014}). Data processing was performed with ^{SAS (2013)} statistical package, version 9.3.
To determine the distribution followed by data, SAS Proc Severity was used and Poisson (Logarithmic), Gamma (Reciprocal), Normal Log (Log), Normal (Identity) and Binomial (Logit) distributions were analyzed with their corresponding bonding functions. The expression for the mixed generalized linear model was the following:
Where:
y_{ijk} |
- response variable |
µ |
- intercept or common mean |
α_{i} |
- fix effect of the i-th treatment (i=1,..…, n) |
β_{j} |
- fix effect of the j-th time (j= 1,........,n) |
(αβ)_{ij} |
- fix effect of the i-th treatment in interaction with the fix effect of the j-th time (ij=1, …,n) |
b_{k} |
- random effect of the k-th experimental unit (k= 1,.....,n) |
e_{ijk} |
- random error asociated to all observationes |
RESULTS AND DISCUSSION
Table 1 shows the correlation coefficients for the variable in vitro gas production where values superior to 0.82 were obtained, which evidences the existence of high correlation throughout the experiment, determined by the proximity in time among sampling schedules. Therefore, the assumption of error independence was not fulfilled. From the hour ten, correlation coefficients reached values of 1.00. At this time, the gas production already expressed its maximum value and begins a stability phase in the process.
H2 | H4 | H6 | H8 | H10 | H12 | H16 | H24 | |
H2 | 1 | |||||||
H4 | 0.95 | 1 | ||||||
H6 | 0.91 | 0.99 | 1 | |||||
H8 | 0.82 | 0.93 | 0.96 | 1 | ||||
H10 | 0.85 | 0.95 | 0.99 | 0.96 | 1 | |||
H12 | 0.85 | 0.96 | 0.99 | 0.97 | 1 | 1 | ||
H16 | 0.85 | 0.95 | 0.99 | 0.97 | 1 | 1 | 1 | |
H24 | 0.85 | 0.95 | 0.99 | 0.97 | 0.99 | 0.99 | 1 | 1 |
H: times
Another necessary assumption in repeated measures over time is sphericity, which requires the variances of differences between all pairs of observations to be equal (^{Caleja et al. 2015}). Table 2 shows the results of the calculation of W statistic of ^{Mauchly (1940)} and the correction factor (epsilon) with P=0.001, which led to reject the hypothesis that variance-covariance matrix is spherical. That is, variances were not homogeneous (^{Kirk 1982}) and it was necessary to adjust the degrees of freedom by means of the Greenhouse-Geisser and Huynh-Feldt epsilon.
Variable | W of Mauchly | Aprox.χ² | FD | Value of P | Epsilon | ||
---|---|---|---|---|---|---|---|
Greenhouse-Geisser | Huynh-Feldt | Inferior limit | |||||
In vitro PGas | 0.00 | 398 | 27 | 0.001 | 0.15 | 0.15 | 0.14 |
PGas: gas production
Table 3 shows the traditional technique that accompanies the analysis of univariate variance of fit of degrees of freedom of variability explained by times, and the one attributed to the term of error through their reduction, where it was tried to compensate the positive bias of F test when the assumption of homogeneity of variances was not fulfilled. With the aim of an approach to sphericity assumption, a reduction in the degrees of freedom was performed, with values of 2.11 and 2.79, although it was observed that, in all cases, the same value of F was obtained (the uncorrected and the three corrected). This led to the same conclusion, since the level of significance was lower than 0.05 which allowed to reject the hypothesis of equality of means (^{Frías and García 1996}).
Origin | Type III of square sum | FD | Mean square | F | Signif. | |
---|---|---|---|---|---|---|
Schedules | Assumed sphericity | 28710.17 | 7 | 4101.45 | 1453.17 | 0.00 |
Greenhouse-Geisser | 28710.17 | 2.11 | 13591.30 | 1453.17 | 0.00 | |
Huynh-Feldt | 28710.17 | 2.79 | 10271.34 | 1453.17 | 0.00 | |
Inferior limit | 28710.17 | 1.00 | 28710.17 | 1453.17 | 0.00 | |
Error (times) | Assumed sphericity | 296.35 | 105 | 2.82 | ||
Greenhouse-Geisser | 296.35 | 31.69 | 9.35 | |||
Huynh-Feldt | 296.35 | 41.93 | 7.07 | |||
Inferior limit | 296.35 | 15.00 | 19.8 |
Results of statistic tests for checking normality assumption, with P <0.0100, appear in table 4. For the variable in vitro gas production, this hypothesis was rejected, residues did not approximate to a normal distribution and this allowed the selection of the mixed generalized linear model as an alternative analysis for non-fulfilling this assumption.
Table 5 shows the variance-covariance structures and information criteria studied. UN, TOEP and CS structures showed the same performance, so that any of them could be selected. The analysis was based on the lower values of the obtained information criteria, and the same residual value. However, TOEP was selected after studying the statements by ^{Fernández et al. (1996)} and ^{Vallejo et al. (2010)}, who expressed that observations recorded from the same subject, in addition to being positive and gradually correlated, show a variance-covariance matrix among repeated measures that have a TOEP structure. That means that the closest scores have a higher correlation.
Information criteria | Variance-covariance structures | ||||
---|---|---|---|---|---|
UN | TOEP | Ar(1) | VC | CS | |
AIC | 751.84 | 751.84 | 753.84 | 753.84 | 751.84 |
AICC | 763.84 | 763.84 | 766.88 | 766.88 | 763.84 |
BIC | 746.43 | 746.43 | 748.22 | 748.22 | 746.43 |
Residual | 0.01 |
Table 6 shows that, for the variable accumulated gas production, there was interaction among the factors treatment and sampling times (P = 0.0024). With the application of this model, it was observed that in vitro gas production for Acacia and Leucaena, showed no differences between them at any sampling time and their values were always superior to those of Albizia. It is appreciated that the highest in vitro gas productions are reached at 24 hours. Acacia and Leucaena showed a similar performance, as well as acacia at 16 hours. On the other hand, the lowest in vitro gas production was obtained with albizia at the beginning of fermentation.
Times (h) | Treatment | |||
---|---|---|---|---|
Acacia | Albizia | Leucaena | SE± Sign. | |
2 | 2.15 ^{m} (8.57) | 1.54 ^{n} (4.67) | 2.23 ^{m} (9.26) | ±0.0494 P=0.0024 |
4 | 2.96 ^{ijk} (19.23) | 2.23 ^{m} (9.31) | 2.93 ^{ijk} (18.74) | |
6 | 3.37 ^{fg} (29.05) | 2.55 ^{l} (12.82) | 3.25 ^{fgh} (25.75) | |
8 | 3.63 ^{de} (37.87) | 2.76 ^{kl} (15.80) | 3.42 ^{ef} (30.58) | |
10 | 3.83 ^{cd} (45.90) | 2.91 ^{jk} (18.45) | 3.61 ^{de} (37.03) | |
12 | 3.96 ^{bc} (52.67) | 3.03 ^{hij} (20.79) | 3.75 ^{cd} (42.37) | |
16 | 4.13 ^{ab} (61.99) | 3.16 ^{ghi} (23.58) | 3.90 ^{bc} (49.36) | |
24 | 4.28 ^{a} (72.18) | 3.31 ^{fg} (27.29) | 4.07 ^{ab} (58.76) |
^{a,b,c,d,e,f,g,h,i,j,k,l,m,n} Different letters indicate significant differences for P<0.05
( ) original means
From the results, a methodological proposal is made and the steps to be followed in research in the agricultural and livestock field in which experiments with repeated measures over time in the same experimental unit are evaluated and described with greater precision:
Calculate the Pearson correlation matrix to determine the degree of association among sampling times
Analyze the fulfillment of the sphericity condition using the Mauchly test and, otherwise, apply the correction factor.
Analyze the theoretical assumption of normality with the tests of Kolmogorov-Smirnov and Shapiro-Wilk.
Examine several variance-covariance structures to obtain estimations with lower bias and lower variance of the model parameters.
Obtaining the information criteria that help to select the most appropriate variance-covariance structure.
For the best fit of the model, choose the lowest values of the information criteria, to obtain the most appropriate variance-covariance structure.
Define the model to be used for each particular situation:
If the assumption of normality was met, the Mixed Linear Model will be used.
If the assumption of normality was not met, try the variants of Poisson, Gamma, Normal Log, Normal and Binomial distributions with their respective Logarithmic, Identity and Logistic link functions. To use the mixed generalized linear model.
CONCLUSIONS
The breach of the assumption of normality of residues from the used tests defined the use of the mixed generalized linear model as an alternative of analysis in experiments with repeated measures over time in the agricultural sector. The information criteria allowed obtaining the optimal structure of the variance-covariance matrix. A work methodology is proposed for processing data with these characteristics.