SciELO - Scientific Electronic Library Online

 
vol.53 número4Los aditivos enzimáticos, su aplicación en la crianza animalEfecto del cinamaldehído en la degradabilidad ruminal in vitro y la producción de ácidos grasos volátiles. índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Articulo

Indicadores

  • No hay articulos citadosCitado por SciELO

Links relacionados

  • No hay articulos similaresSimilares en SciELO

Compartir


Cuban Journal of Agricultural Science

versión On-line ISSN 2079-3480

Cuban J. Agric. Sci. vol.53 no.4 Mayabeque oct.-dic. 2019  Epub 05-Dic-2019

 

BIOMATHEMATICS

Statistical procedure for the analysis of experiments with repeated measures over time in the agricultural and livestock field

Sarai Gómez1  *  , Verena Torres1  , Yoleisy García1  , Magaly Herrera1  , Yolaine Medina1  , R. Rodríguez1 

1Instituto de Ciencia Animal, Carretera Central, km. 47 ½.San José de las Lajas. Mayabeque. Cuba. CP: 32 700.

ABSTRACT

The objective of this study was the proposal of a statistical analysis methodology that will guide the researcher by making repeated measurements over time in the same experimental unit, through a case study with legumes as a substrate in the production of in vitro gas in the agricultural and livestock field. The variable in vitro gas production was analyzed. Pearson correlation matrix was calculated, values ​​superior to 0.82 were obtained and the existence of association among sampling days was determined. The sphericity criterion was confirmed by means of the Mauchly statistic and, in front of its failure to fulfill, the fit of the degrees of freedom was made. In the same way, normality assumption was verified (P <0.0100) and when it was not fulfilled, a mixed generalized linear model was used for analyzing the variants of Poisson, Gamma, Binomial, Normal and Normal Log, to determine the distribution that followed the data, which in this case was Gamma. Toeplitz variance-covariance structure was selected as the one that best fits the model based on the lower values ​​of information criteria. The verification of theoretical assumptions necessary for repeated measures defined the model to be used. The use of a mixed generalized linear model increased the accuracy of results by properly estimating the variance-covariance structures and allowed to analyze unbalanced data. A work methodology is proposed for data processing with repeated measures over time.

Key words: information criteria; covariance structures; correlation matrix

INTRODUCTION

With the continuous development of research and the search for new statistical analysis strategies that provide greater precision and accuracy in obtaining results, attention has been focused on determining which is the most appropriate for the analysis of data from experiments with repeated measures at different times in the same experimental unit.

In the agricultural and livestock field, more and more experiments with these characteristics are carried out, since repeated measurements in the same experimental unit over time are cheaper than the use of a different experimental unit for each measurement over time, less experimental units are required, sample size and costs are reduced, test power and accuracy in estimating trends over time are improved. If this type of analysis is properly applied, it emphasizes the validity of statistical conclusions, because it has greater accuracy in the estimation of parameters of the analysis model (Kuehl 2000).

The design with repeated measures over time was studied with the analysis of univariate and multivariate variance (ANOVA and MANOVA), respectively (Fernández et al. 1996) and other authors used the mixed linear models and the mixed generalized linear models for the advantages that they present with respect to the traditional ones (Balzarini and Macchiavelli 2005 and Vallejo et al. 2010).

The statistical procedure with mixed models allows to analyze correctly and efficiently the data of experiments with repeated measures, through the modeling of the structure of variance-covariance matrix that consider the correlations between repeated measures and the presence of heterogeneous variances to make more precise inferences.

The objective of this study was the proposal of a methodology for statistical analysis to guide the researcher by using experiments with repeated measures over time in the same experimental unit. It is presented through a case study with legumes as a substrate in the production of in vitro gas in the agricultural and livestock field.

MATERIALS AND METHODS

Experimental procedure. The information of an experiment belonging to the Department of Physiology of the Institute of Animal Science was carried out with repeated measures at different times in time in the same experimental units, using the in vitro gas production technique.

Three shrub legumes were evaluated: Acacia cornigera (Acacia), Albizia lebbekoides (Albizia) and Leucaena leucocephala (Leucaena). Samples were collected from fully established plants in an Arboretum of the Institute of Animal Science (San José de las Lajas, Mayabeque, Cuba) in a typical red ferralitic soil (Hernández et al. 2015), without fertilization or irrigation. Leaves and small stems (smaller than 5 mm) of legumes were manually collected simulating the browsing of the animals at 1.5 m height. The plant material was dried in a forced air oven at 60 ° C for 72 h. Subsequently, it was ground in a hammer mill, at a particle size of 1 mm. The plant material was properly preserved in sealed nylon bags and sent to the University of Zaragoza (Spain) for further chemical analysis and in vitro evaluations (Rodríguez et al. 2014).

The variable studied was in vitro gas production (mL g-1 OMinc) measured at 2, 4, 6, 8, 10, 12, 16 and 24 hours, at which point the fermentation was stopped after measuring the gas.

Statistical analysis. To determine the existence of an association among sampling schedules, the Pearson correlation matrix was obtained. The sphericity assumption was calculated through the Mauchly statistic (Pérez and Medrano 2010 and Acosta and Sánchez 2015). Before the breach of this, the fit of the degrees of freedom was performed by means of the epsilon of Greenhouse and Geisser (1959) and Huynh and Feldt (1976). Compliance with the assumption of normality was verified by the tests of Shapiro and Wilk (1965) and Kolmogorov -Smirnov modified by Lilliefors (1967).

In order to obtain estimates with lower bias and lower variance of model parameters, the variance-covariance structures were examined: Unstructured (UN), Toeplitz (TOEP), AutoRegressive of order 1 (Ar (1)), Composite Symmetry (CS) and Components of Variance (CV). These were selected from the smallest values ​​of the information criteria: Akaike (AIC), Corrected Akaike (AICC) and Bayesian (BIC).

Parameters were estimated by the Maximum Restricted Likelihood method and means were compared with the multiple comparison test of Tukey, modified by Kramer, with a significance level for P <0.05 (Tukey 1956).

The estimation method was the approach of Laplace contained in the GLIMMIX procedure of SAS (Gualdrón 2009 and Vallejo et al. 2014). Data processing was performed with SAS (2013) statistical package, version 9.3.

To determine the distribution followed by data, SAS Proc Severity was used and Poisson (Logarithmic), Gamma (Reciprocal), Normal Log (Log), Normal (Identity) and Binomial (Logit) distributions were analyzed with their corresponding bonding functions. The expression for the mixed generalized linear model was the following:

yijk=m+ai+βj+(aβ)ij+bk+eijk

Where:

yijk

- response variable

µ

- intercept or common mean

αi

- fix effect of the i-th treatment (i=1,..…, n)

βj

- fix effect of the j-th time (j= 1,........,n)

(αβ)ij

- fix effect of the i-th treatment in interaction with the fix effect of the j-th time (ij=1, …,n)

bk

- random effect of the k-th experimental unit (k= 1,.....,n)

eijk

- random error asociated to all observationes

RESULTS AND DISCUSSION

Table 1 shows the correlation coefficients for the variable in vitro gas production where values superior to 0.82 were obtained, which evidences the existence of high correlation throughout the experiment, determined by the proximity in time among sampling schedules. Therefore, the assumption of error independence was not fulfilled. From the hour ten, correlation coefficients reached values of 1.00. At this time, the gas production already expressed its maximum value and begins a stability phase in the process.

Table 1 Correlation coefficients for the experiment of in vitro gas production 

H2 H4 H6 H8 H10 H12 H16 H24
H2 1
H4 0.95 1
H6 0.91 0.99 1
H8 0.82 0.93 0.96 1
H10 0.85 0.95 0.99 0.96 1
H12 0.85 0.96 0.99 0.97 1 1
H16 0.85 0.95 0.99 0.97 1 1 1
H24 0.85 0.95 0.99 0.97 0.99 0.99 1 1

H: times

Another necessary assumption in repeated measures over time is sphericity, which requires the variances of differences between all pairs of observations to be equal (Caleja et al. 2015). Table 2 shows the results of the calculation of W statistic of Mauchly (1940) and the correction factor (epsilon) with P=0.001, which led to reject the hypothesis that variance-covariance matrix is spherical. That is, variances were not homogeneous (Kirk 1982) and it was necessary to adjust the degrees of freedom by means of the Greenhouse-Geisser and Huynh-Feldt epsilon.

Table 2 Sphercity test of Mauchly and Epsilon correction for the experiment of in vitro gas production 

Variable W of Mauchly Aprox.χ² FD Value of P Epsilon
Greenhouse-Geisser Huynh-Feldt Inferior limit
In vitro PGas 0.00 398 27 0.001 0.15 0.15 0.14

PGas: gas production

Table 3 shows the traditional technique that accompanies the analysis of univariate variance of fit of degrees of freedom of variability explained by times, and the one attributed to the term of error through their reduction, where it was tried to compensate the positive bias of F test when the assumption of homogeneity of variances was not fulfilled. With the aim of an approach to sphericity assumption, a reduction in the degrees of freedom was performed, with values of 2.11 and 2.79, although it was observed that, in all cases, the same value of F was obtained (the uncorrected and the three corrected). This led to the same conclusion, since the level of significance was lower than 0.05 which allowed to reject the hypothesis of equality of means (Frías and García 1996).

Tabla 3 Fitting of degrees of freedom for the experiment of in vitro gas production  

Origin Type III of square sum FD Mean square F Signif.
Schedules Assumed sphericity 28710.17 7 4101.45 1453.17 0.00
Greenhouse-Geisser 28710.17 2.11 13591.30 1453.17 0.00
Huynh-Feldt 28710.17 2.79 10271.34 1453.17 0.00
Inferior limit 28710.17 1.00 28710.17 1453.17 0.00
Error (times) Assumed sphericity 296.35 105 2.82
Greenhouse-Geisser 296.35 31.69 9.35
Huynh-Feldt 296.35 41.93 7.07
Inferior limit 296.35 15.00 19.8

Results of statistic tests for checking normality assumption, with P <0.0100, appear in table 4. For the variable in vitro gas production, this hypothesis was rejected, residues did not approximate to a normal distribution and this allowed the selection of the mixed generalized linear model as an alternative analysis for non-fulfilling this assumption.

Table 4 Normality test for the experiment of in vitro gas production 

Variable In vitro PGas
Statistical test Value of P
Shapiro-Wilk 0.0000
Kolmogoro-Smirnov <0.0100

Table 5 shows the variance-covariance structures and information criteria studied. UN, TOEP and CS structures showed the same performance, so that any of them could be selected. The analysis was based on the lower values of the obtained information criteria, and the same residual value. However, TOEP was selected after studying the statements by Fernández et al. (1996) and Vallejo et al. (2010), who expressed that observations recorded from the same subject, in addition to being positive and gradually correlated, show a variance-covariance matrix among repeated measures that have a TOEP structure. That means that the closest scores have a higher correlation.

Table 5 Variance-covariance structure and information criteria for in vitro gas production in legumes experiment 

Information criteria Variance-covariance structures
UN TOEP Ar(1) VC CS
AIC 751.84 751.84 753.84 753.84 751.84
AICC 763.84 763.84 766.88 766.88 763.84
BIC 746.43 746.43 748.22 748.22 746.43
Residual 0.01

Table 6 shows that, for the variable accumulated gas production, there was interaction among the factors treatment and sampling times (P = 0.0024). With the application of this model, it was observed that in vitro gas production for Acacia and Leucaena, showed no differences between them at any sampling time and their values were always superior to those of Albizia. It is appreciated that the highest in vitro gas productions are reached at 24 hours. Acacia and Leucaena showed a similar performance, as well as acacia at 16 hours. On the other hand, the lowest in vitro gas production was obtained with albizia at the beginning of fermentation.

Table 6 Means of interaction between treatment and sampling times in the accumulated gas production  

Times (h) Treatment
Acacia Albizia Leucaena SE± Sign.
2 2.15 m (8.57) 1.54 n (4.67) 2.23 m (9.26) ±0.0494 P=0.0024
4 2.96 ijk (19.23) 2.23 m (9.31) 2.93 ijk (18.74)
6 3.37 fg (29.05) 2.55 l (12.82) 3.25 fgh (25.75)
8 3.63 de (37.87) 2.76 kl (15.80) 3.42 ef (30.58)
10 3.83 cd (45.90) 2.91 jk (18.45) 3.61 de (37.03)
12 3.96 bc (52.67) 3.03 hij (20.79) 3.75 cd (42.37)
16 4.13 ab (61.99) 3.16 ghi (23.58) 3.90 bc (49.36)
24 4.28 a (72.18) 3.31 fg (27.29) 4.07 ab (58.76)

a,b,c,d,e,f,g,h,i,j,k,l,m,n Different letters indicate significant differences for P<0.05

( ) original means

From the results, a methodological proposal is made and the steps to be followed in research in the agricultural and livestock field in which experiments with repeated measures over time in the same experimental unit are evaluated and described with greater precision:

  1. Calculate the Pearson correlation matrix to determine the degree of association among sampling times

  2. Analyze the fulfillment of the sphericity condition using the Mauchly test and, otherwise, apply the correction factor.

    • Mauchly proves that variance-covariance matrix is ​​spherical or not, and if it is not, it increases the probability of committing type I error. Therefore, it is necessary to correct the degrees of freedom through the epsilon of Huynh-Feldt and Greenhouse-Heisser

  3. Analyze the theoretical assumption of normality with the tests of Kolmogorov-Smirnov and Shapiro-Wilk.

  4. Examine several variance-covariance structures to obtain estimations with lower bias and lower variance of the model parameters.

    • Unstructured (UN)

    • Toeplitz (TOEP)

    • Autoregressive (AR1)

    • Variance components (VC)

    • Composite symmetry (CS)

  5. Obtaining the information criteria that help to select the most appropriate variance-covariance structure.

    • Akaike (AIC)

    • Corrected Akaike (AICC)

    • Bayesian (BIC)

  6. For the best fit of the model, choose the lowest values ​​of the information criteria, to obtain the most appropriate variance-covariance structure.

  7. Define the model to be used for each particular situation:

    1. If the assumption of normality was met, the Mixed Linear Model will be used.

    2. If the assumption of normality was not met, try the variants of Poisson, Gamma, Normal Log, Normal and Binomial distributions with their respective Logarithmic, Identity and Logistic link functions. To use the mixed generalized linear model.

CONCLUSIONS

The breach of the assumption of normality of residues from the used tests defined the use of the mixed generalized linear model as an alternative of analysis in experiments with repeated measures over time in the agricultural sector. The information criteria allowed obtaining the optimal structure of the variance-covariance matrix. A work methodology is proposed for processing data with these characteristics.

REFERENCES

Acosta, M.M. & Sánchez, J. P.2015. Desempeño psicométrico de dos escalas de autoeficacia e intereses profesionales en una muestra de estudiantes de secundaria. CES Psicología. 8 (2):156-170. Available: http://www.redalyc.org/articulo.oa?id=423542417009. [ Links ]

Balzarini, M. & Macchiavelli, R, 2005. Aplicaciones de Modelo Lineal Mixto en agricultura y forestería. Curso Internacional Aplicaciones de Modelo Lineal Mixto en Agricultura y Foresteria. CATIE, Turrialba, Costa Rica, Mimeo, p,189. [ Links ]

Caleja, C., Barros, L., Antonio, A.L., Ciric, A., Barreira, J.C.M., Sokovic, M., Oliveira, B.P.P., Santos-Buelga, C. & Ferreira, I.C.F.R. 2015. Development of a functional dairy food: Exploring bioactive and preservation effects of chamomile (Matricaria recutita L.). Journal of Functional Foods. 16: 114-124. ISSN: 1756-4646. Available: http://dx.doi.org/10.1016/j.jff.2015.04.033. [ Links ]

Fernández, P., Menéndez, I.A., Vallejo, G. & Herrero, J. 1996. Comparación de la potencia y robustez del AMVAR con dependencia serial en el error, cuando diferentes asunciones distribucionales son violadas. Departamento de Psicología, Universidad de Oviedo. Psicothema 4)1):277-290 ISSN:0214-9915. [ Links ]

Greenhouse, S. & Geisser, S. 1959. On methods in the analysis of profile data. Psycometrika. 24(2): 95-112. Online ISSN: 1860-0980. Available: https://doi.org/10.1007/BF02289823 [ Links ]

Gualdrón, J.C. 2009. Influencia de los criterios de selección AIC Y BIC para la selección del modelo de evolución y la reconstrucción del análisis bayesiano. Available: Available: http://tux.uis.edu.co/labsist/docencia/finales/final2009-I/2050158-20070.pdf . [Consulted: June 20, 2018]. [ Links ]

Hernández, J.A., Pérez, J.J., Bosch, I.D. & Castro, S.N. 2015. Clasificación de los suelos de Cuba 2015. Mayabeque, Cuba. Ediciones INCA. 93 p. ISBN: 978-959-7023-77-7. [ Links ]

Huynh, H. & Feldt, L.S. 1976. Estimation of the Box correction for degrees of freedom from sample data in the randomized block and split-plot designs. J. Educ. Stat. 1(1), 69-82. doi:10.3102/10769986001001069. Available: https://journals.sagepub.com/Links ]

Kirk, R. 1982. Experimental design: Procedures for the behavioral sciences. 2nd Edition. Brooks Cole Publishing Company. California. p. 55. ISBN-10: 081850286X [ Links ]

Kuehl, R.O. 2000. Diseño de experimentos, Principios estadísticos de diseño y análisis de investigación. Segunda edición. Ed, Thomson Learning. Universidad de Arizona, Arizona, USA. p. 492-519. ISBN-0-534-36834-4. Available: http://www.thomsonlearning.com.mx. [ Links ]

Lilliefors, H. 1967. On the Kolmogorov-Smirnov Test for Normality with Mean and Variance Unknown, J. Am. Stat. Assoc. 62(318): 399-402. DOI: 10.1080/01621459.1967.10482916. Available: Stable URL: Lilliefors, H. 1967. On the Kolmogorov-Smirnov Test for Normality with Mean and Variance Unknown, J. Am. Stat. Assoc. 62(318): 399-402. DOI: 10.1080/01621459.1967.10482916. Available: Stable URL: http://www.jstor.org/stable/2283970 http://academicos.fciencias.unam.mx/wp-content/uploads/sites/91/2015/04/Lillifors_normality_ks.pdf. . [ Links ]

Mauchly, J. 1940. Significance test of sphericity of a normal n-variate distribution. Annals of Mathematical Statistics. 11(2): 204-209. Available: https://www.jstor.org/stable/2235878Links ]

Pascual, J., Frías, D. & García, F. 1996. Manual de psicología experimental. Metodología de investigación. Libro Entero. Primera Edición. Editorial Ariel, S.A. Barcelona. p. 139-145. ISBN: 84-344-0868-6. Available: https://www.academia.edu/23242604/Manual_de_psicología_experimental_metodología_de_investigación. [ Links ]

Pérez, E. & Medrano, L. 2010. Análisis factorial exploratorio: Bases conceptuales y metodológicas. Revista Argentina de Ciencias del Comportamiento (RACC). 2(1): 58-66. ISSN-e 1852-4206. Available: http://www.psyche.unc.edu.ar/racc. [ Links ]

Rodríguez, R., de la Fuente, G., Gómez, S. & Fondevila, M. 2014. Biological effect of tannins from different vegetal origin on microbial and fermentation trait in vitro. Anim. Prod. Sci. 54 (8): 1039-1046, ISSN: 1836-0939. Available: http://dx.doi.org/10.1071/AN13045. [ Links ]

SAS. 2013. Sistema de análisis estadístico. Universidad de Nebraska. Versión 9.3. [ Links ]

Shapiro, S. & Wilk, B. 1965. An analysis of variance test for normality (complete samples). Biometrika. 52(3/4): 591-611. doi:10.1093/biomet/52.3-4.591. JSTOR 2333709. MR 0205384. p. 593. [ Links ]

Tukey, J.W. 1958. Bias and confidence in not quite large samples. The Annals of Mathematical Statistics . 29(2):614-623. Available: doi:10.1214/aoms/1177706647. https://projecteuclid.org/euclid.aoms/1177706647Links ]

Vallejo, G., Arnau, J., Bono, R., Fernández, P. & Tuero, E. 2010. Selección de modelos anidados para datos longitudinales usando criterios de información y la estrategia de ajuste condicional. Psicothema. 22 (2):323-333. ISSN: 0214-9915 [ Links ]

Vallejo, G., Tuero, E., Núñez, J.C. & Rosario, P. 2014. Performance evaluation of recent information criteria for selecting multilevel models in Behavioral and Social Sciences. International Journal of Clinical and Health Psychology. 14(1): 48−57. ISSN: 1697-2600. Available: https://www.redalyc.org/pdf/337/33729172006.pdfLinks ]

Received: January 04, 2019; Accepted: July 05, 2019

Creative Commons License