Abstracts
When experimental data are submitted to analysis of variance, the assumption of data homoscedasticity (variance homogeneity among treatments), associated to the adopted mathematical model must be satisfied. This verification is necessary to ensure the correct test for the analysis. In some cases, when data homoscedascity is not observed, errors may invalidate the analysis. An alternative to overcome this difficulty is the application of the specific residue analysis, which consists of the decomposition of the residual sum of squares in its components, in order to adequately test the correspondent orthogonal contrasts of interest between treatment means. Although the decomposition of the residual sum of squares is a seldom used procedure, it is useful for a better understanding of the residual mean square nature and to validate the tests to be applied. The objective of this review is to illustrate the specific residue application as a valid and adequate alternative to analyze data from experiments following completely randomized and randomized complete block designs in the presence of heteroscedasticity.
analysis of variance; completely randomized design; randomized complete block design
Ao realizarse a análise da variância de um conjunto de dados, pressupõese que o critério de homocedasticidade (homogeneidade de variâncias entre tratamentos), associada ao modelo matemático adotado, seja satisfeito. Esta verificação se faz necessária para a correta aplicação dos testes de significância. Quando não é satisfeita, em certos casos, compromete a normalidade dos erros. Uma alternativa para contornar essa deficiência é a aplicação do resíduo específico, que consiste em decompor a soma de quadrados do resíduo em componentes, correspondentes aos contrastes ortogonais de interesse, apropriados para testar cada contraste ortogonal entre médias de tratamentos. A decomposição da soma de quadrados do resíduo é um procedimento pouco utilizado, mas é útil para melhor compreensão da natureza do quadrado médio residual e garantir a validade dos testes aplicados. Nessa revisão avaliouse a aplicação dos resíduos específicos como alternativa válida e adequada, na análise de dados obtidos de experimentos que seguem a estrutura dos delineamentos inteiramente casualizados e em blocos casualizados, na presença da heterocedasticidade.
análise da variância; delineamento inteiramente casualizado; delineamento em blocos casualizados
REVIEW
Specific residue: application of orthogonal contrasts when heteroscedasticity is present
Resíduo específico: aplicação de contrastes ortogonais na presença da heterocedasticidade
Maria Cristina Stolf Nogueira
USP/ESALQ  Depto. de Ciências Exatas, C.P.9  13418900  Piracicaba, SP  Brasil  email <mcsnogue@esalq.usp.br>
ABSTRACT
When experimental data are submitted to analysis of variance, the assumption of data homoscedasticity (variance homogeneity among treatments), associated to the adopted mathematical model must be satisfied. This verification is necessary to ensure the correct test for the analysis. In some cases, when data homoscedascity is not observed, errors may invalidate the analysis. An alternative to overcome this difficulty is the application of the specific residue analysis, which consists of the decomposition of the residual sum of squares in its components, in order to adequately test the correspondent orthogonal contrasts of interest between treatment means. Although the decomposition of the residual sum of squares is a seldom used procedure, it is useful for a better understanding of the residual mean square nature and to validate the tests to be applied. The objective of this review is to illustrate the specific residue application as a valid and adequate alternative to analyze data from experiments following completely randomized and randomized complete block designs in the presence of heteroscedasticity.
Key words: analysis of variance, completely randomized design, randomized complete block design.
RESUMO
Ao realizarse a análise da variância de um conjunto de dados, pressupõese que o critério de homocedasticidade (homogeneidade de variâncias entre tratamentos), associada ao modelo matemático adotado, seja satisfeito. Esta verificação se faz necessária para a correta aplicação dos testes de significância. Quando não é satisfeita, em certos casos, compromete a normalidade dos erros. Uma alternativa para contornar essa deficiência é a aplicação do resíduo específico, que consiste em decompor a soma de quadrados do resíduo em componentes, correspondentes aos contrastes ortogonais de interesse, apropriados para testar cada contraste ortogonal entre médias de tratamentos. A decomposição da soma de quadrados do resíduo é um procedimento pouco utilizado, mas é útil para melhor compreensão da natureza do quadrado médio residual e garantir a validade dos testes aplicados. Nessa revisão avaliouse a aplicação dos resíduos específicos como alternativa válida e adequada, na análise de dados obtidos de experimentos que seguem a estrutura dos delineamentos inteiramente casualizados e em blocos casualizados, na presença da heterocedasticidade.
Palavraschave: análise da variância, delineamento inteiramente casualizado, delineamento em blocos casualizados.
Introduction
The analysis of variance of experimental data requires that the assumption of homoscedasticity (similar variances among treatments), associated to the adopted mathematical model is satisfied. This verification is necessary for a correct significance of the test application. When this condition is not met the heteroscedasticity is prevailing (variance heterogeneity).
The heteroscedasticity can be classified as regular and irregular according to Steel and Torrie (1981) based on Cochran (1947). The regular type is generally originated from data nonnormality and some type of relationship between means and variance treatments. In this case, the data may be transformed to have variance stability among treatments and, as a consequence, the errors will fit into an approximately normal distribution. The irregular type is characterized by certain treatments showing significantly higher variability compared to others, not necessarily presenting a relation between means and variances. In this case, Cochran and Cox (1957, 1971) recommended that such high variability treatments are omitted or that treatments are subdivided into homocedasticity groups in such way that they may present similar variances; or yet, to subdivide the residual sum of squares (SSResidual) in applicable components for the several comparisons of interest, thus obtaining specific residues.
When an analysis of variance is performed, the sum of squares of the treatments (SSTreatment) can be decomposed into components corresponding to orthogonal contrasts; in the same way, the residual sum of squares (SSResidual) can also be decomposed into their orthogonal contrast components, giving origin to the specific residues that are appropriate to test each contrast between treatment means.
The residual sum of squares (SSResidual) decomposition is not a usual procedure as the treatment sum of squares (SSTreatment) decomposition, but according to Cochran and Cox (1957, 1971), it can be applied when there are reasons suggesting the presence of irregular types of heteroscedasticity. In this case, the SSResidual decomposition is useful to better understand the residual mean square (MSResidual) nature and validate the tests to be applied.
A residual sum of squares (SSResidual) decomposition for experimental data of a randomized complete block design was presented by Steel and& Torrie (1981); initially, they established an orthogonal contrast grouping for treatments and thereafter they obtained the value of each contrast for each block. The authors concluded that if the randomized complete block design is valid, any comparison within each block is not influenced by the general level of the block. As a consequence, the variance for any comparison within blocks is appropriate to test contrasts between treatment means. The procedure was numerically shown.
In presence of the heteroscedasticity among experiments, when a group of experiments is considered, the interaction effects involving experiments (assumed as randomized effects) are influenced. An appropriate alternative to analyze the experimental data is the application of the specific residue method. With the objective to illustrate this case, Oliveira and Nogueira (2007) applied the specific residue method on sugarcane yield (t ha^{_1}) experimental data obtained from a group of eleven experiments characterized by the presence of heteroscedasticity among experiments. Each experiment had a randomized incomplete block design, arranged in a 3^{3} NPK factorial (27 treatments = three blocks ´ nine experimental units). The confounding of two degrees of freedom corresponding to the block effects plus NPK interaction effects was considered. No replication was applied to blocks.
The objective of this review is to illustrate the application of specific residues as an alternative procedure to analyze data showing heteroscedasticity among treatments.
Material and Methods
The methods, definitions and concepts on orthogonal contrasts applied to obtain specific residues can be found in Nogueira (2004). To bypassthe irregular heteroscedasticity present in the experimental data of a randomized complete block design, Ferreira (1978) presenteda mathematical procedure to obtain the specific residue sum of squares, correspondent to the appropriate components for comparisons (orthogonal contrasts) of interest, using the orthogonal transformation method. Thus, the specific residue sum of squares of the Y_{h }component (SSR(Y_{h})) is given by
with (J1) degrees of freedom and is the Y_{hj} contrast estimate, correspondent to the Y_{h} contrast application within block j, for j = 1, ..., J,
where I is the total number of treatments, for i = 1, ..., I; c^{hi} is the associated coefficient of the iesimal treatment mean in the hesimal contrast; is the hesimal contrast estimate, for h = 1, ..., (I1) ; is the observed value to iesimal treatment in jesimal block; the total sum of the iesimal treatment and the mean of the iesimal treatment. Two contrasts are orthogonal when for h ¹ h'= 1, ..., (I1).
SSR(Y_{h})=SSResidual has (I1) (J1) degrees of freedom and the residual mean square for Y_{h}, MSR(Y_{h}) = has (J1) degrees of freedom.
Thus, the hypothesis for h=1, ..., (I  1), is tested by the application of the F test, and . where MS(Y_{h}) is the mean square referred to the Y_{h }component, with one degree of freedom, obtained as follows:
In the case of a completely randomized design experiment in presence of irregular heteroscedascity SSResidual is decomposed in specific residues as shown by Nogueira (1984) and Nogueira and Campos (1985). These authors developed the decomposition of SSResidual and presented appropriate specific residues to test each contrast, and also identified how the specific residue sum of squares refers to the Y_{h }component (SSR(Y_{h})). The development of the specific residue sum of squares in relationto the Y_{h }component was obtained by applying the mathematical expectance (E) on SSR(Y_{h}) of the randomized complete block design experiment, as follows:
assuming that E(t_{i}) = t_{i}, E() = , E(e_{ij}) = 0 and E() where t_{i} is the iesimal treatment effect, e_{ij }is the experimental error associated to y_{ij}. The specific residue sum of squares for Y_{h }(SSR(Y_{h})) obtained is presented as follows:
where SST_{i} is the iesimal treatment sum of squares. Thus, the residual mean square for Y_{h} (MSR(Y_{h})) is given by:
with n_{h} degrees of freedom, obtained by the application of the Satterthwaite (1941,1946) formula, and thus,
and SSResidual = SSR(Y_{h}) + SSR(among replications),
with I(J1) degrees of freedom, and the SSR(among replications) is the residual sum of squares among replications, so that
with (J1) degrees of freedom and that residual mean square among replications (MSR(among replications) is
Therefore, the hypotheses H_{0}:Y_{h} = 0 vs. H_{a} : Y_{h} ¹ 0, for h=1, ..., (I  1) were tested by the application of the F test, and the calculated F value was obtained through the expression:
where MS(Y_{h}) is the mean square of the Y_{h }component, with one degree of freedom, obtained as follows:
the followed the approximated F distributionswith one degree of freedom was referred to MS(Y_{h}) with n_{h} degrees of freedom obtained by the Satterthwaite (1941, 1946) formula and to MSR(Y_{h}) as verified by Nogueira (1984). The verification was accomplished through the application of the simulation method developed by Godoi (1978), based on Box and Miller (1958), to variables with normal and onedimensional distributions. The Chisquare test was applied to verify the adherence of F_{h }with the F_{(1,nh)} distributions.
Results and Discussion
Completely randomized design
The experimental data shown in Table 1, cited by Nogueira (1984), refer to sorghum total dry matter yield, first cropping (g per pot) obtained from a completely randomized design experiment, with eight treatments and four replications, so that: Total for each treatment
with (4 1) degrees of freedom, where y_{ij} is the observed value (g per pot) of the iesimal treatment in the jesimal replication.
The variance for each treatment is given by , with (41) degrees of freedom and i = 1, ..., 8.
Preliminary analyses of variance results are presented in Table 2. Seven degrees of freedom for treatments and the sum of squares for treatments were decomposed according to the following group of orthogonal contrasts of interest: Y_{1}: control treatments versus located and incorporated Prates; Y_{2}: among controls;Y_{3}: Located versus incorporated Prates; Y_{4}: Linear effect of located Prates; Y_{5}: Quadratic effect of located Prates; Y_{6}: Linear effect of incorporated Prates; Y_{7}: Quadratic effect of incorporated Prates.
Contrasts Y_{4} and Y_{5} provided the locatedP treatment effect and contrasts Y_{6} and Y_{7}, the incorporatedP treatment effect. The coefficients of applied contrasts and some results are shown in Table 3. As Prates are not equidistant, the coefficients attributed to Y_{4}, Y_{5}, Y_{6} and Y_{7 }contrasts were obtained using the orthogonal polynomial coefficient procedure for nonequidistant levels developed by Nogueira (1978) and cited by Nogueira (2007). The new analysis of variance with F test results without specific residue application is presented in Table 4.
If the model homoscedasticity assumption is satisfied, that is, if it is possible to consider that statistically MSResidual, the analysis presented in Table 4 is perfectly valid.
In order to verify the experimental data homoscedasticity, the Bartlett test was applied (among other tests), which is appropriate to test the following hypotheses:
The hypothesis was rejected at pvalue < 0.005 significance level, evidencing significant differences among variances due to the replications within treatments, characterizing the presence of heteroscedasticity. Once heteroscedasticity was evidenced, a procedure should be applied to overcome this situation. One alternative was the use of the specific residue as the F test denominator, to test each contrast defined in Table 3. This procedure consisted of the decomposition of all residual degrees of freedom (24), and consequently, the residual sum of squares obtaining the specific residue for each contrast:
degrees of freedom obtained through the application of the Satterthwaite (1941, 1946) formula
SSResidual = SSR(Y_{h}) + SSR(among replications), with 8(4 1) degrees of freedom. And SSR(among replications) , with (4 1) degrees of freedom and MSR (among replications)
Thus, the hypothesis for h=1, ..., (8  1), will be tested by the application of the F test and that , as observed by Nogueira (1984). Results are shown in Table 5, where the values in [ ], found in DF ( degrees of freedom) column refer to the effective degrees of freedom  n_{h} , obtained by the Satterthwaite formula and applied in the F test.
It was observed that
MSResidual = MSR (among replications) = MSR (Y_{h}) = 34.1734.
The F test values presented in Table 4 were obtained having MSResidual as denominator, with 24 degrees of freedom. The results presented in Tables 4 and 5 are different as well as some of the conclusions. This fact is important due to the presence of heteroscedasticity, because in Table 4, the MSResidual corresponds to the MSR(Y_{h}) arithmetic mean; and in Table 5, the values obtained for MSR(Y_{h}) were different. In the presence of homoscedasticity the values obtained for MSR(Y_{h}) are very close to the ones obtained for MSResidual. The use of the specific residue procedure showed to be an interesting alternative to be applied when irregular heteroscedasticity is present, providing trustworthy results.
Randomized complete block design
In order to illustrate the specific residue procedure application on data analyses of a randomized complete block design experiment, the following experimental data were considered: yields of eight potato varieties (t ha^{1}) distributed in five blocks (Table 6).
The Bartlett test was applied to verify the variance homogeneity hypothesis, which was rejected, thus evidencing the presence of variance heterogeneity among treatments. Due to this fact and considering that experimental errors followed a normal distribution, the specific residue procedure was applied as an alternative for this data analysis. The initial analysis of variance is shown in Table 7.
Seven degrees of freedom and the variety sum of squares were decomposed in a group of orthogonal contrasts according to the high and low productivity criterion. Then, the potato varieties were divided into two groups and the high productivity potato group consisted of the varieties: (3) B152, (4) Huinkul, (5) B11651; (6) B7253 A and (7) S. Rafaela; and the low productivity potato group consisted of the varieties: (1) Kennebec, (2) B2550E and (8) Buena Vista. Thus, the group of orthogonal contrasts built up according to the productivity criterion was: Y_{1}: High productivity varieties (varieties 3, 4, 5, 6 and 7) versus Low productivity varieties (varieties 1, 2 and 8); Y_{2}: Variety 7 versus varieties 3, 4, 5 and 6; Y_{3}: Varieties 4 and 6 versus varieties 3 and 5; Y_{4}: Between varieties 4 and 6; Y_{5}: Between varieties 3 and 5; Y_{6}: Variety 1 versus varieties 2 and 8; Y_{7}: Between varieties 2 and 8.
The orthogonal contrasts Y_{2}, Y_{3}, Y_{4} and Y_{5} provided the high productivity variety effect with four degrees of freedom, and the contrasts Y_{6} and Y_{7} provided the low productivity variety effect with two degrees of freedom. The coefficients of the applied contrasts, the contrast estimates and the sum of squares obtained are shown in Table 8.
Twenty eight degrees of freedom and the residual sum of squares were decomposed according to the Y(h) components, resulting the Y(h) specific residues given by:
with (51) = 4 degrees of freedom and is the Y_{hj} contrast estimate, corresponding to the Y_{h} contrast application in the block j, for j = 1, ..., J = 5 ,
where y_{ij} is the observed value related to variety i in block j; is the hesimal contrast estimate, for h = 1, ..., (81)=7 and . The values referred to y_{ij} and the Y_{h} coefficients for the calculus are presented in Table 9.
The results referred to and estimates and SSR(Y_{h}) values are presented in Table 10, as follows:
It was observed that SSR(Y_{h}) = SQResidual = 348.324, with (81)(51)=28 degrees of freedom.
Also that MSR(Y_{h}) = SSR(Y_{h}), with (51) = 4 degrees of freedom.
Thus, the hypotheses ..., (8  1), were then tested by the application of the F test,
The analysis of variance obtained with the specific residue procedure application is presented in Table 11. Significant F test values for Y_{1} and Y_{4} contrasts were observed, evidencing they differ from zero.
The analysis of variance without the specific residue procedure was also obtained (Table 12) in order to be compared to the previous analysis (Table 11). Significant F value was obtained for the Y_{1} contrast when calculated with MSResidual as denominator, with 28 degrees of freedom, evidencing that it significantly differed from zero. When the specific residue procedure was applied (Table 11), significant F values were obtained for the Y_{1} and Y_{4} contrasts.
Conclusion
The use of the specific residue procedure is a valid and efficient alternative when heteroscedasticity is present, because it validates the applied tests and also allows a better understanding of the residual mean square nature. The MSResidual corresponds to the MSR(Y_{h}) arithmetic mean, although the values obtained for MSR(Y_{h}) can be different. In the presence of homoscedasticity the values obtained for MSR(Y_{h}) are very close to those obtained for MSResidual.
Received June 15, 2007
Accepted August 21, 2009
 Box, G.E.P; Miller, M.E. 1958. A note on the generation of random normal deviates. Annals of Mathematics Statistics 29: 610611.
 Cochran, W.G. 1947. Some consequences when the assumptions for the analysis of variance are not satisfied. Biometrics 3: 2238.
 Cochran, W.G.; Cox, G.M. 1957. Experimental Designs. 2ed. John Wiley, New York, NY, USA.
 Cochran, W.G.; Cox, G.M. 1971. Diseños Experimentales. Editorial Trillas, Ciudad de México, México.
 Godoi, C.R.M. 1978. Um algoritmo eficiente para simulação de vetores com distribuição multinormal. Ciência e Cultura 20: 701705.
 Ferreira, L.E.P. 1978. A decomposição do resíduo em casos de heterocedasticidade nas análises de variância de ensaios em blocos casualizados. MSc Dissertation. Universidade de São Paulo, Piracicaba, SP, Brazil. (in Portuguese with summary in English).
 Nogueira, I.R. 1978. Método geral para obtenção de tabelas de polinômios ortogonais. Revista da Agricultura 53: 269279.
 Nogueira, M.C.S. 1984. Resíduo específico para contraste de tratamentos no delineamento inteiramente casualizado. Dr. Thesis. Universidade de São Paulo, Piracicaba, SP, Brazil (in Portuguese with summary in English).
 Nogueira, M.C.S.; Campos, H. 1985. Resíduo específico para contraste de tratamentos no delineamento inteiramente casualizado. Anais do Simpósio de Estatística Aplicada à Experimentação Agronômica 1. Fundação Cargill, Piracicaba, SP, Brazil.
 Nogueira, M.C.S. 2004. Orthogonal contrasts: Definitions and concepts. Scientia Agricola61: 118124.
 Nogueira, M.C.S. 2007. Experimentação agronômica I. Conceitos, planejamento e análise de dados. Editora MCSNogueira, Piracicaba, SP, Brazil.
 Oliveira, W.; Nogueira, M.C.S. 2007. Aplicação do resíduo específico na análise de grupos de experimentos. Bragantia 66: 737744.
 Satterthwaite, F.E. 1941. Synthesis of variance. Psychometrika 6: 309316.
 Satterthwaite, F.E. 1946. An approximate distribution of estimates of variance components. Biometrics Bulletin 2: 110114.
 Steel, R.G.D.; Torrie, J.H. 1981. Principles and Procedures of Statistics. 2ed. McGrawHill, New York, NY, USA.
Publication Dates

Publication in this collection
24 Mar 2010 
Date of issue
Feb 2010
History

Accepted
21 Aug 2009 
Received
15 June 2007